enhance todo tool prompt

Update README.md
chore: bump version to 0.5.0
2025-12-15 10:26:59 -08:00 · 2025-12-15 10:11:08 -08:00 · 2025-12-15 08:21:03 -08:00 · 2025-12-15 08:21:03 -08:00 · 2025-12-15 07:41:33 -08:00 · 2025-12-15 19:39:47 +04:00
66 changed files with 5509 additions and 818 deletions
--- a/.github/logo.png
+++ b/.github/logo.png
--- a/.github/workflows/build-release.yml
+++ b/.github/workflows/build-release.yml
@@ -0,0 +1,78 @@
+name: Build & Release
+
+on:
+  push:
+    tags:
+      - 'v*'
+  workflow_dispatch:
+
+jobs:
+  build:
+    strategy:
+      fail-fast: false
+      matrix:
+        include:
+          - os: macos-latest
+            target: macos-arm64
+          - os: macos-15-intel
+            target: macos-x86_64
+          - os: ubuntu-latest
+            target: linux-x86_64
+          - os: windows-latest
+            target: windows-x86_64
+
+    runs-on: ${{ matrix.os }}
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+
+      - uses: snok/install-poetry@v1
+
+      - name: Build
+        shell: bash
+        run: |
+          poetry install --with dev
+          poetry run pyinstaller strix.spec --noconfirm
+
+          VERSION=$(poetry version -s)
+          mkdir -p dist/release
+
+          if [[ "${{ runner.os }}" == "Windows" ]]; then
+            cp dist/strix.exe "dist/release/strix-${VERSION}-${{ matrix.target }}.exe"
+            (cd dist/release && 7z a "strix-${VERSION}-${{ matrix.target }}.zip" "strix-${VERSION}-${{ matrix.target }}.exe")
+          else
+            cp dist/strix "dist/release/strix-${VERSION}-${{ matrix.target }}"
+            chmod +x "dist/release/strix-${VERSION}-${{ matrix.target }}"
+            tar -C dist/release -czvf "dist/release/strix-${VERSION}-${{ matrix.target }}.tar.gz" "strix-${VERSION}-${{ matrix.target }}"
+          fi
+
+      - uses: actions/upload-artifact@v4
+        with:
+          name: strix-${{ matrix.target }}
+          path: |
+            dist/release/*.tar.gz
+            dist/release/*.zip
+          if-no-files-found: error
+
+  release:
+    needs: build
+    runs-on: ubuntu-latest
+    permissions:
+      contents: write
+
+    steps:
+      - uses: actions/download-artifact@v4
+        with:
+          path: release
+          merge-multiple: true
+
+      - name: Create Release
+        uses: softprops/action-gh-release@v2
+        with:
+          prerelease: ${{ !startsWith(github.ref, 'refs/tags/') }}
+          generate_release_notes: true
+          files: release/*
--- a/.gitignore
+++ b/.gitignore
@@ -79,6 +79,7 @@ logs/
 tensorboard/

 # Agent execution traces
+strix_runs/
 agent_runs/

 # Misc
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -101,7 +101,7 @@ We welcome feature ideas! Please:

 ## 🤝 Community

- **Discord**: [Join our community](https://discord.gg/J48Fzuh7)
+- **Discord**: [Join our community](https://discord.gg/YjKFvEZSdZ)
 - **Issues**: [GitHub Issues](https://github.com/usestrix/strix/issues)

 ## ✨ Recognition
@@ -113,4 +113,4 @@ We value all contributions! Contributors will be:

 ---

-**Questions?** Reach out on [Discord](https://discord.gg/J48Fzuh7) or create an issue. We're here to help!
+**Questions?** Reach out on [Discord](https://discord.gg/YjKFvEZSdZ) or create an issue. We're here to help!
--- a/README.md
+++ b/README.md
@@ -1,82 +1,125 @@
-<div align="center">
+<p align="center">
+  <a href="https://usestrix.com/">
+    <img src=".github/logo.png" width="150" alt="Strix Logo">
+  </a>
+</p>

-# Strix
+<h1 align="center">Strix</h1>

-### Open-source AI hackers for your apps
-
-[![Strix](https://img.shields.io/badge/Strix-usestrix.com-1a1a1a.svg)](https://usestrix.com)
-[![Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
-[![Discord](https://img.shields.io/badge/Discord-join-5865F2?logo=discord&logoColor=white)](https://discord.gg/J48Fzuh7)
-[![PyPI Downloads](https://static.pepy.tech/personalized-badge/strix-agent?period=total&units=INTERNATIONAL_SYSTEM&left_color=GRAY&right_color=BLACK&left_text=Downloads)](https://pepy.tech/projects/strix-agent)
-[![GitHub stars](https://img.shields.io/github/stars/usestrix/strix.svg?style=social&label=Star)](https://github.com/usestrix/strix)
-</div>
+<h2 align="center">Open-source AI Hackers to secure your Apps</h2>

 <div align="center">
-<img src=".github/screenshot.png" alt="Strix Demo" width="800" style="border-radius: 16px; box-shadow: 0 20px 40px rgba(0, 0, 0, 0.3), 0 0 0 1px rgba(255, 255, 255, 0.1), inset 0 1px 0 rgba(255, 255, 255, 0.2); transform: perspective(1000px) rotateX(2deg); transition: transform 0.3s ease;">
+
+[![Python](https://img.shields.io/pypi/pyversions/strix-agent?color=3776AB)](https://pypi.org/project/strix-agent/)
+[![PyPI](https://img.shields.io/pypi/v/strix-agent?color=10b981)](https://pypi.org/project/strix-agent/)
+![PyPI Downloads](https://static.pepy.tech/personalized-badge/strix-agent?period=total&units=INTERNATIONAL_SYSTEM&left_color=GREY&right_color=RED&left_text=Downloads)
+[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
+
+[![GitHub Stars](https://img.shields.io/github/stars/usestrix/strix)](https://github.com/usestrix/strix)
+[![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.gg/YjKFvEZSdZ)
+[![Website](https://img.shields.io/badge/Website-usestrix.com-2d3748.svg)](https://usestrix.com)
+
+<a href="https://trendshift.io/repositories/15362" target="_blank"><img src="https://trendshift.io/api/badge/repositories/15362" alt="usestrix%2Fstrix | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
+
+
+[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/usestrix/strix)
+
 </div>

+<br>
+
+<div align="center">
+  <img src=".github/screenshot.png" alt="Strix Demo" width="800" style="border-radius: 16px;">
+</div>
+
+<br>
+
+> [!TIP]
+> **New!** Strix now integrates seamlessly with GitHub Actions and CI/CD pipelines. Automatically scan for vulnerabilities on every pull request and block insecure code before it reaches production!
+
 ---

 ## 🦉 Strix Overview

-Strix are autonomous AI agents that act just like real hackers - they run your code dynamically, find vulnerabilities, and validate them through actual exploitation. Built for developers and security teams who need fast, accurate security testing without the overhead of manual pentesting or the false positives of static analysis tools.
+Strix are autonomous AI agents that act just like real hackers - they run your code dynamically, find vulnerabilities, and validate them through actual proof-of-concepts. Built for developers and security teams who need fast, accurate security testing without the overhead of manual pentesting or the false positives of static analysis tools.

- **Full hacker toolkit** out of the box
- **Teams of agents** that collaborate and scale
- **Real validation** via exploitation and PoC, not false positives
- **Developer‑first** CLI with actionable reports
- **Auto‑fix & reporting** to accelerate remediation
+**Key Capabilities:**
+
+- 🔧 **Full hacker toolkit** out of the box
+- 🤝 **Teams of agents** that collaborate and scale
+- ✅ **Real validation** with PoCs, not false positives
+- 💻 **Developer‑first** CLI with actionable reports
+- 🔄 **Auto‑fix & reporting** to accelerate remediation
+
+
+## 🎯 Use Cases
+
+- **Application Security Testing** - Detect and validate critical vulnerabilities in your applications
+- **Rapid Penetration Testing** - Get penetration tests done in hours, not weeks, with compliance reports
+- **Bug Bounty Automation** - Automate bug bounty research and generate PoCs for faster reporting
+- **CI/CD Integration** - Run tests in CI/CD to block vulnerabilities before reaching production

 ---

-### 🎯 Use Cases
+## 🚀 Quick Start

- Detect and validate critical vulnerabilities in your applications.
- Get penetration tests done in hours, not weeks, with compliance reports.
- Automate bug bounty research and generate PoCs for faster reporting.
- Run tests in CI/CD to block vulnerabilities before reaching production.
-
---
-
-### 🚀 Quick Start
-
-Prerequisites:
+**Prerequisites:**
 - Docker (running)
- Python 3.12+
- An LLM provider key (or a local LLM)
+- An LLM provider key (e.g. [get OpenAI API key](https://platform.openai.com/api-keys) or use a local LLM)
+
+### Installation & First Scan

 ```bash
-# Install
+# Install Strix
+curl -sSL https://strix.ai/install | bash
+
+# Or via pipx
 pipx install strix-agent

-# Configure AI provider
+# Configure your AI provider
 export STRIX_LLM="openai/gpt-5"
 export LLM_API_KEY="your-api-key"

-# Run security assessment
+# Run your first security assessment
 strix --target ./app-directory
 ```

-First run pulls the sandbox Docker image. Results are saved under `agent_runs/<run-name>`.
+> [!NOTE]
+> First run automatically pulls the sandbox Docker image. Results are saved to `strix_runs/<run-name>`

-### ☁️ Cloud Hosted
+## ☁️ Run Strix in Cloud

-Want to skip the setup? Try our cloud-hosted version: **[usestrix.com](https://usestrix.com)**
+Want to skip the local setup, API keys, and unpredictable LLM costs? Run the hosted cloud version of Strix at **[app.usestrix.com](https://usestrix.com)**.
+
+Launch a scan in just a few minutes—no setup or configuration required—and you’ll get:
+
+- **A full pentest report** with validated findings and clear remediation steps
+- **Shareable dashboards** your team can use to track fixes over time
+- **CI/CD and GitHub integrations** to block risky changes before production
+- **Continuous monitoring** so new vulnerabilities are caught quickly
+
+[**Run your first pentest now →**](https://usestrix.com)
+
+---

 ## ✨ Features

 ### 🛠️ Agentic Security Tools

- **🔌 Full HTTP Proxy** - Full request/response manipulation and analysis
- **🌐 Browser Automation** - Multi-tab browser for testing of XSS, CSRF, auth flows
- **💻 Terminal Environments** - Interactive shells for command execution and testing
- **🐍 Python Runtime** - Custom exploit development and validation
- **🔍 Reconnaissance** - Automated OSINT and attack surface mapping
- **📁 Code Analysis** - Static and dynamic analysis capabilities
- **📝 Knowledge Management** - Structured findings and attack documentation
+Strix agents come equipped with a comprehensive security testing toolkit:
+
+- **Full HTTP Proxy** - Full request/response manipulation and analysis
+- **Browser Automation** - Multi-tab browser for testing of XSS, CSRF, auth flows
+- **Terminal Environments** - Interactive shells for command execution and testing
+- **Python Runtime** - Custom exploit development and validation
+- **Reconnaissance** - Automated OSINT and attack surface mapping
+- **Code Analysis** - Static and dynamic analysis capabilities
+- **Knowledge Management** - Structured findings and attack documentation

 ### 🎯 Comprehensive Vulnerability Detection

+Strix can identify and validate a wide range of security vulnerabilities:
+
 - **Access Control** - IDOR, privilege escalation, auth bypass
 - **Injection Attacks** - SQL, NoSQL, command injection
 - **Server-Side** - SSRF, XXE, deserialization flaws
@@ -87,55 +130,51 @@ Want to skip the setup? Try our cloud-hosted version: **[usestrix.com](https://u

 ### 🕸️ Graph of Agents

+Advanced multi-agent orchestration for comprehensive security testing:
+
 - **Distributed Workflows** - Specialized agents for different attacks and assets
 - **Scalable Testing** - Parallel execution for fast comprehensive coverage
 - **Dynamic Coordination** - Agents collaborate and share discoveries

+---

 ## 💻 Usage Examples

+### Basic Usage
+
 ```bash
-# Local codebase analysis
+# Scan a local codebase
 strix --target ./app-directory

-# Repository security review
+# Security review of a GitHub repository
 strix --target https://github.com/org/repo

-# Web application assessment
+# Black-box web application assessment
 strix --target https://your-app.com
-
-# Multi-target white-box testing (source code + deployed app)
-strix -t https://github.com/org/app -t https://your-app.com
-
-# Test multiple environments simultaneously
-strix -t https://dev.your-app.com -t https://staging.your-app.com -t https://prod.your-app.com
-
-# Focused testing with instructions
-strix --target api.your-app.com --instruction "Prioritize authentication and authorization testing"
-
-# Testing with credentials
-strix --target https://your-app.com --instruction "Test with credentials: testuser/testpass. Focus on privilege escalation and access control bypasses."
 ```

-### ⚙️ Configuration
+### Advanced Testing Scenarios

 ```bash
-export STRIX_LLM="openai/gpt-5"
-export LLM_API_KEY="your-api-key"
+# Grey-box authenticated testing
+strix --target https://your-app.com --instruction "Perform authenticated testing using credentials: user:pass"

-# Optional
-export LLM_API_BASE="your-api-base-url"  # if using a local model, e.g. Ollama, LMStudio
-export PERPLEXITY_API_KEY="your-api-key"  # for search capabilities
+# Multi-target testing (source code + deployed app)
+strix -t https://github.com/org/app -t https://your-app.com
+
+# Focused testing with custom instructions
+strix --target api.your-app.com --instruction "Focus on business logic flaws and IDOR vulnerabilities"
+
+# Provide detailed instructions through file (e.g., rules of engagement, scope, exclusions)
+strix --target api.your-app.com --instruction-file ./instruction.md
 ```

-[📚 View supported AI models](https://docs.litellm.ai/docs/providers)
-
 ### 🤖 Headless Mode

 Run Strix programmatically without interactive UI using the `-n/--non-interactive` flag—perfect for servers and automated jobs. The CLI prints real-time vulnerability findings, and the final report before exiting. Exits with non-zero code when vulnerabilities are found.

 ```bash
-strix -n --target https://your-app.com --instruction "Focus on authentication and authorization vulnerabilities"
+strix -n --target https://your-app.com
 ```

 ### 🔄 CI/CD (GitHub Actions)
@@ -152,63 +191,49 @@ jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6

      - name: Install Strix
-        run: pipx install strix-agent
+        run: curl -sSL https://strix.ai/install | bash

      - name: Run Strix
        env:
          STRIX_LLM: ${{ secrets.STRIX_LLM }}
          LLM_API_KEY: ${{ secrets.LLM_API_KEY }}

-        run: strix -n -t ./
+        run: strix -n -t ./ --scan-mode quick
 ```

-## 🏆 Enterprise Platform
+### ⚙️ Configuration

-Our managed platform provides:
+```bash
+export STRIX_LLM="openai/gpt-5"
+export LLM_API_KEY="your-api-key"

- **📈 Executive Dashboards**
- **🧠 Custom Fine-Tuned Models**
- **⚙️ CI/CD Integration**
- **🔍 Large-Scale Scanning**
- **🔌 Third-Party Integrations**
- **🎯 Enterprise Support**
+# Optional
+export LLM_API_BASE="your-api-base-url"  # if using a local model, e.g. Ollama, LMStudio
+export PERPLEXITY_API_KEY="your-api-key"  # for search capabilities
+```

-[**Get Enterprise Demo →**](https://usestrix.com)
-
-## 🔒 Security Architecture
-
- **Container Isolation** - All testing in sandboxed Docker environments
- **Local Processing** - Testing runs locally, no data sent to external services
-
-> [!WARNING]
-> Only test systems you own or have permission to test. You are responsible for using Strix ethically and legally.
+[OpenAI's GPT-5](https://openai.com/api/) (`openai/gpt-5`) and [Anthropic's Claude Sonnet 4.5](https://claude.com/platform/api) (`anthropic/claude-sonnet-4-5`) are the recommended models for best results with Strix. We also support many [other options](https://docs.litellm.ai/docs/providers), including cloud and local models, though their performance and reliability may vary.

 ## 🤝 Contributing

-We welcome contributions from the community! There are several ways to contribute:
+We welcome contributions of code, docs, and new prompt modules - check out our [Contributing Guide](CONTRIBUTING.md) to get started or open a [pull request](https://github.com/usestrix/strix/pulls)/[issue](https://github.com/usestrix/strix/issues).

-### Code Contributions
-See our [Contributing Guide](CONTRIBUTING.md) for details on:
- Setting up your development environment
- Running tests and quality checks
- Submitting pull requests
- Code style guidelines
+## 👥 Join Our Community

-### Prompt Modules Collection
-Help expand our collection of specialized prompt modules for AI agents:
- Advanced testing techniques for vulnerabilities, frameworks, and technologies
- See [Prompt Modules Documentation](strix/prompts/README.md) for guidelines
- Submit via [pull requests](https://github.com/usestrix/strix/pulls) or [issues](https://github.com/usestrix/strix/issues)
+Have questions? Found a bug? Want to contribute? **[Join our Discord!](https://discord.gg/YjKFvEZSdZ)**

 ## 🌟 Support the Project

 **Love Strix?** Give us a ⭐ on GitHub!
+## 🙏 Acknowledgements

-## 👥 Join Our Community
+Strix builds on the incredible work of open-source projects like [LiteLLM](https://github.com/BerriAI/litellm), [Caido](https://github.com/caido/caido), [ProjectDiscovery](https://github.com/projectdiscovery), [Playwright](https://github.com/microsoft/playwright), and [Textual](https://github.com/Textualize/textual). Huge thanks to their maintainers!

-Have questions? Found a bug? Want to contribute? **[Join our Discord!](https://discord.gg/J48Fzuh7)**
+
+> [!WARNING]
+> Only test apps you own or have permission to test. You are responsible for using Strix ethically and legally.

 </div>
--- a/containers/Dockerfile
+++ b/containers/Dockerfile
@@ -158,7 +158,7 @@ RUN mkdir -p /workspace && chown -R pentester:pentester /workspace /app
 COPY pyproject.toml poetry.lock ./

 USER pentester
-RUN poetry install --no-root --without dev
+RUN poetry install --no-root --without dev --extras sandbox
 RUN poetry run playwright install chromium

 RUN /app/venv/bin/pip install -r /home/pentester/tools/jwt_tool/requirements.txt && \
--- a/poetry.lock
+++ b/poetry.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "strix-agent"
-version = "0.3.1"
+version = "0.5.0"
 description = "Open-source AI Hackers for your apps"
 authors = ["Strix <hi@usestrix.com>"]
 readme = "README.md"
@@ -26,6 +26,8 @@ classifiers = [
  "Programming Language :: Python :: 3",
  "Programming Language :: Python :: 3 :: Only",
  "Programming Language :: Python :: 3.12",
+  "Programming Language :: Python :: 3.13",
+  "Programming Language :: Python :: 3.14",
 ]
 packages = [
  { include = "strix", format = ["sdist", "wheel"] }
@@ -43,24 +45,33 @@ strix = "strix.interface.main:main"

 [tool.poetry.dependencies]
 python = "^3.12"
-fastapi = "*"
-uvicorn = "*"
-litellm = { version = "~1.75.8", extras = ["proxy"] }
-openai = ">=1.99.5,<1.100.0"
+# Core CLI dependencies
+litellm = { version = "~1.80.7", extras = ["proxy"] }
 tenacity = "^9.0.0"
-numpydoc = "^1.8.0"
 pydantic = {extras = ["email"], version = "^2.11.3"}
-ipython = "^9.3.0"
-openhands-aci = "^0.3.0"
-playwright = "^1.48.0"
 rich = "*"
 docker = "^7.1.0"
-gql = {extras = ["requests"], version = "^3.5.3"}
 textual = "^4.0.0"
 xmltodict = "^0.13.0"
-pyte = "^0.8.1"
 requests = "^2.32.0"
-libtmux = "^0.46.2"
+
+# Optional LLM provider dependencies
+google-cloud-aiplatform = { version = ">=1.38", optional = true }
+
+# Sandbox-only dependencies (only needed inside Docker container)
+fastapi = { version = "*", optional = true }
+uvicorn = { version = "*", optional = true }
+ipython = { version = "^9.3.0", optional = true }
+openhands-aci = { version = "^0.3.0", optional = true }
+playwright = { version = "^1.48.0", optional = true }
+gql = { version = "^3.5.3", extras = ["requests"], optional = true }
+pyte = { version = "^0.8.1", optional = true }
+libtmux = { version = "^0.46.2", optional = true }
+numpydoc = { version = "^1.8.0", optional = true }
+
+[tool.poetry.extras]
+vertex = ["google-cloud-aiplatform"]
+sandbox = ["fastapi", "uvicorn", "ipython", "openhands-aci", "playwright", "gql", "pyte", "libtmux", "numpydoc"]

 [tool.poetry.group.dev.dependencies]
 # Type checking and static analysis
@@ -81,6 +92,9 @@ pre-commit = "^4.2.0"
 black = "^25.1.0"
 isort = "^6.0.1"

+# Build tools
+pyinstaller = { version = "^6.17.0", python = ">=3.12,<3.15" }
+
 [build-system]
 requires = ["poetry-core"]
 build-backend = "poetry.core.masonry.api"
@@ -129,9 +143,15 @@ module = [
    "textual.*",
    "pyte.*",
    "libtmux.*",
+    "pytest.*",
 ]
 ignore_missing_imports = true

+# Relax strict rules for test files (pytest decorators are not fully typed)
+[[tool.mypy.overrides]]
+module = ["tests.*"]
+disallow_untyped_decorators = false
+
 # ============================================================================
 # Ruff Configuration (Fast Python Linter & Formatter)
 # ============================================================================
@@ -321,7 +341,6 @@ addopts = [
    "--cov-report=term-missing",
    "--cov-report=html",
    "--cov-report=xml",
-    "--cov-fail-under=80"
 ]
 testpaths = ["tests"]
 python_files = ["test_*.py", "*_test.py"]
--- a/scripts/build.sh
+++ b/scripts/build.sh
@@ -0,0 +1,98 @@
+#!/bin/bash
+set -e
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
+
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+
+echo -e "${BLUE}🦉 Strix Build Script${NC}"
+echo "================================"
+
+OS="$(uname -s)"
+ARCH="$(uname -m)"
+
+case "$OS" in
+    Linux*)     OS_NAME="linux";;
+    Darwin*)    OS_NAME="macos";;
+    MINGW*|MSYS*|CYGWIN*) OS_NAME="windows";;
+    *)          OS_NAME="unknown";;
+esac
+
+case "$ARCH" in
+    x86_64|amd64)   ARCH_NAME="x86_64";;
+    arm64|aarch64)  ARCH_NAME="arm64";;
+    *)              ARCH_NAME="$ARCH";;
+esac
+
+echo -e "${YELLOW}Platform:${NC} $OS_NAME-$ARCH_NAME"
+
+cd "$PROJECT_ROOT"
+
+if ! command -v poetry &> /dev/null; then
+    echo -e "${RED}Error: Poetry is not installed${NC}"
+    echo "Please install Poetry first: https://python-poetry.org/docs/#installation"
+    exit 1
+fi
+
+echo -e "\n${BLUE}Installing dependencies...${NC}"
+poetry install --with dev
+
+VERSION=$(poetry version -s)
+echo -e "${YELLOW}Version:${NC} $VERSION"
+
+echo -e "\n${BLUE}Cleaning previous builds...${NC}"
+rm -rf build/ dist/
+
+echo -e "\n${BLUE}Building binary with PyInstaller...${NC}"
+poetry run pyinstaller strix.spec --noconfirm
+
+RELEASE_DIR="dist/release"
+mkdir -p "$RELEASE_DIR"
+
+BINARY_NAME="strix-${VERSION}-${OS_NAME}-${ARCH_NAME}"
+
+if [ "$OS_NAME" = "windows" ]; then
+    if [ ! -f "dist/strix.exe" ]; then
+        echo -e "${RED}Build failed: Binary not found${NC}"
+        exit 1
+    fi
+    BINARY_NAME="${BINARY_NAME}.exe"
+    cp "dist/strix.exe" "$RELEASE_DIR/$BINARY_NAME"
+    echo -e "\n${BLUE}Creating zip...${NC}"
+    ARCHIVE_NAME="${BINARY_NAME%.exe}.zip"
+
+    if command -v 7z &> /dev/null; then
+        7z a "$RELEASE_DIR/$ARCHIVE_NAME" "$RELEASE_DIR/$BINARY_NAME"
+    else
+        powershell -Command "Compress-Archive -Path '$RELEASE_DIR/$BINARY_NAME' -DestinationPath '$RELEASE_DIR/$ARCHIVE_NAME'"
+    fi
+    echo -e "${GREEN}Created:${NC} $RELEASE_DIR/$ARCHIVE_NAME"
+else
+    if [ ! -f "dist/strix" ]; then
+        echo -e "${RED}Build failed: Binary not found${NC}"
+        exit 1
+    fi
+    cp "dist/strix" "$RELEASE_DIR/$BINARY_NAME"
+    chmod +x "$RELEASE_DIR/$BINARY_NAME"
+    echo -e "\n${BLUE}Creating tarball...${NC}"
+    ARCHIVE_NAME="${BINARY_NAME}.tar.gz"
+    tar -czvf "$RELEASE_DIR/$ARCHIVE_NAME" -C "$RELEASE_DIR" "$BINARY_NAME"
+    echo -e "${GREEN}Created:${NC} $RELEASE_DIR/$ARCHIVE_NAME"
+fi
+
+echo -e "\n${GREEN}Build successful!${NC}"
+echo "================================"
+echo -e "${YELLOW}Binary:${NC} $RELEASE_DIR/$BINARY_NAME"
+
+SIZE=$(ls -lh "$RELEASE_DIR/$BINARY_NAME" | awk '{print $5}')
+echo -e "${YELLOW}Size:${NC} $SIZE"
+
+echo -e "\n${BLUE}Testing binary...${NC}"
+"$RELEASE_DIR/$BINARY_NAME" --help > /dev/null 2>&1 && echo -e "${GREEN}Binary test passed!${NC}" || echo -e "${RED}Binary test failed${NC}"
+
+echo -e "\n${GREEN}Done!${NC}"
--- a/scripts/install.sh
+++ b/scripts/install.sh
@@ -0,0 +1,328 @@
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+APP=strix
+REPO="usestrix/strix"
+STRIX_IMAGE="ghcr.io/usestrix/strix-sandbox:0.1.10"
+
+MUTED='\033[0;2m'
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+CYAN='\033[0;36m'
+NC='\033[0m'
+
+requested_version=${VERSION:-}
+SKIP_DOWNLOAD=false
+
+raw_os=$(uname -s)
+os=$(echo "$raw_os" | tr '[:upper:]' '[:lower:]')
+case "$raw_os" in
+  Darwin*) os="macos" ;;
+  Linux*) os="linux" ;;
+  MINGW*|MSYS*|CYGWIN*) os="windows" ;;
+esac
+
+arch=$(uname -m)
+if [[ "$arch" == "aarch64" ]]; then
+  arch="arm64"
+fi
+if [[ "$arch" == "x86_64" ]]; then
+  arch="x86_64"
+fi
+
+if [ "$os" = "macos" ] && [ "$arch" = "x86_64" ]; then
+  rosetta_flag=$(sysctl -n sysctl.proc_translated 2>/dev/null || echo 0)
+  if [ "$rosetta_flag" = "1" ]; then
+    arch="arm64"
+  fi
+fi
+
+combo="$os-$arch"
+case "$combo" in
+  linux-x86_64|macos-x86_64|macos-arm64|windows-x86_64)
+    ;;
+  *)
+    echo -e "${RED}Unsupported OS/Arch: $os/$arch${NC}"
+    exit 1
+    ;;
+esac
+
+archive_ext=".tar.gz"
+if [ "$os" = "windows" ]; then
+  archive_ext=".zip"
+fi
+
+target="$os-$arch"
+
+if [ "$os" = "linux" ]; then
+    if ! command -v tar >/dev/null 2>&1; then
+         echo -e "${RED}Error: 'tar' is required but not installed.${NC}"
+         exit 1
+    fi
+fi
+
+if [ "$os" = "windows" ]; then
+    if ! command -v unzip >/dev/null 2>&1; then
+        echo -e "${RED}Error: 'unzip' is required but not installed.${NC}"
+        exit 1
+    fi
+fi
+
+INSTALL_DIR=$HOME/.strix/bin
+mkdir -p "$INSTALL_DIR"
+
+if [ -z "$requested_version" ]; then
+    specific_version=$(curl -s "https://api.github.com/repos/$REPO/releases/latest" | sed -n 's/.*"tag_name": *"v\([^"]*\)".*/\1/p')
+    if [[ $? -ne 0 || -z "$specific_version" ]]; then
+        echo -e "${RED}Failed to fetch version information${NC}"
+        exit 1
+    fi
+else
+    specific_version=$requested_version
+fi
+
+filename="$APP-${specific_version}-${target}${archive_ext}"
+url="https://github.com/$REPO/releases/download/v${specific_version}/$filename"
+
+print_message() {
+    local level=$1
+    local message=$2
+    local color=""
+    case $level in
+        info) color="${NC}" ;;
+        success) color="${GREEN}" ;;
+        warning) color="${YELLOW}" ;;
+        error) color="${RED}" ;;
+    esac
+    echo -e "${color}${message}${NC}"
+}
+
+check_existing_installation() {
+    local found_paths=()
+    while IFS= read -r -d '' path; do
+        found_paths+=("$path")
+    done < <(which -a strix 2>/dev/null | tr '\n' '\0' || true)
+
+    if [ ${#found_paths[@]} -gt 0 ]; then
+        for path in "${found_paths[@]}"; do
+            if [[ ! -e "$path" ]] || [[ "$path" == "$INSTALL_DIR/strix"* ]]; then
+                continue
+            fi
+
+            if [[ -n "$path" ]]; then
+                echo -e "${MUTED}Found existing strix at: ${NC}$path"
+
+                if [[ "$path" == *".local/bin"* ]]; then
+                    echo -e "${MUTED}Removing old pipx installation...${NC}"
+                    if command -v pipx >/dev/null 2>&1; then
+                        pipx uninstall strix-agent 2>/dev/null || true
+                    fi
+                    rm -f "$path" 2>/dev/null || true
+                elif [[ -L "$path" || -f "$path" ]]; then
+                    echo -e "${MUTED}Removing old installation...${NC}"
+                    rm -f "$path" 2>/dev/null || true
+                fi
+            fi
+        done
+    fi
+}
+
+check_version() {
+    check_existing_installation
+
+    if [[ -x "$INSTALL_DIR/strix" ]]; then
+        installed_version=$("$INSTALL_DIR/strix" --version 2>/dev/null | awk '{print $2}' || echo "")
+        if [[ "$installed_version" == "$specific_version" ]]; then
+            print_message info "${GREEN}✓ Strix ${NC}$specific_version${GREEN} already installed${NC}"
+            SKIP_DOWNLOAD=true
+        elif [[ -n "$installed_version" ]]; then
+            print_message info "${MUTED}Installed: ${NC}$installed_version ${MUTED}→ Upgrading to ${NC}$specific_version"
+        fi
+    fi
+}
+
+download_and_install() {
+    print_message info "\n${CYAN}🦉 Installing Strix${NC} ${MUTED}version: ${NC}$specific_version"
+    print_message info "${MUTED}Platform: ${NC}$target\n"
+
+    local tmp_dir=$(mktemp -d)
+    cd "$tmp_dir"
+
+    echo -e "${MUTED}Downloading...${NC}"
+    curl -# -L -o "$filename" "$url"
+
+    if [ ! -f "$filename" ]; then
+        echo -e "${RED}Download failed${NC}"
+        exit 1
+    fi
+
+    echo -e "${MUTED}Extracting...${NC}"
+    if [ "$os" = "windows" ]; then
+        unzip -q "$filename"
+        mv "strix-${specific_version}-${target}.exe" "$INSTALL_DIR/strix.exe"
+    else
+        tar -xzf "$filename"
+        mv "strix-${specific_version}-${target}" "$INSTALL_DIR/strix"
+        chmod 755 "$INSTALL_DIR/strix"
+    fi
+
+    cd - > /dev/null
+    rm -rf "$tmp_dir"
+
+    echo -e "${GREEN}✓ Strix installed to $INSTALL_DIR${NC}"
+}
+
+check_docker() {
+    echo ""
+    if ! command -v docker >/dev/null 2>&1; then
+        echo -e "${YELLOW}⚠ Docker not found${NC}"
+        echo -e "${MUTED}Strix requires Docker to run the security sandbox.${NC}"
+        echo -e "${MUTED}Please install Docker: ${NC}https://docs.docker.com/get-docker/"
+        echo ""
+        return 1
+    fi
+
+    if ! docker info >/dev/null 2>&1; then
+        echo -e "${YELLOW}⚠ Docker daemon not running${NC}"
+        echo -e "${MUTED}Please start Docker and run: ${NC}docker pull $STRIX_IMAGE"
+        echo ""
+        return 1
+    fi
+
+    echo -e "${MUTED}Checking for sandbox image...${NC}"
+    if docker image inspect "$STRIX_IMAGE" >/dev/null 2>&1; then
+        echo -e "${GREEN}✓ Sandbox image already available${NC}"
+    else
+        echo -e "${MUTED}Pulling sandbox image (this may take a few minutes)...${NC}"
+        if docker pull "$STRIX_IMAGE"; then
+            echo -e "${GREEN}✓ Sandbox image pulled successfully${NC}"
+        else
+            echo -e "${YELLOW}⚠ Failed to pull sandbox image${NC}"
+            echo -e "${MUTED}You can pull it manually later: ${NC}docker pull $STRIX_IMAGE"
+        fi
+    fi
+    return 0
+}
+
+add_to_path() {
+    local config_file=$1
+    local command=$2
+    if grep -Fxq "$command" "$config_file" 2>/dev/null; then
+        return 0
+    elif [[ -w $config_file ]]; then
+        echo -e "\n# strix" >> "$config_file"
+        echo "$command" >> "$config_file"
+    fi
+}
+
+setup_path() {
+    XDG_CONFIG_HOME=${XDG_CONFIG_HOME:-$HOME/.config}
+    current_shell=$(basename "$SHELL")
+
+    case $current_shell in
+        fish)
+            config_files="$HOME/.config/fish/config.fish"
+            ;;
+        zsh)
+            config_files="$HOME/.zshrc $HOME/.zshenv"
+            ;;
+        bash)
+            config_files="$HOME/.bashrc $HOME/.bash_profile $HOME/.profile"
+            ;;
+        *)
+            config_files="$HOME/.bashrc $HOME/.profile"
+            ;;
+    esac
+
+    config_file=""
+    for file in $config_files; do
+        if [[ -f $file ]]; then
+            config_file=$file
+            break
+        fi
+    done
+
+    if [[ -z $config_file ]]; then
+        config_file="$HOME/.bashrc"
+        touch "$config_file"
+    fi
+
+    if [[ ":$PATH:" != *":$INSTALL_DIR:"* ]]; then
+        case $current_shell in
+            fish)
+                add_to_path "$config_file" "fish_add_path $INSTALL_DIR"
+                ;;
+            *)
+                add_to_path "$config_file" "export PATH=\"$INSTALL_DIR:\$PATH\""
+                ;;
+        esac
+    fi
+
+    if [ -n "${GITHUB_ACTIONS-}" ] && [ "${GITHUB_ACTIONS}" == "true" ]; then
+        echo "$INSTALL_DIR" >> "$GITHUB_PATH"
+    fi
+}
+
+verify_installation() {
+    export PATH="$INSTALL_DIR:$PATH"
+
+    local which_strix=$(which strix 2>/dev/null || echo "")
+
+    if [[ "$which_strix" != "$INSTALL_DIR/strix" && "$which_strix" != "$INSTALL_DIR/strix.exe" ]]; then
+        if [[ -n "$which_strix" ]]; then
+            echo -e "${YELLOW}⚠ Found conflicting strix at: ${NC}$which_strix"
+            echo -e "${MUTED}Attempting to remove...${NC}"
+
+            if rm -f "$which_strix" 2>/dev/null; then
+                echo -e "${GREEN}✓ Removed conflicting installation${NC}"
+            else
+                echo -e "${YELLOW}Could not remove automatically.${NC}"
+                echo -e "${MUTED}Please remove manually: ${NC}rm $which_strix"
+            fi
+        fi
+    fi
+
+    if [[ -x "$INSTALL_DIR/strix" ]]; then
+        local version=$("$INSTALL_DIR/strix" --version 2>/dev/null | awk '{print $2}' || echo "unknown")
+        echo -e "${GREEN}✓ Strix ${NC}$version${GREEN} ready${NC}"
+    fi
+}
+
+check_version
+if [ "$SKIP_DOWNLOAD" = false ]; then
+    download_and_install
+fi
+setup_path
+verify_installation
+check_docker
+
+echo ""
+echo -e "${CYAN}"
+echo "   ███████╗████████╗██████╗ ██╗██╗  ██╗"
+echo "   ██╔════╝╚══██╔══╝██╔══██╗██║╚██╗██╔╝"
+echo "   ███████╗   ██║   ██████╔╝██║ ╚███╔╝ "
+echo "   ╚════██║   ██║   ██╔══██╗██║ ██╔██╗ "
+echo "   ███████║   ██║   ██║  ██║██║██╔╝ ██╗"
+echo "   ╚══════╝   ╚═╝   ╚═╝  ╚═╝╚═╝╚═╝  ╚═╝"
+echo -e "${NC}"
+echo -e "${MUTED}  AI Penetration Testing Agent${NC}"
+echo ""
+echo -e "${MUTED}To get started:${NC}"
+echo ""
+echo -e "  ${CYAN}1.${NC} Set your LLM provider:"
+echo -e "     ${MUTED}export STRIX_LLM='openai/gpt-5'${NC}"
+echo -e "     ${MUTED}export LLM_API_KEY='your-api-key'${NC}"
+echo ""
+echo -e "  ${CYAN}2.${NC} Run a penetration test:"
+echo -e "     ${MUTED}strix --target https://example.com${NC}"
+echo ""
+echo -e "${MUTED}For more information visit ${NC}https://usestrix.com"
+echo -e "${MUTED}Join our community ${NC}https://discord.gg/YjKFvEZSdZ"
+echo ""
+
+if [[ ":$PATH:" != *":$INSTALL_DIR:"* ]]; then
+    echo -e "${YELLOW}→${NC} Run ${MUTED}source ~/.$(basename $SHELL)rc${NC} or open a new terminal"
+    echo ""
+fi
--- a/strix.spec
+++ b/strix.spec
@@ -0,0 +1,221 @@
+# -*- mode: python ; coding: utf-8 -*-
+
+import sys
+from pathlib import Path
+from PyInstaller.utils.hooks import collect_data_files, collect_submodules
+
+project_root = Path(SPECPATH)
+strix_root = project_root / 'strix'
+
+datas = []
+
+for jinja_file in strix_root.rglob('*.jinja'):
+    rel_path = jinja_file.relative_to(project_root)
+    datas.append((str(jinja_file), str(rel_path.parent)))
+
+for xml_file in strix_root.rglob('*.xml'):
+    rel_path = xml_file.relative_to(project_root)
+    datas.append((str(xml_file), str(rel_path.parent)))
+
+for tcss_file in strix_root.rglob('*.tcss'):
+    rel_path = tcss_file.relative_to(project_root)
+    datas.append((str(tcss_file), str(rel_path.parent)))
+
+datas += collect_data_files('textual')
+
+datas += collect_data_files('tiktoken')
+datas += collect_data_files('tiktoken_ext')
+
+datas += collect_data_files('litellm')
+
+hiddenimports = [
+    # Core dependencies
+    'litellm',
+    'litellm.llms',
+    'litellm.llms.openai',
+    'litellm.llms.anthropic',
+    'litellm.llms.vertex_ai',
+    'litellm.llms.bedrock',
+    'litellm.utils',
+    'litellm.caching',
+
+    # Textual TUI
+    'textual',
+    'textual.app',
+    'textual.widgets',
+    'textual.containers',
+    'textual.screen',
+    'textual.binding',
+    'textual.reactive',
+    'textual.css',
+    'textual._text_area_theme',
+
+    # Rich console
+    'rich',
+    'rich.console',
+    'rich.panel',
+    'rich.text',
+    'rich.markup',
+    'rich.style',
+    'rich.align',
+    'rich.live',
+
+    # Pydantic
+    'pydantic',
+    'pydantic.fields',
+    'pydantic_core',
+    'email_validator',
+
+    # Docker
+    'docker',
+    'docker.api',
+    'docker.models',
+    'docker.errors',
+
+    # HTTP/Networking
+    'httpx',
+    'httpcore',
+    'requests',
+    'urllib3',
+    'certifi',
+
+    # Jinja2 templating
+    'jinja2',
+    'jinja2.ext',
+    'markupsafe',
+
+    # XML parsing
+    'xmltodict',
+
+    # Tiktoken (for token counting)
+    'tiktoken',
+    'tiktoken_ext',
+    'tiktoken_ext.openai_public',
+
+    # Tenacity retry
+    'tenacity',
+
+    # Strix modules
+    'strix',
+    'strix.interface',
+    'strix.interface.main',
+    'strix.interface.cli',
+    'strix.interface.tui',
+    'strix.interface.utils',
+    'strix.interface.tool_components',
+    'strix.agents',
+    'strix.agents.base_agent',
+    'strix.agents.state',
+    'strix.agents.StrixAgent',
+    'strix.llm',
+    'strix.llm.llm',
+    'strix.llm.config',
+    'strix.llm.utils',
+    'strix.llm.request_queue',
+    'strix.llm.memory_compressor',
+    'strix.runtime',
+    'strix.runtime.runtime',
+    'strix.runtime.docker_runtime',
+    'strix.telemetry',
+    'strix.telemetry.tracer',
+    'strix.tools',
+    'strix.tools.registry',
+    'strix.tools.executor',
+    'strix.tools.argument_parser',
+    'strix.prompts',
+]
+
+hiddenimports += collect_submodules('litellm')
+hiddenimports += collect_submodules('textual')
+hiddenimports += collect_submodules('rich')
+hiddenimports += collect_submodules('pydantic')
+
+excludes = [
+    # Sandbox-only packages
+    'playwright',
+    'playwright.sync_api',
+    'playwright.async_api',
+    'IPython',
+    'ipython',
+    'libtmux',
+    'pyte',
+    'openhands_aci',
+    'openhands-aci',
+    'gql',
+    'fastapi',
+    'uvicorn',
+    'numpydoc',
+
+    # Google Cloud / Vertex AI
+    'google.cloud',
+    'google.cloud.aiplatform',
+    'google.api_core',
+    'google.auth',
+    'google.oauth2',
+    'google.protobuf',
+    'grpc',
+    'grpcio',
+    'grpcio_status',
+
+    # Test frameworks
+    'pytest',
+    'pytest_asyncio',
+    'pytest_cov',
+    'pytest_mock',
+
+    # Development tools
+    'mypy',
+    'ruff',
+    'black',
+    'isort',
+    'pylint',
+    'pyright',
+    'bandit',
+    'pre_commit',
+
+    # Unnecessary for runtime
+    'tkinter',
+    'matplotlib',
+    'numpy',
+    'pandas',
+    'scipy',
+    'PIL',
+    'cv2',
+]
+
+a = Analysis(
+    ['strix/interface/main.py'],
+    pathex=[str(project_root)],
+    binaries=[],
+    datas=datas,
+    hiddenimports=hiddenimports,
+    hookspath=[],
+    hooksconfig={},
+    runtime_hooks=[],
+    excludes=excludes,
+    noarchive=False,
+    optimize=0,
+)
+
+pyz = PYZ(a.pure)
+
+exe = EXE(
+    pyz,
+    a.scripts,
+    a.binaries,
+    a.datas,
+    [],
+    name='strix',
+    debug=False,
+    bootloader_ignore_signals=False,
+    strip=False,
+    upx=False,
+    upx_exclude=[],
+    runtime_tmpdir=None,
+    console=True,
+    disable_windowed_traceback=False,
+    argv_emulation=False,
+    target_arch=None,
+    codesign_identity=None,
+    entitlements_file=None,
+)
--- a/strix/agents/StrixAgent/strix_agent.py
+++ b/strix/agents/StrixAgent/strix_agent.py
@@ -18,13 +18,14 @@ class StrixAgent(BaseAgent):

        super().__init__(config)

-    async def execute_scan(self, scan_config: dict[str, Any]) -> dict[str, Any]:
+    async def execute_scan(self, scan_config: dict[str, Any]) -> dict[str, Any]:  # noqa: PLR0912
        user_instructions = scan_config.get("user_instructions", "")
        targets = scan_config.get("targets", [])

        repositories = []
        local_code = []
        urls = []
+        ip_addresses = []

        for target in targets:
            target_type = target["type"]
@@ -53,6 +54,8 @@ class StrixAgent(BaseAgent):

            elif target_type == "web_application":
                urls.append(details["target_url"])
+            elif target_type == "ip_address":
+                ip_addresses.append(details["target_ip"])

        task_parts = []

@@ -74,6 +77,10 @@ class StrixAgent(BaseAgent):
            task_parts.append("\n\nURLs:")
            task_parts.extend(f"- {url}" for url in urls)

+        if ip_addresses:
+            task_parts.append("\n\nIP Addresses:")
+            task_parts.extend(f"- {ip}" for ip in ip_addresses)
+
        task_description = " ".join(task_parts)

        if user_instructions:
--- a/strix/agents/StrixAgent/system_prompt.jinja
+++ b/strix/agents/StrixAgent/system_prompt.jinja
@@ -10,20 +10,24 @@ You follow all instructions and rules provided to you exactly as written in the

 <communication_rules>
 CLI OUTPUT:
- Never use markdown formatting - you are a CLI agent
- Output plain text only (no **bold**, `code`, [links], # headers)
+- You may use simple markdown: **bold**, *italic*, `code`, ~~strikethrough~~, [links](url), and # headers
+- Do NOT use complex markdown like bullet lists, numbered lists, or tables
 - Use line breaks and indentation for structure
 - NEVER use "Strix" or any identifiable names/markers in HTTP requests, payloads, user-agents, or any inputs

 INTER-AGENT MESSAGES:
 - NEVER echo inter_agent_message or agent_completion_report XML content that is sent to you in your output.
 - Process these internally without displaying the XML
+- NEVER echo agent_identity XML blocks; treat them as internal metadata for identity only. Do not include them in outputs or tool calls.
+- Minimize inter-agent messaging: only message when essential for coordination or assistance; avoid routine status updates; batch non-urgent information; prefer parent/child completion flows and shared artifacts over messaging

 AUTONOMOUS BEHAVIOR:
 - Work autonomously by default
 - You should NOT ask for user input or confirmation - you should always proceed with your task autonomously.
 - Minimize user messaging: avoid redundancy and repetition; consolidate updates into a single concise message
+- NEVER send an empty or blank message. If you have no content to output or need to wait (for user input, subagent results, or any other reason), you MUST call the wait_for_message tool (or another appropriate tool) instead of emitting an empty response.
 - If there is nothing to execute and no user query to answer any more: do NOT send filler/repetitive text — either call wait_for_message or finish your work (subagents: agent_finish; root: finish_scan)
+- While the agent loop is running, almost every output MUST be a tool call. Do NOT send plain text messages; act via tools. If idle, use wait_for_message; when done, use agent_finish (subagents) or finish_scan (root)
 </communication_rules>

 <execution_guidelines>
@@ -102,7 +106,6 @@ OPERATIONAL PRINCIPLES:
 - Choose appropriate tools for each context
 - Chain vulnerabilities for maximum impact
 - Consider business logic and context in exploitation
- **OVERUSE THE THINK TOOL** - Use it CONSTANTLY. Every 1-2 messages MINIMUM, and after each tool call!
 - NEVER skip think tool - it's your most important tool for reasoning and success
 - WORK RELENTLESSLY - Don't stop until you've found something significant
 - Try multiple approaches simultaneously - don't wait for one to fail
@@ -210,10 +213,9 @@ SIMPLE WORKFLOW RULES:
 4. **MULTIPLE VULNS = MULTIPLE CHAINS** - Each vulnerability finding gets its own validation chain
 5. **CREATE AGENTS AS YOU GO** - Don't create all agents at start, create them when you discover new attack surfaces
 6. **ONE JOB PER AGENT** - Each agent has ONE specific task only
-7. **VIEW THE AGENT GRAPH BEFORE ACTING** - Always call view_agent_graph before creating or messaging agents to avoid duplicates and to target correctly
-8. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing
-9. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children
-10. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent
+7. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing
+8. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children
+9. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent

 WHEN TO CREATE NEW AGENTS:

@@ -304,10 +306,25 @@ Tool calls use XML format:
 </function>

 CRITICAL RULES:
+0. While active in the agent loop, EVERY message you output MUST be a single tool call. Do not send plain text-only responses.
 1. One tool call per message
 2. Tool call must be last in message
 3. End response after </function> tag. It's your stop word. Do not continue after it.
-5. Thinking is NOT optional - it's required for reasoning and success
+4. Use ONLY the exact XML format shown above. NEVER use JSON/YAML/INI or any other syntax for tools or parameters.
+5. Tool names must match exactly the tool "name" defined (no module prefixes, dots, or variants).
+   - Correct: <function=think> ... </function>
+   - Incorrect: <thinking_tools.think> ... </function>
+   - Incorrect: <think> ... </think>
+   - Incorrect: {"think": {...}}
+6. Parameters must use <parameter=param_name>value</parameter> exactly. Do NOT pass parameters as JSON or key:value lines. Do NOT add quotes/braces around values.
+7. Do NOT wrap tool calls in markdown/code fences or add any text before or after the tool block.
+
+Example (agent creation tool):
+<function=create_agent>
+<parameter=task>Perform targeted XSS testing on the search endpoint</parameter>
+<parameter=name>XSS Discovery Agent</parameter>
+<parameter=prompt_modules>xss</parameter>
+</function>

 SPRAYING EXECUTION NOTE:
 - When performing large payload sprays or fuzzing, encapsulate the entire spraying loop inside a single python or terminal tool call (e.g., a Python script using asyncio/aiohttp). Do not issue one tool call per payload.
@@ -359,6 +376,7 @@ SPECIALIZED TOOLS:
 PROXY & INTERCEPTION:
 - Caido CLI - Modern web proxy (already running). Used with proxy tool or with python tool (functions already imported).
 - NOTE: If you are seeing proxy errors when sending requests, it usually means you are not sending requests to a correct url/host/port.
+- Ignore Caido proxy-generated 50x HTML error pages; these are proxy issues (might happen when requesting a wrong host or SSL/TLS issues, etc).

 PROGRAMMING:
 - Python 3, Poetry, Go, Node.js/npm
--- a/strix/agents/base_agent.py
+++ b/strix/agents/base_agent.py
@@ -1,4 +1,5 @@
 import asyncio
+import contextlib
 import logging
 from pathlib import Path
 from typing import TYPE_CHECKING, Any, Optional
@@ -75,6 +76,8 @@ class BaseAgent(metaclass=AgentMeta):
                max_iterations=self.max_iterations,
            )

+        with contextlib.suppress(Exception):
+            self.llm.set_agent_identity(self.agent_name, self.state.agent_id)
        self._current_task: asyncio.Task[Any] | None = None

        from strix.telemetry.tracer import get_global_tracer
--- a/strix/agents/state.py
+++ b/strix/agents/state.py
@@ -123,7 +123,7 @@ class AgentState(BaseModel):
            return False

        elapsed = (datetime.now(UTC) - self.waiting_start_time).total_seconds()
-        return elapsed > 120
+        return elapsed > 600

    def has_empty_last_messages(self, count: int = 3) -> bool:
        if len(self.messages) < count:
--- a/strix/interface/assets/tui_styles.tcss
+++ b/strix/interface/assets/tui_styles.tcss
@@ -33,18 +33,32 @@ Screen {
    background: transparent;
 }

+#sidebar {
+    width: 25%;
+    background: transparent;
+    margin-left: 1;
+}
+
 #agents_tree {
-    width: 20%;
+    height: 1fr;
    background: transparent;
    border: round #262626;
    border-title-color: #a8a29e;
    border-title-style: bold;
-    margin-left: 1;
    padding: 1;
+    margin-bottom: 0;
+}
+
+#stats_display {
+    height: auto;
+    max-height: 15;
+    background: transparent;
+    padding: 0;
+    margin: 0;
 }

 #chat_area_container {
-    width: 80%;
+    width: 75%;
    background: transparent;
 }

--- a/strix/interface/cli.py
+++ b/strix/interface/cli.py
@@ -1,9 +1,12 @@
 import atexit
 import signal
 import sys
+import threading
+import time
 from typing import Any

 from rich.console import Console
+from rich.live import Live
 from rich.panel import Panel
 from rich.text import Text

@@ -11,7 +14,7 @@ from strix.agents.StrixAgent import StrixAgent
 from strix.llm.config import LLMConfig
 from strix.telemetry.tracer import Tracer, set_global_tracer

-from .utils import get_severity_color
+from .utils import build_final_stats_text, build_live_stats_text, get_severity_color


 async def run_cli(args: Any) -> None:  # noqa: PLR0915
@@ -36,7 +39,7 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915

    results_text = Text()
    results_text.append("📊 Results will be saved to: ", style="bold cyan")
-    results_text.append(f"agent_runs/{args.run_name}", style="bold white")
+    results_text.append(f"strix_runs/{args.run_name}", style="bold white")

    note_text = Text()
    note_text.append("\n\n", style="dim")
@@ -63,6 +66,8 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915
    console.print(startup_panel)
    console.print()

+    scan_mode = getattr(args, "scan_mode", "deep")
+
    scan_config = {
        "scan_id": args.run_name,
        "targets": args.targets_info,
@@ -70,7 +75,7 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915
        "run_name": args.run_name,
    }

-    llm_config = LLMConfig()
+    llm_config = LLMConfig(scan_mode=scan_mode)
    agent_config = {
        "llm_config": llm_config,
        "max_iterations": 300,
@@ -130,24 +135,80 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915

    set_global_tracer(tracer)

+    def create_live_status() -> Panel:
+        status_text = Text()
+        status_text.append("🦉 ", style="bold white")
+        status_text.append("Running penetration test...", style="bold #22c55e")
+        status_text.append("\n\n")
+
+        stats_text = build_live_stats_text(tracer, agent_config)
+        if stats_text:
+            status_text.append(stats_text)
+
+        return Panel(
+            status_text,
+            title="[bold #22c55e]🔍 Live Penetration Test Status",
+            title_align="center",
+            border_style="#22c55e",
+            padding=(1, 2),
+        )
+
    try:
        console.print()
-        with console.status("[bold cyan]Running penetration test...", spinner="dots") as status:
-            agent = StrixAgent(agent_config)
-            result = await agent.execute_scan(scan_config)
-            status.stop()

-            if isinstance(result, dict) and not result.get("success", True):
-                error_msg = result.get("error", "Unknown error")
-                console.print()
-                console.print(f"[bold red]❌ Penetration test failed:[/] {error_msg}")
-                console.print()
-                sys.exit(1)
+        with Live(
+            create_live_status(), console=console, refresh_per_second=2, transient=False
+        ) as live:
+            stop_updates = threading.Event()
+
+            def update_status() -> None:
+                while not stop_updates.is_set():
+                    try:
+                        live.update(create_live_status())
+                        time.sleep(2)
+                    except Exception:  # noqa: BLE001
+                        break
+
+            update_thread = threading.Thread(target=update_status, daemon=True)
+            update_thread.start()
+
+            try:
+                agent = StrixAgent(agent_config)
+                result = await agent.execute_scan(scan_config)
+
+                if isinstance(result, dict) and not result.get("success", True):
+                    error_msg = result.get("error", "Unknown error")
+                    console.print()
+                    console.print(f"[bold red]❌ Penetration test failed:[/] {error_msg}")
+                    console.print()
+                    sys.exit(1)
+            finally:
+                stop_updates.set()
+                update_thread.join(timeout=1)

    except Exception as e:
        console.print(f"[bold red]Error during penetration test:[/] {e}")
        raise

+    console.print()
+    final_stats_text = Text()
+    final_stats_text.append("📊 ", style="bold cyan")
+    final_stats_text.append("PENETRATION TEST COMPLETED", style="bold green")
+    final_stats_text.append("\n\n")
+
+    stats_text = build_final_stats_text(tracer)
+    if stats_text:
+        final_stats_text.append(stats_text)
+
+    final_stats_panel = Panel(
+        final_stats_text,
+        title="[bold green]✅ Final Statistics",
+        title_align="center",
+        border_style="green",
+        padding=(1, 2),
+    )
+    console.print(final_stats_panel)
+
    if tracer.final_scan_result:
        console.print()

--- a/strix/interface/main.py
+++ b/strix/interface/main.py
@@ -10,6 +10,7 @@ import os
 import shutil
 import sys
 from pathlib import Path
+from typing import Any

 import litellm
 from docker.errors import DockerException
@@ -21,8 +22,7 @@ from strix.interface.cli import run_cli
 from strix.interface.tui import run_tui
 from strix.interface.utils import (
    assign_workspace_subdirs,
-    build_llm_stats_text,
-    build_stats_text,
+    build_final_stats_text,
    check_docker_connection,
    clone_repository,
    collect_local_sources,
@@ -57,10 +57,7 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
    )

    if not os.getenv("LLM_API_KEY"):
-        if not has_base_url:
-            missing_required_vars.append("LLM_API_KEY")
-        else:
-            missing_optional_vars.append("LLM_API_KEY")
+        missing_optional_vars.append("LLM_API_KEY")

    if not has_base_url:
        missing_optional_vars.append("LLM_API_BASE")
@@ -93,13 +90,6 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
                    " - Model name to use with litellm (e.g., 'openai/gpt-5')\n",
                    style="white",
                )
-            elif var == "LLM_API_KEY":
-                error_text.append("• ", style="white")
-                error_text.append("LLM_API_KEY", style="bold cyan")
-                error_text.append(
-                    " - API key for the LLM provider (required for cloud providers)\n",
-                    style="white",
-                )

        if missing_optional_vars:
            error_text.append("\nOptional environment variables:\n", style="white")
@@ -107,7 +97,11 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
                if var == "LLM_API_KEY":
                    error_text.append("• ", style="white")
                    error_text.append("LLM_API_KEY", style="bold cyan")
-                    error_text.append(" - API key for the LLM provider\n", style="white")
+                    error_text.append(
+                        " - API key for the LLM provider "
+                        "(not needed for local models, Vertex AI, AWS, etc.)\n",
+                        style="white",
+                    )
                elif var == "LLM_API_BASE":
                    error_text.append("• ", style="white")
                    error_text.append("LLM_API_BASE", style="bold cyan")
@@ -126,14 +120,12 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
        error_text.append("\nExample setup:\n", style="white")
        error_text.append("export STRIX_LLM='openai/gpt-5'\n", style="dim white")

-        if "LLM_API_KEY" in missing_required_vars:
-            error_text.append("export LLM_API_KEY='your-api-key-here'\n", style="dim white")
-
        if missing_optional_vars:
            for var in missing_optional_vars:
                if var == "LLM_API_KEY":
                    error_text.append(
-                        "export LLM_API_KEY='your-api-key-here'  # optional with local models\n",
+                        "export LLM_API_KEY='your-api-key-here'  "
+                        "# not needed for local models, Vertex AI, AWS, etc.\n",
                        style="dim white",
                    )
                elif var == "LLM_API_BASE":
@@ -190,28 +182,31 @@ async def warm_up_llm() -> None:
    try:
        model_name = os.getenv("STRIX_LLM", "openai/gpt-5")
        api_key = os.getenv("LLM_API_KEY")
-
-        if api_key:
-            litellm.api_key = api_key
-
        api_base = (
            os.getenv("LLM_API_BASE")
            or os.getenv("OPENAI_API_BASE")
            or os.getenv("LITELLM_BASE_URL")
            or os.getenv("OLLAMA_API_BASE")
        )
-        if api_base:
-            litellm.api_base = api_base

        test_messages = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Reply with just 'OK'."},
        ]

-        response = litellm.completion(
-            model=model_name,
-            messages=test_messages,
-        )
+        llm_timeout = int(os.getenv("LLM_TIMEOUT", "600"))
+
+        completion_kwargs: dict[str, Any] = {
+            "model": model_name,
+            "messages": test_messages,
+            "timeout": llm_timeout,
+        }
+        if api_key:
+            completion_kwargs["api_key"] = api_key
+        if api_base:
+            completion_kwargs["api_base"] = api_base
+
+        response = litellm.completion(**completion_kwargs)

        validate_llm_response(response)

@@ -238,6 +233,15 @@ async def warm_up_llm() -> None:
        sys.exit(1)


+def get_version() -> str:
+    try:
+        from importlib.metadata import version
+
+        return version("strix-agent")
+    except Exception:  # noqa: BLE001
+        return "unknown"
+
+
 def parse_arguments() -> argparse.Namespace:
    parser = argparse.ArgumentParser(
        description="Strix Multi-Agent Cybersecurity Penetration Testing Tool",
@@ -257,22 +261,36 @@ Examples:
  # Domain penetration test
  strix --target example.com

+  # IP address penetration test
+  strix --target 192.168.1.42
+
  # Multiple targets (e.g., white-box testing with source and deployed app)
  strix --target https://github.com/user/repo --target https://example.com
  strix --target ./my-project --target https://staging.example.com --target https://prod.example.com

-  # Custom instructions
+  # Custom instructions (inline)
  strix --target example.com --instruction "Focus on authentication vulnerabilities"
+
+  # Custom instructions (from file)
+  strix --target example.com --instruction-file ./instructions.txt
+  strix --target https://app.com --instruction-file /path/to/detailed_instructions.md
        """,
    )

+    parser.add_argument(
+        "-v",
+        "--version",
+        action="version",
+        version=f"strix {get_version()}",
+    )
+
    parser.add_argument(
        "-t",
        "--target",
        type=str,
        required=True,
        action="append",
-        help="Target to test (URL, repository, local directory path, or domain name). "
+        help="Target to test (URL, repository, local directory path, domain name, or IP address). "
        "Can be specified multiple times for multi-target scans.",
    )
    parser.add_argument(
@@ -283,7 +301,15 @@ Examples:
        "testing approaches (e.g., 'Perform thorough authentication testing'), "
        "test credentials (e.g., 'Use the following credentials to access the app: "
        "admin:password123'), "
-        "or areas of interest (e.g., 'Check login API endpoint for security issues')",
+        "or areas of interest (e.g., 'Check login API endpoint for security issues').",
+    )
+
+    parser.add_argument(
+        "--instruction-file",
+        type=str,
+        help="Path to a file containing detailed custom instructions for the penetration test. "
+        "Use this option when you have lengthy or complex instructions saved in a file "
+        "(e.g., '--instruction-file ./detailed_instructions.txt').",
    )

    parser.add_argument(
@@ -302,8 +328,38 @@ Examples:
        ),
    )

+    parser.add_argument(
+        "-m",
+        "--scan-mode",
+        type=str,
+        choices=["quick", "standard", "deep"],
+        default="deep",
+        help=(
+            "Scan mode: "
+            "'quick' for fast CI/CD checks, "
+            "'standard' for routine testing, "
+            "'deep' for thorough security reviews (default). "
+            "Default: deep."
+        ),
+    )
+
    args = parser.parse_args()

+    if args.instruction and args.instruction_file:
+        parser.error(
+            "Cannot specify both --instruction and --instruction-file. Use one or the other."
+        )
+
+    if args.instruction_file:
+        instruction_path = Path(args.instruction_file)
+        try:
+            with instruction_path.open(encoding="utf-8") as f:
+                args.instruction = f.read().strip()
+                if not args.instruction:
+                    parser.error(f"Instruction file '{instruction_path}' is empty")
+        except Exception as e:  # noqa: BLE001
+            parser.error(f"Failed to read instruction file '{instruction_path}': {e}")
+
    args.targets_info = []
    for target in args.target:
        try:
@@ -347,8 +403,7 @@ def display_completion_message(args: argparse.Namespace, results_path: Path) ->
        completion_text.append(" • ", style="dim white")
        completion_text.append("Penetration test interrupted by user", style="white")

-    stats_text = build_stats_text(tracer)
-    llm_stats_text = build_llm_stats_text(tracer)
+    stats_text = build_final_stats_text(tracer)

    target_text = Text()
    if len(args.targets_info) == 1:
@@ -368,9 +423,6 @@ def display_completion_message(args: argparse.Namespace, results_path: Path) ->
    if stats_text.plain:
        panel_parts.extend(["\n", stats_text])

-    if llm_stats_text.plain:
-        panel_parts.extend(["\n", llm_stats_text])
-
    if scan_completed or has_vulnerabilities:
        results_text = Text()
        results_text.append("📊 Results Saved To: ", style="bold cyan")
@@ -392,6 +444,9 @@ def display_completion_message(args: argparse.Namespace, results_path: Path) ->
    console.print("\n")
    console.print(panel)
    console.print()
+    console.print("[dim]🌐 Website:[/] [cyan]https://usestrix.com[/]")
+    console.print("[dim]💬 Discord:[/] [cyan]https://discord.gg/YjKFvEZSdZ[/]")
+    console.print()


 def pull_docker_image() -> None:
@@ -453,7 +508,7 @@ def main() -> None:
    asyncio.run(warm_up_llm())

    if not args.run_name:
-        args.run_name = generate_run_name()
+        args.run_name = generate_run_name(args.targets_info)

    for target_info in args.targets_info:
        if target_info["type"] == "repository":
@@ -469,7 +524,7 @@ def main() -> None:
    else:
        asyncio.run(run_tui(args))

-    results_path = Path("agent_runs") / args.run_name
+    results_path = Path("strix_runs") / args.run_name
    display_completion_message(args, results_path)

    if args.non_interactive:
--- a/strix/interface/tool_components/init.py
+++ b/strix/interface/tool_components/init.py
@@ -1,4 +1,5 @@
 from . import (
+    agent_message_renderer,
    agents_graph_renderer,
    browser_renderer,
    file_edit_renderer,
@@ -10,6 +11,7 @@ from . import (
    scan_info_renderer,
    terminal_renderer,
    thinking_renderer,
+    todo_renderer,
    user_message_renderer,
    web_search_renderer,
 )
@@ -20,6 +22,7 @@ from .registry import ToolTUIRegistry, get_tool_renderer, register_tool_renderer
 __all__ = [
    "BaseToolRenderer",
    "ToolTUIRegistry",
+    "agent_message_renderer",
    "agents_graph_renderer",
    "browser_renderer",
    "file_edit_renderer",
@@ -34,6 +37,7 @@ __all__ = [
    "scan_info_renderer",
    "terminal_renderer",
    "thinking_renderer",
+    "todo_renderer",
    "user_message_renderer",
    "web_search_renderer",
 ]
--- a/strix/interface/tool_components/agent_message_renderer.py
+++ b/strix/interface/tool_components/agent_message_renderer.py
@@ -0,0 +1,70 @@
+import re
+from typing import Any, ClassVar
+
+from textual.widgets import Static
+
+from .base_renderer import BaseToolRenderer
+from .registry import register_tool_renderer
+
+
+def markdown_to_rich(text: str) -> str:
+    # Fenced code blocks: ```lang\n...\n``` or ```\n...\n```
+    text = re.sub(
+        r"```(?:\w*)\n(.*?)```",
+        r"[dim]\1[/dim]",
+        text,
+        flags=re.DOTALL,
+    )
+
+    # Headers
+    text = re.sub(r"^#### (.+)$", r"[bold]\1[/bold]", text, flags=re.MULTILINE)
+    text = re.sub(r"^### (.+)$", r"[bold]\1[/bold]", text, flags=re.MULTILINE)
+    text = re.sub(r"^## (.+)$", r"[bold]\1[/bold]", text, flags=re.MULTILINE)
+    text = re.sub(r"^# (.+)$", r"[bold]\1[/bold]", text, flags=re.MULTILINE)
+
+    # Links
+    text = re.sub(r"\[([^\]]+)\]\(([^)]+)\)", r"[underline]\1[/underline] [dim](\2)[/dim]", text)
+
+    # Bold
+    text = re.sub(r"\*\*(.+?)\*\*", r"[bold]\1[/bold]", text)
+    text = re.sub(r"__(.+?)__", r"[bold]\1[/bold]", text)
+
+    # Italic
+    text = re.sub(r"(?<!\*)\*(?!\*)(.+?)(?<!\*)\*(?!\*)", r"[italic]\1[/italic]", text)
+    text = re.sub(r"(?<![_\w])_(?!_)(.+?)(?<!_)_(?![_\w])", r"[italic]\1[/italic]", text)
+
+    # Inline code
+    text = re.sub(r"`([^`]+)`", r"[bold dim]\1[/bold dim]", text)
+
+    # Strikethrough
+    return re.sub(r"~~(.+?)~~", r"[strike]\1[/strike]", text)
+
+
+@register_tool_renderer
+class AgentMessageRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "agent_message"
+    css_classes: ClassVar[list[str]] = ["chat-message", "agent-message"]
+
+    @classmethod
+    def render(cls, message_data: dict[str, Any]) -> Static:
+        content = message_data.get("content", "")
+
+        if not content:
+            return Static("", classes=cls.css_classes)
+
+        formatted_content = cls._format_agent_message(content)
+
+        css_classes = " ".join(cls.css_classes)
+        return Static(formatted_content, classes=css_classes)
+
+    @classmethod
+    def render_simple(cls, content: str) -> str:
+        if not content:
+            return ""
+
+        return cls._format_agent_message(content)
+
+    @classmethod
+    def _format_agent_message(cls, content: str) -> str:
+        escaped_content = cls.escape_markup(content)
+        return markdown_to_rich(escaped_content)
--- a/strix/interface/tool_components/browser_renderer.py
+++ b/strix/interface/tool_components/browser_renderer.py
@@ -1,16 +1,53 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import get_lexer_by_name
+from pygments.styles import get_style_by_name
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
@register_tool_renderer
 class BrowserRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "browser_action"
    css_classes: ClassVar[list[str]] = ["tool-call", "browser-tool"]

+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_js(cls, code: str) -> str:
+        lexer = get_lexer_by_name("javascript")
+        result_parts: list[str] = []
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+
+            escaped_value = cls.escape_markup(token_value)
+            color = cls._get_token_color(token_type)
+
+            if color:
+                result_parts.append(f"[{color}]{escaped_value}[/]")
+            else:
+                result_parts.append(escaped_value)
+
+        return "".join(result_parts)
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
@@ -115,6 +152,5 @@ class BrowserRenderer(BaseToolRenderer):

    @classmethod
    def _format_js(cls, js_code: str) -> str:
-        if len(js_code) > 200:
-            js_code = js_code[:197] + "..."
-        return f"[white]{cls.escape_markup(js_code)}[/white]"
+        code_display = js_code[:2000] + "..." if len(js_code) > 2000 else js_code
+        return cls._highlight_js(code_display)
--- a/strix/interface/tool_components/file_edit_renderer.py
+++ b/strix/interface/tool_components/file_edit_renderer.py
@@ -1,16 +1,61 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import get_lexer_by_name, get_lexer_for_filename
+from pygments.styles import get_style_by_name
+from pygments.util import ClassNotFound
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
+def _get_lexer_for_file(path: str) -> Any:
+    try:
+        return get_lexer_for_filename(path)
+    except ClassNotFound:
+        return get_lexer_by_name("text")
+
+
@register_tool_renderer
 class StrReplaceEditorRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "str_replace_editor"
    css_classes: ClassVar[list[str]] = ["tool-call", "file-edit-tool"]

+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_code(cls, code: str, path: str) -> str:
+        lexer = _get_lexer_for_file(path)
+        result_parts: list[str] = []
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+
+            escaped_value = cls.escape_markup(token_value)
+            color = cls._get_token_color(token_type)
+
+            if color:
+                result_parts.append(f"[{color}]{escaped_value}[/]")
+            else:
+                result_parts.append(escaped_value)
+
+        return "".join(result_parts)
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
@@ -18,6 +63,9 @@ class StrReplaceEditorRenderer(BaseToolRenderer):

        command = args.get("command", "")
        path = args.get("path", "")
+        old_str = args.get("old_str", "")
+        new_str = args.get("new_str", "")
+        file_text = args.get("file_text", "")

        if command == "view":
            header = "📖 [bold #10b981]Reading file[/]"
@@ -32,12 +80,33 @@ class StrReplaceEditorRenderer(BaseToolRenderer):
        else:
            header = "📄 [bold #10b981]File operation[/]"

-        if (result and isinstance(result, dict) and "content" in result) or path:
-            path_display = path[-60:] if len(path) > 60 else path
-            content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
-        else:
-            content_text = f"{header} [dim]Processing...[/]"
+        path_display = path[-60:] if len(path) > 60 else path
+        content_parts = [f"{header} [dim]{cls.escape_markup(path_display)}[/]"]

+        if command == "str_replace" and (old_str or new_str):
+            if old_str:
+                old_display = old_str[:1000] + "..." if len(old_str) > 1000 else old_str
+                highlighted_old = cls._highlight_code(old_display, path)
+                old_lines = highlighted_old.split("\n")
+                content_parts.extend(f"[#ef4444]-[/] {line}" for line in old_lines)
+            if new_str:
+                new_display = new_str[:1000] + "..." if len(new_str) > 1000 else new_str
+                highlighted_new = cls._highlight_code(new_display, path)
+                new_lines = highlighted_new.split("\n")
+                content_parts.extend(f"[#22c55e]+[/] {line}" for line in new_lines)
+        elif command == "create" and file_text:
+            text_display = file_text[:1500] + "..." if len(file_text) > 1500 else file_text
+            highlighted_text = cls._highlight_code(text_display, path)
+            content_parts.append(highlighted_text)
+        elif command == "insert" and new_str:
+            new_display = new_str[:1000] + "..." if len(new_str) > 1000 else new_str
+            highlighted_new = cls._highlight_code(new_display, path)
+            new_lines = highlighted_new.split("\n")
+            content_parts.extend(f"[#22c55e]+[/] {line}" for line in new_lines)
+        elif not (result and isinstance(result, dict) and "content" in result) and not path:
+            content_parts = [f"{header} [dim]Processing...[/]"]
+
+        content_text = "\n".join(content_parts)
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)

--- a/strix/interface/tool_components/notes_renderer.py
+++ b/strix/interface/tool_components/notes_renderer.py
@@ -6,6 +6,12 @@ from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+def _truncate(text: str, length: int = 800) -> str:
+    if len(text) <= length:
+        return text
+    return text[: length - 3] + "..."
+
+
@register_tool_renderer
 class CreateNoteRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "create_note"
@@ -17,23 +23,24 @@ class CreateNoteRenderer(BaseToolRenderer):

        title = args.get("title", "")
        content = args.get("content", "")
+        category = args.get("category", "general")

-        header = "📝 [bold #fbbf24]Note[/]"
+        header = f"📝 [bold #fbbf24]Note[/] [dim]({category})[/]"

+        lines = [header]
        if title:
-            title_display = title[:100] + "..." if len(title) > 100 else title
-            note_parts = [f"{header}\n  [bold]{cls.escape_markup(title_display)}[/]"]
+            title_display = _truncate(title.strip(), 300)
+            lines.append(f"  {cls.escape_markup(title_display)}")

-            if content:
-                content_display = content[:200] + "..." if len(content) > 200 else content
-                note_parts.append(f"  [dim]{cls.escape_markup(content_display)}[/]")
+        if content:
+            content_display = _truncate(content.strip(), 800)
+            lines.append(f"  [dim]{cls.escape_markup(content_display)}[/]")

-            content_text = "\n".join(note_parts)
-        else:
-            content_text = f"{header}\n  [dim]Creating note...[/]"
+        if len(lines) == 1:
+            lines.append("  [dim]Capturing...[/]")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static("\n".join(lines), classes=css_classes)


@register_tool_renderer
@@ -43,8 +50,8 @@ class DeleteNoteRenderer(BaseToolRenderer):

    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:  # noqa: ARG003
-        header = "🗑️ [bold #fbbf24]Delete Note[/]"
-        content_text = f"{header}\n  [dim]Deleting...[/]"
+        header = "📝 [bold #94a3b8]Note Removed[/]"
+        content_text = header

        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@@ -59,28 +66,24 @@ class UpdateNoteRenderer(BaseToolRenderer):
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})

-        title = args.get("title", "")
-        content = args.get("content", "")
+        title = args.get("title")
+        content = args.get("content")

-        header = "✏️ [bold #fbbf24]Update Note[/]"
+        header = "📝 [bold #fbbf24]Note Updated[/]"
+        lines = [header]

-        if title or content:
-            note_parts = [header]
+        if title:
+            lines.append(f"  {cls.escape_markup(_truncate(title, 300))}")

-            if title:
-                title_display = title[:100] + "..." if len(title) > 100 else title
-                note_parts.append(f"  [bold]{cls.escape_markup(title_display)}[/]")
+        if content:
+            content_display = _truncate(content.strip(), 800)
+            lines.append(f"  [dim]{cls.escape_markup(content_display)}[/]")

-            if content:
-                content_display = content[:200] + "..." if len(content) > 200 else content
-                note_parts.append(f"  [dim]{cls.escape_markup(content_display)}[/]")
-
-            content_text = "\n".join(note_parts)
-        else:
-            content_text = f"{header}\n  [dim]Updating...[/]"
+        if len(lines) == 1:
+            lines.append("  [dim]Updating...[/]")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static("\n".join(lines), classes=css_classes)


@register_tool_renderer
@@ -92,17 +95,34 @@ class ListNotesRenderer(BaseToolRenderer):
    def render(cls, tool_data: dict[str, Any]) -> Static:
        result = tool_data.get("result")

-        header = "📋 [bold #fbbf24]Listing notes[/]"
+        header = "📝 [bold #fbbf24]Notes[/]"

-        if result and isinstance(result, dict) and "notes" in result:
-            notes = result["notes"]
-            if isinstance(notes, list):
-                count = len(notes)
-                content_text = f"{header}\n  [dim]{count} notes found[/]"
+        if result and isinstance(result, dict) and result.get("success"):
+            count = result.get("total_count", 0)
+            notes = result.get("notes", []) or []
+            lines = [header]
+
+            if count == 0:
+                lines.append("  [dim]No notes[/]")
            else:
-                content_text = f"{header}\n  [dim]No notes found[/]"
+                for note in notes[:5]:
+                    title = note.get("title", "").strip() or "(untitled)"
+                    category = note.get("category", "general")
+                    content = note.get("content", "").strip()
+
+                    lines.append(
+                        f"  - {cls.escape_markup(_truncate(title, 300))} [dim]({category})[/]"
+                    )
+                    if content:
+                        content_preview = _truncate(content, 400)
+                        lines.append(f"    [dim]{cls.escape_markup(content_preview)}[/]")
+
+                remaining = max(count - 5, 0)
+                if remaining:
+                    lines.append(f"  [dim]... +{remaining} more[/]")
+            content_text = "\n".join(lines)
        else:
-            content_text = f"{header}\n  [dim]Listing notes...[/]"
+            content_text = f"{header}\n  [dim]Loading...[/]"

        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
--- a/strix/interface/tool_components/python_renderer.py
+++ b/strix/interface/tool_components/python_renderer.py
@@ -1,16 +1,53 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import PythonLexer
+from pygments.styles import get_style_by_name
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
@register_tool_renderer
 class PythonRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "python_action"
    css_classes: ClassVar[list[str]] = ["tool-call", "python-tool"]

+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_python(cls, code: str) -> str:
+        lexer = PythonLexer()
+        result_parts: list[str] = []
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+
+            escaped_value = cls.escape_markup(token_value)
+            color = cls._get_token_color(token_type)
+
+            if color:
+                result_parts.append(f"[{color}]{escaped_value}[/]")
+            else:
+                result_parts.append(escaped_value)
+
+        return "".join(result_parts)
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
@@ -21,8 +58,9 @@ class PythonRenderer(BaseToolRenderer):
        header = "</> [bold #3b82f6]Python[/]"

        if code and action in ["new_session", "execute"]:
-            code_display = code[:600] + "..." if len(code) > 600 else code
-            content_text = f"{header}\n  [italic white]{cls.escape_markup(code_display)}[/]"
+            code_display = code[:2000] + "..." if len(code) > 2000 else code
+            highlighted_code = cls._highlight_python(code_display)
+            content_text = f"{header}\n{highlighted_code}"
        elif action == "close":
            content_text = f"{header}\n  [dim]Closing session...[/]"
        elif action == "list_sessions":
--- a/strix/interface/tool_components/terminal_renderer.py
+++ b/strix/interface/tool_components/terminal_renderer.py
@@ -1,16 +1,53 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import get_lexer_by_name
+from pygments.styles import get_style_by_name
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
@register_tool_renderer
 class TerminalRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "terminal_execute"
    css_classes: ClassVar[list[str]] = ["tool-call", "terminal-tool"]

+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_bash(cls, code: str) -> str:
+        lexer = get_lexer_by_name("bash")
+        result_parts: list[str] = []
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+
+            escaped_value = cls.escape_markup(token_value)
+            color = cls._get_token_color(token_type)
+
+            if color:
+                result_parts.append(f"[{color}]{escaped_value}[/]")
+            else:
+                result_parts.append(escaped_value)
+
+        return "".join(result_parts)
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
@@ -115,17 +152,15 @@ class TerminalRenderer(BaseToolRenderer):

        if is_input:
            formatted_command = cls._format_command_display(command)
-            return f"{terminal_icon} [#3b82f6]>>>[/] [#22c55e]{formatted_command}[/]"
+            return f"{terminal_icon} [#3b82f6]>>>[/] {formatted_command}"

        formatted_command = cls._format_command_display(command)
-        return f"{terminal_icon} [#22c55e]$ {formatted_command}[/]"
+        return f"{terminal_icon} [#22c55e]$[/] {formatted_command}"

    @classmethod
    def _format_command_display(cls, command: str) -> str:
        if not command:
            return ""

-        if len(command) > 400:
-            command = command[:397] + "..."
-
-        return cls.escape_markup(command)
+        cmd_display = command[:2000] + "..." if len(command) > 2000 else command
+        return cls._highlight_bash(cmd_display)
--- a/strix/interface/tool_components/todo_renderer.py
+++ b/strix/interface/tool_components/todo_renderer.py
@@ -0,0 +1,204 @@
+from typing import Any, ClassVar
+
+from textual.widgets import Static
+
+from .base_renderer import BaseToolRenderer
+from .registry import register_tool_renderer
+
+
+STATUS_MARKERS = {
+    "pending": "[ ]",
+    "in_progress": "[~]",
+    "done": "[•]",
+}
+
+
+def _truncate(text: str, length: int = 80) -> str:
+    if len(text) <= length:
+        return text
+    return text[: length - 3] + "..."
+
+
+def _format_todo_lines(
+    cls: type[BaseToolRenderer], result: dict[str, Any], limit: int = 25
+) -> list[str]:
+    todos = result.get("todos")
+    if not isinstance(todos, list) or not todos:
+        return ["  [dim]No todos[/]"]
+
+    lines: list[str] = []
+    total = len(todos)
+
+    for index, todo in enumerate(todos):
+        if index >= limit:
+            remaining = total - limit
+            if remaining > 0:
+                lines.append(f"  [dim]... +{remaining} more[/]")
+            break
+
+        status = todo.get("status", "pending")
+        marker = STATUS_MARKERS.get(status, STATUS_MARKERS["pending"])
+
+        title = todo.get("title", "").strip() or "(untitled)"
+        title = cls.escape_markup(_truncate(title, 90))
+
+        if status == "done":
+            title_markup = f"[dim strike]{title}[/]"
+        elif status == "in_progress":
+            title_markup = f"[italic]{title}[/]"
+        else:
+            title_markup = title
+
+        lines.append(f"  {marker} {title_markup}")
+
+    return lines
+
+
+@register_tool_renderer
+class CreateTodoRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "create_todo"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #a78bfa]Todo[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Failed to create todo")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Creating...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
+
+
+@register_tool_renderer
+class ListTodosRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "list_todos"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #a78bfa]Todos[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Unable to list todos")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Loading...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
+
+
+@register_tool_renderer
+class UpdateTodoRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "update_todo"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #a78bfa]Todo Updated[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Failed to update todo")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Updating...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
+
+
+@register_tool_renderer
+class MarkTodoDoneRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "mark_todo_done"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #a78bfa]Todo Completed[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Failed to mark todo done")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Marking done...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
+
+
+@register_tool_renderer
+class MarkTodoPendingRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "mark_todo_pending"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #f59e0b]Todo Reopened[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Failed to reopen todo")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Reopening...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
+
+
+@register_tool_renderer
+class DeleteTodoRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "delete_todo"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #94a3b8]Todo Removed[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Failed to remove todo")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Removing...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
--- a/strix/interface/tui.py
+++ b/strix/interface/tui.py
@@ -31,6 +31,7 @@ from textual.widgets import Button, Label, Static, TextArea, Tree
 from textual.widgets.tree import TreeNode

 from strix.agents.StrixAgent import StrixAgent
+from strix.interface.utils import build_live_stats_text
 from strix.llm.config import LLMConfig
 from strix.telemetry.tracer import Tracer, set_global_tracer

@@ -318,7 +319,8 @@ class StrixTUIApp(App):  # type: ignore[misc]
        }

    def _build_agent_config(self, args: argparse.Namespace) -> dict[str, Any]:
-        llm_config = LLMConfig()
+        scan_mode = getattr(args, "scan_mode", "deep")
+        llm_config = LLMConfig(scan_mode=scan_mode)

        config = {
            "llm_config": llm_config,
@@ -393,8 +395,12 @@ class StrixTUIApp(App):  # type: ignore[misc]
            agents_tree.guide_depth = 3
            agents_tree.guide_style = "dashed"

+            stats_display = Static("", id="stats_display")
+
+            sidebar = Vertical(agents_tree, stats_display, id="sidebar")
+
            content_container.mount(chat_area_container)
-            content_container.mount(agents_tree)
+            content_container.mount(sidebar)

            chat_area_container.mount(chat_history)
            chat_area_container.mount(agent_status_display)
@@ -481,6 +487,8 @@ class StrixTUIApp(App):  # type: ignore[misc]

        self._update_agent_status_display()

+        self._update_stats_display()
+
    def _update_agent_node(self, agent_id: str, agent_data: dict[str, Any]) -> bool:
        if agent_id not in self.agent_nodes:
            return False
@@ -658,6 +666,33 @@ class StrixTUIApp(App):  # type: ignore[misc]
        except (KeyError, Exception):
            self._safe_widget_operation(status_display.add_class, "hidden")

+    def _update_stats_display(self) -> None:
+        try:
+            stats_display = self.query_one("#stats_display", Static)
+        except (ValueError, Exception):
+            return
+
+        if not self._is_widget_safe(stats_display):
+            return
+
+        stats_content = Text()
+
+        stats_text = build_live_stats_text(self.tracer, self.agent_config)
+        if stats_text:
+            stats_content.append(stats_text)
+
+        from rich.panel import Panel
+
+        stats_panel = Panel(
+            stats_content,
+            title="📊 Live Stats",
+            title_align="left",
+            border_style="#22c55e",
+            padding=(0, 1),
+        )
+
+        self._safe_widget_operation(stats_display.update, stats_panel)
+
    def _get_agent_verb(self, agent_id: str) -> str:
        if agent_id not in self._agent_verbs:
            self._agent_verbs[agent_id] = random.choice(self._action_verbs)  # nosec B311 # noqa: S311
@@ -953,7 +988,7 @@ class StrixTUIApp(App):  # type: ignore[misc]

    def _render_chat_content(self, msg_data: dict[str, Any]) -> str:
        role = msg_data.get("role")
-        content = escape_markup(msg_data.get("content", ""))
+        content = msg_data.get("content", "")

        if not content:
            return ""
@@ -961,8 +996,11 @@ class StrixTUIApp(App):  # type: ignore[misc]
        if role == "user":
            from strix.interface.tool_components.user_message_renderer import UserMessageRenderer

-            return UserMessageRenderer.render_simple(content)
-        return content
+            return UserMessageRenderer.render_simple(escape_markup(content))
+
+        from strix.interface.tool_components.agent_message_renderer import AgentMessageRenderer
+
+        return AgentMessageRenderer.render_simple(content)

    def _render_tool_content_simple(self, tool_data: dict[str, Any]) -> str:
        tool_name = tool_data.get("tool_name", "Unknown Tool")
--- a/strix/interface/utils.py
+++ b/strix/interface/utils.py
@@ -1,3 +1,4 @@
+import ipaddress
 import re
 import secrets
 import shutil
@@ -37,14 +38,9 @@ def get_severity_color(severity: str) -> str:
    return severity_colors.get(severity, "#6b7280")


-def build_stats_text(tracer: Any) -> Text:
-    stats_text = Text()
-    if not tracer:
-        return stats_text
-
+def _build_vulnerability_stats(stats_text: Text, tracer: Any) -> None:
+    """Build vulnerability section of stats text."""
    vuln_count = len(tracer.vulnerability_reports)
-    tool_count = tracer.get_real_tool_count()
-    agent_count = len(tracer.agents)

    if vuln_count > 0:
        severity_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
@@ -80,68 +76,194 @@ def build_stats_text(tracer: Any) -> Text:
        stats_text.append(" (No exploitable vulnerabilities detected)", style="dim green")
        stats_text.append("\n")

+
+def _build_llm_stats(stats_text: Text, total_stats: dict[str, Any]) -> None:
+    """Build LLM usage section of stats text."""
+    if total_stats["requests"] > 0:
+        stats_text.append("\n")
+        stats_text.append("📥 Input Tokens: ", style="bold cyan")
+        stats_text.append(format_token_count(total_stats["input_tokens"]), style="bold white")
+
+        if total_stats["cached_tokens"] > 0:
+            stats_text.append(" • ", style="dim white")
+            stats_text.append("⚡ Cached Tokens: ", style="bold green")
+            stats_text.append(format_token_count(total_stats["cached_tokens"]), style="bold white")
+
+        stats_text.append(" • ", style="dim white")
+        stats_text.append("📤 Output Tokens: ", style="bold cyan")
+        stats_text.append(format_token_count(total_stats["output_tokens"]), style="bold white")
+
+        if total_stats["cost"] > 0:
+            stats_text.append(" • ", style="dim white")
+            stats_text.append("💰 Total Cost: ", style="bold cyan")
+            stats_text.append(f"${total_stats['cost']:.4f}", style="bold yellow")
+    else:
+        stats_text.append("\n")
+        stats_text.append("💰 Total Cost: ", style="bold cyan")
+        stats_text.append("$0.0000 ", style="bold yellow")
+        stats_text.append("• ", style="bold white")
+        stats_text.append("📊 Tokens: ", style="bold cyan")
+        stats_text.append("0", style="bold white")
+
+
+def build_final_stats_text(tracer: Any) -> Text:
+    """Build stats text for final output with detailed messages and LLM usage."""
+    stats_text = Text()
+    if not tracer:
+        return stats_text
+
+    _build_vulnerability_stats(stats_text, tracer)
+
+    tool_count = tracer.get_real_tool_count()
+    agent_count = len(tracer.agents)
+
    stats_text.append("🤖 Agents Used: ", style="bold cyan")
    stats_text.append(str(agent_count), style="bold white")
    stats_text.append(" • ", style="dim white")
    stats_text.append("🛠️ Tools Called: ", style="bold cyan")
    stats_text.append(str(tool_count), style="bold white")

+    llm_stats = tracer.get_total_llm_stats()
+    _build_llm_stats(stats_text, llm_stats["total"])
+
    return stats_text


-def build_llm_stats_text(tracer: Any) -> Text:
-    llm_stats_text = Text()
+def build_live_stats_text(tracer: Any, agent_config: dict[str, Any] | None = None) -> Text:
+    stats_text = Text()
    if not tracer:
-        return llm_stats_text
+        return stats_text
+
+    vuln_count = len(tracer.vulnerability_reports)
+    tool_count = tracer.get_real_tool_count()
+    agent_count = len(tracer.agents)
+
+    stats_text.append("🔍 Vulnerabilities: ", style="bold white")
+    stats_text.append(f"{vuln_count}", style="dim white")
+    stats_text.append("\n")
+    if vuln_count > 0:
+        severity_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
+        for report in tracer.vulnerability_reports:
+            severity = report.get("severity", "").lower()
+            if severity in severity_counts:
+                severity_counts[severity] += 1
+
+        severity_parts = []
+        for severity in ["critical", "high", "medium", "low", "info"]:
+            count = severity_counts[severity]
+            if count > 0:
+                severity_color = get_severity_color(severity)
+                severity_text = Text()
+                severity_text.append(f"{severity.upper()}: ", style=severity_color)
+                severity_text.append(str(count), style=f"bold {severity_color}")
+                severity_parts.append(severity_text)
+
+        for i, part in enumerate(severity_parts):
+            stats_text.append(part)
+            if i < len(severity_parts) - 1:
+                stats_text.append(" | ", style="dim white")
+
+        stats_text.append("\n")
+
+    if agent_config:
+        llm_config = agent_config["llm_config"]
+        model = getattr(llm_config, "model_name", "Unknown")
+        stats_text.append(f"🧠 Model: {model}")
+        stats_text.append("\n")
+
+    stats_text.append("🤖 Agents: ", style="bold white")
+    stats_text.append(str(agent_count), style="dim white")
+    stats_text.append(" • ", style="dim white")
+    stats_text.append("🛠️ Tools: ", style="bold white")
+    stats_text.append(str(tool_count), style="dim white")

    llm_stats = tracer.get_total_llm_stats()
    total_stats = llm_stats["total"]

-    if total_stats["requests"] > 0:
-        llm_stats_text.append("📥 Input Tokens: ", style="bold cyan")
-        llm_stats_text.append(format_token_count(total_stats["input_tokens"]), style="bold white")
+    stats_text.append("\n")

-        if total_stats["cached_tokens"] > 0:
-            llm_stats_text.append(" • ", style="dim white")
-            llm_stats_text.append("⚡ Cached: ", style="bold green")
-            llm_stats_text.append(
-                format_token_count(total_stats["cached_tokens"]), style="bold green"
-            )
+    stats_text.append("📥 Input: ", style="bold white")
+    stats_text.append(format_token_count(total_stats["input_tokens"]), style="dim white")

-        llm_stats_text.append(" • ", style="dim white")
-        llm_stats_text.append("📤 Output Tokens: ", style="bold cyan")
-        llm_stats_text.append(format_token_count(total_stats["output_tokens"]), style="bold white")
+    stats_text.append(" • ", style="dim white")
+    stats_text.append("⚡ ", style="bold white")
+    stats_text.append("Cached: ", style="bold white")
+    stats_text.append(format_token_count(total_stats["cached_tokens"]), style="dim white")

-        if total_stats["cost"] > 0:
-            llm_stats_text.append(" • ", style="dim white")
-            llm_stats_text.append("💰 Total Cost: $", style="bold cyan")
-            llm_stats_text.append(f"{total_stats['cost']:.4f}", style="bold yellow")
+    stats_text.append("\n")

-    return llm_stats_text
+    stats_text.append("📤 Output: ", style="bold white")
+    stats_text.append(format_token_count(total_stats["output_tokens"]), style="dim white")
+
+    stats_text.append(" • ", style="dim white")
+    stats_text.append("💰 Cost: ", style="bold white")
+    stats_text.append(f"${total_stats['cost']:.4f}", style="dim white")
+
+    return stats_text


 # Name generation utilities
-def generate_run_name() -> str:
-    # fmt: off
-    adjectives = [
-        "stealthy", "sneaky", "crafty", "elite", "phantom", "shadow", "silent",
-        "rogue", "covert", "ninja", "ghost", "cyber", "digital", "binary",
-        "encrypted", "obfuscated", "masked", "cloaked", "invisible", "anonymous"
-    ]
-    nouns = [
-        "exploit", "payload", "backdoor", "rootkit", "keylogger", "botnet", "trojan",
-        "worm", "virus", "packet", "buffer", "shell", "daemon", "spider", "crawler",
-        "scanner", "sniffer", "honeypot", "firewall", "breach"
-    ]
-    # fmt: on
-    adj = secrets.choice(adjectives)
-    noun = secrets.choice(nouns)
-    number = secrets.randbelow(900) + 100
-    return f"{adj}-{noun}-{number}"
+
+
+def _slugify_for_run_name(text: str, max_length: int = 32) -> str:
+    text = text.lower().strip()
+    text = re.sub(r"[^a-z0-9]+", "-", text)
+    text = text.strip("-")
+    if len(text) > max_length:
+        text = text[:max_length].rstrip("-")
+    return text or "pentest"
+
+
+def _derive_target_label_for_run_name(targets_info: list[dict[str, Any]] | None) -> str:  # noqa: PLR0911
+    if not targets_info:
+        return "pentest"
+
+    first = targets_info[0]
+    target_type = first.get("type")
+    details = first.get("details", {}) or {}
+    original = first.get("original", "") or ""
+
+    if target_type == "web_application":
+        url = details.get("target_url", original)
+        try:
+            parsed = urlparse(url)
+            return str(parsed.netloc or parsed.path or url)
+        except Exception:  # noqa: BLE001
+            return str(url)
+
+    if target_type == "repository":
+        repo = details.get("target_repo", original)
+        parsed = urlparse(repo)
+        path = parsed.path or repo
+        name = path.rstrip("/").split("/")[-1] or path
+        if name.endswith(".git"):
+            name = name[:-4]
+        return str(name)
+
+    if target_type == "local_code":
+        path_str = details.get("target_path", original)
+        try:
+            return str(Path(path_str).name or path_str)
+        except Exception:  # noqa: BLE001
+            return str(path_str)
+
+    if target_type == "ip_address":
+        return str(details.get("target_ip", original) or original)
+
+    return str(original or "pentest")
+
+
+def generate_run_name(targets_info: list[dict[str, Any]] | None = None) -> str:
+    base_label = _derive_target_label_for_run_name(targets_info)
+    slug = _slugify_for_run_name(base_label)
+
+    random_suffix = secrets.token_hex(2)
+
+    return f"{slug}_{random_suffix}"


 # Target processing utilities
-def infer_target_type(target: str) -> tuple[str, dict[str, str]]:
+def infer_target_type(target: str) -> tuple[str, dict[str, str]]:  # noqa: PLR0911
    if not target or not isinstance(target, str):
        raise ValueError("Target must be a non-empty string")

@@ -167,6 +289,13 @@ def infer_target_type(target: str) -> tuple[str, dict[str, str]]:
            return "repository", {"target_repo": target}
        return "web_application", {"target_url": target}

+    try:
+        ip_obj = ipaddress.ip_address(target)
+    except ValueError:
+        pass
+    else:
+        return "ip_address", {"target_ip": str(ip_obj)}
+
    path = Path(target).expanduser()
    try:
        if path.exists():
@@ -191,7 +320,8 @@ def infer_target_type(target: str) -> tuple[str, dict[str, str]]:
        "- A valid URL (http:// or https://)\n"
        "- A Git repository URL (https://github.com/... or git@github.com:...)\n"
        "- A local directory path\n"
-        "- A domain name (e.g., example.com)"
+        "- A domain name (e.g., example.com)\n"
+        "- An IP address (e.g., 192.168.1.10)"
    )


--- a/strix/llm/init.py
+++ b/strix/llm/init.py
@@ -11,5 +11,3 @@ __all__ = [
 ]

 litellm._logging._disable_debugging()
-
-litellm.drop_params = True
--- a/strix/llm/config.py
+++ b/strix/llm/config.py
@@ -5,15 +5,19 @@ class LLMConfig:
    def __init__(
        self,
        model_name: str | None = None,
-        temperature: float = 0,
        enable_prompt_caching: bool = True,
        prompt_modules: list[str] | None = None,
+        timeout: int | None = None,
+        scan_mode: str = "deep",
    ):
        self.model_name = model_name or os.getenv("STRIX_LLM", "openai/gpt-5")

        if not self.model_name:
            raise ValueError("STRIX_LLM environment variable must be set and not empty")

-        self.temperature = max(0.0, min(1.0, temperature))
        self.enable_prompt_caching = enable_prompt_caching
        self.prompt_modules = prompt_modules or []
+
+        self.timeout = timeout or int(os.getenv("LLM_TIMEOUT", "300"))
+
+        self.scan_mode = scan_mode if scan_mode in ["quick", "standard", "deep"] else "deep"
--- a/strix/llm/llm.py
+++ b/strix/llm/llm.py
@@ -2,6 +2,7 @@ import logging
 import os
 from dataclasses import dataclass
 from enum import Enum
+from fnmatch import fnmatch
 from pathlib import Path
 from typing import Any

@@ -12,7 +13,7 @@ from jinja2 import (
    select_autoescape,
 )
 from litellm import ModelResponse, completion_cost
-from litellm.utils import supports_prompt_caching
+from litellm.utils import supports_prompt_caching, supports_vision

 from strix.llm.config import LLMConfig
 from strix.llm.memory_compressor import MemoryCompressor
@@ -24,18 +25,16 @@ from strix.tools import get_tools_prompt

 logger = logging.getLogger(__name__)

-api_key = os.getenv("LLM_API_KEY")
-if api_key:
-    litellm.api_key = api_key
+litellm.drop_params = True
+litellm.modify_params = True

-api_base = (
+_LLM_API_KEY = os.getenv("LLM_API_KEY")
+_LLM_API_BASE = (
    os.getenv("LLM_API_BASE")
    or os.getenv("OPENAI_API_BASE")
    or os.getenv("LITELLM_BASE_URL")
    or os.getenv("OLLAMA_API_BASE")
 )
-if api_base:
-    litellm.api_base = api_base


 class LLMRequestFailedError(Exception):
@@ -45,27 +44,14 @@ class LLMRequestFailedError(Exception):
        self.details = details


-MODELS_WITHOUT_STOP_WORDS = [
-    "gpt-5",
-    "gpt-5-mini",
-    "gpt-5-nano",
-    "o1-mini",
-    "o1-preview",
-    "o1",
-    "o1-2024-12-17",
-    "o3",
-    "o3-2025-04-16",
-    "o3-mini-2025-01-31",
-    "o3-mini",
-    "o4-mini",
-    "o4-mini-2025-04-16",
+SUPPORTS_STOP_WORDS_FALSE_PATTERNS: list[str] = [
+    "o1*",
    "grok-4-0709",
+    "grok-code-fast-1",
+    "deepseek-r1-0528*",
 ]

-REASONING_EFFORT_SUPPORTED_MODELS = [
-    "gpt-5",
-    "gpt-5-mini",
-    "gpt-5-nano",
+REASONING_EFFORT_PATTERNS: list[str] = [
    "o1-2024-12-17",
    "o1",
    "o3",
@@ -76,9 +62,39 @@ REASONING_EFFORT_SUPPORTED_MODELS = [
    "o4-mini-2025-04-16",
    "gemini-2.5-flash",
    "gemini-2.5-pro",
+    "gpt-5*",
+    "deepseek-r1-0528*",
+    "claude-sonnet-4-5*",
+    "claude-haiku-4-5*",
 ]


+def normalize_model_name(model: str) -> str:
+    raw = (model or "").strip().lower()
+    if "/" in raw:
+        name = raw.split("/")[-1]
+        if ":" in name:
+            name = name.split(":", 1)[0]
+    else:
+        name = raw
+    if name.endswith("-gguf"):
+        name = name[: -len("-gguf")]
+    return name
+
+
+def model_matches(model: str, patterns: list[str]) -> bool:
+    raw = (model or "").strip().lower()
+    name = normalize_model_name(model)
+    for pat in patterns:
+        pat_l = pat.lower()
+        if "/" in pat_l:
+            if fnmatch(raw, pat_l):
+                return True
+        elif fnmatch(name, pat_l):
+            return True
+    return False
+
+
 class StepRole(str, Enum):
    AGENT = "agent"
    USER = "user"
@@ -117,13 +133,19 @@ class RequestStats:


 class LLM:
-    def __init__(self, config: LLMConfig, agent_name: str | None = None):
+    def __init__(
+        self, config: LLMConfig, agent_name: str | None = None, agent_id: str | None = None
+    ):
        self.config = config
        self.agent_name = agent_name
+        self.agent_id = agent_id
        self._total_stats = RequestStats()
        self._last_request_stats = RequestStats()

-        self.memory_compressor = MemoryCompressor()
+        self.memory_compressor = MemoryCompressor(
+            model_name=self.config.model_name,
+            timeout=self.config.timeout,
+        )

        if agent_name:
            prompt_dir = Path(__file__).parent.parent / "agents" / agent_name
@@ -136,9 +158,10 @@ class LLM:
            )

            try:
-                prompt_module_content = load_prompt_modules(
-                    self.config.prompt_modules or [], self.jinja_env
-                )
+                modules_to_load = list(self.config.prompt_modules or [])
+                modules_to_load.append(f"scan_modes/{self.config.scan_mode}")
+
+                prompt_module_content = load_prompt_modules(modules_to_load, self.jinja_env)

                def get_module(name: str) -> str:
                    return prompt_module_content.get(name, "")
@@ -156,6 +179,31 @@ class LLM:
        else:
            self.system_prompt = "You are a helpful AI assistant."

+    def set_agent_identity(self, agent_name: str | None, agent_id: str | None) -> None:
+        if agent_name:
+            self.agent_name = agent_name
+        if agent_id:
+            self.agent_id = agent_id
+
+    def _build_identity_message(self) -> dict[str, Any] | None:
+        if not (self.agent_name and str(self.agent_name).strip()):
+            return None
+        identity_name = self.agent_name
+        identity_id = self.agent_id
+        content = (
+            "\n\n"
+            "<agent_identity>\n"
+            "<meta>Internal metadata: do not echo or reference; "
+            "not part of history or tool calls.</meta>\n"
+            "<note>You are now assuming the role of this agent. "
+            "Act strictly as this agent and maintain self-identity for this step. "
+            "Now go answer the next needed step!</note>\n"
+            f"<agent_name>{identity_name}</agent_name>\n"
+            f"<agent_id>{identity_id}</agent_id>\n"
+            "</agent_identity>\n\n"
+        )
+        return {"role": "user", "content": content}
+
    def _add_cache_control_to_content(
        self, content: str | list[dict[str, Any]]
    ) -> str | list[dict[str, Any]]:
@@ -231,6 +279,10 @@ class LLM:
    ) -> LLMResponse:
        messages = [{"role": "system", "content": self.system_prompt}]

+        identity_message = self._build_identity_message()
+        if identity_message:
+            messages.append(identity_message)
+
        compressed_history = list(self.memory_compressor.compress_history(conversation_history))

        conversation_history.clear()
@@ -329,39 +381,79 @@ class LLM:
        if not self.config.model_name:
            return True

-        actual_model_name = self.config.model_name.split("/")[-1].lower()
-        model_name_lower = self.config.model_name.lower()
-
-        return not any(
-            actual_model_name == unsupported_model.lower()
-            or model_name_lower == unsupported_model.lower()
-            for unsupported_model in MODELS_WITHOUT_STOP_WORDS
-        )
+        return not model_matches(self.config.model_name, SUPPORTS_STOP_WORDS_FALSE_PATTERNS)

    def _should_include_reasoning_effort(self) -> bool:
        if not self.config.model_name:
            return False

-        actual_model_name = self.config.model_name.split("/")[-1].lower()
-        model_name_lower = self.config.model_name.lower()
+        return model_matches(self.config.model_name, REASONING_EFFORT_PATTERNS)

-        return any(
-            actual_model_name == supported_model.lower()
-            or model_name_lower == supported_model.lower()
-            for supported_model in REASONING_EFFORT_SUPPORTED_MODELS
-        )
+    def _model_supports_vision(self) -> bool:
+        if not self.config.model_name:
+            return False
+        try:
+            return bool(supports_vision(model=self.config.model_name))
+        except Exception:  # noqa: BLE001
+            return False
+
+    def _filter_images_from_messages(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
+        filtered_messages = []
+        for msg in messages:
+            content = msg.get("content")
+            updated_msg = msg
+            if isinstance(content, list):
+                filtered_content = []
+                for item in content:
+                    if isinstance(item, dict):
+                        if item.get("type") == "image_url":
+                            filtered_content.append(
+                                {
+                                    "type": "text",
+                                    "text": "[Screenshot removed - model does not support "
+                                    "vision. Use view_source or execute_js instead.]",
+                                }
+                            )
+                        else:
+                            filtered_content.append(item)
+                    else:
+                        filtered_content.append(item)
+                if filtered_content:
+                    text_parts = [
+                        item.get("text", "") if isinstance(item, dict) else str(item)
+                        for item in filtered_content
+                    ]
+                    all_text = all(
+                        isinstance(item, dict) and item.get("type") == "text"
+                        for item in filtered_content
+                    )
+                    if all_text:
+                        updated_msg = {**msg, "content": "\n".join(text_parts)}
+                    else:
+                        updated_msg = {**msg, "content": filtered_content}
+                else:
+                    updated_msg = {**msg, "content": ""}
+            filtered_messages.append(updated_msg)
+        return filtered_messages

    async def _make_request(
        self,
        messages: list[dict[str, Any]],
    ) -> ModelResponse:
+        if not self._model_supports_vision():
+            messages = self._filter_images_from_messages(messages)
+
        completion_args: dict[str, Any] = {
            "model": self.config.model_name,
            "messages": messages,
-            "temperature": self.config.temperature,
-            "timeout": 180,
+            "timeout": self.config.timeout,
        }

+        if _LLM_API_KEY:
+            completion_args["api_key"] = _LLM_API_KEY
+        if _LLM_API_BASE:
+            completion_args["api_base"] = _LLM_API_BASE
+
        if self._should_include_stop_param():
            completion_args["stop"] = ["</function>"]

--- a/strix/llm/memory_compressor.py
+++ b/strix/llm/memory_compressor.py
@@ -85,6 +85,7 @@ def _extract_message_text(msg: dict[str, Any]) -> str:
 def _summarize_messages(
    messages: list[dict[str, Any]],
    model: str,
+    timeout: int = 600,
 ) -> dict[str, Any]:
    if not messages:
        empty_summary = "<context_summary message_count='0'>{text}</context_summary>"
@@ -106,7 +107,7 @@ def _summarize_messages(
        completion_args = {
            "model": model,
            "messages": [{"role": "user", "content": prompt}],
-            "timeout": 180,
+            "timeout": timeout,
        }

        response = litellm.completion(**completion_args)
@@ -146,9 +147,11 @@ class MemoryCompressor:
        self,
        max_images: int = 3,
        model_name: str | None = None,
+        timeout: int = 600,
    ):
        self.max_images = max_images
        self.model_name = model_name or os.getenv("STRIX_LLM", "openai/gpt-5")
+        self.timeout = timeout

        if not self.model_name:
            raise ValueError("STRIX_LLM environment variable must be set and not empty")
@@ -202,7 +205,7 @@ class MemoryCompressor:
        chunk_size = 10
        for i in range(0, len(old_msgs), chunk_size):
            chunk = old_msgs[i : i + chunk_size]
-            summary = _summarize_messages(chunk, model_name)
+            summary = _summarize_messages(chunk, model_name, self.timeout)
            if summary:
                compressed.append(summary)

--- a/strix/llm/request_queue.py
+++ b/strix/llm/request_queue.py
@@ -1,5 +1,6 @@
 import asyncio
 import logging
+import os
 import threading
 import time
 from typing import Any
@@ -26,7 +27,15 @@ def should_retry_exception(exception: Exception) -> bool:


 class LLMRequestQueue:
-    def __init__(self, max_concurrent: int = 6, delay_between_requests: float = 1.0):
+    def __init__(self, max_concurrent: int = 1, delay_between_requests: float = 4.0):
+        rate_limit_delay = os.getenv("LLM_RATE_LIMIT_DELAY")
+        if rate_limit_delay:
+            delay_between_requests = float(rate_limit_delay)
+
+        rate_limit_concurrent = os.getenv("LLM_RATE_LIMIT_CONCURRENT")
+        if rate_limit_concurrent:
+            max_concurrent = int(rate_limit_concurrent)
+
        self.max_concurrent = max_concurrent
        self.delay_between_requests = delay_between_requests
        self._semaphore = threading.BoundedSemaphore(max_concurrent)
@@ -52,8 +61,8 @@ class LLMRequestQueue:
            self._semaphore.release()

    @retry(  # type: ignore[misc]
-        stop=stop_after_attempt(5),
-        wait=wait_exponential(multiplier=2, min=1, max=30),
+        stop=stop_after_attempt(3),
+        wait=wait_exponential(multiplier=8, min=8, max=64),
        retry=retry_if_exception(should_retry_exception),
        reraise=True,
    )
--- a/strix/prompts/coordination/root_agent.jinja
+++ b/strix/prompts/coordination/root_agent.jinja
@@ -28,7 +28,7 @@ AGENT TYPES YOU CAN CREATE:
 COORDINATION GUIDELINES:
 - Ensure clear task boundaries and success criteria
 - Terminate redundant agents when objectives overlap
- Use message passing for agent communication
+- Use message passing only when essential (requests/answers or critical handoffs); avoid routine status messages and prefer batched updates
 </agent_management>

 <final_responsibilities>
--- a/strix/prompts/scan_modes/deep.jinja
+++ b/strix/prompts/scan_modes/deep.jinja
@@ -0,0 +1,145 @@
+<scan_mode>
+DEEP SCAN MODE - Exhaustive Security Assessment
+
+This mode is for thorough security reviews where finding vulnerabilities is critical.
+
+PHASE 1: EXHAUSTIVE RECONNAISSANCE AND MAPPING
+Spend significant effort understanding the target before exploitation.
+
+For whitebox (source code available):
+- Map EVERY file, module, and code path in the repository
+- Trace all entry points from HTTP handlers to database queries
+- Identify all authentication mechanisms and their implementations
+- Map all authorization checks and understand the access control model
+- Identify all external service integrations and API calls
+- Analyze all configuration files for secrets and misconfigurations
+- Review all database schemas and understand data relationships
+- Map all background jobs, cron tasks, and async processing
+- Identify all serialization/deserialization points
+- Review all file handling operations (upload, download, processing)
+- Understand the deployment model and infrastructure assumptions
+- Check all dependency versions against known CVE databases
+
+For blackbox (no source code):
+- Exhaustive subdomain enumeration using multiple sources and tools
+- Full port scanning to identify all services
+- Complete content discovery with multiple wordlists
+- Technology fingerprinting on all discovered assets
+- API endpoint discovery through documentation, JavaScript analysis, and fuzzing
+- Identify all parameters including hidden and rarely-used ones
+- Map all user roles by testing with different account types
+- Understand rate limiting, WAF rules, and security controls in place
+- Document the complete application architecture as understood from outside
+
+EXECUTION STRATEGY - HIERARCHICAL AGENT SWARM:
+After Phase 1 (Recon & Mapping) is complete:
+1. Divide the application into major components/parts (e.g., Auth System, Payment Gateway, User Profile, Admin Panel)
+2. Spawn a specialized subagent for EACH major component
+3. Each component agent must then:
+   - Further subdivide its scope into subparts (e.g., Login Form, Registration API, Password Reset)
+   - Spawn sub-subagents for each distinct subpart
+4. At the lowest level (specific functionality), spawn specialized agents for EACH potential vulnerability type:
+   - "Auth System" → "Login Form" → "SQLi Agent", "XSS Agent", "Auth Bypass Agent"
+   - This creates a massive parallel swarm covering every angle
+   - Do NOT overload a single agent with multiple vulnerability types
+   - Scale horizontally to maximum capacity
+
+PHASE 2: DEEP BUSINESS LOGIC ANALYSIS
+Understand the application deeply enough to find logic flaws:
+- CREATE A FULL STORYBOARD of all user flows and state transitions
+- Document every step of the business logic in a structured flow diagram
+- Use the application extensively as every type of user to map the full lifecycle of data
+- Document all state machines and workflows (e.g. Order Created -> Paid -> Shipped)
+- Identify trust boundaries between components
+- Map all integrations with third-party services
+- Understand what invariants the application tries to maintain
+- Identify all points where roles, privileges, or sensitive data changes hands
+- Look for implicit assumptions in the business logic
+- Consider multi-step attacks that abuse normal functionality
+
+PHASE 3: COMPREHENSIVE ATTACK SURFACE TESTING
+Test EVERY input vector with EVERY applicable technique.
+
+Input Handling - Test all parameters, headers, cookies with:
+- Multiple injection payloads (SQL, NoSQL, LDAP, XPath, Command, Template)
+- Various encodings and bypass techniques (double encoding, unicode, null bytes)
+- Boundary conditions and type confusion
+- Large payloads and buffer-related issues
+
+Authentication and Session:
+- Exhaustive brute force protection testing
+- Session fixation, hijacking, and prediction attacks
+- JWT/token manipulation if applicable
+- OAuth flow abuse scenarios
+- Password reset flow vulnerabilities (token leakage, reuse, timing)
+- Multi-factor authentication bypass techniques
+- Account enumeration through all possible channels
+
+Access Control:
+- Test EVERY endpoint for horizontal and vertical access control
+- Parameter tampering on all object references
+- Forced browsing to all discovered resources
+- HTTP method tampering
+- Test access control after session changes (logout, role change)
+
+File Operations:
+- Exhaustive file upload bypass testing (extension, content-type, magic bytes)
+- Path traversal on all file parameters
+- Server-side request forgery through file inclusion
+- XXE through all XML parsing points
+
+Business Logic:
+- Race conditions on all state-changing operations
+- Workflow bypass attempts on every multi-step process
+- Price/quantity manipulation in all transactions
+- Parallel execution attacks
+- Time-of-check to time-of-use vulnerabilities
+
+Advanced Attacks:
+- HTTP request smuggling if multiple proxies/servers
+- Cache poisoning and cache deception
+- Subdomain takeover on all subdomains
+- Prototype pollution in JavaScript applications
+- CORS misconfiguration exploitation
+- WebSocket security testing
+- GraphQL specific attacks if applicable
+
+PHASE 4: VULNERABILITY CHAINING
+Don't just find individual bugs - chain them:
+- Combine information disclosure with access control bypass
+- Chain SSRF to access internal services
+- Use low-severity findings to enable high-impact attacks
+- Look for multi-step attack paths that automated tools miss
+- Consider attacks that span multiple application components
+
+CHAINING PRINCIPLES (MAX IMPACT):
+- Treat every finding as a pivot: ask "What does this unlock next?" until you reach maximum privilege / maximum data exposure / maximum control
+- Prefer end-to-end exploit paths over isolated bugs: initial foothold → pivot → privilege gain → sensitive action/data
+- Cross boundaries deliberately: user → admin, external → internal, unauthenticated → authenticated, read → write, single-tenant → cross-tenant
+- Validate chains by executing the full sequence using the available tools (proxy + browser for workflows, python for automation, terminal for supporting commands)
+- When a component agent finds a potential pivot, it must message/spawn the next focused agent to continue the chain in the next component/subpart
+
+PHASE 5: PERSISTENT TESTING
+If initial attempts fail, don't give up:
+- Research specific technologies for known bypasses
+- Try alternative exploitation techniques
+- Look for edge cases and unusual functionality
+- Test with different client contexts
+- Revisit previously tested areas with new information
+- Consider timing-based and blind exploitation techniques
+
+PHASE 6: THOROUGH REPORTING
+- Document EVERY confirmed vulnerability with full details
+- Include all severity levels - even low findings may enable chains
+- Provide complete reproduction steps and PoC
+- Document remediation recommendations
+- Note areas requiring additional review beyond current scope
+
+MINDSET:
+- Relentless - this is about finding what others miss
+- Creative - think of unconventional attack vectors
+- Patient - real vulnerabilities often require deep investigation
+- Thorough - test every parameter, every endpoint, every edge case
+- Persistent - if one approach fails, try ten more
+- Holistic - understand how components interact to find systemic issues
+</scan_mode>
--- a/strix/prompts/scan_modes/quick.jinja
+++ b/strix/prompts/scan_modes/quick.jinja
@@ -0,0 +1,63 @@
+<scan_mode>
+QUICK SCAN MODE - Rapid Security Assessment
+
+This mode is optimized for fast feedback. Focus on HIGH-IMPACT vulnerabilities with minimal overhead.
+
+PHASE 1: RAPID ORIENTATION
+- If source code is available: Focus primarily on RECENT CHANGES (git diff, new commits, modified files)
+- Identify the most critical entry points: authentication endpoints, payment flows, admin interfaces, API endpoints handling sensitive data
+- Quickly understand the tech stack and frameworks in use
+- Skip exhaustive reconnaissance - use what's immediately visible
+
+PHASE 2: TARGETED ATTACK SURFACE
+For whitebox (source code available):
+- Prioritize files changed in recent commits/PRs - these are most likely to contain fresh bugs
+- Look for security-sensitive patterns in diffs: auth checks, input handling, database queries, file operations
+- Trace user-controllable input in changed code paths
+- Check if security controls were modified or bypassed
+
+For blackbox (no source code):
+- Focus on authentication and session management
+- Test the most critical user flows only
+- Check for obvious misconfigurations and exposed endpoints
+- Skip deep content discovery - test what's immediately accessible
+
+PHASE 3: HIGH-IMPACT VULNERABILITY FOCUS
+Prioritize in this order:
+1. Authentication bypass and broken access control
+2. Remote code execution vectors
+3. SQL injection in critical endpoints
+4. Insecure direct object references (IDOR) in sensitive resources
+5. Server-side request forgery (SSRF)
+6. Hardcoded credentials or secrets in code
+
+Skip lower-priority items:
+- Extensive subdomain enumeration
+- Full directory bruteforcing
+- Information disclosure that doesn't lead to exploitation
+- Theoretical vulnerabilities without PoC
+
+PHASE 4: VALIDATION AND REPORTING
+- Validate only critical/high severity findings with minimal PoC
+- Report findings as you discover them - don't wait for completion
+- Focus on exploitability and business impact
+
+QUICK CHAINING RULE:
+- If you find ANY strong primitive (auth weakness, access control gap, injection point, internal reachability), immediately attempt a single high-impact pivot to demonstrate real impact
+- Do not stop at a low-context “maybe”; turn it into a concrete exploit sequence (even if short) that reaches privileged action or sensitive data
+
+OPERATIONAL GUIDELINES:
+- Use the browser tool for quick manual testing of critical flows
+- Use terminal for targeted scans with fast presets (e.g., nuclei with critical/high templates only)
+- Use proxy to inspect traffic on key endpoints
+- Skip extensive fuzzing - use targeted payloads only
+- Create subagents only for parallel high-priority tasks
+- If whitebox: file_edit tool to review specific suspicious code sections
+- Use notes tool to track critical findings only
+
+MINDSET:
+- Think like a time-boxed bug bounty hunter going for quick wins
+- Prioritize breadth over depth on critical areas
+- If something looks exploitable, validate quickly and move on
+- Don't get stuck - if an attack vector isn't yielding results quickly, pivot
+</scan_mode>
--- a/strix/prompts/scan_modes/standard.jinja
+++ b/strix/prompts/scan_modes/standard.jinja
@@ -0,0 +1,91 @@
+<scan_mode>
+STANDARD SCAN MODE - Balanced Security Assessment
+
+This mode provides thorough coverage with a structured methodology. Balance depth with efficiency.
+
+PHASE 1: RECONNAISSANCE AND MAPPING
+Understanding the target is critical before exploitation. Never skip this phase.
+
+For whitebox (source code available):
+- Map the entire codebase structure: directories, modules, entry points
+- Identify the application architecture (MVC, microservices, monolith)
+- Understand the routing: how URLs map to handlers/controllers
+- Identify all user input vectors: forms, APIs, file uploads, headers, cookies
+- Map authentication and authorization flows
+- Identify database interactions and ORM usage
+- Review dependency manifests for known vulnerable packages
+- Understand the data model and sensitive data locations
+
+For blackbox (no source code):
+- Crawl the application thoroughly using browser tool - interact with every feature
+- Enumerate all endpoints, parameters, and functionality
+- Identify the technology stack through fingerprinting
+- Map user roles and access levels
+- Understand the business logic by using the application as intended
+- Document all forms, APIs, and data entry points
+- Use proxy tool to capture and analyze all traffic during exploration
+
+PHASE 2: BUSINESS LOGIC UNDERSTANDING
+Before testing for vulnerabilities, understand what the application DOES:
+- What are the critical business flows? (payments, user registration, data access)
+- What actions should be restricted to specific roles?
+- What data should users NOT be able to access?
+- What state transitions exist? (order pending → paid → shipped)
+- Where does money, sensitive data, or privilege flow?
+
+PHASE 3: SYSTEMATIC VULNERABILITY ASSESSMENT
+Test each attack surface methodically. Create focused subagents for different areas.
+
+Entry Point Analysis:
+- Test all input fields for injection vulnerabilities
+- Check all API endpoints for authentication and authorization
+- Verify all file upload functionality for bypass
+- Test all search and filter functionality
+- Check redirect parameters and URL handling
+
+Authentication and Session:
+- Test login for brute force protection
+- Check session token entropy and handling
+- Test password reset flows for weaknesses
+- Verify logout invalidates sessions
+- Test for authentication bypass techniques
+
+Access Control:
+- For every privileged action, test as unprivileged user
+- Test horizontal access control (user A accessing user B's data)
+- Test vertical access control (user escalating to admin)
+- Check API endpoints mirror UI access controls
+- Test direct object references with different user contexts
+
+Business Logic:
+- Attempt to skip steps in multi-step processes
+- Test for race conditions in critical operations
+- Try negative values, zero values, boundary conditions
+- Attempt to replay transactions
+- Test for price manipulation in e-commerce flows
+
+PHASE 4: EXPLOITATION AND VALIDATION
+- Every finding must have a working proof-of-concept
+- Demonstrate actual impact, not theoretical risk
+- Chain vulnerabilities when possible to show maximum impact
+- Document the full attack path from initial access to impact
+- Use python tool for complex exploit development
+
+CHAINING & MAX IMPACT MINDSET:
+- Always ask: "If I can do X, what does that enable me to do next?" Keep pivoting until you reach maximum privilege or maximum sensitive data access
+- Prefer complete end-to-end paths (entry point → pivot → privileged action/data) over isolated bug reports
+- Use the application as a real user would: exploit must survive the actual workflow and state transitions
+- When you discover a useful pivot (info leak, weak boundary, partial access), immediately pursue the next step rather than stopping at the first win
+
+PHASE 5: COMPREHENSIVE REPORTING
+- Report all confirmed vulnerabilities with clear reproduction steps
+- Include severity based on actual exploitability and business impact
+- Provide remediation recommendations
+- Document any areas that need further investigation
+
+MINDSET:
+- Methodical and systematic - cover the full attack surface
+- Document as you go - findings and areas tested
+- Validate everything - no assumptions about exploitability
+- Think about business impact, not just technical severity
+</scan_mode>
--- a/strix/prompts/vulnerabilities/information_disclosure.jinja
+++ b/strix/prompts/vulnerabilities/information_disclosure.jinja
@@ -0,0 +1,222 @@
+<information_disclosure_vulnerability_guide>
+<title>INFORMATION DISCLOSURE</title>
+
+<critical>Information leaks accelerate exploitation by revealing code, configuration, identifiers, and trust boundaries. Treat every response byte, artifact, and header as potential intelligence. Minimize, normalize, and scope disclosure across all channels.</critical>
+
+<scope>
+- Errors and exception pages: stack traces, file paths, SQL, framework versions
+- Debug/dev tooling reachable in prod: debuggers, profilers, feature flags
+- DVCS/build artifacts and temp/backup files: .git, .svn, .hg, .bak, .swp, archives
+- Configuration and secrets: .env, phpinfo, appsettings.json, Docker/K8s manifests
+- API schemas and introspection: OpenAPI/Swagger, GraphQL introspection, gRPC reflection
+- Client bundles and source maps: webpack/Vite maps, embedded env, __NEXT_DATA__, static JSON
+- Headers and response metadata: Server/X-Powered-By, tracing, ETag, Accept-Ranges, Server-Timing
+- Storage/export surfaces: public buckets, signed URLs, export/download endpoints
+- Observability/admin: /metrics, /actuator, /health, tracing UIs (Jaeger, Zipkin), Kibana, Admin UIs
+- Directory listings and indexing: autoindex, sitemap/robots revealing hidden routes
+- Cross-origin signals: CORS misconfig, Referrer-Policy leakage, Expose-Headers
+- File/document metadata: EXIF, PDF/Office properties
+</scope>
+
+<methodology>
+1. Build a channel map: Web, API, GraphQL, WebSocket, gRPC, mobile, background jobs, exports, CDN.
+2. Establish a diff harness: compare owner vs non-owner vs anonymous across transports; normalize on status/body length/ETag/headers.
+3. Trigger controlled failures: send malformed types, boundary values, missing params, and alternate content-types to elicit error detail and stack traces.
+4. Enumerate artifacts: DVCS folders, backups, config endpoints, source maps, client bundles, API docs, observability routes.
+5. Correlate disclosures to impact: versions→CVE, paths→LFI/RCE, keys→cloud access, schemas→auth bypass, IDs→IDOR.
+</methodology>
+
+<surfaces>
+<errors_and_exceptions>
+- SQL/ORM errors: reveal table/column names, DBMS, query fragments
+- Stack traces: absolute paths, class/method names, framework versions, developer emails
+- Template engine probes: {% raw %}{{7*7}}, ${7*7}{% endraw %} identify templating stack and code paths
+- JSON/XML parsers: type mismatches and coercion logs leak internal model names
+</errors_and_exceptions>
+
+<debug_and_env_modes>
+- Debug pages and flags: Django DEBUG, Laravel Telescope, Rails error pages, Flask/Werkzeug debugger, ASP.NET customErrors Off
+- Profiler endpoints: /debug/pprof, /actuator, /_profiler, custom /debug APIs
+- Feature/config toggles exposed in JS or headers; admin/staff banners in HTML
+</debug_and_env_modes>
+
+<dvcs_and_backups>
+- DVCS: /.git/ (HEAD, config, index, objects), .svn/entries, .hg/store → reconstruct source and secrets
+- Backups/temp: .bak/.old/~/.swp/.swo/.tmp/.orig, db dumps, zipped deployments under /backup/, /old/, /archive/
+- Build artifacts: dist artifacts containing .map, env prints, internal URLs
+</dvcs_and_backups>
+
+<configs_and_secrets>
+- Classic: web.config, appsettings.json, settings.py, config.php, phpinfo.php
+- Containers/cloud: Dockerfile, docker-compose.yml, Kubernetes manifests, service account tokens, cloud credentials files
+- Credentials and connection strings; internal hosts and ports; JWT secrets
+</configs_and_secrets>
+
+<api_schemas_and_introspection>
+- OpenAPI/Swagger: /swagger, /api-docs, /openapi.json — enumerate hidden/privileged operations
+- GraphQL: introspection enabled; field suggestions; error disclosure via invalid fields; persisted queries catalogs
+- gRPC: server reflection exposing services/messages; proto download via reflection
+</api_schemas_and_introspection>
+
+<client_bundles_and_maps>
+- Source maps (.map) reveal original sources, comments, and internal logic
+- Client env leakage: NEXT_PUBLIC_/VITE_/REACT_APP_ variables; runtime config; embedded secrets accidentally shipped
+- Next.js data: __NEXT_DATA__ and pre-fetched JSON under /_next/data can include internal IDs, flags, or PII
+- Static JSON/CSV feeds used by the UI that bypass server-side auth filtering
+</client_bundles_and_maps>
+
+<headers_and_response_metadata>
+- Fingerprinting: Server, X-Powered-By, X-AspNet-Version
+- Tracing: X-Request-Id, traceparent, Server-Timing, debug headers
+- Caching oracles: ETag/If-None-Match, Last-Modified/If-Modified-Since, Accept-Ranges/Range (partial content reveals)
+- Content sniffing and MIME metadata that implies backend components
+</headers_and_response_metadata>
+
+<storage_and_exports>
+- Public object storage: S3/GCS/Azure blobs with world-readable ACLs or guessable keys
+- Signed URLs: long-lived, weakly scoped, re-usable across tenants; metadata leaks in headers
+- Export/report endpoints returning foreign data sets or unfiltered fields
+</storage_and_exports>
+
+<observability_and_admin>
+- Metrics: Prometheus /metrics exposing internal hostnames, process args, SQL, credentials by mistake
+- Health/config: /actuator/health, /actuator/env, Spring Boot info endpoints
+- Tracing UIs and dashboards: Jaeger/Zipkin/Kibana/Grafana exposed without auth
+</observability_and_admin>
+
+<directory_and_indexing>
+- Autoindex on /uploads/, /files/, /logs/, /tmp/, /assets/
+- Robots/sitemap reveal hidden paths, admin panels, export feeds
+</directory_and_indexing>
+
+<cross_origin_signals>
+- Referrer leakage: missing/referrer policy leading to path/query/token leaks to third parties
+- CORS: overly permissive Access-Control-Allow-Origin/Expose-Headers revealing data cross-origin; preflight error shapes
+</cross_origin_signals>
+
+<file_metadata>
+- EXIF, PDF/Office properties: authors, paths, software versions, timestamps, embedded objects
+</file_metadata>
+</surfaces>
+
+<advanced_techniques>
+<differential_oracles>
+- Compare owner vs non-owner vs anonymous for the same resource and track: status, length, ETag, Last-Modified, Cache-Control
+- HEAD vs GET: header-only differences can confirm existence or type without content
+- Conditional requests: 304 vs 200 behaviors leak existence/state; binary search content size via Range requests
+</differential_oracles>
+
+<cdn_and_cache_keys>
+- Identity-agnostic caches: CDN/proxy keys missing Authorization/tenant headers → cross-user cached responses
+- Vary misconfiguration: user-agent/language vary without auth vary leaks alternate content
+- 206 partial content + stale caches leak object fragments
+</cdn_and_cache_keys>
+
+<cross_channel_mirroring>
+- Inconsistent hardening between REST, GraphQL, WebSocket, and gRPC; one channel leaks schema or fields hidden in others
+- SSR vs CSR: server-rendered pages omit fields while JSON API includes them; compare responses
+</cross_channel_mirroring>
+
+<introspection_and_reflection>
+- GraphQL: disabled introspection still leaks via errors, fragment suggestions, and client bundles containing schema
+- gRPC reflection: list services/messages and infer internal resource names and flows
+</introspection_and_reflection>
+
+<cloud_specific>
+- S3/GCS/Azure: anonymous listing disabled but object reads allowed; metadata headers leak owner/project identifiers
+- Pre-signed URLs: audience not bound; observe key scope and lifetime in URL params
+</cloud_specific>
+</advanced_techniques>
+
+<usefulness_assessment>
+- Actionable signals:
+  - Secrets/keys/tokens that grant new access (DB creds, cloud keys, JWT signing/refresh, signed URL secrets)
+  - Versions with a reachable, unpatched CVE on an exposed path
+  - Cross-tenant identifiers/data or per-user fields that differ by principal
+  - File paths, service hosts, or internal URLs that enable LFI/SSRF/RCE pivots
+  - Cache/CDN differentials (Vary/ETag/Range) that expose other users' content
+  - Schema/introspection revealing hidden operations or fields that return sensitive data
+- Likely benign or intended:
+  - Public docs or non-sensitive metadata explicitly documented as public
+  - Generic server names without precise versions or exploit path
+  - Redacted/sanitized fields with stable length/ETag across principals
+  - Per-user data visible only to the owner and consistent with privacy policy
+</usefulness_assessment>
+
+<triage_rubric>
+- Critical: Credentials/keys; signed URL secrets; config dumps; unrestricted admin/observability panels
+- High: Versions with reachable CVEs; cross-tenant data; caches serving cross-user content; schema enabling auth bypass
+- Medium: Internal paths/hosts enabling LFI/SSRF pivots; source maps revealing hidden endpoints/IDs
+- Low: Generic headers, marketing versions, intended documentation without exploit path
+- Guidance: Always attempt a minimal, reversible proof for Critical/High; if no safe chain exists, document precise blocker and downgrade
+</triage_rubric>
+
+<escalation_playbook>
+- If DVCS/backups/configs → extract secrets; test least-privileged read; rotate after coordinated disclosure
+- If versions → map to CVE; verify exposure; execute minimal PoC under strict scope
+- If schema/introspection → call hidden/privileged fields with non-owner tokens; confirm auth gaps
+- If source maps/client JSON → mine endpoints/IDs/flags; pivot to IDOR/listing; validate filtering
+- If cache/CDN keys → demonstrate cross-user cache leak via Vary/ETag/Range; escalate to broken access control
+- If paths/hosts → target LFI/SSRF with harmless reads (e.g., /etc/hostname, metadata headers); avoid destructive actions
+- If observability/admin → enumerate read-only info first; prove data scope breach; avoid write/exec operations
+</escalation_playbook>
+
+<exploitation_chains>
+<credential_extraction>
+- DVCS/config dumps exposing secrets (DB, SMTP, JWT, cloud)
+- Keys → cloud control plane access; rotate and verify scope
+</credential_extraction>
+
+<version_to_cve>
+1. Derive precise component versions from headers/errors/bundles.
+2. Map to known CVEs and confirm reachability.
+3. Execute minimal proof targeting disclosed component.
+</version_to_cve>
+
+<path_disclosure_to_lfi>
+1. Paths from stack traces/templates reveal filesystem layout.
+2. Use LFI/traversal to fetch config/keys.
+3. Prove controlled access without altering state.
+</path_disclosure_to_lfi>
+
+<schema_to_auth_bypass>
+1. Schema reveals hidden fields/endpoints.
+2. Attempt requests with those fields; confirm missing authorization or field filtering.
+</schema_to_auth_bypass>
+</exploitation_chains>
+
+<validation>
+1. Provide raw evidence (headers/body/artifact) and explain exact data revealed.
+2. Determine intent: cross-check docs/UX; classify per triage rubric (Critical/High/Medium/Low).
+3. Attempt minimal, reversible exploitation or present a concrete step-by-step chain (what to try next and why).
+4. Show reproducibility and minimal request set; include cross-channel confirmation where applicable.
+5. Bound scope (user, tenant, environment) and data sensitivity classification.
+</validation>
+
+<false_positives>
+- Intentional public docs or non-sensitive metadata with no exploit path
+- Generic errors with no actionable details
+- Redacted fields that do not change differential oracles (length/ETag stable)
+- Version banners with no exposed vulnerable surface and no chain
+- Owner-visible-only details that do not cross identity/tenant boundaries
+</false_positives>
+
+<impact>
+- Accelerated exploitation of RCE/LFI/SSRF via precise versions and paths
+- Credential/secret exposure leading to persistent external compromise
+- Cross-tenant data disclosure through exports, caches, or mis-scoped signed URLs
+- Privacy/regulatory violations and business intelligence leakage
+</impact>
+
+<pro_tips>
+1. Start with artifacts (DVCS, backups, maps) before payloads; artifacts yield the fastest wins.
+2. Normalize responses and diff by digest to reduce noise when comparing roles.
+3. Hunt source maps and client data JSON; they often carry internal IDs and flags.
+4. Probe caches/CDNs for identity-unaware keys; verify Vary includes Authorization/tenant.
+5. Treat introspection and reflection as configuration findings across GraphQL/gRPC; validate per environment.
+6. Mine observability endpoints last; they are noisy but high-yield in misconfigured setups.
+7. Chain quickly to a concrete risk and stop—proof should be minimal and reversible.
+</pro_tips>
+
+<remember>Information disclosure is an amplifier. Convert leaks into precise, minimal exploits or clear architectural risks.</remember>
+</information_disclosure_vulnerability_guide>
--- a/strix/prompts/vulnerabilities/open_redirect.jinja
+++ b/strix/prompts/vulnerabilities/open_redirect.jinja
@@ -0,0 +1,177 @@
+<open_redirect_vulnerability_guide>
+<title>OPEN REDIRECT</title>
+
+<critical>Open redirects enable phishing, OAuth/OIDC code and token theft, and allowlist bypass in server-side fetchers that follow redirects. Treat every redirect target as untrusted: canonicalize and enforce exact allowlists per scheme, host, and path.</critical>
+
+<scope>
+- Server-driven redirects (HTTP 3xx Location) and client-driven redirects (window.location, meta refresh, SPA routers)
+- OAuth/OIDC/SAML flows using redirect_uri, post_logout_redirect_uri, RelayState, returnTo/continue/next
+- Multi-hop chains where only the first hop is validated
+- Allowlist/canonicalization bypasses across URL parsers and reverse proxies
+</scope>
+
+<methodology>
+1. Inventory all redirect surfaces: login/logout, password reset, SSO/OAuth flows, payment gateways, email links, invite/verification, unsubscribe, language/locale switches, /out or /r redirectors.
+2. Build a test matrix of scheme×host×path variants and encoding/unicode forms. Compare server-side validation vs browser navigation results.
+3. Exercise multi-hop: trusted-domain → redirector → external. Verify if validation applies pre- or post-redirect.
+4. Prove impact: credential phishing, OAuth code interception, internal egress (if a server fetcher follows redirects).
+</methodology>
+
+<discovery_techniques>
+<injection_points>
+- Params: redirect, url, next, return_to, returnUrl, continue, goto, target, callback, out, dest, back, to, r, u
+- OAuth/OIDC/SAML: redirect_uri, post_logout_redirect_uri, RelayState, state (if used to compute final destination)
+- SPA: router.push/replace, location.assign/href, meta refresh, window.open
+- Headers influencing construction: Host, X-Forwarded-Host/Proto, Referer; and server-side Location echo
+</injection_points>
+
+<parser_differentials>
+<userinfo>
+https://trusted.com@evil.com → many validators parse host as trusted.com, browser navigates to evil.com
+Variants: trusted.com%40evil.com, a%40evil.com%40trusted.com
+</userinfo>
+
+<backslash_and_slashes>
+https://trusted.com\\evil.com, https://trusted.com\\@evil.com, ///evil.com, /\\evil.com
+Windows/backends may normalize \\ to /; browsers differ on interpretation of extra leading slashes
+</backslash_and_slashes>
+
+<whitespace_and_ctrl>
+http%09://evil.com, http%0A://evil.com, trusted.com%09evil.com
+Control/whitespace around the scheme/host can split parsers
+</whitespace_and_ctrl>
+
+<fragment_and_query>
+trusted.com#@evil.com, trusted.com?//@evil.com, ?next=//evil.com#@trusted.com
+Validators often stop at # while the browser parses after it
+</fragment_and_query>
+
+<unicode_and_idna>
+Punycode/IDN: truѕted.com (Cyrillic), trusted.com。evil.com (full-width dot), trailing dot trusted.com.
+Test with mixed Unicode normalization and IDNA conversion
+</unicode_and_idna>
+</parser_differentials>
+
+<encoding_bypasses>
+- Double encoding: %2f%2fevil.com, %252f%252fevil.com
+- Mixed case and scheme smuggling: hTtPs://evil.com, http:evil.com
+- IP variants: decimal 2130706433, octal 0177.0.0.1, hex 0x7f.1, IPv6 [::ffff:127.0.0.1]
+- User-controlled path bases: /out?url=/\\evil.com
+</encoding_bypasses>
+</discovery_techniques>
+
+<allowlist_evasion>
+<common_mistakes>
+- Substring/regex contains checks: allows trusted.com.evil.com, or path matches leaking external
+- Wildcards: *.trusted.com also matches attacker.trusted.com.evil.net
+- Missing scheme pinning: data:, javascript:, file:, gopher: accepted
+- Case/IDN drift between validator and browser
+</common_mistakes>
+
+<robust_validation>
+- Canonicalize with a single modern URL parser (WHATWG URL) and compare exact scheme, hostname (post-IDNA), and an explicit allowlist with optional exact path prefixes
+- Require absolute HTTPS; reject protocol-relative // and unknown schemes
+- Normalize and compare after following zero redirects only; if following, re-validate the final destination per hop server-side
+</robust_validation>
+</allowlist_evasion>
+
+<oauth_oidc_saml>
+<redirect_uri_abuse>
+- Using an open redirect on a trusted domain for redirect_uri enables code interception
+- Weak prefix/suffix checks: https://trusted.com → https://trusted.com.evil.com; /callback → /callback@evil.com
+- Path traversal/canonicalization: /oauth/../../@evil.com
+- post_logout_redirect_uri often less strictly validated; test both
+- state must be unguessable and bound to client/session; do not recompute final destination from state without validation
+</redirect_uri_abuse>
+
+<defense_notes>
+- Pre-register exact redirect_uri values per client (no wildcards). Enforce exact scheme/host/port/path match
+- For public native apps, follow RFC guidance (loopback 127.0.0.1 with exact port handling); disallow open web redirectors
+- SAML RelayState should be validated against an allowlist or ignored for absolute URLs
+</defense_notes>
+</oauth_oidc_saml>
+
+<client_side_vectors>
+<javascript_redirects>
+- location.href/assign/replace using user input; ensure targets are normalized and restricted to same-origin or allowlist
+- meta refresh content=0;url=USER_INPUT; browsers treat javascript:/data: differently; still dangerous in client-controlled redirects
+- SPA routers: router.push(searchParams.get('next')); enforce same-origin and strip schemes
+</javascript_redirects>
+
+</client_side_vectors>
+
+<reverse_proxies_and_gateways>
+- Host/X-Forwarded-* may change absolute URL construction; validate against server-derived canonical origin, not client headers
+- CDNs that follow redirects for link checking or prefetching can leak tokens when chained with open redirects
+</reverse_proxies_and_gateways>
+
+<ssrf_chaining>
+- Some server-side fetchers (web previewers, link unfurlers, validators) follow 3xx; combine with an open redirect on an allowlisted domain to pivot to internal targets (169.254.169.254, localhost, cluster addresses)
+- Confirm by observing distinct error/timing for internal vs external, or OAST callbacks when reachable
+</ssrf_chaining>
+
+<framework_notes>
+<server_side>
+- Rails: redirect_to params[:url] without URI parsing; test array params and protocol-relative
+- Django: HttpResponseRedirect(request.GET['next']) without is_safe_url; relies on ALLOWED_HOSTS + scheme checks
+- Spring: return "redirect:" + param; ensure UriComponentsBuilder normalization and allowlist
+- Express: res.redirect(req.query.url); use a safe redirect helper enforcing relative paths or a vetted allowlist
+</server_side>
+
+<client_side>
+- React/Next.js/Vue/Angular routing based on URLSearchParams; ensure same-origin policy and disallow external schemes in client code
+</client_side>
+</framework_notes>
+
+<exploitation_scenarios>
+<oauth_code_interception>
+1. Set redirect_uri to https://trusted.example/out?url=https://attacker.tld/cb
+2. IdP sends code to trusted.example which redirects to attacker.tld
+3. Exchange code for tokens; demonstrate account access
+</oauth_code_interception>
+
+<phishing_flow>
+1. Send link on trusted domain: /login?next=https://attacker.tld/fake
+2. Victim authenticates; browser navigates to attacker page
+3. Capture credentials/tokens via cloned UI or injected JS
+</phishing_flow>
+
+<internal_evasion>
+1. Server-side link unfurler fetches https://trusted.example/out?u=http://169.254.169.254/latest/meta-data
+2. Redirect follows to metadata; confirm via timing/headers or controlled endpoints
+</internal_evasion>
+</exploitation_scenarios>
+
+<validation>
+1. Produce a minimal URL that navigates to an external domain via the vulnerable surface; include the full address bar capture.
+2. Show bypass of the stated validation (regex/allowlist) using canonicalization variants.
+3. Test multi-hop: prove only first hop is validated and second hop escapes constraints.
+4. For OAuth/SAML, demonstrate code/RelayState delivery to an attacker-controlled endpoint with role-separated evidence.
+</validation>
+
+<false_positives>
+- Redirects constrained to relative same-origin paths with robust normalization
+- Exact pre-registered OAuth redirect_uri with strict verifier
+- Validators using a single canonical parser and comparing post-IDNA host and scheme
+- User prompts that show the exact final destination before navigating and refuse unknown schemes
+</false_positives>
+
+<impact>
+- Credential and token theft via phishing and OAuth/OIDC interception
+- Internal data exposure when server fetchers follow redirects (previewers/unfurlers)
+- Policy bypass where allowlists are enforced only on the first hop
+- Cross-application trust erosion and brand abuse
+</impact>
+
+<pro_tips>
+1. Always compare server-side canonicalization to real browser navigation; differences reveal bypasses.
+2. Try userinfo, protocol-relative, Unicode/IDN, and IP numeric variants early; they catch many weak validators.
+3. In OAuth, prioritize post_logout_redirect_uri and less-discussed flows; they’re often looser.
+4. Exercise multi-hop across distinct subdomains and paths; validators commonly check only hop 1.
+5. For SSRF chaining, target services known to follow redirects and log their outbound requests.
+6. Favor allowlists of exact origins plus optional path prefixes; never substring/regex contains checks.
+7. Keep a curated suite of redirect payloads per runtime (Java, Node, Python, Go) reflecting each parser’s quirks.
+</pro_tips>
+
+<remember>Redirection is safe only when the final destination is constrained after canonicalization. Enforce exact origins, verify per hop, and treat client-provided destinations as untrusted across every stack.</remember>
+</open_redirect_vulnerability_guide>
--- a/strix/prompts/vulnerabilities/subdomain_takeover.jinja
+++ b/strix/prompts/vulnerabilities/subdomain_takeover.jinja
@@ -0,0 +1,155 @@
+<subdomain_takeover_guide>
+<title>SUBDOMAIN TAKEOVER</title>
+
+<critical>Subdomain takeover lets an attacker serve content from a trusted subdomain by claiming resources referenced by dangling DNS (CNAME/A/ALIAS/NS) or mis-bound provider configurations. Consequences include phishing on a trusted origin, cookie and CORS pivot, OAuth redirect abuse, and CDN cache poisoning.</critical>
+
+<scope>
+- Dangling CNAME/A/ALIAS to third-party services (hosting, storage, serverless, CDN)
+- Orphaned NS delegations (child zones with abandoned/expired nameservers)
+- Decommissioned SaaS integrations (support, docs, marketing, forms) referenced via CNAME
+- CDN “alternate domain” mappings (CloudFront/Fastly/Azure CDN) lacking ownership verification
+- Storage and static hosting endpoints (S3/Blob/GCS buckets, GitHub/GitLab Pages)
+</scope>
+
+<methodology>
+1. Enumerate subdomains comprehensively (web, API, mobile, legacy): aggregate CT logs, passive DNS, and org inventory. De-duplicate and normalize.
+2. Resolve DNS for all RR types: A/AAAA, CNAME, NS, MX, TXT. Keep CNAME chains; record terminal CNAME targets and provider hints.
+3. HTTP/TLS probe: capture status, body, length, canonical error text, Server/alt-svc headers, certificate SANs, and CDN headers (Via, X-Served-By).
+4. Fingerprint providers: map known “unclaimed/missing resource” signatures to candidate services. Maintain a living dictionary.
+5. Attempt claim (only with authorization): create the missing resource on the provider with the exact required name; bind the custom domain if the provider allows.
+6. Validate control: serve a minimal unique payload; confirm over HTTPS; optionally obtain a DV certificate (CT log evidence) within legal scope.
+</methodology>
+
+<discovery_techniques>
+<enumeration_pipeline>
+- Subdomain inventory: combine CT (crt.sh APIs), passive DNS sources, in-house asset lists, IaC/terraform outputs, mobile app assets, and historical DNS
+- Resolver sweep: use IPv4/IPv6-aware resolvers; track NXDOMAIN vs SERVFAIL vs provider-branded 4xx/5xx responses
+- Record graph: build a CNAME graph and collapse chains to identify external endpoints (e.g., myapp.example.com → foo.azurewebsites.net)
+</enumeration_pipeline>
+
+<dns_indicators>
+- CNAME targets ending in provider domains: github.io, amazonaws.com, cloudfront.net, azurewebsites.net, blob.core.windows.net, fastly.net, vercel.app, netlify.app, herokudns.com, trafficmanager.net, azureedge.net, akamaized.net
+- Orphaned NS: subzone delegated to nameservers on a domain that has expired or no longer hosts authoritative servers; or to inexistent NS hosts
+- MX to third-party mail providers with decommissioned domains (risk: mail subdomain control or delivery manipulation)
+- TXT/verification artifacts (asuid, _dnsauth, _github-pages-challenge) suggesting previous external bindings
+</dns_indicators>
+
+<http_fingerprints>
+- Service-specific unclaimed messages (examples, not exhaustive):
+  - GitHub Pages: “There isn’t a GitHub Pages site here.”
+  - Fastly: “Fastly error: unknown domain”
+  - Heroku: “No such app” or “There’s nothing here, yet.”
+  - S3 static site: “NoSuchBucket” / “The specified bucket does not exist”
+  - CloudFront (alt domain not configured): 403/400 with “The request could not be satisfied” and no matching distribution
+  - Azure App Service: default 404 for azurewebsites.net unless custom-domain verified (look for asuid TXT requirement)
+  - Shopify: “Sorry, this shop is currently unavailable”
+- TLS clues: certificate CN/SAN referencing provider default host instead of the custom subdomain indicates potential mis-binding
+</http_fingerprints>
+</discovery_techniques>
+
+<exploitation_techniques>
+<claim_third_party_resource>
+- Create the resource with the exact required name:
+  - Storage/hosting: S3 bucket “sub.example.com” (website endpoint) or bucket named after the CNAME target if provider dictates
+  - Pages hosting: create repo/site and add the custom domain (when provider does not enforce prior domain verification)
+  - Serverless/app hosting: create app/site matching the target hostname, then add custom domain mapping
+- Bind the custom domain: some providers require TXT verification (modern hardened path), others historically allowed binding without proof
+</claim_third_party_resource>
+
+<cdn_alternate_domains>
+- Add the victim subdomain as an alternate domain on your CDN distribution if the provider does not enforce domain ownership checks
+- Upload a TLS cert via provider or use managed cert issuance if allowed; confirm 200 on the subdomain with your content
+</cdn_alternate_domains>
+
+<ns_delegation_takeover>
+- If a child zone (e.g., zone.example.com) is delegated to nameservers under an expired domain (ns1.abandoned.tld), register abandoned.tld and host authoritative NS; publish records to control all hosts under the delegated subzone
+- Validate with SOA/NS queries and serve a verification token; then add A/CNAME/MX/TXT as needed
+</ns_delegation_takeover>
+
+<mail_surface>
+- If MX points to a decommissioned provider that allowed inbox creation without domain re-verification (historically), a takeover could enable email receipt for that subdomain; modern providers generally require explicit TXT ownership
+</mail_surface>
+</exploitation_techniques>
+
+<advanced_techniques>
+<blind_and_cache_channels>
+- CDN edge behavior: 404/421 vs 403 differentials reveal whether an alt name is partially configured; probe with Host header manipulation
+- Cache poisoning: once taken over, exploit cache keys and Vary headers to persist malicious responses at the edge
+</blind_and_cache_channels>
+
+<ct_and_tls>
+- Use CT logs to detect unexpected certificate issuance for your subdomain; for PoC, issue a DV cert post-takeover (within scope) to produce verifiable evidence
+</ct_and_tls>
+
+<oauth_and_trust_chains>
+- If the subdomain is whitelisted as an OAuth redirect/callback or in CSP/script-src, a takeover elevates impact to account takeover or script injection on trusted origins
+</oauth_and_trust_chains>
+
+<provider_edges>
+- Many providers hardened domain binding (TXT verification) but legacy projects or specific products remain weak; verify per-product behavior (CDN vs app hosting vs storage)
+- Multi-tenant providers sometimes accept custom domains at the edge even when backend resource is missing; leverage timing and registration windows
+</provider_edges>
+</advanced_techniques>
+
+<bypass_techniques>
+<verification_gaps>
+- Look for providers that accept domain binding prior to TXT verification, or where verification is optional for trial/legacy tiers
+- Race windows: re-claim resource names immediately after victim deletion while DNS still points to provider
+</verification_gaps>
+
+<wildcards_and_fallbacks>
+- Wildcard CNAMEs to providers may expose unbounded subdomains; test random hosts to identify service-wide unclaimed behavior
+- Fallback origins: CDNs configured with multiple origins may expose unknown-domain responses from a default origin that is claimable
+</wildcards_and_fallbacks>
+</bypass_techniques>
+
+<special_contexts>
+<storage_and_static>
+- S3/GCS/Azure Blob static sites: bucket naming constraints dictate whether a bucket can match hostname; website vs API endpoints differ in claimability and fingerprints
+</storage_and_static>
+
+<serverless_and_hosting>
+- GitHub/GitLab Pages, Netlify, Vercel, Azure Static Web Apps: domain binding flows vary; most require TXT now, but historical projects or specific paths may not
+</serverless_and_hosting>
+
+<cdn_and_edge>
+- CloudFront/Fastly/Azure CDN/Akamai: alternate domain verification differs; some products historically allowed alt-domain claims without proof
+</cdn_and_edge>
+
+<dns_delegations>
+- Child-zone NS delegations outrank parent records; control of delegated NS yields full control of all hosts below that label
+</dns_delegations>
+</special_contexts>
+
+<validation>
+1. Before: record DNS chain, HTTP response (status/body length/fingerprint), and TLS details.
+2. After claim: serve unique content and verify over HTTPS at the target subdomain.
+3. Optional: issue a DV certificate (legal scope) and reference CT entry as durable evidence.
+4. Demonstrate impact chains (CSP/script-src trust, OAuth redirect acceptance, cookie Domain scoping) with minimal PoCs.
+</validation>
+
+<false_positives>
+- “Unknown domain” pages that are not claimable due to enforced TXT/ownership checks.
+- Provider-branded default pages for valid, owned resources (not a takeover) versus “unclaimed resource” states
+- Soft 404s from your own infrastructure or catch-all vhosts
+</false_positives>
+
+<impact>
+- Content injection under trusted subdomain: phishing, malware delivery, brand damage
+- Cookie and CORS pivot: if parent site sets Domain-scoped cookies or allows subdomain origins in CORS/Trusted Types/CSP
+- OAuth/SSO abuse via whitelisted redirect URIs
+- Email delivery manipulation for subdomain (MX/DMARC/SPF interactions in edge cases)
+</impact>
+
+<pro_tips>
+1. Build a pipeline: enumerate (subfinder/amass) → resolve (dnsx) → probe (httpx) → fingerprint (nuclei/custom) → verify claims.
+2. Maintain a current fingerprint corpus; provider messages change frequently—prefer regex families over exact strings.
+3. Prefer minimal PoCs: static “ownership proof” page and, where allowed, DV cert issuance for auditability.
+4. Monitor CT for unexpected certs on your subdomains; alert and investigate.
+5. Eliminate dangling DNS in decommission workflows first; deletion of the app/service must remove or block the DNS target.
+6. For NS delegations, treat any expired nameserver domain as critical; reassign or remove delegation immediately.
+7. Use CAA to limit certificate issuance while you triage; it reduces the blast radius for taken-over hosts.
+</pro_tips>
+
+<remember>Subdomain safety is lifecycle safety: if DNS points at anything, you must own and verify the thing on every provider and product path. Remove or verify—there is no safe middle.</remember>
+</subdomain_takeover_guide>
--- a/strix/runtime/docker_runtime.py
+++ b/strix/runtime/docker_runtime.py
@@ -203,7 +203,7 @@ class DockerRuntime(AbstractRuntime):
                all=True, filters={"label": f"strix-scan-id={scan_id}"}
            )
            if containers:
-                container = cast("Container", containers[0])
+                container = containers[0]
                if container.status != "running":
                    container.start()
                    time.sleep(2)
@@ -358,11 +358,7 @@ class DockerRuntime(AbstractRuntime):
            container = self.client.containers.get(container_id)
            container.reload()

-            host = "127.0.0.1"
-            if "DOCKER_HOST" in os.environ:
-                docker_host = os.environ["DOCKER_HOST"]
-                if "://" in docker_host:
-                    host = docker_host.split("://")[1].split(":")[0]
+            host = self._resolve_docker_host()

        except NotFound:
            raise ValueError(f"Container {container_id} not found.") from None
@@ -371,6 +367,20 @@ class DockerRuntime(AbstractRuntime):
        else:
            return f"http://{host}:{port}"

+    def _resolve_docker_host(self) -> str:
+        docker_host = os.getenv("DOCKER_HOST", "")
+        if not docker_host:
+            return "127.0.0.1"
+
+        from urllib.parse import urlparse
+
+        parsed = urlparse(docker_host)
+
+        if parsed.scheme in ("tcp", "http", "https") and parsed.hostname:
+            return parsed.hostname
+
+        return "127.0.0.1"
+
    async def destroy_sandbox(self, container_id: str) -> None:
        logger.info("Destroying scan container %s", container_id)
        try:
--- a/strix/telemetry/tracer.py
+++ b/strix/telemetry/tracer.py
@@ -50,6 +50,7 @@ class Tracer:
        self._run_dir: Path | None = None
        self._next_execution_id = 1
        self._next_message_id = 1
+        self._saved_vuln_ids: set[str] = set()

        self.vulnerability_found_callback: Callable[[str, str, str, str], None] | None = None

@@ -59,7 +60,7 @@ class Tracer:

    def get_run_dir(self) -> Path:
        if self._run_dir is None:
-            runs_dir = Path.cwd() / "agent_runs"
+            runs_dir = Path.cwd() / "strix_runs"
            runs_dir.mkdir(exist_ok=True)

            run_dir_name = self.run_name if self.run_name else self.run_id
@@ -92,6 +93,7 @@ class Tracer:
                report_id, title.strip(), content.strip(), severity.lower().strip()
            )

+        self.save_run_data()
        return report_id

    def set_final_scan_result(
@@ -108,6 +110,7 @@ class Tracer:
        }

        logger.info(f"Set final scan result: success={success}")
+        self.save_run_data(mark_complete=True)

    def log_agent_creation(
        self, agent_id: str, name: str, task: str, parent_id: str | None = None
@@ -197,11 +200,13 @@ class Tracer:
                "max_iterations": config.get("max_iterations", 200),
            }
        )
+        self.get_run_dir()

-    def save_run_data(self) -> None:
+    def save_run_data(self, mark_complete: bool = False) -> None:
        try:
            run_dir = self.get_run_dir()
-            self.end_time = datetime.now(UTC).isoformat()
+            if mark_complete:
+                self.end_time = datetime.now(UTC).isoformat()

            if self.final_scan_result:
                penetration_test_report_file = run_dir / "penetration_test_report.md"
@@ -219,13 +224,13 @@ class Tracer:
                vuln_dir = run_dir / "vulnerabilities"
                vuln_dir.mkdir(exist_ok=True)

-                severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3, "info": 4}
-                sorted_reports = sorted(
-                    self.vulnerability_reports,
-                    key=lambda x: (severity_order.get(x["severity"], 5), x["timestamp"]),
-                )
+                new_reports = [
+                    report
+                    for report in self.vulnerability_reports
+                    if report["id"] not in self._saved_vuln_ids
+                ]

-                for report in sorted_reports:
+                for report in new_reports:
                    vuln_file = vuln_dir / f"{report['id']}.md"
                    with vuln_file.open("w", encoding="utf-8") as f:
                        f.write(f"# {report['title']}\n\n")
@@ -234,30 +239,39 @@ class Tracer:
                        f.write(f"**Found:** {report['timestamp']}\n\n")
                        f.write("## Description\n\n")
                        f.write(f"{report['content']}\n")
+                    self._saved_vuln_ids.add(report["id"])

-                vuln_csv_file = run_dir / "vulnerabilities.csv"
-                with vuln_csv_file.open("w", encoding="utf-8", newline="") as f:
-                    import csv
+                if self.vulnerability_reports:
+                    severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3, "info": 4}
+                    sorted_reports = sorted(
+                        self.vulnerability_reports,
+                        key=lambda x: (severity_order.get(x["severity"], 5), x["timestamp"]),
+                    )

-                    fieldnames = ["id", "title", "severity", "timestamp", "file"]
-                    writer = csv.DictWriter(f, fieldnames=fieldnames)
-                    writer.writeheader()
+                    vuln_csv_file = run_dir / "vulnerabilities.csv"
+                    with vuln_csv_file.open("w", encoding="utf-8", newline="") as f:
+                        import csv

-                    for report in sorted_reports:
-                        writer.writerow(
-                            {
-                                "id": report["id"],
-                                "title": report["title"],
-                                "severity": report["severity"].upper(),
-                                "timestamp": report["timestamp"],
-                                "file": f"vulnerabilities/{report['id']}.md",
-                            }
-                        )
+                        fieldnames = ["id", "title", "severity", "timestamp", "file"]
+                        writer = csv.DictWriter(f, fieldnames=fieldnames)
+                        writer.writeheader()

-                logger.info(
-                    f"Saved {len(self.vulnerability_reports)} vulnerability reports to: {vuln_dir}"
-                )
-                logger.info(f"Saved vulnerability index to: {vuln_csv_file}")
+                        for report in sorted_reports:
+                            writer.writerow(
+                                {
+                                    "id": report["id"],
+                                    "title": report["title"],
+                                    "severity": report["severity"].upper(),
+                                    "timestamp": report["timestamp"],
+                                    "file": f"vulnerabilities/{report['id']}.md",
+                                }
+                            )
+
+                if new_reports:
+                    logger.info(
+                        f"Saved {len(new_reports)} new vulnerability report(s) to: {vuln_dir}"
+                    )
+                logger.info(f"Updated vulnerability index: {vuln_csv_file}")

            logger.info(f"📊 Essential scan data saved to: {run_dir}")

@@ -320,4 +334,4 @@ class Tracer:
        }

    def cleanup(self) -> None:
-        self.save_run_data()
+        self.save_run_data(mark_complete=True)
--- a/strix/tools/init.py
+++ b/strix/tools/init.py
@@ -24,9 +24,13 @@ SANDBOX_MODE = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"

 HAS_PERPLEXITY_API = bool(os.getenv("PERPLEXITY_API_KEY"))

+DISABLE_BROWSER = os.getenv("STRIX_DISABLE_BROWSER", "false").lower() == "true"
+
 if not SANDBOX_MODE:
    from .agents_graph import *  # noqa: F403
-    from .browser import *  # noqa: F403
+
+    if not DISABLE_BROWSER:
+        from .browser import *  # noqa: F403
    from .file_edit import *  # noqa: F403
    from .finish import *  # noqa: F403
    from .notes import *  # noqa: F403
@@ -35,13 +39,14 @@ if not SANDBOX_MODE:
    from .reporting import *  # noqa: F403
    from .terminal import *  # noqa: F403
    from .thinking import *  # noqa: F403
+    from .todo import *  # noqa: F403

    if HAS_PERPLEXITY_API:
        from .web_search import *  # noqa: F403
 else:
-    from .browser import *  # noqa: F403
+    if not DISABLE_BROWSER:
+        from .browser import *  # noqa: F403
    from .file_edit import *  # noqa: F403
-    from .notes import *  # noqa: F403
    from .proxy import *  # noqa: F403
    from .python import *  # noqa: F403
    from .terminal import *  # noqa: F403
--- a/strix/tools/agents_graph/agents_graph_actions.py
+++ b/strix/tools/agents_graph/agents_graph_actions.py
@@ -230,9 +230,18 @@ def create_agent(

        state = AgentState(task=task, agent_name=name, parent_id=parent_id, max_iterations=300)

-        llm_config = LLMConfig(prompt_modules=module_list)
-
        parent_agent = _agent_instances.get(parent_id)
+
+        timeout = None
+        scan_mode = "deep"
+        if parent_agent and hasattr(parent_agent, "llm_config"):
+            if hasattr(parent_agent.llm_config, "timeout"):
+                timeout = parent_agent.llm_config.timeout
+            if hasattr(parent_agent.llm_config, "scan_mode"):
+                scan_mode = parent_agent.llm_config.scan_mode
+
+        llm_config = LLMConfig(prompt_modules=module_list, timeout=timeout, scan_mode=scan_mode)
+
        agent_config = {
            "llm_config": llm_config,
            "state": state,
--- a/strix/tools/agents_graph/agents_graph_actions_schema.xml
+++ b/strix/tools/agents_graph/agents_graph_actions_schema.xml
@@ -87,10 +87,6 @@ Only create a new agent if no existing agent is handling the specific task.</des
      <description>Response containing: - agent_id: Unique identifier for the created agent - success: Whether the agent was created successfully - message: Status message - agent_info: Details about the created agent</description>
    </returns>
    <examples>
-  # REQUIRED: Check agent graph again before creating another agent
-  <function=view_agent_graph>
-  </function>
-
  # After confirming no SQL testing agent exists, create agent for vulnerability validation
  <function=create_agent>
  <parameter=task>Validate and exploit the suspected SQL injection vulnerability found in
@@ -125,11 +121,16 @@ Only create a new agent if no existing agent is handling the specific task.</des
  </tool>
  <tool name="send_message_to_agent">
    <description>Send a message to another agent in the graph for coordination and communication.</description>
-    <details>This enables agents to communicate with each other during execution for:
+    <details>This enables agents to communicate with each other during execution, but should be used only when essential:
  - Sharing discovered information or findings
  - Asking questions or requesting assistance
  - Providing instructions or coordination
-  - Reporting status or results</details>
+  - Reporting status or results
+
+Best practices:
+- Avoid routine status updates; batch non-urgent information
+- Prefer parent/child completion flows (agent_finish)
+- Do not message when the context is already known</details>
    <parameters>
      <parameter name="target_agent_id" type="string" required="true">
        <description>ID of the agent to send the message to</description>
--- a/strix/tools/browser/browser_actions.py
+++ b/strix/tools/browser/browser_actions.py
@@ -1,8 +1,10 @@
-from typing import Any, Literal, NoReturn
+from typing import TYPE_CHECKING, Any, Literal, NoReturn

 from strix.tools.registry import register_tool

-from .tab_manager import BrowserTabManager, get_browser_tab_manager
+
+if TYPE_CHECKING:
+    from .tab_manager import BrowserTabManager


 BrowserAction = Literal[
@@ -71,7 +73,7 @@ def _validate_file_path(action_name: str, file_path: str | None) -> None:


 def _handle_navigation_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    url: str | None = None,
    tab_id: str | None = None,
@@ -90,7 +92,7 @@ def _handle_navigation_actions(


 def _handle_interaction_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    coordinate: str | None = None,
    text: str | None = None,
@@ -128,7 +130,7 @@ def _raise_unknown_action(action: str) -> NoReturn:


 def _handle_tab_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    url: str | None = None,
    tab_id: str | None = None,
@@ -149,7 +151,7 @@ def _handle_tab_actions(


 def _handle_utility_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    duration: float | None = None,
    js_code: str | None = None,
@@ -191,6 +193,8 @@ def browser_action(
    file_path: str | None = None,
    clear: bool = False,
 ) -> dict[str, Any]:
+    from .tab_manager import get_browser_tab_manager
+
    manager = get_browser_tab_manager()

    try:
--- a/strix/tools/executor.py
+++ b/strix/tools/executor.py
@@ -17,6 +17,10 @@ from .registry import (
 )


+SANDBOX_EXECUTION_TIMEOUT = float(os.getenv("STRIX_SANDBOX_EXECUTION_TIMEOUT", "500"))
+SANDBOX_CONNECT_TIMEOUT = float(os.getenv("STRIX_SANDBOX_CONNECT_TIMEOUT", "10"))
+
+
 async def execute_tool(tool_name: str, agent_state: Any | None = None, **kwargs: Any) -> Any:
    execute_in_sandbox = should_execute_in_sandbox(tool_name)
    sandbox_mode = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
@@ -62,10 +66,15 @@ async def _execute_tool_in_sandbox(tool_name: str, agent_state: Any, **kwargs: A
        "Content-Type": "application/json",
    }

+    timeout = httpx.Timeout(
+        timeout=SANDBOX_EXECUTION_TIMEOUT,
+        connect=SANDBOX_CONNECT_TIMEOUT,
+    )
+
    async with httpx.AsyncClient(trust_env=False) as client:
        try:
            response = await client.post(
-                request_url, json=request_data, headers=headers, timeout=None
+                request_url, json=request_data, headers=headers, timeout=timeout
            )
            response.raise_for_status()
            response_data = response.json()
--- a/strix/tools/file_edit/file_edit_actions.py
+++ b/strix/tools/file_edit/file_edit_actions.py
@@ -3,9 +3,6 @@ import re
 from pathlib import Path
 from typing import Any, cast

-from openhands_aci import file_editor
-from openhands_aci.utils.shell import run_shell_cmd
-
 from strix.tools.registry import register_tool


@@ -33,6 +30,8 @@ def str_replace_editor(
    new_str: str | None = None,
    insert_line: int | None = None,
 ) -> dict[str, Any]:
+    from openhands_aci import file_editor
+
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
@@ -64,6 +63,8 @@ def list_files(
    path: str,
    recursive: bool = False,
 ) -> dict[str, Any]:
+    from openhands_aci.utils.shell import run_shell_cmd
+
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
@@ -116,6 +117,8 @@ def search_files(
    regex: str,
    file_pattern: str = "*",
 ) -> dict[str, Any]:
+    from openhands_aci.utils.shell import run_shell_cmd
+
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
--- a/strix/tools/notes/notes_actions.py
+++ b/strix/tools/notes/notes_actions.py
@@ -11,7 +11,6 @@ _notes_storage: dict[str, dict[str, Any]] = {}
 def _filter_notes(
    category: str | None = None,
    tags: list[str] | None = None,
-    priority: str | None = None,
    search_query: str | None = None,
 ) -> list[dict[str, Any]]:
    filtered_notes = []
@@ -20,9 +19,6 @@ def _filter_notes(
        if category and note.get("category") != category:
            continue

-        if priority and note.get("priority") != priority:
-            continue
-
        if tags:
            note_tags = note.get("tags", [])
            if not any(tag in note_tags for tag in tags):
@@ -43,13 +39,12 @@ def _filter_notes(
    return filtered_notes


-@register_tool
+@register_tool(sandbox_execution=False)
 def create_note(
    title: str,
    content: str,
    category: str = "general",
    tags: list[str] | None = None,
-    priority: str = "normal",
 ) -> dict[str, Any]:
    try:
        if not title or not title.strip():
@@ -58,7 +53,7 @@ def create_note(
        if not content or not content.strip():
            return {"success": False, "error": "Content cannot be empty", "note_id": None}

-        valid_categories = ["general", "findings", "methodology", "todo", "questions", "plan"]
+        valid_categories = ["general", "findings", "methodology", "questions", "plan"]
        if category not in valid_categories:
            return {
                "success": False,
@@ -66,14 +61,6 @@ def create_note(
                "note_id": None,
            }

-        valid_priorities = ["low", "normal", "high", "urgent"]
-        if priority not in valid_priorities:
-            return {
-                "success": False,
-                "error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
-                "note_id": None,
-            }
-
        note_id = str(uuid.uuid4())[:5]
        timestamp = datetime.now(UTC).isoformat()

@@ -82,7 +69,6 @@ def create_note(
            "content": content.strip(),
            "category": category,
            "tags": tags or [],
-            "priority": priority,
            "created_at": timestamp,
            "updated_at": timestamp,
        }
@@ -99,17 +85,14 @@ def create_note(
        }


-@register_tool
+@register_tool(sandbox_execution=False)
 def list_notes(
    category: str | None = None,
    tags: list[str] | None = None,
-    priority: str | None = None,
    search: str | None = None,
 ) -> dict[str, Any]:
    try:
-        filtered_notes = _filter_notes(
-            category=category, tags=tags, priority=priority, search_query=search
-        )
+        filtered_notes = _filter_notes(category=category, tags=tags, search_query=search)

        return {
            "success": True,
@@ -126,13 +109,12 @@ def list_notes(
        }


-@register_tool
+@register_tool(sandbox_execution=False)
 def update_note(
    note_id: str,
    title: str | None = None,
    content: str | None = None,
    tags: list[str] | None = None,
-    priority: str | None = None,
 ) -> dict[str, Any]:
    try:
        if note_id not in _notes_storage:
@@ -153,15 +135,6 @@ def update_note(
        if tags is not None:
            note["tags"] = tags

-        if priority is not None:
-            valid_priorities = ["low", "normal", "high", "urgent"]
-            if priority not in valid_priorities:
-                return {
-                    "success": False,
-                    "error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
-                }
-            note["priority"] = priority
-
        note["updated_at"] = datetime.now(UTC).isoformat()

        return {
@@ -173,7 +146,7 @@ def update_note(
        return {"success": False, "error": f"Failed to update note: {e}"}


-@register_tool
+@register_tool(sandbox_execution=False)
 def delete_note(note_id: str) -> dict[str, Any]:
    try:
        if note_id not in _notes_storage:
--- a/strix/tools/notes/notes_actions_schema.xml
+++ b/strix/tools/notes/notes_actions_schema.xml
@@ -1,10 +1,9 @@
 <tools>
  <tool name="create_note">
-    <description>Create a personal note for TODOs, side notes, plans, and organizational purposes during
-  the scan.</description>
-    <details>Use this tool for quick reminders, action items, planning thoughts, and organizational notes
-  rather than formal vulnerability reports or detailed findings. This is your personal notepad
-  for keeping track of tasks, ideas, and things to remember or follow up on.</details>
+    <description>Create a personal note for observations, findings, and research during the scan.</description>
+    <details>Use this tool for documenting discoveries, observations, methodology notes, and questions.
+  This is your personal notepad for recording information you want to remember or reference later.
+  For tracking actionable tasks, use the todo tool instead.</details>
    <parameters>
      <parameter name="title" type="string" required="true">
        <description>Title of the note</description>
@@ -13,49 +12,41 @@
        <description>Content of the note</description>
      </parameter>
      <parameter name="category" type="string" required="false">
-        <description>Category to organize the note (default: "general", "findings", "methodology", "todo", "questions", "plan")</description>
+        <description>Category to organize the note (default: "general", "findings", "methodology", "questions", "plan")</description>
      </parameter>
      <parameter name="tags" type="string" required="false">
        <description>Tags for categorization</description>
      </parameter>
-      <parameter name="priority" type="string" required="false">
-        <description>Priority level of the note ("low", "normal", "high", "urgent")</description>
-      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - note_id: ID of the created note - success: Whether the note was created successfully</description>
    </returns>
    <examples>
-  # Create a TODO reminder
-  <function=create_note>
-  <parameter=title>TODO: Check SSL Certificate Details</parameter>
-  <parameter=content>Remember to verify SSL certificate validity and check for weak ciphers
-               on the HTTPS service discovered on port 443. Also check for certificate
-               transparency logs.</parameter>
-  <parameter=category>todo</parameter>
-  <parameter=tags>["ssl", "certificate", "followup"]</parameter>
-  <parameter=priority>normal</parameter>
-  </function>
-
-  # Planning note
-  <function=create_note>
-  <parameter=title>Scan Strategy Planning</parameter>
-  <parameter=content>Plan for next phase: 1) Complete subdomain enumeration 2) Test discovered
-               web apps for OWASP Top 10 3) Check database services for default creds
-               4) Review any custom applications for business logic flaws</parameter>
-  <parameter=category>plan</parameter>
-  <parameter=tags>["planning", "strategy", "next_steps"]</parameter>
-  </function>
-
-  # Side note for later investigation
+  # Document an interesting finding
  <function=create_note>
  <parameter=title>Interesting Directory Found</parameter>
-  <parameter=content>Found /backup/ directory that might contain sensitive files. Low priority
-               for now but worth checking if time permits. Directory listing seems
-               disabled.</parameter>
+  <parameter=content>Found /backup/ directory that might contain sensitive files. Directory listing
+               seems disabled but worth investigating further.</parameter>
  <parameter=category>findings</parameter>
-  <parameter=tags>["directory", "backup", "low_priority"]</parameter>
-  <parameter=priority>low</parameter>
+  <parameter=tags>["directory", "backup"]</parameter>
+  </function>
+
+  # Methodology note
+  <function=create_note>
+  <parameter=title>Authentication Flow Analysis</parameter>
+  <parameter=content>The application uses JWT tokens stored in localStorage. Token expiration is
+               set to 24 hours. Observed that refresh token rotation is not implemented.</parameter>
+  <parameter=category>methodology</parameter>
+  <parameter=tags>["auth", "jwt", "session"]</parameter>
+  </function>
+
+  # Research question
+  <function=create_note>
+  <parameter=title>Custom Header Investigation</parameter>
+  <parameter=content>The API returns a custom X-Request-ID header. Need to research if this
+               could be used for user tracking or has any security implications.</parameter>
+  <parameter=category>questions</parameter>
+  <parameter=tags>["headers", "research"]</parameter>
  </function>
    </examples>
  </tool>
@@ -84,9 +75,6 @@
      <parameter name="tags" type="string" required="false">
        <description>Filter by tags (returns notes with any of these tags)</description>
      </parameter>
-      <parameter name="priority" type="string" required="false">
-        <description>Filter by priority level</description>
-      </parameter>
      <parameter name="search" type="string" required="false">
        <description>Search query to find in note titles and content</description>
      </parameter>
@@ -100,11 +88,6 @@
  <parameter=category>findings</parameter>
  </function>

-  # List high priority items
-  <function=list_notes>
-  <parameter=priority>high</parameter>
-  </function>
-
  # Search for SQL injection related notes
  <function=list_notes>
  <parameter=search>SQL injection</parameter>
@@ -132,9 +115,6 @@
      <parameter name="tags" type="string" required="false">
        <description>New tags for the note</description>
      </parameter>
-      <parameter name="priority" type="string" required="false">
-        <description>New priority level</description>
-      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - success: Whether the note was updated successfully</description>
@@ -143,7 +123,6 @@
  <function=update_note>
  <parameter=note_id>note_123</parameter>
  <parameter=content>Updated content with new findings...</parameter>
-  <parameter=priority>urgent</parameter>
  </function>
    </examples>
  </tool>
--- a/strix/tools/proxy/proxy_actions.py
+++ b/strix/tools/proxy/proxy_actions.py
@@ -2,8 +2,6 @@ from typing import Any, Literal

 from strix.tools.registry import register_tool

-from .proxy_manager import get_proxy_manager
-

 RequestPart = Literal["request", "response"]

@@ -27,6 +25,8 @@ def list_requests(
    sort_order: Literal["asc", "desc"] = "desc",
    scope_id: str | None = None,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.list_requests(
        httpql_filter, start_page, end_page, page_size, sort_by, sort_order, scope_id
@@ -41,6 +41,8 @@ def view_request(
    page: int = 1,
    page_size: int = 50,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.view_request(request_id, part, search_pattern, page, page_size)

@@ -53,6 +55,8 @@ def send_request(
    body: str = "",
    timeout: int = 30,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    if headers is None:
        headers = {}
    manager = get_proxy_manager()
@@ -64,6 +68,8 @@ def repeat_request(
    request_id: str,
    modifications: dict[str, Any] | None = None,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    if modifications is None:
        modifications = {}
    manager = get_proxy_manager()
@@ -78,6 +84,8 @@ def scope_rules(
    scope_id: str | None = None,
    scope_name: str | None = None,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.scope_rules(action, allowlist, denylist, scope_id, scope_name)

@@ -89,6 +97,8 @@ def list_sitemap(
    depth: Literal["DIRECT", "ALL"] = "DIRECT",
    page: int = 1,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.list_sitemap(scope_id, parent_id, depth, page)

@@ -97,5 +107,7 @@ def list_sitemap(
 def view_sitemap_entry(
    entry_id: str,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.view_sitemap_entry(entry_id)
--- a/strix/tools/python/python_actions.py
+++ b/strix/tools/python/python_actions.py
@@ -2,8 +2,6 @@ from typing import Any, Literal

 from strix.tools.registry import register_tool

-from .python_manager import get_python_session_manager
-

 PythonAction = Literal["new_session", "execute", "close", "list_sessions"]

@@ -15,6 +13,8 @@ def python_action(
    timeout: int = 30,
    session_id: str | None = None,
 ) -> dict[str, Any]:
+    from .python_manager import get_python_session_manager
+
    def _validate_code(action_name: str, code: str | None) -> None:
        if not code:
            raise ValueError(f"code parameter is required for {action_name} action")
--- a/strix/tools/terminal/terminal_actions.py
+++ b/strix/tools/terminal/terminal_actions.py
@@ -2,8 +2,6 @@ from typing import Any

 from strix.tools.registry import register_tool

-from .terminal_manager import get_terminal_manager
-

@register_tool
 def terminal_execute(
@@ -13,6 +11,8 @@ def terminal_execute(
    terminal_id: str | None = None,
    no_enter: bool = False,
 ) -> dict[str, Any]:
+    from .terminal_manager import get_terminal_manager
+
    manager = get_terminal_manager()

    try:
--- a/strix/tools/todo/init.py
+++ b/strix/tools/todo/init.py
@@ -0,0 +1,18 @@
+from .todo_actions import (
+    create_todo,
+    delete_todo,
+    list_todos,
+    mark_todo_done,
+    mark_todo_pending,
+    update_todo,
+)
+
+
+__all__ = [
+    "create_todo",
+    "delete_todo",
+    "list_todos",
+    "mark_todo_done",
+    "mark_todo_pending",
+    "update_todo",
+]
--- a/strix/tools/todo/todo_actions.py
+++ b/strix/tools/todo/todo_actions.py
@@ -0,0 +1,568 @@
+import json
+import uuid
+from datetime import UTC, datetime
+from typing import Any
+
+from strix.tools.registry import register_tool
+
+
+VALID_PRIORITIES = ["low", "normal", "high", "critical"]
+VALID_STATUSES = ["pending", "in_progress", "done"]
+
+_todos_storage: dict[str, dict[str, dict[str, Any]]] = {}
+
+
+def _get_agent_todos(agent_id: str) -> dict[str, dict[str, Any]]:
+    if agent_id not in _todos_storage:
+        _todos_storage[agent_id] = {}
+    return _todos_storage[agent_id]
+
+
+def _normalize_priority(priority: str | None, default: str = "normal") -> str:
+    candidate = (priority or default or "normal").lower()
+    if candidate not in VALID_PRIORITIES:
+        raise ValueError(f"Invalid priority. Must be one of: {', '.join(VALID_PRIORITIES)}")
+    return candidate
+
+
+def _sorted_todos(agent_id: str) -> list[dict[str, Any]]:
+    agent_todos = _get_agent_todos(agent_id)
+
+    todos_list: list[dict[str, Any]] = []
+    for todo_id, todo in agent_todos.items():
+        entry = todo.copy()
+        entry["todo_id"] = todo_id
+        todos_list.append(entry)
+
+    priority_order = {"critical": 0, "high": 1, "normal": 2, "low": 3}
+    status_order = {"done": 0, "in_progress": 1, "pending": 2}
+
+    todos_list.sort(
+        key=lambda x: (
+            status_order.get(x.get("status", "pending"), 99),
+            priority_order.get(x.get("priority", "normal"), 99),
+            x.get("created_at", ""),
+        )
+    )
+    return todos_list
+
+
+def _normalize_todo_ids(raw_ids: Any) -> list[str]:
+    if raw_ids is None:
+        return []
+
+    if isinstance(raw_ids, str):
+        stripped = raw_ids.strip()
+        if not stripped:
+            return []
+        try:
+            data = json.loads(stripped)
+        except json.JSONDecodeError:
+            data = stripped.split(",") if "," in stripped else [stripped]
+        if isinstance(data, list):
+            return [str(item).strip() for item in data if str(item).strip()]
+        return [str(data).strip()]
+
+    if isinstance(raw_ids, list):
+        return [str(item).strip() for item in raw_ids if str(item).strip()]
+
+    return [str(raw_ids).strip()]
+
+
+def _normalize_bulk_updates(raw_updates: Any) -> list[dict[str, Any]]:
+    if raw_updates is None:
+        return []
+
+    data = raw_updates
+    if isinstance(raw_updates, str):
+        stripped = raw_updates.strip()
+        if not stripped:
+            return []
+        try:
+            data = json.loads(stripped)
+        except json.JSONDecodeError as e:
+            raise ValueError("Updates must be valid JSON") from e
+
+    if isinstance(data, dict):
+        data = [data]
+
+    if not isinstance(data, list):
+        raise TypeError("Updates must be a list of update objects")
+
+    normalized: list[dict[str, Any]] = []
+    for item in data:
+        if not isinstance(item, dict):
+            raise TypeError("Each update must be an object with todo_id")
+
+        todo_id = item.get("todo_id") or item.get("id")
+        if not todo_id:
+            raise ValueError("Each update must include 'todo_id'")
+
+        normalized.append(
+            {
+                "todo_id": str(todo_id).strip(),
+                "title": item.get("title"),
+                "description": item.get("description"),
+                "priority": item.get("priority"),
+                "status": item.get("status"),
+            }
+        )
+
+    return normalized
+
+
+def _normalize_bulk_todos(raw_todos: Any) -> list[dict[str, Any]]:
+    if raw_todos is None:
+        return []
+
+    data = raw_todos
+    if isinstance(raw_todos, str):
+        stripped = raw_todos.strip()
+        if not stripped:
+            return []
+        try:
+            data = json.loads(stripped)
+        except json.JSONDecodeError:
+            entries = [line.strip(" -*\t") for line in stripped.splitlines() if line.strip(" -*\t")]
+            return [{"title": entry} for entry in entries]
+
+    if isinstance(data, dict):
+        data = [data]
+
+    if not isinstance(data, list):
+        raise TypeError("Todos must be provided as a list, dict, or JSON string")
+
+    normalized: list[dict[str, Any]] = []
+    for item in data:
+        if isinstance(item, str):
+            title = item.strip()
+            if title:
+                normalized.append({"title": title})
+            continue
+
+        if not isinstance(item, dict):
+            raise TypeError("Each todo entry must be a string or object with a title")
+
+        title = item.get("title", "")
+        if not isinstance(title, str) or not title.strip():
+            raise ValueError("Each todo entry must include a non-empty 'title'")
+
+        normalized.append(
+            {
+                "title": title.strip(),
+                "description": (item.get("description") or "").strip() or None,
+                "priority": item.get("priority"),
+            }
+        )
+
+    return normalized
+
+
+@register_tool(sandbox_execution=False)
+def create_todo(
+    agent_state: Any,
+    title: str | None = None,
+    description: str | None = None,
+    priority: str = "normal",
+    todos: Any | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        default_priority = _normalize_priority(priority)
+
+        tasks_to_create: list[dict[str, Any]] = []
+
+        if todos is not None:
+            tasks_to_create.extend(_normalize_bulk_todos(todos))
+
+        if title and title.strip():
+            tasks_to_create.append(
+                {
+                    "title": title.strip(),
+                    "description": description.strip() if description else None,
+                    "priority": default_priority,
+                }
+            )
+
+        if not tasks_to_create:
+            return {
+                "success": False,
+                "error": "Provide a title or 'todos' list to create.",
+                "todo_id": None,
+            }
+
+        agent_todos = _get_agent_todos(agent_id)
+        created: list[dict[str, Any]] = []
+
+        for task in tasks_to_create:
+            task_priority = _normalize_priority(task.get("priority"), default_priority)
+            todo_id = str(uuid.uuid4())[:6]
+            timestamp = datetime.now(UTC).isoformat()
+
+            todo = {
+                "title": task["title"],
+                "description": task.get("description"),
+                "priority": task_priority,
+                "status": "pending",
+                "created_at": timestamp,
+                "updated_at": timestamp,
+                "completed_at": None,
+            }
+
+            agent_todos[todo_id] = todo
+            created.append(
+                {
+                    "todo_id": todo_id,
+                    "title": task["title"],
+                    "priority": task_priority,
+                }
+            )
+
+    except (ValueError, TypeError) as e:
+        return {"success": False, "error": f"Failed to create todo: {e}", "todo_id": None}
+    else:
+        todos_list = _sorted_todos(agent_id)
+
+        response: dict[str, Any] = {
+            "success": True,
+            "created": created,
+            "count": len(created),
+            "todos": todos_list,
+            "total_count": len(todos_list),
+        }
+        return response
+
+
+@register_tool(sandbox_execution=False)
+def list_todos(
+    agent_state: Any,
+    status: str | None = None,
+    priority: str | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        agent_todos = _get_agent_todos(agent_id)
+
+        status_filter = status.lower() if isinstance(status, str) else None
+        priority_filter = priority.lower() if isinstance(priority, str) else None
+
+        todos_list = []
+        for todo_id, todo in agent_todos.items():
+            if status_filter and todo.get("status") != status_filter:
+                continue
+
+            if priority_filter and todo.get("priority") != priority_filter:
+                continue
+
+            todo_with_id = todo.copy()
+            todo_with_id["todo_id"] = todo_id
+            todos_list.append(todo_with_id)
+
+        priority_order = {"critical": 0, "high": 1, "normal": 2, "low": 3}
+        status_order = {"done": 0, "in_progress": 1, "pending": 2}
+
+        todos_list.sort(
+            key=lambda x: (
+                status_order.get(x.get("status", "pending"), 99),
+                priority_order.get(x.get("priority", "normal"), 99),
+                x.get("created_at", ""),
+            )
+        )
+
+        summary_counts = {
+            "pending": 0,
+            "in_progress": 0,
+            "done": 0,
+        }
+        for todo in todos_list:
+            status_value = todo.get("status", "pending")
+            if status_value not in summary_counts:
+                summary_counts[status_value] = 0
+            summary_counts[status_value] += 1
+
+        return {
+            "success": True,
+            "todos": todos_list,
+            "total_count": len(todos_list),
+            "summary": summary_counts,
+        }
+
+    except (ValueError, TypeError) as e:
+        return {
+            "success": False,
+            "error": f"Failed to list todos: {e}",
+            "todos": [],
+            "total_count": 0,
+            "summary": {"pending": 0, "in_progress": 0, "done": 0},
+        }
+
+
+def _apply_single_update(
+    agent_todos: dict[str, dict[str, Any]],
+    todo_id: str,
+    title: str | None = None,
+    description: str | None = None,
+    priority: str | None = None,
+    status: str | None = None,
+) -> dict[str, Any] | None:
+    if todo_id not in agent_todos:
+        return {"todo_id": todo_id, "error": f"Todo with ID '{todo_id}' not found"}
+
+    todo = agent_todos[todo_id]
+
+    if title is not None:
+        if not title.strip():
+            return {"todo_id": todo_id, "error": "Title cannot be empty"}
+        todo["title"] = title.strip()
+
+    if description is not None:
+        todo["description"] = description.strip() if description else None
+
+    if priority is not None:
+        try:
+            todo["priority"] = _normalize_priority(priority, str(todo.get("priority", "normal")))
+        except ValueError as exc:
+            return {"todo_id": todo_id, "error": str(exc)}
+
+    if status is not None:
+        status_candidate = status.lower()
+        if status_candidate not in VALID_STATUSES:
+            return {
+                "todo_id": todo_id,
+                "error": f"Invalid status. Must be one of: {', '.join(VALID_STATUSES)}",
+            }
+        todo["status"] = status_candidate
+        if status_candidate == "done":
+            todo["completed_at"] = datetime.now(UTC).isoformat()
+        else:
+            todo["completed_at"] = None
+
+    todo["updated_at"] = datetime.now(UTC).isoformat()
+    return None
+
+
+@register_tool(sandbox_execution=False)
+def update_todo(
+    agent_state: Any,
+    todo_id: str | None = None,
+    title: str | None = None,
+    description: str | None = None,
+    priority: str | None = None,
+    status: str | None = None,
+    updates: Any | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        agent_todos = _get_agent_todos(agent_id)
+
+        updates_to_apply: list[dict[str, Any]] = []
+
+        if updates is not None:
+            updates_to_apply.extend(_normalize_bulk_updates(updates))
+
+        if todo_id is not None:
+            updates_to_apply.append(
+                {
+                    "todo_id": todo_id,
+                    "title": title,
+                    "description": description,
+                    "priority": priority,
+                    "status": status,
+                }
+            )
+
+        if not updates_to_apply:
+            return {
+                "success": False,
+                "error": "Provide todo_id or 'updates' list to update.",
+            }
+
+        updated: list[str] = []
+        errors: list[dict[str, Any]] = []
+
+        for update in updates_to_apply:
+            error = _apply_single_update(
+                agent_todos,
+                update["todo_id"],
+                update.get("title"),
+                update.get("description"),
+                update.get("priority"),
+                update.get("status"),
+            )
+            if error:
+                errors.append(error)
+            else:
+                updated.append(update["todo_id"])
+
+        todos_list = _sorted_todos(agent_id)
+
+        response: dict[str, Any] = {
+            "success": len(errors) == 0,
+            "updated": updated,
+            "updated_count": len(updated),
+            "todos": todos_list,
+            "total_count": len(todos_list),
+        }
+
+        if errors:
+            response["errors"] = errors
+
+    except (ValueError, TypeError) as e:
+        return {"success": False, "error": str(e)}
+    else:
+        return response
+
+
+@register_tool(sandbox_execution=False)
+def mark_todo_done(
+    agent_state: Any,
+    todo_id: str | None = None,
+    todo_ids: Any | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        agent_todos = _get_agent_todos(agent_id)
+
+        ids_to_mark: list[str] = []
+        if todo_ids is not None:
+            ids_to_mark.extend(_normalize_todo_ids(todo_ids))
+        if todo_id is not None:
+            ids_to_mark.append(todo_id)
+
+        if not ids_to_mark:
+            return {"success": False, "error": "Provide todo_id or todo_ids to mark as done."}
+
+        marked: list[str] = []
+        errors: list[dict[str, Any]] = []
+        timestamp = datetime.now(UTC).isoformat()
+
+        for tid in ids_to_mark:
+            if tid not in agent_todos:
+                errors.append({"todo_id": tid, "error": f"Todo with ID '{tid}' not found"})
+                continue
+
+            todo = agent_todos[tid]
+            todo["status"] = "done"
+            todo["completed_at"] = timestamp
+            todo["updated_at"] = timestamp
+            marked.append(tid)
+
+        todos_list = _sorted_todos(agent_id)
+
+        response: dict[str, Any] = {
+            "success": len(errors) == 0,
+            "marked_done": marked,
+            "marked_count": len(marked),
+            "todos": todos_list,
+            "total_count": len(todos_list),
+        }
+
+        if errors:
+            response["errors"] = errors
+
+    except (ValueError, TypeError) as e:
+        return {"success": False, "error": str(e)}
+    else:
+        return response
+
+
+@register_tool(sandbox_execution=False)
+def mark_todo_pending(
+    agent_state: Any,
+    todo_id: str | None = None,
+    todo_ids: Any | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        agent_todos = _get_agent_todos(agent_id)
+
+        ids_to_mark: list[str] = []
+        if todo_ids is not None:
+            ids_to_mark.extend(_normalize_todo_ids(todo_ids))
+        if todo_id is not None:
+            ids_to_mark.append(todo_id)
+
+        if not ids_to_mark:
+            return {"success": False, "error": "Provide todo_id or todo_ids to mark as pending."}
+
+        marked: list[str] = []
+        errors: list[dict[str, Any]] = []
+        timestamp = datetime.now(UTC).isoformat()
+
+        for tid in ids_to_mark:
+            if tid not in agent_todos:
+                errors.append({"todo_id": tid, "error": f"Todo with ID '{tid}' not found"})
+                continue
+
+            todo = agent_todos[tid]
+            todo["status"] = "pending"
+            todo["completed_at"] = None
+            todo["updated_at"] = timestamp
+            marked.append(tid)
+
+        todos_list = _sorted_todos(agent_id)
+
+        response: dict[str, Any] = {
+            "success": len(errors) == 0,
+            "marked_pending": marked,
+            "marked_count": len(marked),
+            "todos": todos_list,
+            "total_count": len(todos_list),
+        }
+
+        if errors:
+            response["errors"] = errors
+
+    except (ValueError, TypeError) as e:
+        return {"success": False, "error": str(e)}
+    else:
+        return response
+
+
+@register_tool(sandbox_execution=False)
+def delete_todo(
+    agent_state: Any,
+    todo_id: str | None = None,
+    todo_ids: Any | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        agent_todos = _get_agent_todos(agent_id)
+
+        ids_to_delete: list[str] = []
+        if todo_ids is not None:
+            ids_to_delete.extend(_normalize_todo_ids(todo_ids))
+        if todo_id is not None:
+            ids_to_delete.append(todo_id)
+
+        if not ids_to_delete:
+            return {"success": False, "error": "Provide todo_id or todo_ids to delete."}
+
+        deleted: list[str] = []
+        errors: list[dict[str, Any]] = []
+
+        for tid in ids_to_delete:
+            if tid not in agent_todos:
+                errors.append({"todo_id": tid, "error": f"Todo with ID '{tid}' not found"})
+                continue
+
+            del agent_todos[tid]
+            deleted.append(tid)
+
+        todos_list = _sorted_todos(agent_id)
+
+        response: dict[str, Any] = {
+            "success": len(errors) == 0,
+            "deleted": deleted,
+            "deleted_count": len(deleted),
+            "todos": todos_list,
+            "total_count": len(todos_list),
+        }
+
+        if errors:
+            response["errors"] = errors
+
+    except (ValueError, TypeError) as e:
+        return {"success": False, "error": str(e)}
+    else:
+        return response
--- a/strix/tools/todo/todo_actions_schema.xml
+++ b/strix/tools/todo/todo_actions_schema.xml
@@ -0,0 +1,225 @@
+<tools>
+  <important>
+  The todo tool is available for organizing complex tasks when needed. Each subagent has their own
+  separate todo list - your todos are private to you and do not interfere with other agents' todos.
+
+  WHEN TO USE TODOS:
+  - Planning complex multi-step operations
+  - Tracking multiple parallel workstreams
+  - When you need to remember tasks to return to later
+  - Organizing large-scope assessments with many components
+
+  WHEN NOT NEEDED:
+  - Simple, straightforward tasks
+  - Linear workflows where progress is obvious
+  - Short tasks that can be completed quickly
+
+  If you do use todos, batch operations together to minimize tool calls.
+  </important>
+
+  <tool name="create_todo">
+    <description>Create a new todo item to track tasks, goals, and progress.</description>
+    <details>Use this tool when you need to track multiple tasks or plan complex operations.
+  Each subagent maintains their own independent todo list - your todos are yours alone.
+
+  Useful for breaking down complex tasks into smaller, manageable items when the workflow
+  is non-trivial or when you need to track progress across multiple components.</details>
+    <parameters>
+      <parameter name="title" type="string" required="false">
+        <description>Short, actionable title for the todo (e.g., "Test login endpoint for SQL injection")</description>
+      </parameter>
+      <parameter name="todos" type="string" required="false">
+        <description>Create multiple todos at once. Provide a JSON array of {"title": "...", "description": "...", "priority": "..."} objects or a newline-separated bullet list.</description>
+      </parameter>
+      <parameter name="description" type="string" required="false">
+        <description>Detailed description or notes about the task</description>
+      </parameter>
+      <parameter name="priority" type="string" required="false">
+        <description>Priority level: "low", "normal", "high", "critical" (default: "normal")</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - created: List of created todos with their IDs - todos: Full sorted todo list - success: Whether the operation succeeded</description>
+    </returns>
+    <examples>
+  # Create a high priority todo
+  <function=create_todo>
+  <parameter=title>Test authentication bypass on /api/admin</parameter>
+  <parameter=description>The admin endpoint seems to have weak authentication. Try JWT manipulation, session fixation, and privilege escalation.</parameter>
+  <parameter=priority>high</parameter>
+  </function>
+
+  # Create a simple todo
+  <function=create_todo>
+  <parameter=title>Enumerate all API endpoints</parameter>
+  </function>
+
+  # Bulk create todos (JSON array)
+  <function=create_todo>
+  <parameter=todos>[{"title": "Map all admin routes", "priority": "high"}, {"title": "Check forgotten password flow"}]</parameter>
+  </function>
+
+  # Bulk create todos (bullet list)
+  <function=create_todo>
+  <parameter=todos>
+  - Capture baseline traffic in proxy
+  - Enumerate S3 buckets for leaked assets
+  - Compare responses for timing differences
+  </parameter>
+  </function>
+    </examples>
+  </tool>
+
+  <tool name="list_todos">
+    <description>List all todos with optional filtering by status or priority.</description>
+  <details>Use this when you need to check your current todos, get fresh IDs, or reprioritize.
+  The list is sorted: done first, then in_progress, then pending. Within each status, sorted by priority (critical > high > normal > low).
+  Each subagent has their own independent todo list.</details>
+    <parameters>
+      <parameter name="status" type="string" required="false">
+        <description>Filter by status: "pending", "in_progress", "done"</description>
+      </parameter>
+      <parameter name="priority" type="string" required="false">
+        <description>Filter by priority: "low", "normal", "high", "critical"</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - todos: List of todo items - total_count: Total number of todos - summary: Count by status (pending, in_progress, done)</description>
+    </returns>
+    <examples>
+  # List all todos
+  <function=list_todos>
+  </function>
+
+  # List only pending todos
+  <function=list_todos>
+  <parameter=status>pending</parameter>
+  </function>
+
+  # List high priority items
+  <function=list_todos>
+  <parameter=priority>high</parameter>
+  </function>
+    </examples>
+  </tool>
+
+  <tool name="update_todo">
+  <description>Update one or multiple todo items. Prefer bulk updates in a single call when updating multiple items.</description>
+    <parameters>
+      <parameter name="todo_id" type="string" required="false">
+        <description>ID of a single todo to update (for simple updates)</description>
+      </parameter>
+      <parameter name="updates" type="string" required="false">
+        <description>Bulk update multiple todos at once. JSON array of objects with todo_id and fields to update: [{"todo_id": "abc", "status": "done"}, {"todo_id": "def", "priority": "high"}].</description>
+      </parameter>
+      <parameter name="title" type="string" required="false">
+        <description>New title (used with todo_id)</description>
+      </parameter>
+      <parameter name="description" type="string" required="false">
+        <description>New description (used with todo_id)</description>
+      </parameter>
+      <parameter name="priority" type="string" required="false">
+        <description>New priority: "low", "normal", "high", "critical" (used with todo_id)</description>
+      </parameter>
+      <parameter name="status" type="string" required="false">
+        <description>New status: "pending", "in_progress", "done" (used with todo_id)</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - updated: List of updated todo IDs - updated_count: Number updated - todos: Full sorted todo list - errors: Any failed updates</description>
+    </returns>
+    <examples>
+  # Single update
+  <function=update_todo>
+  <parameter=todo_id>abc123</parameter>
+  <parameter=status>in_progress</parameter>
+  </function>
+
+  # Bulk update - mark multiple todos with different statuses in ONE call
+  <function=update_todo>
+  <parameter=updates>[{"todo_id": "abc123", "status": "done"}, {"todo_id": "def456", "status": "in_progress"}, {"todo_id": "ghi789", "priority": "critical"}]</parameter>
+  </function>
+    </examples>
+  </tool>
+
+  <tool name="mark_todo_done">
+  <description>Mark one or multiple todos as completed in a single call.</description>
+  <details>Mark todos as done after completing them. Group multiple completions into one call using todo_ids when possible.</details>
+    <parameters>
+      <parameter name="todo_id" type="string" required="false">
+        <description>ID of a single todo to mark as done</description>
+      </parameter>
+      <parameter name="todo_ids" type="string" required="false">
+        <description>Mark multiple todos done at once. JSON array of IDs: ["abc123", "def456"] or comma-separated: "abc123, def456"</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - marked_done: List of IDs marked done - marked_count: Number marked - todos: Full sorted list - errors: Any failures</description>
+    </returns>
+    <examples>
+  # Mark single todo done
+  <function=mark_todo_done>
+  <parameter=todo_id>abc123</parameter>
+  </function>
+
+  # Mark multiple todos done in ONE call
+  <function=mark_todo_done>
+  <parameter=todo_ids>["abc123", "def456", "ghi789"]</parameter>
+  </function>
+    </examples>
+  </tool>
+
+  <tool name="mark_todo_pending">
+    <description>Mark one or multiple todos as pending (reopen completed tasks).</description>
+    <details>Use this to reopen tasks that were marked done but need more work. Supports bulk operations.</details>
+    <parameters>
+      <parameter name="todo_id" type="string" required="false">
+        <description>ID of a single todo to mark as pending</description>
+      </parameter>
+      <parameter name="todo_ids" type="string" required="false">
+        <description>Mark multiple todos pending at once. JSON array of IDs: ["abc123", "def456"] or comma-separated: "abc123, def456"</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - marked_pending: List of IDs marked pending - marked_count: Number marked - todos: Full sorted list - errors: Any failures</description>
+    </returns>
+    <examples>
+  # Mark single todo pending
+  <function=mark_todo_pending>
+  <parameter=todo_id>abc123</parameter>
+  </function>
+
+  # Mark multiple todos pending in ONE call
+  <function=mark_todo_pending>
+  <parameter=todo_ids>["abc123", "def456"]</parameter>
+  </function>
+    </examples>
+  </tool>
+
+  <tool name="delete_todo">
+    <description>Delete one or multiple todos in a single call.</description>
+    <details>Use this to remove todos that are no longer relevant. Supports bulk deletion to save tool calls.</details>
+    <parameters>
+      <parameter name="todo_id" type="string" required="false">
+        <description>ID of a single todo to delete</description>
+      </parameter>
+      <parameter name="todo_ids" type="string" required="false">
+        <description>Delete multiple todos at once. JSON array of IDs: ["abc123", "def456"] or comma-separated: "abc123, def456"</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - deleted: List of deleted IDs - deleted_count: Number deleted - todos: Remaining todos - errors: Any failures</description>
+    </returns>
+    <examples>
+  # Delete single todo
+  <function=delete_todo>
+  <parameter=todo_id>abc123</parameter>
+  </function>
+
+  # Delete multiple todos in ONE call
+  <function=delete_todo>
+  <parameter=todo_ids>["abc123", "def456", "ghi789"]</parameter>
+  </function>
+    </examples>
+  </tool>
+</tools>
--- a/tests/init.py
+++ b/tests/init.py
@@ -0,0 +1 @@
+# Strix Test Suite
--- a/tests/agents/init.py
+++ b/tests/agents/init.py
@@ -0,0 +1 @@
+"""Tests for strix.agents module."""
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -0,0 +1 @@
+"""Pytest configuration and shared fixtures for Strix tests."""
--- a/tests/interface/init.py
+++ b/tests/interface/init.py
@@ -0,0 +1 @@
+"""Tests for strix.interface module."""
--- a/tests/llm/init.py
+++ b/tests/llm/init.py
@@ -0,0 +1 @@
+"""Tests for strix.llm module."""
--- a/tests/runtime/init.py
+++ b/tests/runtime/init.py
@@ -0,0 +1 @@
+"""Tests for strix.runtime module."""
--- a/tests/telemetry/init.py
+++ b/tests/telemetry/init.py
@@ -0,0 +1 @@
+"""Tests for strix.telemetry module."""
--- a/tests/tools/init.py
+++ b/tests/tools/init.py
@@ -0,0 +1 @@
+"""Tests for strix.tools module."""
--- a/tests/tools/conftest.py
+++ b/tests/tools/conftest.py
@@ -0,0 +1,34 @@
+"""Fixtures for strix.tools tests."""
+
+from collections.abc import Callable
+from typing import Any
+
+import pytest
+
+
+@pytest.fixture
+def sample_function_with_types() -> Callable[..., None]:
+    """Create a sample function with type annotations for testing argument conversion."""
+
+    def func(
+        name: str,
+        count: int,
+        enabled: bool,
+        ratio: float,
+        items: list[Any],
+        config: dict[str, Any],
+        optional: str | None = None,
+    ) -> None:
+        pass
+
+    return func
+
+
+@pytest.fixture
+def sample_function_no_annotations() -> Callable[..., None]:
+    """Create a sample function without type annotations."""
+
+    def func(arg1, arg2, arg3):  # type: ignore[no-untyped-def]
+        pass
+
+    return func
--- a/tests/tools/test_argument_parser.py
+++ b/tests/tools/test_argument_parser.py
@@ -0,0 +1,271 @@
+from collections.abc import Callable
+
+import pytest
+
+from strix.tools.argument_parser import (
+    ArgumentConversionError,
+    _convert_basic_types,
+    _convert_to_bool,
+    _convert_to_dict,
+    _convert_to_list,
+    convert_arguments,
+    convert_string_to_type,
+)
+
+
+class TestConvertToBool:
+    """Tests for the _convert_to_bool function."""
+
+    @pytest.mark.parametrize(
+        "value",
+        ["true", "True", "TRUE", "1", "yes", "Yes", "YES", "on", "On", "ON"],
+    )
+    def test_truthy_values(self, value: str) -> None:
+        """Test that truthy string values are converted to True."""
+        assert _convert_to_bool(value) is True
+
+    @pytest.mark.parametrize(
+        "value",
+        ["false", "False", "FALSE", "0", "no", "No", "NO", "off", "Off", "OFF"],
+    )
+    def test_falsy_values(self, value: str) -> None:
+        """Test that falsy string values are converted to False."""
+        assert _convert_to_bool(value) is False
+
+    def test_non_standard_truthy_string(self) -> None:
+        """Test that non-empty non-standard strings are truthy."""
+        assert _convert_to_bool("anything") is True
+        assert _convert_to_bool("hello") is True
+
+    def test_empty_string(self) -> None:
+        """Test that empty string is falsy."""
+        assert _convert_to_bool("") is False
+
+
+class TestConvertToList:
+    """Tests for the _convert_to_list function."""
+
+    def test_json_array_string(self) -> None:
+        """Test parsing a JSON array string."""
+        result = _convert_to_list('["a", "b", "c"]')
+        assert result == ["a", "b", "c"]
+
+    def test_json_array_with_numbers(self) -> None:
+        """Test parsing a JSON array with numbers."""
+        result = _convert_to_list("[1, 2, 3]")
+        assert result == [1, 2, 3]
+
+    def test_comma_separated_string(self) -> None:
+        """Test parsing a comma-separated string."""
+        result = _convert_to_list("a, b, c")
+        assert result == ["a", "b", "c"]
+
+    def test_comma_separated_no_spaces(self) -> None:
+        """Test parsing comma-separated values without spaces."""
+        result = _convert_to_list("x,y,z")
+        assert result == ["x", "y", "z"]
+
+    def test_single_value(self) -> None:
+        """Test that a single value returns a list with one element."""
+        result = _convert_to_list("single")
+        assert result == ["single"]
+
+    def test_json_non_array_wraps_in_list(self) -> None:
+        """Test that a valid JSON non-array value is wrapped in a list."""
+        result = _convert_to_list('"string"')
+        assert result == ["string"]
+
+    def test_json_object_wraps_in_list(self) -> None:
+        """Test that a JSON object is wrapped in a list."""
+        result = _convert_to_list('{"key": "value"}')
+        assert result == [{"key": "value"}]
+
+    def test_empty_json_array(self) -> None:
+        """Test parsing an empty JSON array."""
+        result = _convert_to_list("[]")
+        assert result == []
+
+
+class TestConvertToDict:
+    """Tests for the _convert_to_dict function."""
+
+    def test_valid_json_object(self) -> None:
+        """Test parsing a valid JSON object string."""
+        result = _convert_to_dict('{"key": "value", "number": 42}')
+        assert result == {"key": "value", "number": 42}
+
+    def test_empty_json_object(self) -> None:
+        """Test parsing an empty JSON object."""
+        result = _convert_to_dict("{}")
+        assert result == {}
+
+    def test_invalid_json_returns_empty_dict(self) -> None:
+        """Test that invalid JSON returns an empty dictionary."""
+        result = _convert_to_dict("not json")
+        assert result == {}
+
+    def test_json_array_returns_empty_dict(self) -> None:
+        """Test that a JSON array returns an empty dictionary."""
+        result = _convert_to_dict("[1, 2, 3]")
+        assert result == {}
+
+    def test_nested_json_object(self) -> None:
+        """Test parsing a nested JSON object."""
+        result = _convert_to_dict('{"outer": {"inner": "value"}}')
+        assert result == {"outer": {"inner": "value"}}
+
+
+class TestConvertBasicTypes:
+    """Tests for the _convert_basic_types function."""
+
+    def test_convert_to_int(self) -> None:
+        """Test converting string to int."""
+        assert _convert_basic_types("42", int) == 42
+        assert _convert_basic_types("-10", int) == -10
+
+    def test_convert_to_float(self) -> None:
+        """Test converting string to float."""
+        assert _convert_basic_types("3.14", float) == 3.14
+        assert _convert_basic_types("-2.5", float) == -2.5
+
+    def test_convert_to_str(self) -> None:
+        """Test converting string to str (passthrough)."""
+        assert _convert_basic_types("hello", str) == "hello"
+
+    def test_convert_to_bool(self) -> None:
+        """Test converting string to bool."""
+        assert _convert_basic_types("true", bool) is True
+        assert _convert_basic_types("false", bool) is False
+
+    def test_convert_to_list_type(self) -> None:
+        """Test converting to list type."""
+        result = _convert_basic_types("[1, 2, 3]", list)
+        assert result == [1, 2, 3]
+
+    def test_convert_to_dict_type(self) -> None:
+        """Test converting to dict type."""
+        result = _convert_basic_types('{"a": 1}', dict)
+        assert result == {"a": 1}
+
+    def test_unknown_type_attempts_json(self) -> None:
+        """Test that unknown types attempt JSON parsing."""
+        result = _convert_basic_types('{"key": "value"}', object)
+        assert result == {"key": "value"}
+
+    def test_unknown_type_returns_original(self) -> None:
+        """Test that unparseable values are returned as-is."""
+        result = _convert_basic_types("plain text", object)
+        assert result == "plain text"
+
+
+class TestConvertStringToType:
+    """Tests for the convert_string_to_type function."""
+
+    def test_basic_type_conversion(self) -> None:
+        """Test basic type conversions."""
+        assert convert_string_to_type("42", int) == 42
+        assert convert_string_to_type("3.14", float) == 3.14
+        assert convert_string_to_type("true", bool) is True
+
+    def test_optional_type(self) -> None:
+        """Test conversion with Optional type."""
+        result = convert_string_to_type("42", int | None)
+        assert result == 42
+
+    def test_union_type(self) -> None:
+        """Test conversion with Union type."""
+        result = convert_string_to_type("42", int | str)
+        assert result == 42
+
+    def test_union_type_with_none(self) -> None:
+        """Test conversion with Union including None."""
+        result = convert_string_to_type("hello", str | None)
+        assert result == "hello"
+
+    def test_modern_union_syntax(self) -> None:
+        """Test conversion with modern union syntax (int | None)."""
+        result = convert_string_to_type("100", int | None)
+        assert result == 100
+
+
+class TestConvertArguments:
+    """Tests for the convert_arguments function."""
+
+    def test_converts_typed_arguments(
+        self, sample_function_with_types: Callable[..., None]
+    ) -> None:
+        """Test that arguments are converted based on type annotations."""
+        kwargs = {
+            "name": "test",
+            "count": "5",
+            "enabled": "true",
+            "ratio": "2.5",
+            "items": "[1, 2, 3]",
+            "config": '{"key": "value"}',
+        }
+        result = convert_arguments(sample_function_with_types, kwargs)
+
+        assert result["name"] == "test"
+        assert result["count"] == 5
+        assert result["enabled"] is True
+        assert result["ratio"] == 2.5
+        assert result["items"] == [1, 2, 3]
+        assert result["config"] == {"key": "value"}
+
+    def test_passes_through_none_values(
+        self, sample_function_with_types: Callable[..., None]
+    ) -> None:
+        """Test that None values are passed through unchanged."""
+        kwargs = {"name": "test", "count": None}
+        result = convert_arguments(sample_function_with_types, kwargs)
+        assert result["count"] is None
+
+    def test_passes_through_non_string_values(
+        self, sample_function_with_types: Callable[..., None]
+    ) -> None:
+        """Test that non-string values are passed through unchanged."""
+        kwargs = {"name": "test", "count": 42}
+        result = convert_arguments(sample_function_with_types, kwargs)
+        assert result["count"] == 42
+
+    def test_unknown_parameter_passed_through(
+        self, sample_function_with_types: Callable[..., None]
+    ) -> None:
+        """Test that parameters not in signature are passed through."""
+        kwargs = {"name": "test", "unknown_param": "value"}
+        result = convert_arguments(sample_function_with_types, kwargs)
+        assert result["unknown_param"] == "value"
+
+    def test_function_without_annotations(
+        self, sample_function_no_annotations: Callable[..., None]
+    ) -> None:
+        """Test handling of functions without type annotations."""
+        kwargs = {"arg1": "value1", "arg2": "42"}
+        result = convert_arguments(sample_function_no_annotations, kwargs)
+        assert result["arg1"] == "value1"
+        assert result["arg2"] == "42"
+
+    def test_raises_error_on_conversion_failure(
+        self, sample_function_with_types: Callable[..., None]
+    ) -> None:
+        """Test that ArgumentConversionError is raised on conversion failure."""
+        kwargs = {"count": "not_a_number"}
+        with pytest.raises(ArgumentConversionError) as exc_info:
+            convert_arguments(sample_function_with_types, kwargs)
+        assert exc_info.value.param_name == "count"
+
+
+class TestArgumentConversionError:
+    """Tests for the ArgumentConversionError exception class."""
+
+    def test_error_with_param_name(self) -> None:
+        """Test creating error with parameter name."""
+        error = ArgumentConversionError("Test error", param_name="test_param")
+        assert error.param_name == "test_param"
+        assert str(error) == "Test error"
+
+    def test_error_without_param_name(self) -> None:
+        """Test creating error without parameter name."""
+        error = ArgumentConversionError("Test error")
+        assert error.param_name is None
+        assert str(error) == "Test error"
Author	SHA1	Message	Date
0xallam	78b6c26652	enhance todo tool prompt	2025-12-15 10:26:59 -08:00
0xallam	d649a7c70b	Update README.md	2025-12-15 10:11:08 -08:00
0xallam	d96852de55	chore: bump version to 0.5.0	2025-12-15 08:21:03 -08:00
0xallam	eb0c52b720	feat: add PyInstaller build for standalone binary distribution - Add PyInstaller spec file and build script for creating standalone executables - Add install.sh for curl \| sh installation from GitHub releases - Add GitHub Actions workflow for multi-platform builds (macOS, Linux, Windows) - Move sandbox-only deps (playwright, ipython, libtmux, etc.) to optional extras - Make google-cloud-aiplatform optional ([vertex] extra) to reduce binary size - Use lazy imports in tool actions to avoid loading sandbox deps at startup - Add -v/--version flag to CLI - Add website and Discord links to completion message - Binary size: ~97MB (down from ~120MB with all deps)	2025-12-15 08:21:03 -08:00
0xallam	2899021a21	chore(todo): encourage batched todo operations Strengthen schema guidance to batch todo creation, status updates, and completions while reducing unnecessary list refreshes to cut tool-call volume.	2025-12-15 07:41:33 -08:00
Ahmed Allam	0fcd5c46b2	Fix badge in README.md	2025-12-15 19:39:47 +04:00
0xallam	dcf77b31fc	chore(tools): raise sandbox execution timeout Increase default sandbox tool execution timeout from 120s to 500s while keeping connect timeout unchanged.	2025-12-14 20:40:00 -08:00
0xallam	37c8cffbe3	feat(tools): add bulk operations support to todo tools - update_todo: add `updates` param for bulk updates in one call - mark_todo_done: add `todo_ids` param to mark multiple todos done - mark_todo_pending: add `todo_ids` param to mark multiple pending - delete_todo: add `todo_ids` param to delete multiple todos - Increase todo renderer display limit from 10 to 25 - Maintains backward compatibility with single-ID usage - Update prompts to keep todos short-horizon and dynamic	2025-12-14 20:31:33 -08:00
0xallam	c29f13fd69	feat: add --scan-mode CLI option with quick/standard/deep modes Introduces scan mode selection to control testing depth and methodology: - quick: optimized for CI/CD, focuses on recent changes and high-impact vulns - standard: balanced coverage with systematic methodology - deep: exhaustive testing with hierarchical agent swarm (now default) Each mode has dedicated prompt modules with detailed pentesting guidelines covering reconnaissance, mapping, business logic analysis, exploitation, and vulnerability chaining strategies. Closes #152	2025-12-14 19:13:08 -08:00
Rohit Martires	5c995628bf	Feat: added support for non vision models STRIX_DISABLE_BROWSER flag (#188 ) Co-authored-by: 0xallam <ahmed39652003@gmail.com>	2025-12-14 23:45:43 +04:00
Ahmed Allam	624f1ed77f	feat(tui): add markdown rendering for agent messages (#197 ) Add AgentMessageRenderer to render agent messages with basic markdown support: - Headers (#, ##, ###, ####) - Bold (text) and italic (text) - Inline code and fenced code blocks - Links [text](url) and strikethrough Update system prompt to allow agents to use simple markdown formatting.	2025-12-14 22:53:07 +04:00
Ahmed Allam	2b926c733b	feat(tools): add dedicated todo tool for agent task tracking (#196 ) - Add new todo tool with create, list, update, mark_done, mark_pending, delete actions - Each subagent has isolated todo storage keyed by agent_id - Support bulk todo creation via JSON array or bullet list - Add TUI renderers for all todo actions with status markers - Update notes tool to remove priority and todo-related functionality - Add task tracking guidance to StrixAgent system prompt - Fix instruction file error handling in CLI	2025-12-14 22:16:02 +04:00
Ahmed Allam	a075ea1a0a	feat(tui): add syntax highlighting for tool renderers (#195 ) Add Pygments-based syntax highlighting with native hacker theme: - Python renderer: Python code highlighting - Browser renderer: JavaScript code highlighting - Terminal renderer: Bash command highlighting - File edit renderer: Auto-detect language from file extension, diff-style display	2025-12-14 04:39:28 +04:00
0xallam	5e3d14a1eb	chore: add Python 3.13 and 3.14 classifiers	2025-12-13 11:20:30 -08:00
Ahmed Allam	e57b7238f6	Update README to remove duplicate demo image	2025-12-12 21:59:16 +04:00
Ahmed Allam	13fe87d428	Add DeepWiki docs for Strix	2025-12-12 21:58:28 +04:00
K0IN	3e5845a0e1	Update GitHub Actions checkout action version (#189 )	2025-12-11 22:24:20 +04:00
Alexander De Battista Kvamme	9fedcf1551	Fix/ Long text instruction causes crash (#184 )	2025-12-08 23:23:51 +04:00
0xallam	1edd8eda01	fix: lint errors and code style improvements	2025-12-07 17:54:32 +02:00
0xallam	d8cb21bea3	chore: bump version to 0.4.1	2025-12-07 15:13:45 +02:00
0xallam	bd8d927f34	fix: add timeout to sandbox tool execution HTTP calls Replace timeout=None with configurable timeouts (120s execution, 10s connect) to prevent hung sandbox connections from blocking indefinitely. Configurable via STRIX_SANDBOX_EXECUTION_TIMEOUT and STRIX_SANDBOX_CONNECT_TIMEOUT environment variables.	2025-12-07 17:07:25 +04:00
0xallam	fc267564f5	chore: add google-cloud-aiplatform dependency Adds support for Vertex AI models via the google-cloud-aiplatform SDK.	2025-12-07 04:11:37 +04:00
0xallam	37c9b4b0e0	fix: make LLM_API_KEY optional for all providers Some providers like Vertex AI, AWS Bedrock, and local models don't require an API key as they use different authentication mechanisms.	2025-12-07 02:07:28 +02:00
0xallam	208b31a570	fix: filter out image_url content for non-vision models	2025-12-07 02:13:02 +04:00
Ahmed Allam	a14cb41745	chore: Bump litellm version	2025-12-07 01:38:21 +04:00
0xallam	4297c8f6e4	fix: pass api_key directly to litellm completion calls	2025-12-07 01:38:21 +04:00
0xallam	286d53384a	fix: set LITELLM_API_KEY env var for unified API key support	2025-12-07 01:38:21 +04:00
0xallam	ab40dbc33a	fix: improve request queue reliability and reduce stuck requests	2025-12-06 20:44:48 +02:00
dependabot[bot]	b6cb1302ce	chore(deps): bump urllib3 from 2.5.0 to 2.6.0 Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.0. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/2.5.0...2.6.0) --- updated-dependencies: - dependency-name: urllib3 dependency-version: 2.6.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-12-06 16:23:55 +04:00
Ahmed Allam	b74132b2dc	Update README.md	2025-12-03 20:09:22 +00:00
Ahmed Allam	35dd9d0a8f	refactor(tests): reorganize unit tests module structure	2025-12-04 00:02:14 +04:00
Ahmed Allam	6c5c0b0d1c	chore: resolve linting errors in test modules	2025-12-04 00:02:14 +04:00
Jeong-Ryeol	65c3383ecc	test: add initial unit tests for argument_parser module Add comprehensive test suite for the argument_parser module including: - Tests for _convert_to_bool with truthy/falsy values - Tests for _convert_to_list with JSON and comma-separated inputs - Tests for _convert_to_dict with valid/invalid JSON - Tests for convert_string_to_type with various type annotations - Tests for convert_arguments with typed functions - Tests for ArgumentConversionError exception class This establishes the foundation for the project's test infrastructure with pytest configuration already in place.	2025-12-04 00:02:14 +04:00
Vincent Yang	919cb5e248	docs: add file-based instruction example (#165 ) Co-authored-by: 0xallam <ahmed39652003@gmail.com>	2025-12-03 22:59:59 +04:00
Vincent Yang	c97ff94617	feat: Show Model Name in Live Stats Panel (#169 ) Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>	2025-12-03 18:45:01 +00:00
dependabot[bot]	53c9da9213	chore(deps): bump cryptography from 43.0.3 to 44.0.1 (#163 ) Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.3 to 44.0.1. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/43.0.3...44.0.1) --- updated-dependencies: - dependency-name: cryptography dependency-version: 44.0.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-02 21:44:35 +04:00
dependabot[bot]	1e189c1245	chore(deps): bump fonttools from 4.59.1 to 4.61.0 (#161 ) Bumps [fonttools](https://github.com/fonttools/fonttools) from 4.59.1 to 4.61.0. - [Release notes](https://github.com/fonttools/fonttools/releases) - [Changelog](https://github.com/fonttools/fonttools/blob/main/NEWS.rst) - [Commits](https://github.com/fonttools/fonttools/compare/4.59.1...4.61.0) --- updated-dependencies: - dependency-name: fonttools dependency-version: 4.61.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-02 19:23:56 +04:00
Ahmed Allam	62f804b8b5	Update link in README	2025-12-01 16:04:46 +04:00
Ahmed Allam	5ff10e9d20	Add acknowledgements in README	2025-11-29 19:27:30 +04:00
Ahmed Allam	9825fb46ec	chore: Bump version for 0.4.0 release	2025-11-25 20:18:44 +04:00
Alexander De Battista Kvamme	c0e547928e	Real-time display panel for agent stats (#134 ) Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>	2025-11-25 12:06:20 +00:00
Trusthoodies	78d0148d58	Add open redirect, subdomain takeover, and info disclosure prompt modules (#132 ) Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>	2025-11-25 10:32:55 +00:00
dependabot[bot]	eebb76de3b	chore(deps): bump pypdf from 6.1.3 to 6.4.0 Bumps [pypdf](https://github.com/py-pdf/pypdf) from 6.1.3 to 6.4.0. - [Release notes](https://github.com/py-pdf/pypdf/releases) - [Changelog](https://github.com/py-pdf/pypdf/blob/main/CHANGELOG.md) - [Commits](https://github.com/py-pdf/pypdf/compare/6.1.3...6.4.0) --- updated-dependencies: - dependency-name: pypdf dependency-version: 6.4.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-25 12:44:38 +04:00
Ahmed Allam	2ae1b3ddd1	Update README	2025-11-23 22:29:44 +04:00
Ahmed Allam	a11cd09a93	feat: support file-based instructions for detailed test configuration	2025-11-23 00:46:37 +04:00
Ahmed Allam	68ebdb2b6d	feat: enhance run name generation to include target information	2025-11-22 22:54:07 +04:00
Ahmed Allam	5befb32318	feat: implement incremental pentest data persistence	2025-11-22 22:54:07 +04:00
cyberseall	86e6ed49bb	feat(llm): make LLM request queue rate limits configurable and more conservative Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>	2025-11-22 17:07:43 +00:00
Ahmed Allam	0c811845f1	docs: update README	2025-11-21 23:07:11 +04:00
Ahmed Allam	383d53c7a9	feat(agent): implement agent identity guidline and improve system prompt	2025-11-15 16:21:05 +04:00
Ahmed Allam	478bf5d4d3	refactor(llm): remove unused temperature parameter from LLMConfig	2025-11-15 12:44:40 +04:00
Ahmed Allam	d1f7741965	feat(llm): enhance model features handling with pattern matching	2025-11-15 12:43:43 +04:00
Ahmed Allam	821929cd3e	fix(agent): increase waiting time threshold from 120 to 600 seconds	2025-11-15 12:39:46 +04:00
Ahmed Allam	5de16d2953	chore: Bump LiteLLM version	2025-11-15 12:37:22 +04:00
Ahmed Allam	6a2a62c121	chore: Fix formatting in README.md	2025-11-14 16:07:54 +00:00
Ahmed Allam	426dd27454	chore: Minor readme tweaks. Bump version for 0.3.4 release	2025-11-14 20:02:48 +04:00
Mark Percival	cedc65409e	fix: link	2025-11-14 20:02:48 +04:00
Mark Percival	72d5a73386	Chore: Update README	2025-11-14 20:02:48 +04:00
Ahmed Allam	dab69af033	fix(runtime): correct DOCKER_HOST parsing for sandbox URL	2025-11-14 02:41:00 +04:00
Ahmed Allam	6abb53dc02	feat: support scanning IP addresses	2025-11-14 01:38:58 +04:00
Ahmed Allam	f1d2961779	Update README	2025-11-12 19:29:01 +04:00
purpl3horse	2b7a8e3ee7	Update README.md Instruction argument was written in plural in the readme ( a typo )	2025-11-12 19:03:27 +04:00
Ahmed Allam	3e7466a533	chore: Bump version for 0.3.3 release	2025-11-12 18:58:03 +04:00
Ahmed Allam	1abfb360e4	feat: add configurable timeout for LLM requests	2025-11-12 18:58:03 +04:00
Ahmed Allam	795ed02955	docs: update README with recommended models	2025-11-12 15:01:15 +04:00
Alexei Macheret Artur	2cb0c31897	chore(deps): bump starlette from 0.46.2 to 0.49.1 (#75 ) Bumps [starlette](https://github.com/Kludex/starlette) from 0.46.2 to 0.49.1. - [Release notes](https://github.com/Kludex/starlette/releases) - [Changelog](https://github.com/Kludex/starlette/blob/main/docs/release-notes.md) - [Commits](https://github.com/Kludex/starlette/compare/0.46.2...0.49.1) --- updated-dependencies: - dependency-name: starlette dependency-version: 0.49.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-11-10 14:19:18 +04:00
m4ki3lf0	1c8780cf81	Update Readme Co-authored-by: m4ki3lf0 <m4ki3lf0@git.com> Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>	2025-11-10 09:49:37 +00:00
Ahmed Allam	b6d9d941cf	Update README	2025-11-08 15:07:53 +04:00
Ahmed Allam	edd628bbc1	Chore: fix discord link in readme	2025-11-07 18:03:47 +04:00
Ahmed Allam	d76c7c55b2	Fix: update litellm dependency version	2025-11-05 12:40:44 +02:00
Ahmed Allam	b5ddba3867	docs: Update README	2025-11-05 01:21:48 +02:00
				`@@ -0,0 +1 @@`
				`"""Pytest configuration and shared fixtures for Strix tests."""`