chore: Bump strix version to 0.6.0

feat: modernize TUI status bar with sweep animation
- Replace braille spinner with ping-pong sweep animation using colored squares - Add smooth gradient fade with 8 color steps from dim to bright green - Modernize keymap styling: keys in white, actions in dim, separated by · - Move "esc stop" to left side next to animation - Change ctrl-c to ctrl-q for quit - Simplify "Initializing Agent" to just "Initializing" - Remove italic styling from status text - Waiting state shows only "Send message to resume" hint - Remove unused action verbs and related dead code Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-12 09:19:19 -08:00 · 2026-01-11 23:54:24 -08:00 · 2026-01-10 15:53:10 -08:00 · 2026-01-10 15:49:03 -08:00 · 2026-01-10 15:49:03 -08:00 · 2026-01-10 15:49:03 -08:00
114 changed files with 9083 additions and 2538 deletions
--- a/.github/workflows/build-release.yml
+++ b/.github/workflows/build-release.yml
@@ -0,0 +1,78 @@
+name: Build & Release
+
+on:
+  push:
+    tags:
+      - 'v*'
+  workflow_dispatch:
+
+jobs:
+  build:
+    strategy:
+      fail-fast: false
+      matrix:
+        include:
+          - os: macos-latest
+            target: macos-arm64
+          - os: macos-15-intel
+            target: macos-x86_64
+          - os: ubuntu-latest
+            target: linux-x86_64
+          - os: windows-latest
+            target: windows-x86_64
+
+    runs-on: ${{ matrix.os }}
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+
+      - uses: snok/install-poetry@v1
+
+      - name: Build
+        shell: bash
+        run: |
+          poetry install --with dev
+          poetry run pyinstaller strix.spec --noconfirm
+
+          VERSION=$(poetry version -s)
+          mkdir -p dist/release
+
+          if [[ "${{ runner.os }}" == "Windows" ]]; then
+            cp dist/strix.exe "dist/release/strix-${VERSION}-${{ matrix.target }}.exe"
+            (cd dist/release && 7z a "strix-${VERSION}-${{ matrix.target }}.zip" "strix-${VERSION}-${{ matrix.target }}.exe")
+          else
+            cp dist/strix "dist/release/strix-${VERSION}-${{ matrix.target }}"
+            chmod +x "dist/release/strix-${VERSION}-${{ matrix.target }}"
+            tar -C dist/release -czvf "dist/release/strix-${VERSION}-${{ matrix.target }}.tar.gz" "strix-${VERSION}-${{ matrix.target }}"
+          fi
+
+      - uses: actions/upload-artifact@v4
+        with:
+          name: strix-${{ matrix.target }}
+          path: |
+            dist/release/*.tar.gz
+            dist/release/*.zip
+          if-no-files-found: error
+
+  release:
+    needs: build
+    runs-on: ubuntu-latest
+    permissions:
+      contents: write
+
+    steps:
+      - uses: actions/download-artifact@v4
+        with:
+          path: release
+          merge-multiple: true
+
+      - name: Create Release
+        uses: softprops/action-gh-release@v2
+        with:
+          prerelease: ${{ !startsWith(github.ref, 'refs/tags/') }}
+          generate_release_notes: true
+          files: release/*
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -39,14 +39,14 @@ Thank you for your interest in contributing to Strix! This guide will help you g
   poetry run strix --target https://example.com
   ```

-## 📚 Contributing Prompt Modules
+## 📚 Contributing Skills

-Prompt modules are specialized knowledge packages that enhance agent capabilities. See [strix/prompts/README.md](strix/prompts/README.md) for detailed guidelines.
+Skills are specialized knowledge packages that enhance agent capabilities. See [strix/skills/README.md](strix/skills/README.md) for detailed guidelines.

 ### Quick Guide

 1. **Choose the right category** (`/vulnerabilities`, `/frameworks`, `/technologies`, etc.)
-2. **Create a** `.jinja` file with your prompts
+2. **Create a** `.jinja` file with your skill content
 3. **Include practical examples** - Working payloads, commands, or test cases
 4. **Provide validation methods** - How to confirm findings and avoid false positives
 5. **Submit via PR** with clear description
--- a/README.md
+++ b/README.md
@@ -1,5 +1,5 @@
 <p align="center">
-  <a href="https://usestrix.com/">
+  <a href="https://strix.ai/">
    <img src=".github/logo.png" width="150" alt="Strix Logo">
  </a>
 </p>
@@ -12,15 +12,18 @@

 [![Python](https://img.shields.io/pypi/pyversions/strix-agent?color=3776AB)](https://pypi.org/project/strix-agent/)
 [![PyPI](https://img.shields.io/pypi/v/strix-agent?color=10b981)](https://pypi.org/project/strix-agent/)
-[![PyPI Downloads](https://static.pepy.tech/personalized-badge/strix-agent?period=total&units=INTERNATIONAL_SYSTEM&left_color=GREY&right_color=RED&left_text=Downloads)](https://pepy.tech/projects/strix-agent)
 [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
+[![Docs](https://img.shields.io/badge/Docs-docs.strix.ai-10b981.svg)](https://docs.strix.ai)

 [![GitHub Stars](https://img.shields.io/github/stars/usestrix/strix)](https://github.com/usestrix/strix)
 [![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.gg/YjKFvEZSdZ)
-[![Website](https://img.shields.io/badge/Website-usestrix.com-2d3748.svg)](https://usestrix.com)
+[![Website](https://img.shields.io/badge/Website-strix.ai-2d3748.svg)](https://strix.ai)

 <a href="https://trendshift.io/repositories/15362" target="_blank"><img src="https://trendshift.io/api/badge/repositories/15362" alt="usestrix%2Fstrix | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

+
+[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/usestrix/strix)
+
 </div>

 <br>
@@ -62,13 +65,15 @@ Strix are autonomous AI agents that act just like real hackers - they run your c

 **Prerequisites:**
 - Docker (running)
- Python 3.12+
 - An LLM provider key (e.g. [get OpenAI API key](https://platform.openai.com/api-keys) or use a local LLM)

 ### Installation & First Scan

 ```bash
 # Install Strix
+curl -sSL https://strix.ai/install | bash
+
+# Or via pipx
 pipx install strix-agent

 # Configure your AI provider
@@ -84,7 +89,7 @@ strix --target ./app-directory

 ## ☁️ Run Strix in Cloud

-Want to skip the local setup, API keys, and unpredictable LLM costs? Run the hosted cloud version of Strix at **[app.usestrix.com](https://app.usestrix.com)**.
+Want to skip the local setup, API keys, and unpredictable LLM costs? Run the hosted cloud version of Strix at **[app.strix.ai](https://strix.ai)**.

 Launch a scan in just a few minutes—no setup or configuration required—and you’ll get:

@@ -93,7 +98,7 @@ Launch a scan in just a few minutes—no setup or configuration required—and y
 - **CI/CD and GitHub integrations** to block risky changes before production
 - **Continuous monitoring** so new vulnerabilities are caught quickly

-[**Run your first pentest now →**](https://app.usestrix.com)
+[**Run your first pentest now →**](https://strix.ai)

 ---

@@ -159,6 +164,9 @@ strix -t https://github.com/org/app -t https://your-app.com

 # Focused testing with custom instructions
 strix --target api.your-app.com --instruction "Focus on business logic flaws and IDOR vulnerabilities"
+
+# Provide detailed instructions through file (e.g., rules of engagement, scope, exclusions)
+strix --target api.your-app.com --instruction-file ./instruction.md
 ```

 ### 🤖 Headless Mode
@@ -183,17 +191,17 @@ jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6

      - name: Install Strix
-        run: pipx install strix-agent
+        run: curl -sSL https://strix.ai/install | bash

      - name: Run Strix
        env:
          STRIX_LLM: ${{ secrets.STRIX_LLM }}
          LLM_API_KEY: ${{ secrets.LLM_API_KEY }}

-        run: strix -n -t ./
+        run: strix -n -t ./ --scan-mode quick
 ```

 ### ⚙️ Configuration
@@ -205,27 +213,27 @@ export LLM_API_KEY="your-api-key"
 # Optional
 export LLM_API_BASE="your-api-base-url"  # if using a local model, e.g. Ollama, LMStudio
 export PERPLEXITY_API_KEY="your-api-key"  # for search capabilities
+export STRIX_REASONING_EFFORT="high"  # control thinking effort (default: high, quick scan: medium)
 ```

-[OpenAI's GPT-5](https://openai.com/api/) (`openai/gpt-5`) and [Anthropic's Claude Sonnet 4.5](https://claude.com/platform/api) (`anthropic/claude-sonnet-4-5`) are the recommended models for best results with Strix. We also support many [other options](https://docs.litellm.ai/docs/providers), including cloud and local models, though their performance and reliability may vary.
+> [!NOTE]
+> Strix automatically saves your configuration to `~/.strix/cli-config.json`, so you don't have to re-enter it on every run.
+
+**Recommended models for best results:**
+
+- [OpenAI GPT-5](https://openai.com/api/) — `openai/gpt-5`
+- [Anthropic Claude Sonnet 4.5](https://claude.com/platform/api) — `anthropic/claude-sonnet-4-5`
+- [Google Gemini 3 Pro Preview](https://cloud.google.com/vertex-ai) — `vertex_ai/gemini-3-pro-preview`
+
+See the [LLM Providers documentation](https://docs.strix.ai/llm-providers/overview) for all supported providers including Vertex AI, Bedrock, Azure, and local models.
+
+## 📚 Documentation
+
+Full documentation is available at **[docs.strix.ai](https://docs.strix.ai)** — including detailed guides for usage, CI/CD integrations, skills, and advanced configuration.

 ## 🤝 Contributing

-We welcome contributions from the community! There are several ways to contribute:
-
-### Code Contributions
-See our [Contributing Guide](CONTRIBUTING.md) for details on:
- Setting up your development environment
- Running tests and quality checks
- Submitting pull requests
- Code style guidelines
-
-
-### Prompt Modules Collection
-Help expand our collection of specialized prompt modules for AI agents:
- Advanced testing techniques for vulnerabilities, frameworks, and technologies
- See [Prompt Modules Documentation](strix/prompts/README.md) for guidelines
- Submit via [pull requests](https://github.com/usestrix/strix/pulls) or [issues](https://github.com/usestrix/strix/issues)
+We welcome contributions of code, docs, and new skills - check out our [Contributing Guide](https://docs.strix.ai/contributing) to get started or open a [pull request](https://github.com/usestrix/strix/pulls)/[issue](https://github.com/usestrix/strix/issues).

 ## 👥 Join Our Community

@@ -234,6 +242,10 @@ Have questions? Found a bug? Want to contribute? **[Join our Discord!](https://d
 ## 🌟 Support the Project

 **Love Strix?** Give us a ⭐ on GitHub!
+## 🙏 Acknowledgements
+
+Strix builds on the incredible work of open-source projects like [LiteLLM](https://github.com/BerriAI/litellm), [Caido](https://github.com/caido/caido), [ProjectDiscovery](https://github.com/projectdiscovery), [Playwright](https://github.com/microsoft/playwright), and [Textual](https://github.com/Textualize/textual). Huge thanks to their maintainers!
+

 > [!WARNING]
 > Only test apps you own or have permission to test. You are responsible for using Strix ethically and legally.
--- a/containers/Dockerfile
+++ b/containers/Dockerfile
@@ -40,10 +40,11 @@ RUN apt-get update && \
    gdb \
    tmux \
    libnss3 libnspr4 libdbus-1-3 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 libatspi2.0-0 \
-    libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libxkbcommon0 libpango-1.0-0 libcairo2 libasound2 \
+    libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libxkbcommon0 libpango-1.0-0 libcairo2 libasound2t64 \
    fonts-unifont fonts-noto-color-emoji fonts-freefont-ttf fonts-dejavu-core ttf-bitstream-vera \
    libnss3-tools

+
 RUN setcap cap_net_raw,cap_net_admin,cap_net_bind_service+eip $(which nmap)

 USER pentester
@@ -158,7 +159,7 @@ RUN mkdir -p /workspace && chown -R pentester:pentester /workspace /app
 COPY pyproject.toml poetry.lock ./

 USER pentester
-RUN poetry install --no-root --without dev
+RUN poetry install --no-root --without dev --extras sandbox
 RUN poetry run playwright install chromium

 RUN /app/venv/bin/pip install -r /home/pentester/tools/jwt_tool/requirements.txt && \
--- a/poetry.lock
+++ b/poetry.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "strix-agent"
-version = "0.4.0"
+version = "0.6.0"
 description = "Open-source AI Hackers for your apps"
 authors = ["Strix <hi@usestrix.com>"]
 readme = "README.md"
@@ -26,6 +26,8 @@ classifiers = [
  "Programming Language :: Python :: 3",
  "Programming Language :: Python :: 3 :: Only",
  "Programming Language :: Python :: 3.12",
+  "Programming Language :: Python :: 3.13",
+  "Programming Language :: Python :: 3.14",
 ]
 packages = [
  { include = "strix", format = ["sdist", "wheel"] }
@@ -43,24 +45,34 @@ strix = "strix.interface.main:main"

 [tool.poetry.dependencies]
 python = "^3.12"
-fastapi = "*"
-uvicorn = "*"
-litellm = { version = "~1.79.1", extras = ["proxy"] }
-openai = ">=1.99.5,<1.100.0"
+# Core CLI dependencies
+litellm = { version = "~1.80.7", extras = ["proxy"] }
 tenacity = "^9.0.0"
-numpydoc = "^1.8.0"
 pydantic = {extras = ["email"], version = "^2.11.3"}
-ipython = "^9.3.0"
-openhands-aci = "^0.3.0"
-playwright = "^1.48.0"
 rich = "*"
 docker = "^7.1.0"
-gql = {extras = ["requests"], version = "^3.5.3"}
 textual = "^4.0.0"
 xmltodict = "^0.13.0"
-pyte = "^0.8.1"
 requests = "^2.32.0"
-libtmux = "^0.46.2"
+cvss = "^3.2"
+
+# Optional LLM provider dependencies
+google-cloud-aiplatform = { version = ">=1.38", optional = true }
+
+# Sandbox-only dependencies (only needed inside Docker container)
+fastapi = { version = "*", optional = true }
+uvicorn = { version = "*", optional = true }
+ipython = { version = "^9.3.0", optional = true }
+openhands-aci = { version = "^0.3.0", optional = true }
+playwright = { version = "^1.48.0", optional = true }
+gql = { version = "^3.5.3", extras = ["requests"], optional = true }
+pyte = { version = "^0.8.1", optional = true }
+libtmux = { version = "^0.46.2", optional = true }
+numpydoc = { version = "^1.8.0", optional = true }
+
+[tool.poetry.extras]
+vertex = ["google-cloud-aiplatform"]
+sandbox = ["fastapi", "uvicorn", "ipython", "openhands-aci", "playwright", "gql", "pyte", "libtmux", "numpydoc"]

 [tool.poetry.group.dev.dependencies]
 # Type checking and static analysis
@@ -81,6 +93,9 @@ pre-commit = "^4.2.0"
 black = "^25.1.0"
 isort = "^6.0.1"

+# Build tools
+pyinstaller = { version = "^6.17.0", python = ">=3.12,<3.15" }
+
 [build-system]
 requires = ["poetry-core"]
 build-backend = "poetry.core.masonry.api"
@@ -129,9 +144,16 @@ module = [
    "textual.*",
    "pyte.*",
    "libtmux.*",
+    "pytest.*",
+    "cvss.*",
 ]
 ignore_missing_imports = true

+# Relax strict rules for test files (pytest decorators are not fully typed)
+[[tool.mypy.overrides]]
+module = ["tests.*"]
+disallow_untyped_decorators = false
+
 # ============================================================================
 # Ruff Configuration (Fast Python Linter & Formatter)
 # ============================================================================
@@ -321,7 +343,6 @@ addopts = [
    "--cov-report=term-missing",
    "--cov-report=html",
    "--cov-report=xml",
-    "--cov-fail-under=80"
 ]
 testpaths = ["tests"]
 python_files = ["test_*.py", "*_test.py"]
--- a/scripts/build.sh
+++ b/scripts/build.sh
@@ -0,0 +1,98 @@
+#!/bin/bash
+set -e
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
+
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+
+echo -e "${BLUE}🦉 Strix Build Script${NC}"
+echo "================================"
+
+OS="$(uname -s)"
+ARCH="$(uname -m)"
+
+case "$OS" in
+    Linux*)     OS_NAME="linux";;
+    Darwin*)    OS_NAME="macos";;
+    MINGW*|MSYS*|CYGWIN*) OS_NAME="windows";;
+    *)          OS_NAME="unknown";;
+esac
+
+case "$ARCH" in
+    x86_64|amd64)   ARCH_NAME="x86_64";;
+    arm64|aarch64)  ARCH_NAME="arm64";;
+    *)              ARCH_NAME="$ARCH";;
+esac
+
+echo -e "${YELLOW}Platform:${NC} $OS_NAME-$ARCH_NAME"
+
+cd "$PROJECT_ROOT"
+
+if ! command -v poetry &> /dev/null; then
+    echo -e "${RED}Error: Poetry is not installed${NC}"
+    echo "Please install Poetry first: https://python-poetry.org/docs/#installation"
+    exit 1
+fi
+
+echo -e "\n${BLUE}Installing dependencies...${NC}"
+poetry install --with dev
+
+VERSION=$(poetry version -s)
+echo -e "${YELLOW}Version:${NC} $VERSION"
+
+echo -e "\n${BLUE}Cleaning previous builds...${NC}"
+rm -rf build/ dist/
+
+echo -e "\n${BLUE}Building binary with PyInstaller...${NC}"
+poetry run pyinstaller strix.spec --noconfirm
+
+RELEASE_DIR="dist/release"
+mkdir -p "$RELEASE_DIR"
+
+BINARY_NAME="strix-${VERSION}-${OS_NAME}-${ARCH_NAME}"
+
+if [ "$OS_NAME" = "windows" ]; then
+    if [ ! -f "dist/strix.exe" ]; then
+        echo -e "${RED}Build failed: Binary not found${NC}"
+        exit 1
+    fi
+    BINARY_NAME="${BINARY_NAME}.exe"
+    cp "dist/strix.exe" "$RELEASE_DIR/$BINARY_NAME"
+    echo -e "\n${BLUE}Creating zip...${NC}"
+    ARCHIVE_NAME="${BINARY_NAME%.exe}.zip"
+
+    if command -v 7z &> /dev/null; then
+        7z a "$RELEASE_DIR/$ARCHIVE_NAME" "$RELEASE_DIR/$BINARY_NAME"
+    else
+        powershell -Command "Compress-Archive -Path '$RELEASE_DIR/$BINARY_NAME' -DestinationPath '$RELEASE_DIR/$ARCHIVE_NAME'"
+    fi
+    echo -e "${GREEN}Created:${NC} $RELEASE_DIR/$ARCHIVE_NAME"
+else
+    if [ ! -f "dist/strix" ]; then
+        echo -e "${RED}Build failed: Binary not found${NC}"
+        exit 1
+    fi
+    cp "dist/strix" "$RELEASE_DIR/$BINARY_NAME"
+    chmod +x "$RELEASE_DIR/$BINARY_NAME"
+    echo -e "\n${BLUE}Creating tarball...${NC}"
+    ARCHIVE_NAME="${BINARY_NAME}.tar.gz"
+    tar -czvf "$RELEASE_DIR/$ARCHIVE_NAME" -C "$RELEASE_DIR" "$BINARY_NAME"
+    echo -e "${GREEN}Created:${NC} $RELEASE_DIR/$ARCHIVE_NAME"
+fi
+
+echo -e "\n${GREEN}Build successful!${NC}"
+echo "================================"
+echo -e "${YELLOW}Binary:${NC} $RELEASE_DIR/$BINARY_NAME"
+
+SIZE=$(ls -lh "$RELEASE_DIR/$BINARY_NAME" | awk '{print $5}')
+echo -e "${YELLOW}Size:${NC} $SIZE"
+
+echo -e "\n${BLUE}Testing binary...${NC}"
+"$RELEASE_DIR/$BINARY_NAME" --help > /dev/null 2>&1 && echo -e "${GREEN}Binary test passed!${NC}" || echo -e "${RED}Binary test failed${NC}"
+
+echo -e "\n${GREEN}Done!${NC}"
--- a/scripts/install.sh
+++ b/scripts/install.sh
@@ -0,0 +1,328 @@
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+APP=strix
+REPO="usestrix/strix"
+STRIX_IMAGE="ghcr.io/usestrix/strix-sandbox:0.1.10"
+
+MUTED='\033[0;2m'
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+CYAN='\033[0;36m'
+NC='\033[0m'
+
+requested_version=${VERSION:-}
+SKIP_DOWNLOAD=false
+
+raw_os=$(uname -s)
+os=$(echo "$raw_os" | tr '[:upper:]' '[:lower:]')
+case "$raw_os" in
+  Darwin*) os="macos" ;;
+  Linux*) os="linux" ;;
+  MINGW*|MSYS*|CYGWIN*) os="windows" ;;
+esac
+
+arch=$(uname -m)
+if [[ "$arch" == "aarch64" ]]; then
+  arch="arm64"
+fi
+if [[ "$arch" == "x86_64" ]]; then
+  arch="x86_64"
+fi
+
+if [ "$os" = "macos" ] && [ "$arch" = "x86_64" ]; then
+  rosetta_flag=$(sysctl -n sysctl.proc_translated 2>/dev/null || echo 0)
+  if [ "$rosetta_flag" = "1" ]; then
+    arch="arm64"
+  fi
+fi
+
+combo="$os-$arch"
+case "$combo" in
+  linux-x86_64|macos-x86_64|macos-arm64|windows-x86_64)
+    ;;
+  *)
+    echo -e "${RED}Unsupported OS/Arch: $os/$arch${NC}"
+    exit 1
+    ;;
+esac
+
+archive_ext=".tar.gz"
+if [ "$os" = "windows" ]; then
+  archive_ext=".zip"
+fi
+
+target="$os-$arch"
+
+if [ "$os" = "linux" ]; then
+    if ! command -v tar >/dev/null 2>&1; then
+         echo -e "${RED}Error: 'tar' is required but not installed.${NC}"
+         exit 1
+    fi
+fi
+
+if [ "$os" = "windows" ]; then
+    if ! command -v unzip >/dev/null 2>&1; then
+        echo -e "${RED}Error: 'unzip' is required but not installed.${NC}"
+        exit 1
+    fi
+fi
+
+INSTALL_DIR=$HOME/.strix/bin
+mkdir -p "$INSTALL_DIR"
+
+if [ -z "$requested_version" ]; then
+    specific_version=$(curl -s "https://api.github.com/repos/$REPO/releases/latest" | sed -n 's/.*"tag_name": *"v\([^"]*\)".*/\1/p')
+    if [[ $? -ne 0 || -z "$specific_version" ]]; then
+        echo -e "${RED}Failed to fetch version information${NC}"
+        exit 1
+    fi
+else
+    specific_version=$requested_version
+fi
+
+filename="$APP-${specific_version}-${target}${archive_ext}"
+url="https://github.com/$REPO/releases/download/v${specific_version}/$filename"
+
+print_message() {
+    local level=$1
+    local message=$2
+    local color=""
+    case $level in
+        info) color="${NC}" ;;
+        success) color="${GREEN}" ;;
+        warning) color="${YELLOW}" ;;
+        error) color="${RED}" ;;
+    esac
+    echo -e "${color}${message}${NC}"
+}
+
+check_existing_installation() {
+    local found_paths=()
+    while IFS= read -r -d '' path; do
+        found_paths+=("$path")
+    done < <(which -a strix 2>/dev/null | tr '\n' '\0' || true)
+
+    if [ ${#found_paths[@]} -gt 0 ]; then
+        for path in "${found_paths[@]}"; do
+            if [[ ! -e "$path" ]] || [[ "$path" == "$INSTALL_DIR/strix"* ]]; then
+                continue
+            fi
+
+            if [[ -n "$path" ]]; then
+                echo -e "${MUTED}Found existing strix at: ${NC}$path"
+
+                if [[ "$path" == *".local/bin"* ]]; then
+                    echo -e "${MUTED}Removing old pipx installation...${NC}"
+                    if command -v pipx >/dev/null 2>&1; then
+                        pipx uninstall strix-agent 2>/dev/null || true
+                    fi
+                    rm -f "$path" 2>/dev/null || true
+                elif [[ -L "$path" || -f "$path" ]]; then
+                    echo -e "${MUTED}Removing old installation...${NC}"
+                    rm -f "$path" 2>/dev/null || true
+                fi
+            fi
+        done
+    fi
+}
+
+check_version() {
+    check_existing_installation
+
+    if [[ -x "$INSTALL_DIR/strix" ]]; then
+        installed_version=$("$INSTALL_DIR/strix" --version 2>/dev/null | awk '{print $2}' || echo "")
+        if [[ "$installed_version" == "$specific_version" ]]; then
+            print_message info "${GREEN}✓ Strix ${NC}$specific_version${GREEN} already installed${NC}"
+            SKIP_DOWNLOAD=true
+        elif [[ -n "$installed_version" ]]; then
+            print_message info "${MUTED}Installed: ${NC}$installed_version ${MUTED}→ Upgrading to ${NC}$specific_version"
+        fi
+    fi
+}
+
+download_and_install() {
+    print_message info "\n${CYAN}🦉 Installing Strix${NC} ${MUTED}version: ${NC}$specific_version"
+    print_message info "${MUTED}Platform: ${NC}$target\n"
+
+    local tmp_dir=$(mktemp -d)
+    cd "$tmp_dir"
+
+    echo -e "${MUTED}Downloading...${NC}"
+    curl -# -L -o "$filename" "$url"
+
+    if [ ! -f "$filename" ]; then
+        echo -e "${RED}Download failed${NC}"
+        exit 1
+    fi
+
+    echo -e "${MUTED}Extracting...${NC}"
+    if [ "$os" = "windows" ]; then
+        unzip -q "$filename"
+        mv "strix-${specific_version}-${target}.exe" "$INSTALL_DIR/strix.exe"
+    else
+        tar -xzf "$filename"
+        mv "strix-${specific_version}-${target}" "$INSTALL_DIR/strix"
+        chmod 755 "$INSTALL_DIR/strix"
+    fi
+
+    cd - > /dev/null
+    rm -rf "$tmp_dir"
+
+    echo -e "${GREEN}✓ Strix installed to $INSTALL_DIR${NC}"
+}
+
+check_docker() {
+    echo ""
+    if ! command -v docker >/dev/null 2>&1; then
+        echo -e "${YELLOW}⚠ Docker not found${NC}"
+        echo -e "${MUTED}Strix requires Docker to run the security sandbox.${NC}"
+        echo -e "${MUTED}Please install Docker: ${NC}https://docs.docker.com/get-docker/"
+        echo ""
+        return 1
+    fi
+
+    if ! docker info >/dev/null 2>&1; then
+        echo -e "${YELLOW}⚠ Docker daemon not running${NC}"
+        echo -e "${MUTED}Please start Docker and run: ${NC}docker pull $STRIX_IMAGE"
+        echo ""
+        return 1
+    fi
+
+    echo -e "${MUTED}Checking for sandbox image...${NC}"
+    if docker image inspect "$STRIX_IMAGE" >/dev/null 2>&1; then
+        echo -e "${GREEN}✓ Sandbox image already available${NC}"
+    else
+        echo -e "${MUTED}Pulling sandbox image (this may take a few minutes)...${NC}"
+        if docker pull "$STRIX_IMAGE"; then
+            echo -e "${GREEN}✓ Sandbox image pulled successfully${NC}"
+        else
+            echo -e "${YELLOW}⚠ Failed to pull sandbox image${NC}"
+            echo -e "${MUTED}You can pull it manually later: ${NC}docker pull $STRIX_IMAGE"
+        fi
+    fi
+    return 0
+}
+
+add_to_path() {
+    local config_file=$1
+    local command=$2
+    if grep -Fxq "$command" "$config_file" 2>/dev/null; then
+        return 0
+    elif [[ -w $config_file ]]; then
+        echo -e "\n# strix" >> "$config_file"
+        echo "$command" >> "$config_file"
+    fi
+}
+
+setup_path() {
+    XDG_CONFIG_HOME=${XDG_CONFIG_HOME:-$HOME/.config}
+    current_shell=$(basename "$SHELL")
+
+    case $current_shell in
+        fish)
+            config_files="$HOME/.config/fish/config.fish"
+            ;;
+        zsh)
+            config_files="$HOME/.zshrc $HOME/.zshenv"
+            ;;
+        bash)
+            config_files="$HOME/.bashrc $HOME/.bash_profile $HOME/.profile"
+            ;;
+        *)
+            config_files="$HOME/.bashrc $HOME/.profile"
+            ;;
+    esac
+
+    config_file=""
+    for file in $config_files; do
+        if [[ -f $file ]]; then
+            config_file=$file
+            break
+        fi
+    done
+
+    if [[ -z $config_file ]]; then
+        config_file="$HOME/.bashrc"
+        touch "$config_file"
+    fi
+
+    if [[ ":$PATH:" != *":$INSTALL_DIR:"* ]]; then
+        case $current_shell in
+            fish)
+                add_to_path "$config_file" "fish_add_path $INSTALL_DIR"
+                ;;
+            *)
+                add_to_path "$config_file" "export PATH=\"$INSTALL_DIR:\$PATH\""
+                ;;
+        esac
+    fi
+
+    if [ -n "${GITHUB_ACTIONS-}" ] && [ "${GITHUB_ACTIONS}" == "true" ]; then
+        echo "$INSTALL_DIR" >> "$GITHUB_PATH"
+    fi
+}
+
+verify_installation() {
+    export PATH="$INSTALL_DIR:$PATH"
+
+    local which_strix=$(which strix 2>/dev/null || echo "")
+
+    if [[ "$which_strix" != "$INSTALL_DIR/strix" && "$which_strix" != "$INSTALL_DIR/strix.exe" ]]; then
+        if [[ -n "$which_strix" ]]; then
+            echo -e "${YELLOW}⚠ Found conflicting strix at: ${NC}$which_strix"
+            echo -e "${MUTED}Attempting to remove...${NC}"
+
+            if rm -f "$which_strix" 2>/dev/null; then
+                echo -e "${GREEN}✓ Removed conflicting installation${NC}"
+            else
+                echo -e "${YELLOW}Could not remove automatically.${NC}"
+                echo -e "${MUTED}Please remove manually: ${NC}rm $which_strix"
+            fi
+        fi
+    fi
+
+    if [[ -x "$INSTALL_DIR/strix" ]]; then
+        local version=$("$INSTALL_DIR/strix" --version 2>/dev/null | awk '{print $2}' || echo "unknown")
+        echo -e "${GREEN}✓ Strix ${NC}$version${GREEN} ready${NC}"
+    fi
+}
+
+check_version
+if [ "$SKIP_DOWNLOAD" = false ]; then
+    download_and_install
+fi
+setup_path
+verify_installation
+check_docker
+
+echo ""
+echo -e "${CYAN}"
+echo "   ███████╗████████╗██████╗ ██╗██╗  ██╗"
+echo "   ██╔════╝╚══██╔══╝██╔══██╗██║╚██╗██╔╝"
+echo "   ███████╗   ██║   ██████╔╝██║ ╚███╔╝ "
+echo "   ╚════██║   ██║   ██╔══██╗██║ ██╔██╗ "
+echo "   ███████║   ██║   ██║  ██║██║██╔╝ ██╗"
+echo "   ╚══════╝   ╚═╝   ╚═╝  ╚═╝╚═╝╚═╝  ╚═╝"
+echo -e "${NC}"
+echo -e "${MUTED}  AI Penetration Testing Agent${NC}"
+echo ""
+echo -e "${MUTED}To get started:${NC}"
+echo ""
+echo -e "  ${CYAN}1.${NC} Set your LLM provider:"
+echo -e "     ${MUTED}export STRIX_LLM='openai/gpt-5'${NC}"
+echo -e "     ${MUTED}export LLM_API_KEY='your-api-key'${NC}"
+echo ""
+echo -e "  ${CYAN}2.${NC} Run a penetration test:"
+echo -e "     ${MUTED}strix --target https://example.com${NC}"
+echo ""
+echo -e "${MUTED}For more information visit ${NC}https://strix.ai"
+echo -e "${MUTED}Join our community ${NC}https://discord.gg/YjKFvEZSdZ"
+echo ""
+
+if [[ ":$PATH:" != *":$INSTALL_DIR:"* ]]; then
+    echo -e "${YELLOW}→${NC} Run ${MUTED}source ~/.$(basename $SHELL)rc${NC} or open a new terminal"
+    echo ""
+fi
--- a/strix.spec
+++ b/strix.spec
@@ -0,0 +1,221 @@
+# -*- mode: python ; coding: utf-8 -*-
+
+import sys
+from pathlib import Path
+from PyInstaller.utils.hooks import collect_data_files, collect_submodules
+
+project_root = Path(SPECPATH)
+strix_root = project_root / 'strix'
+
+datas = []
+
+for jinja_file in strix_root.rglob('*.jinja'):
+    rel_path = jinja_file.relative_to(project_root)
+    datas.append((str(jinja_file), str(rel_path.parent)))
+
+for xml_file in strix_root.rglob('*.xml'):
+    rel_path = xml_file.relative_to(project_root)
+    datas.append((str(xml_file), str(rel_path.parent)))
+
+for tcss_file in strix_root.rglob('*.tcss'):
+    rel_path = tcss_file.relative_to(project_root)
+    datas.append((str(tcss_file), str(rel_path.parent)))
+
+datas += collect_data_files('textual')
+
+datas += collect_data_files('tiktoken')
+datas += collect_data_files('tiktoken_ext')
+
+datas += collect_data_files('litellm')
+
+hiddenimports = [
+    # Core dependencies
+    'litellm',
+    'litellm.llms',
+    'litellm.llms.openai',
+    'litellm.llms.anthropic',
+    'litellm.llms.vertex_ai',
+    'litellm.llms.bedrock',
+    'litellm.utils',
+    'litellm.caching',
+
+    # Textual TUI
+    'textual',
+    'textual.app',
+    'textual.widgets',
+    'textual.containers',
+    'textual.screen',
+    'textual.binding',
+    'textual.reactive',
+    'textual.css',
+    'textual._text_area_theme',
+
+    # Rich console
+    'rich',
+    'rich.console',
+    'rich.panel',
+    'rich.text',
+    'rich.markup',
+    'rich.style',
+    'rich.align',
+    'rich.live',
+
+    # Pydantic
+    'pydantic',
+    'pydantic.fields',
+    'pydantic_core',
+    'email_validator',
+
+    # Docker
+    'docker',
+    'docker.api',
+    'docker.models',
+    'docker.errors',
+
+    # HTTP/Networking
+    'httpx',
+    'httpcore',
+    'requests',
+    'urllib3',
+    'certifi',
+
+    # Jinja2 templating
+    'jinja2',
+    'jinja2.ext',
+    'markupsafe',
+
+    # XML parsing
+    'xmltodict',
+
+    # Tiktoken (for token counting)
+    'tiktoken',
+    'tiktoken_ext',
+    'tiktoken_ext.openai_public',
+
+    # Tenacity retry
+    'tenacity',
+
+    # Strix modules
+    'strix',
+    'strix.interface',
+    'strix.interface.main',
+    'strix.interface.cli',
+    'strix.interface.tui',
+    'strix.interface.utils',
+    'strix.interface.tool_components',
+    'strix.agents',
+    'strix.agents.base_agent',
+    'strix.agents.state',
+    'strix.agents.StrixAgent',
+    'strix.llm',
+    'strix.llm.llm',
+    'strix.llm.config',
+    'strix.llm.utils',
+    'strix.llm.request_queue',
+    'strix.llm.memory_compressor',
+    'strix.runtime',
+    'strix.runtime.runtime',
+    'strix.runtime.docker_runtime',
+    'strix.telemetry',
+    'strix.telemetry.tracer',
+    'strix.tools',
+    'strix.tools.registry',
+    'strix.tools.executor',
+    'strix.tools.argument_parser',
+    'strix.skills',
+]
+
+hiddenimports += collect_submodules('litellm')
+hiddenimports += collect_submodules('textual')
+hiddenimports += collect_submodules('rich')
+hiddenimports += collect_submodules('pydantic')
+
+excludes = [
+    # Sandbox-only packages
+    'playwright',
+    'playwright.sync_api',
+    'playwright.async_api',
+    'IPython',
+    'ipython',
+    'libtmux',
+    'pyte',
+    'openhands_aci',
+    'openhands-aci',
+    'gql',
+    'fastapi',
+    'uvicorn',
+    'numpydoc',
+
+    # Google Cloud / Vertex AI
+    'google.cloud',
+    'google.cloud.aiplatform',
+    'google.api_core',
+    'google.auth',
+    'google.oauth2',
+    'google.protobuf',
+    'grpc',
+    'grpcio',
+    'grpcio_status',
+
+    # Test frameworks
+    'pytest',
+    'pytest_asyncio',
+    'pytest_cov',
+    'pytest_mock',
+
+    # Development tools
+    'mypy',
+    'ruff',
+    'black',
+    'isort',
+    'pylint',
+    'pyright',
+    'bandit',
+    'pre_commit',
+
+    # Unnecessary for runtime
+    'tkinter',
+    'matplotlib',
+    'numpy',
+    'pandas',
+    'scipy',
+    'PIL',
+    'cv2',
+]
+
+a = Analysis(
+    ['strix/interface/main.py'],
+    pathex=[str(project_root)],
+    binaries=[],
+    datas=datas,
+    hiddenimports=hiddenimports,
+    hookspath=[],
+    hooksconfig={},
+    runtime_hooks=[],
+    excludes=excludes,
+    noarchive=False,
+    optimize=0,
+)
+
+pyz = PYZ(a.pure)
+
+exe = EXE(
+    pyz,
+    a.scripts,
+    a.binaries,
+    a.datas,
+    [],
+    name='strix',
+    debug=False,
+    bootloader_ignore_signals=False,
+    strip=False,
+    upx=False,
+    upx_exclude=[],
+    runtime_tmpdir=None,
+    console=True,
+    disable_windowed_traceback=False,
+    argv_emulation=False,
+    target_arch=None,
+    codesign_identity=None,
+    entitlements_file=None,
+)
--- a/strix/agents/StrixAgent/strix_agent.py
+++ b/strix/agents/StrixAgent/strix_agent.py
@@ -8,13 +8,13 @@ class StrixAgent(BaseAgent):
    max_iterations = 300

    def __init__(self, config: dict[str, Any]):
-        default_modules = []
+        default_skills = []

        state = config.get("state")
        if state is None or (hasattr(state, "parent_id") and state.parent_id is None):
-            default_modules = ["root_agent"]
+            default_skills = ["root_agent"]

-        self.default_llm_config = LLMConfig(prompt_modules=default_modules)
+        self.default_llm_config = LLMConfig(skills=default_skills)

        super().__init__(config)

--- a/strix/agents/StrixAgent/system_prompt.jinja
+++ b/strix/agents/StrixAgent/system_prompt.jinja
@@ -10,8 +10,8 @@ You follow all instructions and rules provided to you exactly as written in the

 <communication_rules>
 CLI OUTPUT:
- Never use markdown formatting - you are a CLI agent
- Output plain text only (no **bold**, `code`, [links], # headers)
+- You may use simple markdown: **bold**, *italic*, `code`, ~~strikethrough~~, [links](url), and # headers
+- Do NOT use complex markdown like bullet lists, numbered lists, or tables
 - Use line breaks and indentation for structure
 - NEVER use "Strix" or any identifiable names/markers in HTTP requests, payloads, user-agents, or any inputs

@@ -134,6 +134,7 @@ VALIDATION REQUIREMENTS:
 - Keep going until you find something that matters
 - A vulnerability is ONLY considered reported when a reporting agent uses create_vulnerability_report with full details. Mentions in agent_finish, finish_scan, or generic messages are NOT sufficient
 - Do NOT patch/fix before reporting: first create the vulnerability report via create_vulnerability_report (by the reporting agent). Only after reporting is completed should fixing/patching proceed
+- DEDUPLICATION: The create_vulnerability_report tool uses LLM-based deduplication. If it rejects your report as a duplicate, DO NOT attempt to re-submit the same vulnerability. Accept the rejection and move on to testing other areas. The vulnerability has already been reported by another agent
 </execution_guidelines>

 <vulnerability_focus>
@@ -263,25 +264,25 @@ CRITICAL RULES:
 - **ONE AGENT = ONE TASK** - Don't let agents do multiple unrelated jobs
 - **SPAWN REACTIVELY** - Create new agents based on what you discover
 - **ONLY REPORTING AGENTS** can use create_vulnerability_report tool
- **AGENT SPECIALIZATION MANDATORY** - Each agent must be highly specialized; prefer 1–3 prompt modules, up to 5 for complex contexts
+- **AGENT SPECIALIZATION MANDATORY** - Each agent must be highly specialized; prefer 1–3 skills, up to 5 for complex contexts
 - **NO GENERIC AGENTS** - Avoid creating broad, multi-purpose agents that dilute focus

 AGENT SPECIALIZATION EXAMPLES:

 GOOD SPECIALIZATION:
- "SQLi Validation Agent" with prompt_modules: sql_injection
- "XSS Discovery Agent" with prompt_modules: xss
- "Auth Testing Agent" with prompt_modules: authentication_jwt, business_logic
- "SSRF + XXE Agent" with prompt_modules: ssrf, xxe, rce (related attack vectors)
+- "SQLi Validation Agent" with skills: sql_injection
+- "XSS Discovery Agent" with skills: xss
+- "Auth Testing Agent" with skills: authentication_jwt, business_logic
+- "SSRF + XXE Agent" with skills: ssrf, xxe, rce (related attack vectors)

 BAD SPECIALIZATION:
- "General Web Testing Agent" with prompt_modules: sql_injection, xss, csrf, ssrf, authentication_jwt (too broad)
- "Everything Agent" with prompt_modules: all available modules (completely unfocused)
- Any agent with more than 5 prompt modules (violates constraints)
+- "General Web Testing Agent" with skills: sql_injection, xss, csrf, ssrf, authentication_jwt (too broad)
+- "Everything Agent" with skills: all available skills (completely unfocused)
+- Any agent with more than 5 skills (violates constraints)

 FOCUS PRINCIPLES:
 - Each agent should have deep expertise in 1-3 related vulnerability types
- Agents with single modules have the deepest specialization
+- Agents with single skills have the deepest specialization
 - Related vulnerabilities (like SSRF+XXE or Auth+Business Logic) can be combined
 - Never create "kitchen sink" agents that try to do everything

@@ -323,7 +324,7 @@ Example (agent creation tool):
 <function=create_agent>
 <parameter=task>Perform targeted XSS testing on the search endpoint</parameter>
 <parameter=name>XSS Discovery Agent</parameter>
-<parameter=prompt_modules>xss</parameter>
+<parameter=skills>xss</parameter>
 </function>

 SPRAYING EXECUTION NOTE:
@@ -392,12 +393,12 @@ Directories:
 Default user: pentester (sudo available)
 </environment>

-{% if loaded_module_names %}
+{% if loaded_skill_names %}
 <specialized_knowledge>
-{# Dynamic prompt modules loaded based on agent specialization #}
+{# Dynamic skills loaded based on agent specialization #}

-{% for module_name in loaded_module_names %}
-{{ get_module(module_name) }}
+{% for skill_name in loaded_skill_names %}
+{{ get_skill(skill_name) }}

 {% endfor %}
 </specialized_knowledge>
--- a/strix/agents/base_agent.py
+++ b/strix/agents/base_agent.py
@@ -16,6 +16,7 @@ from jinja2 import (

 from strix.llm import LLM, LLMConfig, LLMRequestFailedError
 from strix.llm.utils import clean_content
+from strix.runtime import SandboxInitializationError
 from strix.tools import process_tool_invocations

 from .state import AgentState
@@ -145,18 +146,16 @@ class BaseAgent(metaclass=AgentMeta):
        if self.state.parent_id is None and agents_graph_actions._root_agent_id is None:
            agents_graph_actions._root_agent_id = self.state.agent_id

-    def cancel_current_execution(self) -> None:
-        if self._current_task and not self._current_task.done():
-            self._current_task.cancel()
-            self._current_task = None
-
    async def agent_loop(self, task: str) -> dict[str, Any]:  # noqa: PLR0912, PLR0915
-        await self._initialize_sandbox_and_state(task)
-
        from strix.telemetry.tracer import get_global_tracer

        tracer = get_global_tracer()

+        try:
+            await self._initialize_sandbox_and_state(task)
+        except SandboxInitializationError as e:
+            return self._handle_sandbox_error(e, tracer)
+
        while True:
            self._check_agent_messages(self.state)

@@ -204,7 +203,11 @@ class BaseAgent(metaclass=AgentMeta):
                self.state.add_message("user", final_warning_msg)

            try:
-                should_finish = await self._process_iteration(tracer)
+                iteration_task = asyncio.create_task(self._process_iteration(tracer))
+                self._current_task = iteration_task
+                should_finish = await iteration_task
+                self._current_task = None
+
                if should_finish:
                    if self.non_interactive:
                        self.state.set_completed({"success": True})
@@ -215,43 +218,22 @@ class BaseAgent(metaclass=AgentMeta):
                    continue

            except asyncio.CancelledError:
+                self._current_task = None
+                if tracer:
+                    partial_content = tracer.finalize_streaming_as_interrupted(self.state.agent_id)
+                    if partial_content and partial_content.strip():
+                        self.state.add_message(
+                            "assistant", f"{partial_content}\n\n[ABORTED BY USER]"
+                        )
                if self.non_interactive:
                    raise
                await self._enter_waiting_state(tracer, error_occurred=False, was_cancelled=True)
                continue

            except LLMRequestFailedError as e:
-                error_msg = str(e)
-                error_details = getattr(e, "details", None)
-                self.state.add_error(error_msg)
-
-                if self.non_interactive:
-                    self.state.set_completed({"success": False, "error": error_msg})
-                    if tracer:
-                        tracer.update_agent_status(self.state.agent_id, "failed", error_msg)
-                        if error_details:
-                            tracer.log_tool_execution_start(
-                                self.state.agent_id,
-                                "llm_error_details",
-                                {"error": error_msg, "details": error_details},
-                            )
-                            tracer.update_tool_execution(
-                                tracer._next_execution_id - 1, "failed", error_details
-                            )
-                    return {"success": False, "error": error_msg}
-
-                self.state.enter_waiting_state(llm_failed=True)
-                if tracer:
-                    tracer.update_agent_status(self.state.agent_id, "llm_failed", error_msg)
-                    if error_details:
-                        tracer.log_tool_execution_start(
-                            self.state.agent_id,
-                            "llm_error_details",
-                            {"error": error_msg, "details": error_details},
-                        )
-                        tracer.update_tool_execution(
-                            tracer._next_execution_id - 1, "failed", error_details
-                        )
+                result = self._handle_llm_error(e, tracer)
+                if result is not None:
+                    return result
                continue

            except (RuntimeError, ValueError, TypeError) as e:
@@ -269,7 +251,7 @@ class BaseAgent(metaclass=AgentMeta):

        if self.state.has_waiting_timeout():
            self.state.resume_from_waiting()
-            self.state.add_message("assistant", "Waiting timeout reached. Resuming execution.")
+            self.state.add_message("user", "Waiting timeout reached. Resuming execution.")

            from strix.telemetry.tracer import get_global_tracer

@@ -334,16 +316,22 @@ class BaseAgent(metaclass=AgentMeta):
        if not sandbox_mode and self.state.sandbox_id is None:
            from strix.runtime import get_runtime

-            runtime = get_runtime()
-            sandbox_info = await runtime.create_sandbox(
-                self.state.agent_id, self.state.sandbox_token, self.local_sources
-            )
-            self.state.sandbox_id = sandbox_info["workspace_id"]
-            self.state.sandbox_token = sandbox_info["auth_token"]
-            self.state.sandbox_info = sandbox_info
+            try:
+                runtime = get_runtime()
+                sandbox_info = await runtime.create_sandbox(
+                    self.state.agent_id, self.state.sandbox_token, self.local_sources
+                )
+                self.state.sandbox_id = sandbox_info["workspace_id"]
+                self.state.sandbox_token = sandbox_info["auth_token"]
+                self.state.sandbox_info = sandbox_info

-            if "agent_id" in sandbox_info:
-                self.state.sandbox_info["agent_id"] = sandbox_info["agent_id"]
+                if "agent_id" in sandbox_info:
+                    self.state.sandbox_info["agent_id"] = sandbox_info["agent_id"]
+            except Exception as e:
+                from strix.telemetry import posthog
+
+                posthog.error("sandbox_init_error", str(e))
+                raise

        if not self.state.task:
            self.state.task = task
@@ -351,9 +339,16 @@ class BaseAgent(metaclass=AgentMeta):
        self.state.add_message("user", task)

    async def _process_iteration(self, tracer: Optional["Tracer"]) -> bool:
-        response = await self.llm.generate(self.state.get_conversation_history())
+        final_response = None
+        async for response in self.llm.generate(self.state.get_conversation_history()):
+            final_response = response
+            if tracer and response.content:
+                tracer.update_streaming_content(self.state.agent_id, response.content)

-        content_stripped = (response.content or "").strip()
+        if final_response is None:
+            return False
+
+        content_stripped = (final_response.content or "").strip()

        if not content_stripped:
            corrective_message = (
@@ -369,17 +364,19 @@ class BaseAgent(metaclass=AgentMeta):
            self.state.add_message("user", corrective_message)
            return False

-        self.state.add_message("assistant", response.content)
+        thinking_blocks = getattr(final_response, "thinking_blocks", None)
+        self.state.add_message("assistant", final_response.content, thinking_blocks=thinking_blocks)
        if tracer:
+            tracer.clear_streaming_content(self.state.agent_id)
            tracer.log_chat_message(
-                content=clean_content(response.content),
+                content=clean_content(final_response.content),
                role="assistant",
                agent_id=self.state.agent_id,
            )

        actions = (
-            response.tool_invocations
-            if hasattr(response, "tool_invocations") and response.tool_invocations
+            final_response.tool_invocations
+            if hasattr(final_response, "tool_invocations") and final_response.tool_invocations
            else []
        )

@@ -420,18 +417,6 @@ class BaseAgent(metaclass=AgentMeta):

        return False

-    async def _handle_iteration_error(
-        self,
-        error: RuntimeError | ValueError | TypeError | asyncio.CancelledError,
-        tracer: Optional["Tracer"],
-    ) -> bool:
-        error_msg = f"Error in iteration {self.state.iteration}: {error!s}"
-        logger.exception(error_msg)
-        self.state.add_error(error_msg)
-        if tracer:
-            tracer.update_agent_status(self.state.agent_id, "error")
-        return True
-
    def _check_agent_messages(self, state: AgentState) -> None:  # noqa: PLR0912
        try:
            from strix.tools.agents_graph.agents_graph_actions import _agent_graph, _agent_messages
@@ -516,3 +501,90 @@ class BaseAgent(metaclass=AgentMeta):
            logger = logging.getLogger(__name__)
            logger.warning(f"Error checking agent messages: {e}")
            return
+
+    def _handle_sandbox_error(
+        self,
+        error: SandboxInitializationError,
+        tracer: Optional["Tracer"],
+    ) -> dict[str, Any]:
+        error_msg = str(error.message)
+        error_details = error.details
+        self.state.add_error(error_msg)
+
+        if self.non_interactive:
+            self.state.set_completed({"success": False, "error": error_msg})
+            if tracer:
+                tracer.update_agent_status(self.state.agent_id, "failed", error_msg)
+                if error_details:
+                    exec_id = tracer.log_tool_execution_start(
+                        self.state.agent_id,
+                        "sandbox_error_details",
+                        {"error": error_msg, "details": error_details},
+                    )
+                    tracer.update_tool_execution(exec_id, "failed", {"details": error_details})
+            return {"success": False, "error": error_msg, "details": error_details}
+
+        self.state.enter_waiting_state()
+        if tracer:
+            tracer.update_agent_status(self.state.agent_id, "sandbox_failed", error_msg)
+            if error_details:
+                exec_id = tracer.log_tool_execution_start(
+                    self.state.agent_id,
+                    "sandbox_error_details",
+                    {"error": error_msg, "details": error_details},
+                )
+                tracer.update_tool_execution(exec_id, "failed", {"details": error_details})
+
+        return {"success": False, "error": error_msg, "details": error_details}
+
+    def _handle_llm_error(
+        self,
+        error: LLMRequestFailedError,
+        tracer: Optional["Tracer"],
+    ) -> dict[str, Any] | None:
+        error_msg = str(error)
+        error_details = getattr(error, "details", None)
+        self.state.add_error(error_msg)
+
+        if self.non_interactive:
+            self.state.set_completed({"success": False, "error": error_msg})
+            if tracer:
+                tracer.update_agent_status(self.state.agent_id, "failed", error_msg)
+                if error_details:
+                    exec_id = tracer.log_tool_execution_start(
+                        self.state.agent_id,
+                        "llm_error_details",
+                        {"error": error_msg, "details": error_details},
+                    )
+                    tracer.update_tool_execution(exec_id, "failed", {"details": error_details})
+            return {"success": False, "error": error_msg}
+
+        self.state.enter_waiting_state(llm_failed=True)
+        if tracer:
+            tracer.update_agent_status(self.state.agent_id, "llm_failed", error_msg)
+            if error_details:
+                exec_id = tracer.log_tool_execution_start(
+                    self.state.agent_id,
+                    "llm_error_details",
+                    {"error": error_msg, "details": error_details},
+                )
+                tracer.update_tool_execution(exec_id, "failed", {"details": error_details})
+
+        return None
+
+    async def _handle_iteration_error(
+        self,
+        error: RuntimeError | ValueError | TypeError | asyncio.CancelledError,
+        tracer: Optional["Tracer"],
+    ) -> bool:
+        error_msg = f"Error in iteration {self.state.iteration}: {error!s}"
+        logger.exception(error_msg)
+        self.state.add_error(error_msg)
+        if tracer:
+            tracer.update_agent_status(self.state.agent_id, "error")
+        return True
+
+    def cancel_current_execution(self) -> None:
+        if self._current_task and not self._current_task.done():
+            self._current_task.cancel()
+        self._current_task = None
--- a/strix/agents/state.py
+++ b/strix/agents/state.py
@@ -43,8 +43,11 @@ class AgentState(BaseModel):
        self.iteration += 1
        self.last_updated = datetime.now(UTC).isoformat()

-    def add_message(self, role: str, content: Any) -> None:
-        self.messages.append({"role": role, "content": content})
+    def add_message(self, role: str, content: Any, thinking_blocks: list[dict[str, Any]] | None = None) -> None:
+        message = {"role": role, "content": content}
+        if thinking_blocks:
+            message["thinking_blocks"] = thinking_blocks
+        self.messages.append(message)
        self.last_updated = datetime.now(UTC).isoformat()

    def add_action(self, action: dict[str, Any]) -> None:
--- a/strix/config/init.py
+++ b/strix/config/init.py
@@ -0,0 +1,12 @@
+from strix.config.config import (
+    Config,
+    apply_saved_config,
+    save_current_config,
+)
+
+
+__all__ = [
+    "Config",
+    "apply_saved_config",
+    "save_current_config",
+]
--- a/strix/config/config.py
+++ b/strix/config/config.py
@@ -0,0 +1,131 @@
+import contextlib
+import json
+import os
+from pathlib import Path
+from typing import Any
+
+
+class Config:
+    """Configuration Manager for Strix."""
+
+    # LLM Configuration
+    strix_llm = None
+    llm_api_key = None
+    llm_api_base = None
+    openai_api_base = None
+    litellm_base_url = None
+    ollama_api_base = None
+    strix_reasoning_effort = "high"
+    llm_timeout = "300"
+    llm_rate_limit_delay = "4.0"
+    llm_rate_limit_concurrent = "1"
+
+    # Tool & Feature Configuration
+    perplexity_api_key = None
+    strix_disable_browser = "false"
+
+    # Runtime Configuration
+    strix_image = "ghcr.io/usestrix/strix-sandbox:0.1.10"
+    strix_runtime_backend = "docker"
+    strix_sandbox_execution_timeout = "500"
+    strix_sandbox_connect_timeout = "10"
+
+    # Telemetry
+    strix_telemetry = "1"
+
+    @classmethod
+    def _tracked_names(cls) -> list[str]:
+        return [
+            k
+            for k, v in vars(cls).items()
+            if not k.startswith("_") and k[0].islower() and (v is None or isinstance(v, str))
+        ]
+
+    @classmethod
+    def tracked_vars(cls) -> list[str]:
+        return [name.upper() for name in cls._tracked_names()]
+
+    @classmethod
+    def get(cls, name: str) -> str | None:
+        env_name = name.upper()
+        default = getattr(cls, name, None)
+        return os.getenv(env_name, default)
+
+    @classmethod
+    def config_dir(cls) -> Path:
+        return Path.home() / ".strix"
+
+    @classmethod
+    def config_file(cls) -> Path:
+        return cls.config_dir() / "cli-config.json"
+
+    @classmethod
+    def load(cls) -> dict[str, Any]:
+        path = cls.config_file()
+        if not path.exists():
+            return {}
+        try:
+            with path.open("r", encoding="utf-8") as f:
+                data: dict[str, Any] = json.load(f)
+                return data
+        except (json.JSONDecodeError, OSError):
+            return {}
+
+    @classmethod
+    def save(cls, config: dict[str, Any]) -> bool:
+        try:
+            cls.config_dir().mkdir(parents=True, exist_ok=True)
+            config_path = cls.config_file()
+            with config_path.open("w", encoding="utf-8") as f:
+                json.dump(config, f, indent=2)
+        except OSError:
+            return False
+        with contextlib.suppress(OSError):
+            config_path.chmod(0o600)  # may fail on Windows
+        return True
+
+    @classmethod
+    def apply_saved(cls) -> dict[str, str]:
+        saved = cls.load()
+        env_vars = saved.get("env", {})
+        applied = {}
+
+        for var_name, var_value in env_vars.items():
+            if var_name in cls.tracked_vars() and not os.getenv(var_name):
+                os.environ[var_name] = var_value
+                applied[var_name] = var_value
+
+        return applied
+
+    @classmethod
+    def capture_current(cls) -> dict[str, Any]:
+        env_vars = {}
+        for var_name in cls.tracked_vars():
+            value = os.getenv(var_name)
+            if value:
+                env_vars[var_name] = value
+        return {"env": env_vars}
+
+    @classmethod
+    def save_current(cls) -> bool:
+        existing = cls.load().get("env", {})
+        merged = dict(existing)
+
+        for var_name in cls.tracked_vars():
+            value = os.getenv(var_name)
+            if value is None:
+                pass
+            elif value == "":
+                merged.pop(var_name, None)
+            else:
+                merged[var_name] = value
+
+        return cls.save({"env": merged})
+
+
+def apply_saved_config() -> dict[str, str]:
+    return Config.apply_saved()
+
+
+def save_current_config() -> bool:
+    return Config.save_current()
--- a/strix/interface/assets/tui_styles.tcss
+++ b/strix/interface/assets/tui_styles.tcss
@@ -1,13 +1,14 @@
 Screen {
-    background: #1a1a1a;
+    background: #000000;
    color: #d4d4d4;
 }

 #splash_screen {
    height: 100%;
    width: 100%;
-    background: #1a1a1a;
+    background: #000000;
    color: #22c55e;
+    align: center middle;
    content-align: center middle;
    text-align: center;
 }
@@ -17,6 +18,7 @@ Screen {
    height: auto;
    background: transparent;
    text-align: center;
+    content-align: center middle;
    padding: 2;
 }

@@ -24,7 +26,7 @@ Screen {
    height: 100%;
    padding: 0;
    margin: 0;
-    background: #1a1a1a;
+    background: #000000;
 }

 #content_container {
@@ -39,10 +41,14 @@ Screen {
    margin-left: 1;
 }

+#sidebar.-hidden {
+    display: none;
+}
+
 #agents_tree {
    height: 1fr;
    background: transparent;
-    border: round #262626;
+    border: round #333333;
    border-title-color: #a8a29e;
    border-title-style: bold;
    padding: 1;
@@ -57,21 +63,135 @@ Screen {
    margin: 0;
 }

+#vulnerabilities_panel {
+    height: auto;
+    max-height: 12;
+    background: transparent;
+    padding: 0;
+    margin: 0;
+    border: round #333333;
+    overflow-y: auto;
+    scrollbar-background: #000000;
+    scrollbar-color: #333333;
+    scrollbar-corner-color: #000000;
+    scrollbar-size-vertical: 1;
+}
+
+#vulnerabilities_panel.hidden {
+    display: none;
+}
+
+.vuln-item {
+    height: auto;
+    width: 100%;
+    padding: 0 1;
+    background: transparent;
+    color: #d4d4d4;
+}
+
+.vuln-item:hover {
+    background: #1a1a1a;
+    color: #fafaf9;
+}
+
+VulnerabilityDetailScreen {
+    align: center middle;
+    background: #000000 80%;
+}
+
+#vuln_detail_dialog {
+    grid-size: 1;
+    grid-gutter: 1;
+    grid-rows: 1fr auto;
+    padding: 2 3;
+    width: 85%;
+    max-width: 110;
+    height: 85%;
+    max-height: 45;
+    border: solid #262626;
+    background: #0a0a0a;
+}
+
+#vuln_detail_scroll {
+    height: 1fr;
+    background: transparent;
+    scrollbar-background: #0a0a0a;
+    scrollbar-color: #404040;
+    scrollbar-corner-color: #0a0a0a;
+    scrollbar-size: 1 1;
+    padding-right: 1;
+}
+
+#vuln_detail_content {
+    width: 100%;
+    background: transparent;
+    padding: 0;
+}
+
+#vuln_detail_buttons {
+    width: 100%;
+    height: auto;
+    align: right middle;
+    padding-top: 1;
+    margin: 0;
+    border-top: solid #1a1a1a;
+}
+
+#copy_vuln_detail {
+    width: auto;
+    min-width: 12;
+    height: auto;
+    background: transparent;
+    color: #525252;
+    border: none;
+    text-style: none;
+    margin: 0 1;
+    padding: 0 2;
+}
+
+#close_vuln_detail {
+    width: auto;
+    min-width: 10;
+    height: auto;
+    background: transparent;
+    color: #a3a3a3;
+    border: none;
+    text-style: none;
+    margin: 0;
+    padding: 0 2;
+}
+
+#copy_vuln_detail:hover, #copy_vuln_detail:focus {
+    background: transparent;
+    color: #22c55e;
+    border: none;
+}
+
+#close_vuln_detail:hover, #close_vuln_detail:focus {
+    background: transparent;
+    color: #ffffff;
+    border: none;
+}
+
 #chat_area_container {
    width: 75%;
    background: transparent;
 }

+#chat_area_container.-full-width {
+    width: 100%;
+}
+
 #chat_history {
    height: 1fr;
    background: transparent;
-    border: round #1a1a1a;
+    border: round #0a0a0a;
    padding: 0;
    margin-bottom: 0;
    margin-right: 0;
-    scrollbar-background: #0f0f0f;
-    scrollbar-color: #262626;
-    scrollbar-corner-color: #0f0f0f;
+    scrollbar-background: #000000;
+    scrollbar-color: #1a1a1a;
+    scrollbar-corner-color: #000000;
    scrollbar-size: 1 1;
 }

@@ -93,7 +213,7 @@ Screen {
    color: #a3a3a3;
    text-align: left;
    content-align: left middle;
-    text-style: italic;
+    text-style: none;
    margin: 0;
    padding: 0;
 }
@@ -113,11 +233,11 @@ Screen {
 #chat_input_container {
    height: 3;
    background: transparent;
-    border: round #525252;
+    border: round #333333;
    margin-right: 0;
    padding: 0;
    layout: horizontal;
-    align-vertical: middle;
+    align-vertical: top;
 }

 #chat_input_container:focus-within {
@@ -134,7 +254,7 @@ Screen {
    height: 100%;
    padding: 0 0 0 1;
    color: #737373;
-    content-align-vertical: middle;
+    content-align-vertical: top;
 }

 #chat_history:focus {
@@ -144,7 +264,7 @@ Screen {
 #chat_input {
    width: 1fr;
    height: 100%;
-    background: #121212;
+    background: transparent;
    border: none;
    color: #d4d4d4;
    padding: 0;
@@ -155,6 +275,14 @@ Screen {
    border: none;
 }

+#chat_input .text-area--cursor-line {
+    background: transparent;
+}
+
+#chat_input:focus .text-area--cursor-line {
+    background: transparent;
+}
+
 #chat_input > .text-area--placeholder {
    color: #525252;
    text-style: italic;
@@ -198,39 +326,31 @@ Screen {
 }

 .tool-call {
-    margin: 0 !important;
-    margin-top: 0 !important;
-    margin-bottom: 0 !important;
+    margin-top: 1;
+    margin-bottom: 0;
    padding: 0 1;
-    background: #0a0a0a;
-    border: round #1a1a1a;
-    border-left: thick #f59e0b;
+    background: transparent;
+    border: none;
    width: 100%;
 }

 .tool-call.status-completed {
-    border-left: thick #22c55e;
-    background: #0d1f12;
-    margin: 0 !important;
-    margin-top: 0 !important;
-    margin-bottom: 0 !important;
+    background: transparent;
+    margin-top: 1;
+    margin-bottom: 0;
 }

 .tool-call.status-running {
-    border-left: thick #f59e0b;
-    background: #1f1611;
-    margin: 0 !important;
-    margin-top: 0 !important;
-    margin-bottom: 0 !important;
+    background: transparent;
+    margin-top: 1;
+    margin-bottom: 0;
 }

 .tool-call.status-failed,
 .tool-call.status-error {
-    border-left: thick #ef4444;
-    background: #1f0d0d;
-    margin: 0 !important;
-    margin-top: 0 !important;
-    margin-bottom: 0 !important;
+    background: transparent;
+    margin-top: 1;
+    margin-bottom: 0;
 }

 .browser-tool,
@@ -242,209 +362,54 @@ Screen {
 .notes-tool,
 .thinking-tool,
 .web-search-tool,
-.finish-tool,
-.reporting-tool,
 .scan-info-tool,
 .subagent-info-tool {
-    margin: 0 !important;
-    margin-top: 0 !important;
-    margin-bottom: 0 !important;
-}
-
-.browser-tool {
-    border-left: thick #06b6d4;
-}
-
-.browser-tool.status-completed {
-    border-left: thick #06b6d4;
-    background: transparent;
-    margin: 0 !important;
-    margin-top: 0 !important;
-    margin-bottom: 0 !important;
-}
-
-.browser-tool.status-running {
-    border-left: thick #0891b2;
-    background: transparent;
-    margin: 0 !important;
-    margin-top: 0 !important;
-    margin-bottom: 0 !important;
-}
-
-.terminal-tool {
-    border-left: thick #22c55e;
-}
-
-.terminal-tool.status-completed {
-    border-left: thick #22c55e;
-    background: transparent;
-}
-
-.terminal-tool.status-running {
-    border-left: thick #16a34a;
-    background: transparent;
-}
-
-.python-tool {
-    border-left: thick #3b82f6;
-}
-
-.python-tool.status-completed {
-    border-left: thick #3b82f6;
-    background: transparent;
-}
-
-.python-tool.status-running {
-    border-left: thick #2563eb;
-    background: transparent;
-}
-
-.agents-graph-tool {
-    border-left: thick #fbbf24;
-}
-
-.agents-graph-tool.status-completed {
-    border-left: thick #fbbf24;
-    background: transparent;
-}
-
-.agents-graph-tool.status-running {
-    border-left: thick #f59e0b;
-    background: transparent;
-}
-
-.file-edit-tool {
-    border-left: thick #10b981;
-}
-
-.file-edit-tool.status-completed {
-    border-left: thick #10b981;
-    background: transparent;
-}
-
-.file-edit-tool.status-running {
-    border-left: thick #059669;
-    background: transparent;
-}
-
-.proxy-tool {
-    border-left: thick #06b6d4;
-}
-
-.proxy-tool.status-completed {
-    border-left: thick #06b6d4;
-    background: transparent;
-}
-
-.proxy-tool.status-running {
-    border-left: thick #0891b2;
-    background: transparent;
-}
-
-.notes-tool {
-    border-left: thick #fbbf24;
-}
-
-.notes-tool.status-completed {
-    border-left: thick #fbbf24;
-    background: transparent;
-}
-
-.notes-tool.status-running {
-    border-left: thick #f59e0b;
-    background: transparent;
-}
-
-.thinking-tool {
-    border-left: thick #a855f7;
-}
-
-.thinking-tool.status-completed {
-    border-left: thick #a855f7;
-    background: transparent;
-}
-
-.thinking-tool.status-running {
-    border-left: thick #9333ea;
-    background: transparent;
-}
-
-.web-search-tool {
-    border-left: thick #22c55e;
-}
-
-.web-search-tool.status-completed {
-    border-left: thick #22c55e;
-    background: transparent;
-}
-
-.web-search-tool.status-running {
-    border-left: thick #16a34a;
-    background: transparent;
-}
-
-.finish-tool {
-    border-left: thick #dc2626;
-}
-
-.finish-tool.status-completed {
-    border-left: thick #dc2626;
-    background: transparent;
-}
-
-.finish-tool.status-running {
-    border-left: thick #b91c1c;
+    margin-top: 1;
+    margin-bottom: 0;
    background: transparent;
 }

+.finish-tool,
 .reporting-tool {
-    border-left: thick #ea580c;
-}
-
-.reporting-tool.status-completed {
-    border-left: thick #ea580c;
-    background: transparent;
-}
-
-.reporting-tool.status-running {
-    border-left: thick #c2410c;
-    background: transparent;
-}
-
-.scan-info-tool {
-    border-left: thick #22c55e;
-    background: transparent;
-    margin: 0 !important;
-    margin-top: 0 !important;
-    margin-bottom: 0 !important;
-}
-
-.scan-info-tool.status-completed {
-    border-left: thick #22c55e;
-    background: transparent;
-}
-
-.scan-info-tool.status-running {
-    border-left: thick #16a34a;
-    background: transparent;
-}
-
-.subagent-info-tool {
-    border-left: thick #22c55e;
-    background: transparent;
-    margin: 0 !important;
-    margin-top: 0 !important;
-    margin-bottom: 0 !important;
-}
-
-.subagent-info-tool.status-completed {
-    border-left: thick #22c55e;
+    margin-top: 1;
+    margin-bottom: 0;
    background: transparent;
 }

+.browser-tool.status-completed,
+.browser-tool.status-running,
+.terminal-tool.status-completed,
+.terminal-tool.status-running,
+.python-tool.status-completed,
+.python-tool.status-running,
+.agents-graph-tool.status-completed,
+.agents-graph-tool.status-running,
+.file-edit-tool.status-completed,
+.file-edit-tool.status-running,
+.proxy-tool.status-completed,
+.proxy-tool.status-running,
+.notes-tool.status-completed,
+.notes-tool.status-running,
+.thinking-tool.status-completed,
+.thinking-tool.status-running,
+.web-search-tool.status-completed,
+.web-search-tool.status-running,
+.scan-info-tool.status-completed,
+.scan-info-tool.status-running,
+.subagent-info-tool.status-completed,
 .subagent-info-tool.status-running {
-    border-left: thick #16a34a;
    background: transparent;
+    margin-top: 1;
+    margin-bottom: 0;
+}
+
+.finish-tool.status-completed,
+.finish-tool.status-running,
+.reporting-tool.status-completed,
+.reporting-tool.status-running {
+    background: transparent;
+    margin-top: 1;
+    margin-bottom: 0;
 }

 Tree {
@@ -462,7 +427,7 @@ Tree > .tree--label {
    background: transparent;
    padding: 0 1;
    margin-bottom: 1;
-    border-bottom: solid #262626;
+    border-bottom: solid #1a1a1a;
    text-align: center;
 }

@@ -502,7 +467,7 @@ Tree > .tree--label {
 }

 Tree:focus {
-    border: round #262626;
+    border: round #1a1a1a;
 }

 Tree:focus > .tree--label {
@@ -546,7 +511,7 @@ StopAgentScreen {
    width: 30;
    height: auto;
    border: round #a3a3a3;
-    background: #1a1a1a 98%;
+    background: #000000 98%;
 }

 #stop_agent_title {
@@ -608,8 +573,8 @@ QuitScreen {
    padding: 1;
    width: 24;
    height: auto;
-    border: round #525252;
-    background: #1a1a1a 98%;
+    border: round #333333;
+    background: #000000 98%;
 }

 #quit_title {
@@ -672,7 +637,7 @@ HelpScreen {
    width: 40;
    height: auto;
    border: round #22c55e;
-    background: #1a1a1a 98%;
+    background: #000000 98%;
 }

 #help_title {
--- a/strix/interface/cli.py
+++ b/strix/interface/cli.py
@@ -14,7 +14,10 @@ from strix.agents.StrixAgent import StrixAgent
 from strix.llm.config import LLMConfig
 from strix.telemetry.tracer import Tracer, set_global_tracer

-from .utils import build_final_stats_text, build_live_stats_text, get_severity_color
+from .utils import (
+    build_live_stats_text,
+    format_vulnerability_report,
+)


 async def run_cli(args: Any) -> None:  # noqa: PLR0915
@@ -66,6 +69,8 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915
    console.print(startup_panel)
    console.print()

+    scan_mode = getattr(args, "scan_mode", "deep")
+
    scan_config = {
        "scan_id": args.run_name,
        "targets": args.targets_info,
@@ -73,7 +78,7 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915
        "run_name": args.run_name,
    }

-    llm_config = LLMConfig()
+    llm_config = LLMConfig(scan_mode=scan_mode)
    agent_config = {
        "llm_config": llm_config,
        "max_iterations": 300,
@@ -86,28 +91,14 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915
    tracer = Tracer(args.run_name)
    tracer.set_scan_config(scan_config)

-    def display_vulnerability(report_id: str, title: str, content: str, severity: str) -> None:
-        severity_color = get_severity_color(severity.lower())
+    def display_vulnerability(report: dict[str, Any]) -> None:
+        report_id = report.get("id", "unknown")

-        vuln_text = Text()
-        vuln_text.append("🐞 ", style="bold red")
-        vuln_text.append("VULNERABILITY FOUND", style="bold red")
-        vuln_text.append(" • ", style="dim white")
-        vuln_text.append(title, style="bold white")
-
-        severity_text = Text()
-        severity_text.append("Severity: ", style="dim white")
-        severity_text.append(severity.upper(), style=f"bold {severity_color}")
+        vuln_text = format_vulnerability_report(report)

        vuln_panel = Panel(
-            Text.assemble(
-                vuln_text,
-                "\n\n",
-                severity_text,
-                "\n\n",
-                content,
-            ),
-            title=f"[bold red]🔍 {report_id.upper()}",
+            vuln_text,
+            title=f"[bold red]{report_id.upper()}",
            title_align="left",
            border_style="red",
            padding=(1, 2),
@@ -139,7 +130,7 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915
        status_text.append("Running penetration test...", style="bold #22c55e")
        status_text.append("\n\n")

-        stats_text = build_live_stats_text(tracer)
+        stats_text = build_live_stats_text(tracer, agent_config)
        if stats_text:
            status_text.append(stats_text)

@@ -176,8 +167,11 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915

                if isinstance(result, dict) and not result.get("success", True):
                    error_msg = result.get("error", "Unknown error")
+                    error_details = result.get("details")
                    console.print()
                    console.print(f"[bold red]❌ Penetration test failed:[/] {error_msg}")
+                    if error_details:
+                        console.print(f"[dim]{error_details}[/]")
                    console.print()
                    sys.exit(1)
            finally:
@@ -188,25 +182,6 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915
        console.print(f"[bold red]Error during penetration test:[/] {e}")
        raise

-    console.print()
-    final_stats_text = Text()
-    final_stats_text.append("📊 ", style="bold cyan")
-    final_stats_text.append("PENETRATION TEST COMPLETED", style="bold green")
-    final_stats_text.append("\n\n")
-
-    stats_text = build_final_stats_text(tracer)
-    if stats_text:
-        final_stats_text.append(stats_text)
-
-    final_stats_panel = Panel(
-        final_stats_text,
-        title="[bold green]✅ Final Statistics",
-        title_align="center",
-        border_style="green",
-        padding=(1, 2),
-    )
-    console.print(final_stats_panel)
-
    if tracer.final_scan_result:
        console.print()

--- a/strix/interface/main.py
+++ b/strix/interface/main.py
@@ -6,10 +6,10 @@ Strix Agent Interface
 import argparse
 import asyncio
 import logging
-import os
 import shutil
 import sys
 from pathlib import Path
+from typing import Any

 import litellm
 from docker.errors import DockerException
@@ -17,9 +17,14 @@ from rich.console import Console
 from rich.panel import Panel
 from rich.text import Text

-from strix.interface.cli import run_cli
-from strix.interface.tui import run_tui
-from strix.interface.utils import (
+from strix.config import Config, apply_saved_config, save_current_config
+
+
+apply_saved_config()
+
+from strix.interface.cli import run_cli  # noqa: E402
+from strix.interface.tui import run_tui  # noqa: E402
+from strix.interface.utils import (  # noqa: E402
    assign_workspace_subdirs,
    build_final_stats_text,
    check_docker_connection,
@@ -29,10 +34,12 @@ from strix.interface.utils import (
    image_exists,
    infer_target_type,
    process_pull_line,
+    rewrite_localhost_targets,
    validate_llm_response,
 )
-from strix.runtime.docker_runtime import STRIX_IMAGE
-from strix.telemetry.tracer import get_global_tracer
+from strix.runtime.docker_runtime import HOST_GATEWAY_HOSTNAME  # noqa: E402
+from strix.telemetry import posthog  # noqa: E402
+from strix.telemetry.tracer import get_global_tracer  # noqa: E402


 logging.getLogger().setLevel(logging.ERROR)
@@ -43,30 +50,30 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
    missing_required_vars = []
    missing_optional_vars = []

-    if not os.getenv("STRIX_LLM"):
+    if not Config.get("strix_llm"):
        missing_required_vars.append("STRIX_LLM")

    has_base_url = any(
        [
-            os.getenv("LLM_API_BASE"),
-            os.getenv("OPENAI_API_BASE"),
-            os.getenv("LITELLM_BASE_URL"),
-            os.getenv("OLLAMA_API_BASE"),
+            Config.get("llm_api_base"),
+            Config.get("openai_api_base"),
+            Config.get("litellm_base_url"),
+            Config.get("ollama_api_base"),
        ]
    )

-    if not os.getenv("LLM_API_KEY"):
-        if not has_base_url:
-            missing_required_vars.append("LLM_API_KEY")
-        else:
-            missing_optional_vars.append("LLM_API_KEY")
+    if not Config.get("llm_api_key"):
+        missing_optional_vars.append("LLM_API_KEY")

    if not has_base_url:
        missing_optional_vars.append("LLM_API_BASE")

-    if not os.getenv("PERPLEXITY_API_KEY"):
+    if not Config.get("perplexity_api_key"):
        missing_optional_vars.append("PERPLEXITY_API_KEY")

+    if not Config.get("strix_reasoning_effort"):
+        missing_optional_vars.append("STRIX_REASONING_EFFORT")
+
    if missing_required_vars:
        error_text = Text()
        error_text.append("❌ ", style="bold red")
@@ -92,13 +99,6 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
                    " - Model name to use with litellm (e.g., 'openai/gpt-5')\n",
                    style="white",
                )
-            elif var == "LLM_API_KEY":
-                error_text.append("• ", style="white")
-                error_text.append("LLM_API_KEY", style="bold cyan")
-                error_text.append(
-                    " - API key for the LLM provider (required for cloud providers)\n",
-                    style="white",
-                )

        if missing_optional_vars:
            error_text.append("\nOptional environment variables:\n", style="white")
@@ -106,7 +106,11 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
                if var == "LLM_API_KEY":
                    error_text.append("• ", style="white")
                    error_text.append("LLM_API_KEY", style="bold cyan")
-                    error_text.append(" - API key for the LLM provider\n", style="white")
+                    error_text.append(
+                        " - API key for the LLM provider "
+                        "(not needed for local models, Vertex AI, AWS, etc.)\n",
+                        style="white",
+                    )
                elif var == "LLM_API_BASE":
                    error_text.append("• ", style="white")
                    error_text.append("LLM_API_BASE", style="bold cyan")
@@ -121,18 +125,24 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
                        " - API key for Perplexity AI web search (enables real-time research)\n",
                        style="white",
                    )
+                elif var == "STRIX_REASONING_EFFORT":
+                    error_text.append("• ", style="white")
+                    error_text.append("STRIX_REASONING_EFFORT", style="bold cyan")
+                    error_text.append(
+                        " - Reasoning effort level: none, minimal, low, medium, high, xhigh "
+                        "(default: high)\n",
+                        style="white",
+                    )

        error_text.append("\nExample setup:\n", style="white")
        error_text.append("export STRIX_LLM='openai/gpt-5'\n", style="dim white")

-        if "LLM_API_KEY" in missing_required_vars:
-            error_text.append("export LLM_API_KEY='your-api-key-here'\n", style="dim white")
-
        if missing_optional_vars:
            for var in missing_optional_vars:
                if var == "LLM_API_KEY":
                    error_text.append(
-                        "export LLM_API_KEY='your-api-key-here'  # optional with local models\n",
+                        "export LLM_API_KEY='your-api-key-here'  "
+                        "# not needed for local models, Vertex AI, AWS, etc.\n",
                        style="dim white",
                    )
                elif var == "LLM_API_BASE":
@@ -145,6 +155,11 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
                    error_text.append(
                        "export PERPLEXITY_API_KEY='your-perplexity-key-here'\n", style="dim white"
                    )
+                elif var == "STRIX_REASONING_EFFORT":
+                    error_text.append(
+                        "export STRIX_REASONING_EFFORT='high'\n",
+                        style="dim white",
+                    )

        panel = Panel(
            error_text,
@@ -187,33 +202,33 @@ async def warm_up_llm() -> None:
    console = Console()

    try:
-        model_name = os.getenv("STRIX_LLM", "openai/gpt-5")
-        api_key = os.getenv("LLM_API_KEY")
-
-        if api_key:
-            litellm.api_key = api_key
-
+        model_name = Config.get("strix_llm")
+        api_key = Config.get("llm_api_key")
        api_base = (
-            os.getenv("LLM_API_BASE")
-            or os.getenv("OPENAI_API_BASE")
-            or os.getenv("LITELLM_BASE_URL")
-            or os.getenv("OLLAMA_API_BASE")
+            Config.get("llm_api_base")
+            or Config.get("openai_api_base")
+            or Config.get("litellm_base_url")
+            or Config.get("ollama_api_base")
        )
-        if api_base:
-            litellm.api_base = api_base

        test_messages = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Reply with just 'OK'."},
        ]

-        llm_timeout = int(os.getenv("LLM_TIMEOUT", "600"))
+        llm_timeout = int(Config.get("llm_timeout") or "300")

-        response = litellm.completion(
-            model=model_name,
-            messages=test_messages,
-            timeout=llm_timeout,
-        )
+        completion_kwargs: dict[str, Any] = {
+            "model": model_name,
+            "messages": test_messages,
+            "timeout": llm_timeout,
+        }
+        if api_key:
+            completion_kwargs["api_key"] = api_key
+        if api_base:
+            completion_kwargs["api_base"] = api_base
+
+        response = litellm.completion(**completion_kwargs)

        validate_llm_response(response)

@@ -240,6 +255,15 @@ async def warm_up_llm() -> None:
        sys.exit(1)


+def get_version() -> str:
+    try:
+        from importlib.metadata import version
+
+        return version("strix-agent")
+    except Exception:  # noqa: BLE001
+        return "unknown"
+
+
 def parse_arguments() -> argparse.Namespace:
    parser = argparse.ArgumentParser(
        description="Strix Multi-Agent Cybersecurity Penetration Testing Tool",
@@ -270,11 +294,18 @@ Examples:
  strix --target example.com --instruction "Focus on authentication vulnerabilities"

  # Custom instructions (from file)
-  strix --target example.com --instruction ./instructions.txt
-  strix --target https://app.com --instruction /path/to/detailed_instructions.md
+  strix --target example.com --instruction-file ./instructions.txt
+  strix --target https://app.com --instruction-file /path/to/detailed_instructions.md
        """,
    )

+    parser.add_argument(
+        "-v",
+        "--version",
+        action="version",
+        version=f"strix {get_version()}",
+    )
+
    parser.add_argument(
        "-t",
        "--target",
@@ -292,15 +323,15 @@ Examples:
        "testing approaches (e.g., 'Perform thorough authentication testing'), "
        "test credentials (e.g., 'Use the following credentials to access the app: "
        "admin:password123'), "
-        "or areas of interest (e.g., 'Check login API endpoint for security issues'). "
-        "You can also provide a path to a file containing detailed instructions "
-        "(e.g., '--instruction ./instructions.txt').",
+        "or areas of interest (e.g., 'Check login API endpoint for security issues').",
    )

    parser.add_argument(
-        "--run-name",
+        "--instruction-file",
        type=str,
-        help="Custom name for this penetration test run",
+        help="Path to a file containing detailed custom instructions for the penetration test. "
+        "Use this option when you have lengthy or complex instructions saved in a file "
+        "(e.g., '--instruction-file ./detailed_instructions.txt').",
    )

    parser.add_argument(
@@ -313,18 +344,37 @@ Examples:
        ),
    )

+    parser.add_argument(
+        "-m",
+        "--scan-mode",
+        type=str,
+        choices=["quick", "standard", "deep"],
+        default="deep",
+        help=(
+            "Scan mode: "
+            "'quick' for fast CI/CD checks, "
+            "'standard' for routine testing, "
+            "'deep' for thorough security reviews (default). "
+            "Default: deep."
+        ),
+    )
+
    args = parser.parse_args()

-    if args.instruction:
-        instruction_path = Path(args.instruction)
-        if instruction_path.exists() and instruction_path.is_file():
-            try:
-                with instruction_path.open(encoding="utf-8") as f:
-                    args.instruction = f.read().strip()
-                    if not args.instruction:
-                        parser.error(f"Instruction file '{instruction_path}' is empty")
-            except Exception as e:  # noqa: BLE001
-                parser.error(f"Failed to read instruction file '{instruction_path}': {e}")
+    if args.instruction and args.instruction_file:
+        parser.error(
+            "Cannot specify both --instruction and --instruction-file. Use one or the other."
+        )
+
+    if args.instruction_file:
+        instruction_path = Path(args.instruction_file)
+        try:
+            with instruction_path.open(encoding="utf-8") as f:
+                args.instruction = f.read().strip()
+                if not args.instruction:
+                    parser.error(f"Instruction file '{instruction_path}' is empty")
+        except Exception as e:  # noqa: BLE001
+            parser.error(f"Failed to read instruction file '{instruction_path}': {e}")

    args.targets_info = []
    for target in args.target:
@@ -343,6 +393,7 @@ Examples:
            parser.error(f"Invalid target '{target}'")

    assign_workspace_subdirs(args.targets_info)
+    rewrite_localhost_targets(args.targets_info, HOST_GATEWAY_HOSTNAME)

    return args

@@ -410,17 +461,20 @@ def display_completion_message(args: argparse.Namespace, results_path: Path) ->
    console.print("\n")
    console.print(panel)
    console.print()
+    console.print("[dim]🌐 Website:[/] [cyan]https://strix.ai[/]")
+    console.print("[dim]💬 Discord:[/] [cyan]https://discord.gg/YjKFvEZSdZ[/]")
+    console.print()


 def pull_docker_image() -> None:
    console = Console()
    client = check_docker_connection()

-    if image_exists(client, STRIX_IMAGE):
+    if image_exists(client, Config.get("strix_image")):  # type: ignore[arg-type]
        return

    console.print()
-    console.print(f"[bold cyan]🐳 Pulling Docker image:[/] {STRIX_IMAGE}")
+    console.print(f"[bold cyan]🐳 Pulling Docker image:[/] {Config.get('strix_image')}")
    console.print("[dim yellow]This only happens on first run and may take a few minutes...[/]")
    console.print()

@@ -429,7 +483,7 @@ def pull_docker_image() -> None:
            layers_info: dict[str, str] = {}
            last_update = ""

-            for line in client.api.pull(STRIX_IMAGE, stream=True, decode=True):
+            for line in client.api.pull(Config.get("strix_image"), stream=True, decode=True):
                last_update = process_pull_line(line, layers_info, status, last_update)

        except DockerException as e:
@@ -438,7 +492,7 @@ def pull_docker_image() -> None:
            error_text.append("❌ ", style="bold red")
            error_text.append("FAILED TO PULL IMAGE", style="bold red")
            error_text.append("\n\n", style="white")
-            error_text.append(f"Could not download: {STRIX_IMAGE}\n", style="white")
+            error_text.append(f"Could not download: {Config.get('strix_image')}\n", style="white")
            error_text.append(str(e), style="dim red")

            panel = Panel(
@@ -470,8 +524,9 @@ def main() -> None:
    validate_environment()
    asyncio.run(warm_up_llm())

-    if not args.run_name:
-        args.run_name = generate_run_name(args.targets_info)
+    save_current_config()
+
+    args.run_name = generate_run_name(args.targets_info)

    for target_info in args.targets_info:
        if target_info["type"] == "repository":
@@ -482,10 +537,32 @@ def main() -> None:

    args.local_sources = collect_local_sources(args.targets_info)

-    if args.non_interactive:
-        asyncio.run(run_cli(args))
-    else:
-        asyncio.run(run_tui(args))
+    is_whitebox = bool(args.local_sources)
+
+    posthog.start(
+        model=Config.get("strix_llm"),
+        scan_mode=args.scan_mode,
+        is_whitebox=is_whitebox,
+        interactive=not args.non_interactive,
+        has_instructions=bool(args.instruction),
+    )
+
+    exit_reason = "user_exit"
+    try:
+        if args.non_interactive:
+            asyncio.run(run_cli(args))
+        else:
+            asyncio.run(run_tui(args))
+    except KeyboardInterrupt:
+        exit_reason = "interrupted"
+    except Exception as e:
+        exit_reason = "error"
+        posthog.error("unhandled_exception", str(e))
+        raise
+    finally:
+        tracer = get_global_tracer()
+        if tracer:
+            posthog.end(tracer, exit_reason=exit_reason)

    results_path = Path("strix_runs") / args.run_name
    display_completion_message(args, results_path)
--- a/strix/interface/streaming_parser.py
+++ b/strix/interface/streaming_parser.py
@@ -0,0 +1,119 @@
+import html
+import re
+from dataclasses import dataclass
+from typing import Literal
+
+
+_FUNCTION_TAG_PREFIX = "<function="
+
+
+def _get_safe_content(content: str) -> tuple[str, str]:
+    if not content:
+        return "", ""
+
+    last_lt = content.rfind("<")
+    if last_lt == -1:
+        return content, ""
+
+    suffix = content[last_lt:]
+    target = _FUNCTION_TAG_PREFIX  # "<function="
+
+    if target.startswith(suffix):
+        return content[:last_lt], suffix
+
+    return content, ""
+
+
+@dataclass
+class StreamSegment:
+    type: Literal["text", "tool"]
+    content: str
+    tool_name: str | None = None
+    args: dict[str, str] | None = None
+    is_complete: bool = False
+
+
+def parse_streaming_content(content: str) -> list[StreamSegment]:
+    if not content:
+        return []
+
+    segments: list[StreamSegment] = []
+
+    func_pattern = r"<function=([^>]+)>"
+    func_matches = list(re.finditer(func_pattern, content))
+
+    if not func_matches:
+        safe_content, _ = _get_safe_content(content)
+        text = safe_content.strip()
+        if text:
+            segments.append(StreamSegment(type="text", content=text))
+        return segments
+
+    first_func_start = func_matches[0].start()
+    if first_func_start > 0:
+        text_before = content[:first_func_start].strip()
+        if text_before:
+            segments.append(StreamSegment(type="text", content=text_before))
+
+    for i, match in enumerate(func_matches):
+        tool_name = match.group(1)
+        func_start = match.end()
+
+        func_end_match = re.search(r"</function>", content[func_start:])
+
+        if func_end_match:
+            func_body = content[func_start : func_start + func_end_match.start()]
+            is_complete = True
+            end_pos = func_start + func_end_match.end()
+        else:
+            if i + 1 < len(func_matches):
+                next_func_start = func_matches[i + 1].start()
+                func_body = content[func_start:next_func_start]
+            else:
+                func_body = content[func_start:]
+            is_complete = False
+            end_pos = len(content)
+
+        args = _parse_streaming_params(func_body)
+
+        segments.append(
+            StreamSegment(
+                type="tool",
+                content=func_body,
+                tool_name=tool_name,
+                args=args,
+                is_complete=is_complete,
+            )
+        )
+
+        if is_complete and i + 1 < len(func_matches):
+            next_start = func_matches[i + 1].start()
+            text_between = content[end_pos:next_start].strip()
+            if text_between:
+                segments.append(StreamSegment(type="text", content=text_between))
+
+    return segments
+
+
+def _parse_streaming_params(func_body: str) -> dict[str, str]:
+    args: dict[str, str] = {}
+
+    complete_pattern = r"<parameter=([^>]+)>(.*?)</parameter>"
+    complete_matches = list(re.finditer(complete_pattern, func_body, re.DOTALL))
+    complete_end_pos = 0
+
+    for match in complete_matches:
+        param_name = match.group(1)
+        param_value = html.unescape(match.group(2).strip())
+        args[param_name] = param_value
+        complete_end_pos = max(complete_end_pos, match.end())
+
+    remaining = func_body[complete_end_pos:]
+    incomplete_pattern = r"<parameter=([^>]+)>(.*)$"
+    incomplete_match = re.search(incomplete_pattern, remaining, re.DOTALL)
+    if incomplete_match:
+        param_name = incomplete_match.group(1)
+        param_value = html.unescape(incomplete_match.group(2).strip())
+        args[param_name] = param_value
+
+    return args
--- a/strix/interface/tool_components/init.py
+++ b/strix/interface/tool_components/init.py
@@ -1,4 +1,5 @@
 from . import (
+    agent_message_renderer,
    agents_graph_renderer,
    browser_renderer,
    file_edit_renderer,
@@ -10,6 +11,7 @@ from . import (
    scan_info_renderer,
    terminal_renderer,
    thinking_renderer,
+    todo_renderer,
    user_message_renderer,
    web_search_renderer,
 )
@@ -20,6 +22,7 @@ from .registry import ToolTUIRegistry, get_tool_renderer, register_tool_renderer
 __all__ = [
    "BaseToolRenderer",
    "ToolTUIRegistry",
+    "agent_message_renderer",
    "agents_graph_renderer",
    "browser_renderer",
    "file_edit_renderer",
@@ -34,6 +37,7 @@ __all__ = [
    "scan_info_renderer",
    "terminal_renderer",
    "thinking_renderer",
+    "todo_renderer",
    "user_message_renderer",
    "web_search_renderer",
 ]
--- a/strix/interface/tool_components/agent_message_renderer.py
+++ b/strix/interface/tool_components/agent_message_renderer.py
@@ -0,0 +1,190 @@
+from functools import cache
+from typing import Any, ClassVar
+
+from pygments.lexers import get_lexer_by_name, guess_lexer
+from pygments.styles import get_style_by_name
+from pygments.util import ClassNotFound
+from rich.text import Text
+from textual.widgets import Static
+
+from .base_renderer import BaseToolRenderer
+from .registry import register_tool_renderer
+
+
+_HEADER_STYLES = [
+    ("###### ", 7, "bold #4ade80"),
+    ("##### ", 6, "bold #22c55e"),
+    ("#### ", 5, "bold #16a34a"),
+    ("### ", 4, "bold #15803d"),
+    ("## ", 3, "bold #22c55e"),
+    ("# ", 2, "bold #4ade80"),
+]
+
+
+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
+def _get_token_color(token_type: Any) -> str | None:
+    colors = _get_style_colors()
+    while token_type:
+        if token_type in colors:
+            return colors[token_type]
+        token_type = token_type.parent
+    return None
+
+
+def _highlight_code(code: str, language: str | None = None) -> Text:
+    text = Text()
+
+    try:
+        lexer = get_lexer_by_name(language) if language else guess_lexer(code)
+    except ClassNotFound:
+        text.append(code, style="#d4d4d4")
+        return text
+
+    for token_type, token_value in lexer.get_tokens(code):
+        if not token_value:
+            continue
+        color = _get_token_color(token_type)
+        text.append(token_value, style=color)
+
+    return text
+
+
+def _try_parse_header(line: str) -> tuple[str, str] | None:
+    for prefix, strip_len, style in _HEADER_STYLES:
+        if line.startswith(prefix):
+            return (line[strip_len:], style)
+    return None
+
+
+def _apply_markdown_styles(text: str) -> Text:  # noqa: PLR0912
+    result = Text()
+    lines = text.split("\n")
+
+    in_code_block = False
+    code_block_lang: str | None = None
+    code_block_lines: list[str] = []
+
+    for i, line in enumerate(lines):
+        if i > 0 and not in_code_block:
+            result.append("\n")
+
+        if line.startswith("```"):
+            if not in_code_block:
+                in_code_block = True
+                code_block_lang = line[3:].strip() or None
+                code_block_lines = []
+                if i > 0:
+                    result.append("\n")
+            else:
+                in_code_block = False
+                code_content = "\n".join(code_block_lines)
+                if code_content:
+                    result.append_text(_highlight_code(code_content, code_block_lang))
+                code_block_lines = []
+                code_block_lang = None
+            continue
+
+        if in_code_block:
+            code_block_lines.append(line)
+            continue
+
+        header = _try_parse_header(line)
+        if header:
+            result.append(header[0], style=header[1])
+        elif line.startswith("> "):
+            result.append("┃ ", style="#22c55e")
+            result.append_text(_process_inline_formatting(line[2:]))
+        elif line.startswith(("- ", "* ")):
+            result.append("• ", style="#22c55e")
+            result.append_text(_process_inline_formatting(line[2:]))
+        elif len(line) > 2 and line[0].isdigit() and line[1:3] in (". ", ") "):
+            result.append(line[0] + ". ", style="#22c55e")
+            result.append_text(_process_inline_formatting(line[2:]))
+        elif line.strip() in ("---", "***", "___"):
+            result.append("─" * 40, style="#22c55e")
+        else:
+            result.append_text(_process_inline_formatting(line))
+
+    if in_code_block and code_block_lines:
+        code_content = "\n".join(code_block_lines)
+        result.append_text(_highlight_code(code_content, code_block_lang))
+
+    return result
+
+
+def _process_inline_formatting(line: str) -> Text:
+    result = Text()
+    i = 0
+    n = len(line)
+
+    while i < n:
+        if i + 1 < n and line[i : i + 2] in ("**", "__"):
+            marker = line[i : i + 2]
+            end = line.find(marker, i + 2)
+            if end != -1:
+                result.append(line[i + 2 : end], style="bold #4ade80")
+                i = end + 2
+                continue
+
+        if i + 1 < n and line[i : i + 2] == "~~":
+            end = line.find("~~", i + 2)
+            if end != -1:
+                result.append(line[i + 2 : end], style="strike #525252")
+                i = end + 2
+                continue
+
+        if line[i] == "`":
+            end = line.find("`", i + 1)
+            if end != -1:
+                result.append(line[i + 1 : end], style="bold #22c55e on #0a0a0a")
+                i = end + 1
+                continue
+
+        if line[i] in ("*", "_"):
+            marker = line[i]
+            if i + 1 < n and line[i + 1] != marker:
+                end = line.find(marker, i + 1)
+                if end != -1 and (end + 1 >= n or line[end + 1] != marker):
+                    result.append(line[i + 1 : end], style="italic #86efac")
+                    i = end + 1
+                    continue
+
+        result.append(line[i])
+        i += 1
+
+    return result
+
+
+@register_tool_renderer
+class AgentMessageRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "agent_message"
+    css_classes: ClassVar[list[str]] = ["chat-message", "agent-message"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        content = tool_data.get("content", "")
+
+        if not content:
+            return Static(Text(), classes=" ".join(cls.css_classes))
+
+        styled_text = _apply_markdown_styles(content)
+
+        return Static(styled_text, classes=" ".join(cls.css_classes))
+
+    @classmethod
+    def render_simple(cls, content: str) -> Text:
+        if not content:
+            return Text()
+
+        from strix.llm.utils import clean_content
+
+        cleaned = clean_content(content)
+        if not cleaned:
+            return Text()
+
+        return _apply_markdown_styles(cleaned)
--- a/strix/interface/tool_components/agents_graph_renderer.py
+++ b/strix/interface/tool_components/agents_graph_renderer.py
@@ -1,5 +1,6 @@
 from typing import Any, ClassVar

+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
@@ -12,11 +13,15 @@ class ViewAgentGraphRenderer(BaseToolRenderer):
    css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]

    @classmethod
-    def render(cls, tool_data: dict[str, Any]) -> Static:  # noqa: ARG003
-        content_text = "🕸️ [bold #fbbf24]Viewing agents graph[/]"
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        status = tool_data.get("status", "unknown")

-        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        text = Text()
+        text.append("◇ ", style="#a78bfa")
+        text.append("viewing agents graph", style="dim")
+
+        css_classes = cls.get_css_classes(status)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -27,20 +32,22 @@ class CreateAgentRenderer(BaseToolRenderer):
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
+        status = tool_data.get("status", "unknown")

        task = args.get("task", "")
        name = args.get("name", "Agent")

-        header = f"🤖 [bold #fbbf24]Creating {cls.escape_markup(name)}[/]"
+        text = Text()
+        text.append("◈ ", style="#a78bfa")
+        text.append("spawning ", style="dim")
+        text.append(name, style="bold #a78bfa")

        if task:
-            task_display = task[:400] + "..." if len(task) > 400 else task
-            content_text = f"{header}\n  [dim]{cls.escape_markup(task_display)}[/]"
-        else:
-            content_text = f"{header}\n  [dim]Spawning agent...[/]"
+            text.append("\n  ")
+            text.append(task, style="dim")

-        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        css_classes = cls.get_css_classes(status)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -51,19 +58,24 @@ class SendMessageToAgentRenderer(BaseToolRenderer):
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
+        status = tool_data.get("status", "unknown")

        message = args.get("message", "")
+        agent_id = args.get("agent_id", "")

-        header = "💬 [bold #fbbf24]Sending message[/]"
+        text = Text()
+        text.append("→ ", style="#60a5fa")
+        if agent_id:
+            text.append(f"to {agent_id}", style="dim")
+        else:
+            text.append("sending message", style="dim")

        if message:
-            message_display = message[:400] + "..." if len(message) > 400 else message
-            content_text = f"{header}\n  [dim]{cls.escape_markup(message_display)}[/]"
-        else:
-            content_text = f"{header}\n  [dim]Sending...[/]"
+            text.append("\n  ")
+            text.append(message, style="dim")

-        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        css_classes = cls.get_css_classes(status)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -79,25 +91,28 @@ class AgentFinishRenderer(BaseToolRenderer):
        findings = args.get("findings", [])
        success = args.get("success", True)

-        header = (
-            "🏁 [bold #fbbf24]Agent completed[/]" if success else "🏁 [bold #fbbf24]Agent failed[/]"
-        )
+        text = Text()
+        text.append("🏁 ")
+
+        if success:
+            text.append("Agent completed", style="bold #fbbf24")
+        else:
+            text.append("Agent failed", style="bold #fbbf24")

        if result_summary:
-            content_parts = [f"{header}\n  [bold]{cls.escape_markup(result_summary)}[/]"]
+            text.append("\n  ")
+            text.append(result_summary, style="bold")

            if findings and isinstance(findings, list):
-                finding_lines = [f"• {finding}" for finding in findings]
-                content_parts.append(
-                    f"  [dim]{chr(10).join([cls.escape_markup(line) for line in finding_lines])}[/]"
-                )
-
-            content_text = "\n".join(content_parts)
+                for finding in findings:
+                    text.append("\n  • ")
+                    text.append(str(finding), style="dim")
        else:
-            content_text = f"{header}\n  [dim]Completing task...[/]"
+            text.append("\n  ")
+            text.append("Completing task...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -108,16 +123,17 @@ class WaitForMessageRenderer(BaseToolRenderer):
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
+        status = tool_data.get("status", "unknown")

-        reason = args.get("reason", "Waiting for messages from other agents or user input")
+        reason = args.get("reason", "")

-        header = "⏸️ [bold #fbbf24]Waiting for messages[/]"
+        text = Text()
+        text.append("○ ", style="#6b7280")
+        text.append("waiting", style="dim")

        if reason:
-            reason_display = reason[:400] + "..." if len(reason) > 400 else reason
-            content_text = f"{header}\n  [dim]{cls.escape_markup(reason_display)}[/]"
-        else:
-            content_text = f"{header}\n  [dim]Agent paused until message received...[/]"
+            text.append("\n  ")
+            text.append(reason, style="dim")

-        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        css_classes = cls.get_css_classes(status)
+        return Static(text, classes=css_classes)
--- a/strix/interface/tool_components/base_renderer.py
+++ b/strix/interface/tool_components/base_renderer.py
@@ -1,13 +1,12 @@
 from abc import ABC, abstractmethod
-from typing import Any, ClassVar, cast
+from typing import Any, ClassVar

-from rich.markup import escape as rich_escape
+from rich.text import Text
 from textual.widgets import Static


 class BaseToolRenderer(ABC):
    tool_name: ClassVar[str] = ""
-
    css_classes: ClassVar[list[str]] = ["tool-call"]

    @classmethod
@@ -16,47 +15,80 @@ class BaseToolRenderer(ABC):
        pass

    @classmethod
-    def escape_markup(cls, text: str) -> str:
-        return cast("str", rich_escape(text))
+    def build_text(cls, tool_data: dict[str, Any]) -> Text:  # noqa: ARG003
+        return Text()

    @classmethod
-    def format_args(cls, args: dict[str, Any], max_length: int = 500) -> str:
-        if not args:
-            return ""
-
-        args_parts = []
-        for k, v in args.items():
-            str_v = str(v)
-            if len(str_v) > max_length:
-                str_v = str_v[: max_length - 3] + "..."
-            args_parts.append(f"  [dim]{k}:[/] {cls.escape_markup(str_v)}")
-        return "\n".join(args_parts)
+    def create_static(cls, content: Text, status: str) -> Static:
+        css_classes = cls.get_css_classes(status)
+        return Static(content, classes=css_classes)

    @classmethod
-    def format_result(cls, result: Any, max_length: int = 1000) -> str:
-        if result is None:
-            return ""
-
-        str_result = str(result).strip()
-        if not str_result:
-            return ""
-
-        if len(str_result) > max_length:
-            str_result = str_result[: max_length - 3] + "..."
-        return cls.escape_markup(str_result)
-
-    @classmethod
-    def get_status_icon(cls, status: str) -> str:
-        status_icons = {
-            "running": "[#f59e0b]●[/#f59e0b] In progress...",
-            "completed": "[#22c55e]✓[/#22c55e] Done",
-            "failed": "[#dc2626]✗[/#dc2626] Failed",
-            "error": "[#dc2626]✗[/#dc2626] Error",
+    def status_icon(cls, status: str) -> tuple[str, str]:
+        icons = {
+            "running": ("● In progress...", "#f59e0b"),
+            "completed": ("✓ Done", "#22c55e"),
+            "failed": ("✗ Failed", "#dc2626"),
+            "error": ("✗ Error", "#dc2626"),
        }
-        return status_icons.get(status, "[dim]○[/dim] Unknown")
+        return icons.get(status, ("○ Unknown", "dim"))

    @classmethod
    def get_css_classes(cls, status: str) -> str:
        base_classes = cls.css_classes.copy()
        base_classes.append(f"status-{status}")
        return " ".join(base_classes)
+
+    @classmethod
+    def text_with_style(cls, content: str, style: str | None = None) -> Text:
+        text = Text()
+        text.append(content, style=style)
+        return text
+
+    @classmethod
+    def text_icon_label(
+        cls,
+        icon: str,
+        label: str,
+        icon_style: str | None = None,
+        label_style: str | None = None,
+    ) -> Text:
+        text = Text()
+        text.append(icon, style=icon_style)
+        text.append(" ")
+        text.append(label, style=label_style)
+        return text
+
+    @classmethod
+    def text_header(
+        cls,
+        icon: str,
+        title: str,
+        subtitle: str = "",
+        title_style: str = "bold",
+        subtitle_style: str = "dim",
+    ) -> Text:
+        text = Text()
+        text.append(icon)
+        text.append(" ")
+        text.append(title, style=title_style)
+        if subtitle:
+            text.append(" ")
+            text.append(subtitle, style=subtitle_style)
+        return text
+
+    @classmethod
+    def text_key_value(
+        cls,
+        key: str,
+        value: str,
+        key_style: str = "dim",
+        value_style: str | None = None,
+        indent: int = 2,
+    ) -> Text:
+        text = Text()
+        text.append(" " * indent)
+        text.append(key, style=key_style)
+        text.append(": ")
+        text.append(value, style=value_style)
+        return text
--- a/strix/interface/tool_components/browser_renderer.py
+++ b/strix/interface/tool_components/browser_renderer.py
@@ -1,120 +1,135 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import get_lexer_by_name
+from pygments.styles import get_style_by_name
+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
@register_tool_renderer
 class BrowserRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "browser_action"
    css_classes: ClassVar[list[str]] = ["tool-call", "browser-tool"]

+    SIMPLE_ACTIONS: ClassVar[dict[str, str]] = {
+        "back": "going back in browser history",
+        "forward": "going forward in browser history",
+        "scroll_down": "scrolling down",
+        "scroll_up": "scrolling up",
+        "refresh": "refreshing browser tab",
+        "close_tab": "closing browser tab",
+        "switch_tab": "switching browser tab",
+        "list_tabs": "listing browser tabs",
+        "view_source": "viewing page source",
+        "get_console_logs": "getting console logs",
+        "screenshot": "taking screenshot of browser tab",
+        "wait": "waiting...",
+        "close": "closing browser",
+    }
+
+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_js(cls, code: str) -> Text:
+        lexer = get_lexer_by_name("javascript")
+        text = Text()
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+            color = cls._get_token_color(token_type)
+            text.append(token_value, style=color)
+
+        return text
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        status = tool_data.get("status", "unknown")

        action = args.get("action", "unknown")
-
-        content = cls._build_sleek_content(action, args)
+        content = cls._build_content(action, args)

        css_classes = cls.get_css_classes(status)
        return Static(content, classes=css_classes)

    @classmethod
-    def _build_sleek_content(cls, action: str, args: dict[str, Any]) -> str:
-        browser_icon = "🌐"
+    def _build_url_action(cls, text: Text, label: str, url: str | None, suffix: str = "") -> None:
+        text.append(label, style="#06b6d4")
+        if url:
+            text.append(url, style="#06b6d4")
+            if suffix:
+                text.append(suffix, style="#06b6d4")
+
+    @classmethod
+    def _build_content(cls, action: str, args: dict[str, Any]) -> Text:
+        text = Text()
+        text.append("🌐 ")
+
+        if action in cls.SIMPLE_ACTIONS:
+            text.append(cls.SIMPLE_ACTIONS[action], style="#06b6d4")
+            return text

        url = args.get("url")
-        text = args.get("text")
-        js_code = args.get("js_code")
-        key = args.get("key")
-        file_path = args.get("file_path")

-        if action in [
-            "launch",
-            "goto",
-            "new_tab",
-            "type",
-            "execute_js",
-            "click",
-            "double_click",
-            "hover",
-            "press_key",
-            "save_pdf",
-        ]:
-            if action == "launch":
-                display_url = cls._format_url(url) if url else None
-                message = (
-                    f"launching {display_url} on browser" if display_url else "launching browser"
-                )
-            elif action == "goto":
-                display_url = cls._format_url(url) if url else None
-                message = f"navigating to {display_url}" if display_url else "navigating"
-            elif action == "new_tab":
-                display_url = cls._format_url(url) if url else None
-                message = f"opening tab {display_url}" if display_url else "opening tab"
-            elif action == "type":
-                display_text = cls._format_text(text) if text else None
-                message = f"typing {display_text}" if display_text else "typing"
-            elif action == "execute_js":
-                display_js = cls._format_js(js_code) if js_code else None
-                message = (
-                    f"executing javascript\n{display_js}" if display_js else "executing javascript"
-                )
-            elif action == "press_key":
-                display_key = cls.escape_markup(key) if key else None
-                message = f"pressing key {display_key}" if display_key else "pressing key"
-            elif action == "save_pdf":
-                display_path = cls.escape_markup(file_path) if file_path else None
-                message = f"saving PDF to {display_path}" if display_path else "saving PDF"
-            else:
-                action_words = {
-                    "click": "clicking",
-                    "double_click": "double clicking",
-                    "hover": "hovering",
-                }
-                message = cls.escape_markup(action_words[action])
-
-            return f"{browser_icon} [#06b6d4]{message}[/]"
-
-        simple_actions = {
-            "back": "going back in browser history",
-            "forward": "going forward in browser history",
-            "scroll_down": "scrolling down",
-            "scroll_up": "scrolling up",
-            "refresh": "refreshing browser tab",
-            "close_tab": "closing browser tab",
-            "switch_tab": "switching browser tab",
-            "list_tabs": "listing browser tabs",
-            "view_source": "viewing page source",
-            "get_console_logs": "getting console logs",
-            "screenshot": "taking screenshot of browser tab",
-            "wait": "waiting...",
-            "close": "closing browser",
+        url_actions = {
+            "launch": ("launching ", " on browser" if url else "browser"),
+            "goto": ("navigating to ", ""),
+            "new_tab": ("opening tab ", ""),
        }
+        if action in url_actions:
+            label, suffix = url_actions[action]
+            if action == "launch" and not url:
+                text.append("launching browser", style="#06b6d4")
+            else:
+                cls._build_url_action(text, label, url, suffix)
+            return text

-        if action in simple_actions:
-            return f"{browser_icon} [#06b6d4]{cls.escape_markup(simple_actions[action])}[/]"
+        click_actions = {
+            "click": "clicking",
+            "double_click": "double clicking",
+            "hover": "hovering",
+        }
+        if action in click_actions:
+            text.append(click_actions[action], style="#06b6d4")
+            return text

-        return f"{browser_icon} [#06b6d4]{cls.escape_markup(action)}[/]"
+        handlers: dict[str, tuple[str, str | None]] = {
+            "type": ("typing ", args.get("text")),
+            "press_key": ("pressing key ", args.get("key")),
+            "save_pdf": ("saving PDF to ", args.get("file_path")),
+        }
+        if action in handlers:
+            label, value = handlers[action]
+            text.append(label, style="#06b6d4")
+            if value:
+                text.append(str(value), style="#06b6d4")
+            return text

-    @classmethod
-    def _format_url(cls, url: str) -> str:
-        if len(url) > 300:
-            url = url[:297] + "..."
-        return cls.escape_markup(url)
+        if action == "execute_js":
+            text.append("executing javascript", style="#06b6d4")
+            js_code = args.get("js_code")
+            if js_code:
+                text.append("\n")
+                text.append_text(cls._highlight_js(js_code))
+            return text

-    @classmethod
-    def _format_text(cls, text: str) -> str:
-        if len(text) > 200:
-            text = text[:197] + "..."
-        return cls.escape_markup(text)
-
-    @classmethod
-    def _format_js(cls, js_code: str) -> str:
-        if len(js_code) > 200:
-            js_code = js_code[:197] + "..."
-        return f"[white]{cls.escape_markup(js_code)}[/white]"
+        text.append(action, style="#06b6d4")
+        return text
--- a/strix/interface/tool_components/file_edit_renderer.py
+++ b/strix/interface/tool_components/file_edit_renderer.py
@@ -1,16 +1,56 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import get_lexer_by_name, get_lexer_for_filename
+from pygments.styles import get_style_by_name
+from pygments.util import ClassNotFound
+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
+def _get_lexer_for_file(path: str) -> Any:
+    try:
+        return get_lexer_for_filename(path)
+    except ClassNotFound:
+        return get_lexer_by_name("text")
+
+
@register_tool_renderer
 class StrReplaceEditorRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "str_replace_editor"
    css_classes: ClassVar[list[str]] = ["tool-call", "file-edit-tool"]

+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_code(cls, code: str, path: str) -> Text:
+        lexer = _get_lexer_for_file(path)
+        text = Text()
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+            color = cls._get_token_color(token_type)
+            text.append(token_value, style=color)
+
+        return text
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
@@ -18,28 +58,67 @@ class StrReplaceEditorRenderer(BaseToolRenderer):

        command = args.get("command", "")
        path = args.get("path", "")
+        old_str = args.get("old_str", "")
+        new_str = args.get("new_str", "")
+        file_text = args.get("file_text", "")

-        if command == "view":
-            header = "📖 [bold #10b981]Reading file[/]"
-        elif command == "str_replace":
-            header = "✏️ [bold #10b981]Editing file[/]"
-        elif command == "create":
-            header = "📝 [bold #10b981]Creating file[/]"
-        elif command == "insert":
-            header = "✏️ [bold #10b981]Inserting text[/]"
-        elif command == "undo_edit":
-            header = "↩️ [bold #10b981]Undoing edit[/]"
-        else:
-            header = "📄 [bold #10b981]File operation[/]"
+        text = Text()

-        if (result and isinstance(result, dict) and "content" in result) or path:
+        icons_and_labels = {
+            "view": ("📖 ", "Reading file", "#10b981"),
+            "str_replace": ("✏️ ", "Editing file", "#10b981"),
+            "create": ("📝 ", "Creating file", "#10b981"),
+            "insert": ("✏️ ", "Inserting text", "#10b981"),
+            "undo_edit": ("↩️ ", "Undoing edit", "#10b981"),
+        }
+
+        icon, label, color = icons_and_labels.get(command, ("📄 ", "File operation", "#10b981"))
+        text.append(icon)
+        text.append(label, style=f"bold {color}")
+
+        if path:
            path_display = path[-60:] if len(path) > 60 else path
-            content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
-        else:
-            content_text = f"{header} [dim]Processing...[/]"
+            text.append(" ")
+            text.append(path_display, style="dim")
+
+        if command == "str_replace" and (old_str or new_str):
+            if old_str:
+                highlighted_old = cls._highlight_code(old_str, path)
+                for line in highlighted_old.plain.split("\n"):
+                    text.append("\n")
+                    text.append("-", style="#ef4444")
+                    text.append(" ")
+                    text.append(line)
+
+            if new_str:
+                highlighted_new = cls._highlight_code(new_str, path)
+                for line in highlighted_new.plain.split("\n"):
+                    text.append("\n")
+                    text.append("+", style="#22c55e")
+                    text.append(" ")
+                    text.append(line)
+
+        elif command == "create" and file_text:
+            text.append("\n")
+            text.append_text(cls._highlight_code(file_text, path))
+
+        elif command == "insert" and new_str:
+            highlighted_new = cls._highlight_code(new_str, path)
+            for line in highlighted_new.plain.split("\n"):
+                text.append("\n")
+                text.append("+", style="#22c55e")
+                text.append(" ")
+                text.append(line)
+
+        elif isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif not (result and isinstance(result, dict) and "content" in result) and not path:
+            text.append(" ")
+            text.append("Processing...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -50,19 +129,21 @@ class ListFilesRenderer(BaseToolRenderer):
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
-
        path = args.get("path", "")

-        header = "📂 [bold #10b981]Listing files[/]"
+        text = Text()
+        text.append("📂 ")
+        text.append("Listing files", style="bold #10b981")
+        text.append(" ")

        if path:
            path_display = path[-60:] if len(path) > 60 else path
-            content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
+            text.append(path_display, style="dim")
        else:
-            content_text = f"{header} [dim]Current directory[/]"
+            text.append("Current directory", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -73,27 +154,27 @@ class SearchFilesRenderer(BaseToolRenderer):
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
-
        path = args.get("path", "")
        regex = args.get("regex", "")

-        header = "🔍 [bold purple]Searching files[/]"
+        text = Text()
+        text.append("🔍 ")
+        text.append("Searching files", style="bold purple")
+        text.append(" ")

        if path and regex:
-            path_display = path[-30:] if len(path) > 30 else path
-            regex_display = regex[:30] if len(regex) > 30 else regex
-            content_text = (
-                f"{header} [dim]{cls.escape_markup(path_display)} for "
-                f"'{cls.escape_markup(regex_display)}'[/]"
-            )
+            text.append(path, style="dim")
+            text.append(" for '", style="dim")
+            text.append(regex, style="dim")
+            text.append("'", style="dim")
        elif path:
-            path_display = path[-60:] if len(path) > 60 else path
-            content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
+            text.append(path, style="dim")
        elif regex:
-            regex_display = regex[:60] if len(regex) > 60 else regex
-            content_text = f"{header} [dim]'{cls.escape_markup(regex_display)}'[/]"
+            text.append("'", style="dim")
+            text.append(regex, style="dim")
+            text.append("'", style="dim")
        else:
-            content_text = f"{header} [dim]Searching...[/]"
+            text.append("Searching...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)
--- a/strix/interface/tool_components/finish_renderer.py
+++ b/strix/interface/tool_components/finish_renderer.py
@@ -1,11 +1,17 @@
 from typing import Any, ClassVar

+from rich.padding import Padding
+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+FIELD_STYLE = "bold #4ade80"
+BG_COLOR = "#141414"
+
+
@register_tool_renderer
 class FinishScanRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "finish_scan"
@@ -15,17 +21,44 @@ class FinishScanRenderer(BaseToolRenderer):
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})

-        content = args.get("content", "")
-        success = args.get("success", True)
+        executive_summary = args.get("executive_summary", "")
+        methodology = args.get("methodology", "")
+        technical_analysis = args.get("technical_analysis", "")
+        recommendations = args.get("recommendations", "")

-        header = (
-            "🏁 [bold #dc2626]Finishing Scan[/]" if success else "🏁 [bold #dc2626]Scan Failed[/]"
-        )
+        text = Text()
+        text.append("🏁 ")
+        text.append("Finishing Scan", style="bold #dc2626")

-        if content:
-            content_text = f"{header}\n  [bold]{cls.escape_markup(content)}[/]"
-        else:
-            content_text = f"{header}\n  [dim]Generating final report...[/]"
+        if executive_summary:
+            text.append("\n\n")
+            text.append("Executive Summary", style=FIELD_STYLE)
+            text.append("\n")
+            text.append(executive_summary)
+
+        if methodology:
+            text.append("\n\n")
+            text.append("Methodology", style=FIELD_STYLE)
+            text.append("\n")
+            text.append(methodology)
+
+        if technical_analysis:
+            text.append("\n\n")
+            text.append("Technical Analysis", style=FIELD_STYLE)
+            text.append("\n")
+            text.append(technical_analysis)
+
+        if recommendations:
+            text.append("\n\n")
+            text.append("Recommendations", style=FIELD_STYLE)
+            text.append("\n")
+            text.append(recommendations)
+
+        if not (executive_summary or methodology or technical_analysis or recommendations):
+            text.append("\n  ")
+            text.append("Generating final report...", style="dim")
+
+        padded = Padding(text, 2, style=f"on {BG_COLOR}")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(padded, classes=css_classes)
--- a/strix/interface/tool_components/notes_renderer.py
+++ b/strix/interface/tool_components/notes_renderer.py
@@ -1,5 +1,6 @@
 from typing import Any, ClassVar

+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
@@ -17,23 +18,28 @@ class CreateNoteRenderer(BaseToolRenderer):

        title = args.get("title", "")
        content = args.get("content", "")
+        category = args.get("category", "general")

-        header = "📝 [bold #fbbf24]Note[/]"
+        text = Text()
+        text.append("📝 ")
+        text.append("Note", style="bold #fbbf24")
+        text.append(" ")
+        text.append(f"({category})", style="dim")

        if title:
-            title_display = title[:100] + "..." if len(title) > 100 else title
-            note_parts = [f"{header}\n  [bold]{cls.escape_markup(title_display)}[/]"]
+            text.append("\n  ")
+            text.append(title.strip())

-            if content:
-                content_display = content[:200] + "..." if len(content) > 200 else content
-                note_parts.append(f"  [dim]{cls.escape_markup(content_display)}[/]")
+        if content:
+            text.append("\n  ")
+            text.append(content.strip(), style="dim")

-            content_text = "\n".join(note_parts)
-        else:
-            content_text = f"{header}\n  [dim]Creating note...[/]"
+        if not title and not content:
+            text.append("\n  ")
+            text.append("Capturing...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -43,11 +49,12 @@ class DeleteNoteRenderer(BaseToolRenderer):

    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:  # noqa: ARG003
-        header = "🗑️ [bold #fbbf24]Delete Note[/]"
-        content_text = f"{header}\n  [dim]Deleting...[/]"
+        text = Text()
+        text.append("📝 ")
+        text.append("Note Removed", style="bold #94a3b8")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -59,28 +66,27 @@ class UpdateNoteRenderer(BaseToolRenderer):
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})

-        title = args.get("title", "")
-        content = args.get("content", "")
+        title = args.get("title")
+        content = args.get("content")

-        header = "✏️ [bold #fbbf24]Update Note[/]"
+        text = Text()
+        text.append("📝 ")
+        text.append("Note Updated", style="bold #fbbf24")

-        if title or content:
-            note_parts = [header]
+        if title:
+            text.append("\n  ")
+            text.append(title)

-            if title:
-                title_display = title[:100] + "..." if len(title) > 100 else title
-                note_parts.append(f"  [bold]{cls.escape_markup(title_display)}[/]")
+        if content:
+            text.append("\n  ")
+            text.append(content.strip(), style="dim")

-            if content:
-                content_display = content[:200] + "..." if len(content) > 200 else content
-                note_parts.append(f"  [dim]{cls.escape_markup(content_display)}[/]")
-
-            content_text = "\n".join(note_parts)
-        else:
-            content_text = f"{header}\n  [dim]Updating...[/]"
+        if not title and not content:
+            text.append("\n  ")
+            text.append("Updating...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -92,17 +98,36 @@ class ListNotesRenderer(BaseToolRenderer):
    def render(cls, tool_data: dict[str, Any]) -> Static:
        result = tool_data.get("result")

-        header = "📋 [bold #fbbf24]Listing notes[/]"
+        text = Text()
+        text.append("📝 ")
+        text.append("Notes", style="bold #fbbf24")

-        if result and isinstance(result, dict) and "notes" in result:
-            notes = result["notes"]
-            if isinstance(notes, list):
-                count = len(notes)
-                content_text = f"{header}\n  [dim]{count} notes found[/]"
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict) and result.get("success"):
+            count = result.get("total_count", 0)
+            notes = result.get("notes", []) or []
+
+            if count == 0:
+                text.append("\n  ")
+                text.append("No notes", style="dim")
            else:
-                content_text = f"{header}\n  [dim]No notes found[/]"
+                for note in notes:
+                    title = note.get("title", "").strip() or "(untitled)"
+                    category = note.get("category", "general")
+                    note_content = note.get("content", "").strip()
+
+                    text.append("\n  - ")
+                    text.append(title)
+                    text.append(f" ({category})", style="dim")
+
+                    if note_content:
+                        text.append("\n    ")
+                        text.append(note_content, style="dim")
        else:
-            content_text = f"{header}\n  [dim]Listing notes...[/]"
+            text.append("\n  ")
+            text.append("Loading...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)
--- a/strix/interface/tool_components/proxy_renderer.py
+++ b/strix/interface/tool_components/proxy_renderer.py
@@ -1,5 +1,6 @@
 from typing import Any, ClassVar

+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
@@ -18,38 +19,42 @@ class ListRequestsRenderer(BaseToolRenderer):

        httpql_filter = args.get("httpql_filter")

-        header = "📋 [bold #06b6d4]Listing requests[/]"
+        text = Text()
+        text.append("📋 ")
+        text.append("Listing requests", style="bold #06b6d4")

-        if result and isinstance(result, dict) and "requests" in result:
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict) and "requests" in result:
            requests = result["requests"]
            if isinstance(requests, list) and requests:
-                request_lines = []
-                for req in requests[:3]:
+                for req in requests[:25]:
                    if isinstance(req, dict):
                        method = req.get("method", "?")
                        path = req.get("path", "?")
                        response = req.get("response") or {}
                        status = response.get("statusCode", "?")
-                        line = f"{method} {path} → {status}"
-                        request_lines.append(line)
-
-                if len(requests) > 3:
-                    request_lines.append(f"... +{len(requests) - 3} more")
-
-                escaped_lines = [cls.escape_markup(line) for line in request_lines]
-                content_text = f"{header}\n  [dim]{chr(10).join(escaped_lines)}[/]"
+                        text.append("\n  ")
+                        text.append(f"{method} {path} → {status}", style="dim")
+                if len(requests) > 25:
+                    text.append("\n  ")
+                    text.append(f"... +{len(requests) - 25} more", style="dim")
            else:
-                content_text = f"{header}\n  [dim]No requests found[/]"
+                text.append("\n  ")
+                text.append("No requests found", style="dim")
        elif httpql_filter:
            filter_display = (
-                httpql_filter[:300] + "..." if len(httpql_filter) > 300 else httpql_filter
+                httpql_filter[:500] + "..." if len(httpql_filter) > 500 else httpql_filter
            )
-            content_text = f"{header}\n  [dim]{cls.escape_markup(filter_display)}[/]"
+            text.append("\n  ")
+            text.append(filter_display, style="dim")
        else:
-            content_text = f"{header}\n  [dim]All requests[/]"
+            text.append("\n  ")
+            text.append("All requests", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -64,34 +69,41 @@ class ViewRequestRenderer(BaseToolRenderer):

        part = args.get("part", "request")

-        header = f"👀 [bold #06b6d4]Viewing {cls.escape_markup(part)}[/]"
+        text = Text()
+        text.append("👀 ")
+        text.append(f"Viewing {part}", style="bold #06b6d4")

-        if result and isinstance(result, dict):
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict):
            if "content" in result:
                content = result["content"]
-                content_preview = content[:500] + "..." if len(content) > 500 else content
-                content_text = f"{header}\n  [dim]{cls.escape_markup(content_preview)}[/]"
+                content_preview = content[:2000] + "..." if len(content) > 2000 else content
+                text.append("\n  ")
+                text.append(content_preview, style="dim")
            elif "matches" in result:
                matches = result["matches"]
                if isinstance(matches, list) and matches:
-                    match_lines = [
-                        match["match"]
-                        for match in matches[:3]
-                        if isinstance(match, dict) and "match" in match
-                    ]
-                    if len(matches) > 3:
-                        match_lines.append(f"... +{len(matches) - 3} more matches")
-                    escaped_lines = [cls.escape_markup(line) for line in match_lines]
-                    content_text = f"{header}\n  [dim]{chr(10).join(escaped_lines)}[/]"
+                    for match in matches[:25]:
+                        if isinstance(match, dict) and "match" in match:
+                            text.append("\n  ")
+                            text.append(match["match"], style="dim")
+                    if len(matches) > 25:
+                        text.append("\n  ")
+                        text.append(f"... +{len(matches) - 25} more matches", style="dim")
                else:
-                    content_text = f"{header}\n  [dim]No matches found[/]"
+                    text.append("\n  ")
+                    text.append("No matches found", style="dim")
            else:
-                content_text = f"{header}\n  [dim]Viewing content...[/]"
+                text.append("\n  ")
+                text.append("Viewing content...", style="dim")
        else:
-            content_text = f"{header}\n  [dim]Loading...[/]"
+            text.append("\n  ")
+            text.append("Loading...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -107,30 +119,39 @@ class SendRequestRenderer(BaseToolRenderer):
        method = args.get("method", "GET")
        url = args.get("url", "")

-        header = f"📤 [bold #06b6d4]Sending {cls.escape_markup(method)}[/]"
+        text = Text()
+        text.append("📤 ")
+        text.append(f"Sending {method}", style="bold #06b6d4")

-        if result and isinstance(result, dict):
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict):
            status_code = result.get("status_code")
            response_body = result.get("body", "")

            if status_code:
-                response_preview = f"Status: {status_code}"
+                text.append("\n  ")
+                text.append(f"Status: {status_code}", style="dim")
                if response_body:
                    body_preview = (
-                        response_body[:300] + "..." if len(response_body) > 300 else response_body
+                        response_body[:2000] + "..." if len(response_body) > 2000 else response_body
                    )
-                    response_preview += f"\n{body_preview}"
-                content_text = f"{header}\n  [dim]{cls.escape_markup(response_preview)}[/]"
+                    text.append("\n  ")
+                    text.append(body_preview, style="dim")
            else:
-                content_text = f"{header}\n  [dim]Response received[/]"
+                text.append("\n  ")
+                text.append("Response received", style="dim")
        elif url:
-            url_display = url[:400] + "..." if len(url) > 400 else url
-            content_text = f"{header}\n  [dim]{cls.escape_markup(url_display)}[/]"
+            url_display = url[:500] + "..." if len(url) > 500 else url
+            text.append("\n  ")
+            text.append(url_display, style="dim")
        else:
-            content_text = f"{header}\n  [dim]Sending...[/]"
+            text.append("\n  ")
+            text.append("Sending...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -145,31 +166,40 @@ class RepeatRequestRenderer(BaseToolRenderer):

        modifications = args.get("modifications", {})

-        header = "🔄 [bold #06b6d4]Repeating request[/]"
+        text = Text()
+        text.append("🔄 ")
+        text.append("Repeating request", style="bold #06b6d4")

-        if result and isinstance(result, dict):
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict):
            status_code = result.get("status_code")
            response_body = result.get("body", "")

            if status_code:
-                response_preview = f"Status: {status_code}"
+                text.append("\n  ")
+                text.append(f"Status: {status_code}", style="dim")
                if response_body:
                    body_preview = (
-                        response_body[:300] + "..." if len(response_body) > 300 else response_body
+                        response_body[:2000] + "..." if len(response_body) > 2000 else response_body
                    )
-                    response_preview += f"\n{body_preview}"
-                content_text = f"{header}\n  [dim]{cls.escape_markup(response_preview)}[/]"
+                    text.append("\n  ")
+                    text.append(body_preview, style="dim")
            else:
-                content_text = f"{header}\n  [dim]Response received[/]"
+                text.append("\n  ")
+                text.append("Response received", style="dim")
        elif modifications:
-            mod_text = str(modifications)
-            mod_display = mod_text[:400] + "..." if len(mod_text) > 400 else mod_text
-            content_text = f"{header}\n  [dim]{cls.escape_markup(mod_display)}[/]"
+            mod_str = str(modifications)
+            mod_display = mod_str[:500] + "..." if len(mod_str) > 500 else mod_str
+            text.append("\n  ")
+            text.append(mod_display, style="dim")
        else:
-            content_text = f"{header}\n  [dim]No modifications[/]"
+            text.append("\n  ")
+            text.append("No modifications", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -179,11 +209,14 @@ class ScopeRulesRenderer(BaseToolRenderer):

    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:  # noqa: ARG003
-        header = "⚙️ [bold #06b6d4]Updating proxy scope[/]"
-        content_text = f"{header}\n  [dim]Configuring...[/]"
+        text = Text()
+        text.append("⚙️ ")
+        text.append("Updating proxy scope", style="bold #06b6d4")
+        text.append("\n  ")
+        text.append("Configuring...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -195,31 +228,34 @@ class ListSitemapRenderer(BaseToolRenderer):
    def render(cls, tool_data: dict[str, Any]) -> Static:
        result = tool_data.get("result")

-        header = "🗺️ [bold #06b6d4]Listing sitemap[/]"
+        text = Text()
+        text.append("🗺️ ")
+        text.append("Listing sitemap", style="bold #06b6d4")

-        if result and isinstance(result, dict) and "entries" in result:
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict) and "entries" in result:
            entries = result["entries"]
            if isinstance(entries, list) and entries:
-                entry_lines = []
-                for entry in entries[:4]:
+                for entry in entries[:30]:
                    if isinstance(entry, dict):
                        label = entry.get("label", "?")
                        kind = entry.get("kind", "?")
-                        line = f"{kind}: {label}"
-                        entry_lines.append(line)
-
-                if len(entries) > 4:
-                    entry_lines.append(f"... +{len(entries) - 4} more")
-
-                escaped_lines = [cls.escape_markup(line) for line in entry_lines]
-                content_text = f"{header}\n  [dim]{chr(10).join(escaped_lines)}[/]"
+                        text.append("\n  ")
+                        text.append(f"{kind}: {label}", style="dim")
+                if len(entries) > 30:
+                    text.append("\n  ")
+                    text.append(f"... +{len(entries) - 30} more entries", style="dim")
            else:
-                content_text = f"{header}\n  [dim]No entries found[/]"
+                text.append("\n  ")
+                text.append("No entries found", style="dim")
        else:
-            content_text = f"{header}\n  [dim]Loading...[/]"
+            text.append("\n  ")
+            text.append("Loading...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)


@register_tool_renderer
@@ -231,25 +267,30 @@ class ViewSitemapEntryRenderer(BaseToolRenderer):
    def render(cls, tool_data: dict[str, Any]) -> Static:
        result = tool_data.get("result")

-        header = "📍 [bold #06b6d4]Viewing sitemap entry[/]"
+        text = Text()
+        text.append("📍 ")
+        text.append("Viewing sitemap entry", style="bold #06b6d4")

-        if result and isinstance(result, dict):
-            if "entry" in result:
-                entry = result["entry"]
-                if isinstance(entry, dict):
-                    label = entry.get("label", "")
-                    kind = entry.get("kind", "")
-                    if label and kind:
-                        entry_info = f"{kind}: {label}"
-                        content_text = f"{header}\n  [dim]{cls.escape_markup(entry_info)}[/]"
-                    else:
-                        content_text = f"{header}\n  [dim]Entry details loaded[/]"
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict) and "entry" in result:
+            entry = result["entry"]
+            if isinstance(entry, dict):
+                label = entry.get("label", "")
+                kind = entry.get("kind", "")
+                if label and kind:
+                    text.append("\n  ")
+                    text.append(f"{kind}: {label}", style="dim")
                else:
-                    content_text = f"{header}\n  [dim]Entry details loaded[/]"
+                    text.append("\n  ")
+                    text.append("Entry details loaded", style="dim")
            else:
-                content_text = f"{header}\n  [dim]Loading entry...[/]"
+                text.append("\n  ")
+                text.append("Entry details loaded", style="dim")
        else:
-            content_text = f"{header}\n  [dim]Loading...[/]"
+            text.append("\n  ")
+            text.append("Loading...", style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)
--- a/strix/interface/tool_components/python_renderer.py
+++ b/strix/interface/tool_components/python_renderer.py
@@ -1,34 +1,156 @@
+import re
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import PythonLexer
+from pygments.styles import get_style_by_name
+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+MAX_OUTPUT_LINES = 50
+MAX_LINE_LENGTH = 200
+
+STRIP_PATTERNS = [
+    r"\.\.\. \[(stdout|stderr|result|output|error) truncated at \d+k? chars\]",
+]
+
+
+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
@register_tool_renderer
 class PythonRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "python_action"
    css_classes: ClassVar[list[str]] = ["tool-call", "python-tool"]

+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_python(cls, code: str) -> Text:
+        lexer = PythonLexer()
+        text = Text()
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+            color = cls._get_token_color(token_type)
+            text.append(token_value, style=color)
+
+        return text
+
+    @classmethod
+    def _clean_output(cls, output: str) -> str:
+        cleaned = output
+        for pattern in STRIP_PATTERNS:
+            cleaned = re.sub(pattern, "", cleaned)
+        return cleaned.strip()
+
+    @classmethod
+    def _truncate_line(cls, line: str) -> str:
+        if len(line) > MAX_LINE_LENGTH:
+            return line[: MAX_LINE_LENGTH - 3] + "..."
+        return line
+
+    @classmethod
+    def _format_output(cls, output: str) -> Text:
+        text = Text()
+        lines = output.splitlines()
+        total_lines = len(lines)
+
+        head_count = MAX_OUTPUT_LINES // 2
+        tail_count = MAX_OUTPUT_LINES - head_count - 1
+
+        if total_lines <= MAX_OUTPUT_LINES:
+            display_lines = lines
+            truncated = False
+            hidden_count = 0
+        else:
+            display_lines = lines[:head_count]
+            truncated = True
+            hidden_count = total_lines - head_count - tail_count
+
+        for i, line in enumerate(display_lines):
+            truncated_line = cls._truncate_line(line)
+            text.append("  ")
+            text.append(truncated_line, style="dim")
+            if i < len(display_lines) - 1 or truncated:
+                text.append("\n")
+
+        if truncated:
+            text.append(f"  ... {hidden_count} lines truncated ...", style="dim italic")
+            text.append("\n")
+            tail_lines = lines[-tail_count:]
+            for i, line in enumerate(tail_lines):
+                truncated_line = cls._truncate_line(line)
+                text.append("  ")
+                text.append(truncated_line, style="dim")
+                if i < len(tail_lines) - 1:
+                    text.append("\n")
+
+        return text
+
+    @classmethod
+    def _append_output(cls, text: Text, result: dict[str, Any] | str) -> None:
+        if isinstance(result, str):
+            if result.strip():
+                text.append("\n")
+                text.append_text(cls._format_output(result))
+            return
+
+        stdout = result.get("stdout", "")
+        stderr = result.get("stderr", "")
+
+        stdout = cls._clean_output(stdout) if stdout else ""
+        stderr = cls._clean_output(stderr) if stderr else ""
+
+        if stdout:
+            text.append("\n")
+            formatted_output = cls._format_output(stdout)
+            text.append_text(formatted_output)
+
+        if stderr:
+            text.append("\n")
+            text.append("  stderr: ", style="bold #ef4444")
+            formatted_stderr = cls._format_output(stderr)
+            text.append_text(formatted_stderr)
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
+        status = tool_data.get("status", "unknown")
+        result = tool_data.get("result")

        action = args.get("action", "")
        code = args.get("code", "")

-        header = "</> [bold #3b82f6]Python[/]"
+        text = Text()
+        text.append("</> ", style="dim")

        if code and action in ["new_session", "execute"]:
-            code_display = code[:600] + "..." if len(code) > 600 else code
-            content_text = f"{header}\n  [italic white]{cls.escape_markup(code_display)}[/]"
+            text.append_text(cls._highlight_python(code))
        elif action == "close":
-            content_text = f"{header}\n  [dim]Closing session...[/]"
+            text.append("Closing session...", style="dim")
        elif action == "list_sessions":
-            content_text = f"{header}\n  [dim]Listing sessions...[/]"
+            text.append("Listing sessions...", style="dim")
        else:
-            content_text = f"{header}\n  [dim]Running...[/]"
+            text.append("Running...", style="dim")

-        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        if result and isinstance(result, dict | str):
+            cls._append_output(text, result)
+
+        css_classes = cls.get_css_classes(status)
+        return Static(text, classes=css_classes)
--- a/strix/interface/tool_components/registry.py
+++ b/strix/interface/tool_components/registry.py
@@ -1,5 +1,6 @@
 from typing import Any, ClassVar

+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
@@ -47,26 +48,32 @@ def render_tool_widget(tool_data: dict[str, Any]) -> Static:


 def _render_default_tool_widget(tool_data: dict[str, Any]) -> Static:
-    tool_name = BaseToolRenderer.escape_markup(tool_data.get("tool_name", "Unknown Tool"))
+    tool_name = tool_data.get("tool_name", "Unknown Tool")
    args = tool_data.get("args", {})
    status = tool_data.get("status", "unknown")
    result = tool_data.get("result")

-    status_text = BaseToolRenderer.get_status_icon(status)
+    text = Text()

-    header = f"→ Using tool [bold blue]{BaseToolRenderer.escape_markup(tool_name)}[/]"
-    content_parts = [header]
+    text.append("→ Using tool ", style="dim")
+    text.append(tool_name, style="bold blue")
+    text.append("\n")

-    args_str = BaseToolRenderer.format_args(args)
-    if args_str:
-        content_parts.append(args_str)
+    for k, v in list(args.items()):
+        str_v = str(v)
+        text.append("  ")
+        text.append(k, style="dim")
+        text.append(": ")
+        text.append(str_v)
+        text.append("\n")

    if status in ["completed", "failed", "error"] and result is not None:
-        result_str = BaseToolRenderer.format_result(result)
-        if result_str:
-            content_parts.append(f"[bold]Result:[/] {result_str}")
+        result_str = str(result)
+        text.append("Result: ", style="bold")
+        text.append(result_str)
    else:
-        content_parts.append(status_text)
+        icon, color = BaseToolRenderer.status_icon(status)
+        text.append(icon, style=color)

    css_classes = BaseToolRenderer.get_css_classes(status)
-    return Static("\n".join(content_parts), classes=css_classes)
+    return Static(text, classes=css_classes)
--- a/strix/interface/tool_components/reporting_renderer.py
+++ b/strix/interface/tool_components/reporting_renderer.py
@@ -1,53 +1,221 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import PythonLexer
+from pygments.styles import get_style_by_name
+from rich.padding import Padding
+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
+FIELD_STYLE = "bold #4ade80"
+BG_COLOR = "#141414"
+
+
@register_tool_renderer
 class CreateVulnerabilityReportRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "create_vulnerability_report"
    css_classes: ClassVar[list[str]] = ["tool-call", "reporting-tool"]

+    SEVERITY_COLORS: ClassVar[dict[str, str]] = {
+        "critical": "#dc2626",
+        "high": "#ea580c",
+        "medium": "#d97706",
+        "low": "#65a30d",
+        "info": "#0284c7",
+    }
+
    @classmethod
-    def render(cls, tool_data: dict[str, Any]) -> Static:
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_python(cls, code: str) -> Text:
+        lexer = PythonLexer()
+        text = Text()
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+            color = cls._get_token_color(token_type)
+            text.append(token_value, style=color)
+
+        return text
+
+    @classmethod
+    def _get_cvss_color(cls, cvss_score: float) -> str:
+        if cvss_score >= 9.0:
+            return "#dc2626"
+        if cvss_score >= 7.0:
+            return "#ea580c"
+        if cvss_score >= 4.0:
+            return "#d97706"
+        if cvss_score >= 0.1:
+            return "#65a30d"
+        return "#6b7280"
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:  # noqa: PLR0912, PLR0915
        args = tool_data.get("args", {})
+        result = tool_data.get("result", {})

        title = args.get("title", "")
-        severity = args.get("severity", "")
-        content = args.get("content", "")
+        description = args.get("description", "")
+        impact = args.get("impact", "")
+        target = args.get("target", "")
+        technical_analysis = args.get("technical_analysis", "")
+        poc_description = args.get("poc_description", "")
+        poc_script_code = args.get("poc_script_code", "")
+        remediation_steps = args.get("remediation_steps", "")

-        header = "🐞 [bold #ea580c]Vulnerability Report[/]"
+        attack_vector = args.get("attack_vector", "")
+        attack_complexity = args.get("attack_complexity", "")
+        privileges_required = args.get("privileges_required", "")
+        user_interaction = args.get("user_interaction", "")
+        scope = args.get("scope", "")
+        confidentiality = args.get("confidentiality", "")
+        integrity = args.get("integrity", "")
+        availability = args.get("availability", "")
+
+        endpoint = args.get("endpoint", "")
+        method = args.get("method", "")
+        cve = args.get("cve", "")
+
+        severity = ""
+        cvss_score = None
+        if isinstance(result, dict):
+            severity = result.get("severity", "")
+            cvss_score = result.get("cvss_score")
+
+        text = Text()
+        text.append("🐞 ")
+        text.append("Vulnerability Report", style="bold #ea580c")

        if title:
-            content_parts = [f"{header}\n  [bold]{cls.escape_markup(title)}[/]"]
+            text.append("\n\n")
+            text.append("Title: ", style=FIELD_STYLE)
+            text.append(title)

-            if severity:
-                severity_color = cls._get_severity_color(severity.lower())
-                content_parts.append(
-                    f"  [dim]Severity: [{severity_color}]"
-                    f"{cls.escape_markup(severity.upper())}[/{severity_color}][/]"
-                )
+        if severity:
+            text.append("\n\n")
+            text.append("Severity: ", style=FIELD_STYLE)
+            severity_color = cls.SEVERITY_COLORS.get(severity.lower(), "#6b7280")
+            text.append(severity.upper(), style=f"bold {severity_color}")

-            if content:
-                content_parts.append(f"  [dim]{cls.escape_markup(content)}[/]")
+        if cvss_score is not None:
+            text.append("\n\n")
+            text.append("CVSS Score: ", style=FIELD_STYLE)
+            cvss_color = cls._get_cvss_color(cvss_score)
+            text.append(str(cvss_score), style=f"bold {cvss_color}")

-            content_text = "\n".join(content_parts)
-        else:
-            content_text = f"{header}\n  [dim]Creating report...[/]"
+        if target:
+            text.append("\n\n")
+            text.append("Target: ", style=FIELD_STYLE)
+            text.append(target)
+
+        if endpoint:
+            text.append("\n\n")
+            text.append("Endpoint: ", style=FIELD_STYLE)
+            text.append(endpoint)
+
+        if method:
+            text.append("\n\n")
+            text.append("Method: ", style=FIELD_STYLE)
+            text.append(method)
+
+        if cve:
+            text.append("\n\n")
+            text.append("CVE: ", style=FIELD_STYLE)
+            text.append(cve)
+
+        if any(
+            [
+                attack_vector,
+                attack_complexity,
+                privileges_required,
+                user_interaction,
+                scope,
+                confidentiality,
+                integrity,
+                availability,
+            ]
+        ):
+            text.append("\n\n")
+            cvss_parts = []
+            if attack_vector:
+                cvss_parts.append(f"AV:{attack_vector}")
+            if attack_complexity:
+                cvss_parts.append(f"AC:{attack_complexity}")
+            if privileges_required:
+                cvss_parts.append(f"PR:{privileges_required}")
+            if user_interaction:
+                cvss_parts.append(f"UI:{user_interaction}")
+            if scope:
+                cvss_parts.append(f"S:{scope}")
+            if confidentiality:
+                cvss_parts.append(f"C:{confidentiality}")
+            if integrity:
+                cvss_parts.append(f"I:{integrity}")
+            if availability:
+                cvss_parts.append(f"A:{availability}")
+            text.append("CVSS Vector: ", style=FIELD_STYLE)
+            text.append("/".join(cvss_parts), style="dim")
+
+        if description:
+            text.append("\n\n")
+            text.append("Description", style=FIELD_STYLE)
+            text.append("\n")
+            text.append(description)
+
+        if impact:
+            text.append("\n\n")
+            text.append("Impact", style=FIELD_STYLE)
+            text.append("\n")
+            text.append(impact)
+
+        if technical_analysis:
+            text.append("\n\n")
+            text.append("Technical Analysis", style=FIELD_STYLE)
+            text.append("\n")
+            text.append(technical_analysis)
+
+        if poc_description:
+            text.append("\n\n")
+            text.append("PoC Description", style=FIELD_STYLE)
+            text.append("\n")
+            text.append(poc_description)
+
+        if poc_script_code:
+            text.append("\n\n")
+            text.append("PoC Code", style=FIELD_STYLE)
+            text.append("\n")
+            text.append_text(cls._highlight_python(poc_script_code))
+
+        if remediation_steps:
+            text.append("\n\n")
+            text.append("Remediation", style=FIELD_STYLE)
+            text.append("\n")
+            text.append(remediation_steps)
+
+        if not title:
+            text.append("\n  ")
+            text.append("Creating report...", style="dim")
+
+        padded = Padding(text, 2, style=f"on {BG_COLOR}")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
-
-    @classmethod
-    def _get_severity_color(cls, severity: str) -> str:
-        severity_colors = {
-            "critical": "#dc2626",
-            "high": "#ea580c",
-            "medium": "#d97706",
-            "low": "#65a30d",
-            "info": "#0284c7",
-        }
-        return severity_colors.get(severity, "#6b7280")
+        return Static(padded, classes=css_classes)
--- a/strix/interface/tool_components/scan_info_renderer.py
+++ b/strix/interface/tool_components/scan_info_renderer.py
@@ -1,5 +1,6 @@
 from typing import Any, ClassVar

+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
@@ -15,29 +16,28 @@ class ScanStartInfoRenderer(BaseToolRenderer):
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        status = tool_data.get("status", "unknown")
-
        targets = args.get("targets", [])

+        text = Text()
+        text.append("🚀 Starting penetration test")
+
        if len(targets) == 1:
-            target_display = cls._build_single_target_display(targets[0])
-            content = f"🚀 Starting penetration test on {target_display}"
+            text.append(" on ")
+            text.append(cls._get_target_display(targets[0]))
        elif len(targets) > 1:
-            content = f"🚀 Starting penetration test on {len(targets)} targets"
+            text.append(f" on {len(targets)} targets")
            for target_info in targets:
-                target_display = cls._build_single_target_display(target_info)
-                content += f"\n   • {target_display}"
-        else:
-            content = "🚀 Starting penetration test"
+                text.append("\n   • ")
+                text.append(cls._get_target_display(target_info))

        css_classes = cls.get_css_classes(status)
-        return Static(content, classes=css_classes)
+        return Static(text, classes=css_classes)

    @classmethod
-    def _build_single_target_display(cls, target_info: dict[str, Any]) -> str:
+    def _get_target_display(cls, target_info: dict[str, Any]) -> str:
        original = target_info.get("original")
        if original:
-            return cls.escape_markup(str(original))
-
+            return str(original)
        return "unknown target"


@@ -51,14 +51,17 @@ class SubagentStartInfoRenderer(BaseToolRenderer):
        args = tool_data.get("args", {})
        status = tool_data.get("status", "unknown")

-        name = args.get("name", "Unknown Agent")
-        task = args.get("task", "")
+        name = str(args.get("name", "Unknown Agent"))
+        task = str(args.get("task", ""))
+
+        text = Text()
+        text.append("◈ ", style="#a78bfa")
+        text.append("subagent ", style="dim")
+        text.append(name, style="bold #a78bfa")

-        name = cls.escape_markup(str(name))
-        content = f"🤖 Spawned subagent {name}"
        if task:
-            task = cls.escape_markup(str(task))
-            content += f"\n    Task: {task}"
+            text.append("\n  ")
+            text.append(task, style="dim")

        css_classes = cls.get_css_classes(status)
-        return Static(content, classes=css_classes)
+        return Static(text, classes=css_classes)
--- a/strix/interface/tool_components/terminal_renderer.py
+++ b/strix/interface/tool_components/terminal_renderer.py
@@ -1,131 +1,311 @@
+import re
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import get_lexer_by_name
+from pygments.styles import get_style_by_name
+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+MAX_OUTPUT_LINES = 50
+MAX_LINE_LENGTH = 200
+
+STRIP_PATTERNS = [
+    (
+        r"\n?\[Command still running after [\d.]+s - showing output so far\.?"
+        r"\s*(?:Use C-c to interrupt if needed\.)?\]"
+    ),
+    r"^\[Below is the output of the previous command\.\]\n?",
+    r"^No command is currently running\. Cannot send input\.$",
+    (
+        r"^A command is already running\. Use is_input=true to send input to it, "
+        r"or interrupt it first \(e\.g\., with C-c\)\.$"
+    ),
+]
+
+
+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
@register_tool_renderer
 class TerminalRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "terminal_execute"
    css_classes: ClassVar[list[str]] = ["tool-call", "terminal-tool"]

+    CONTROL_SEQUENCES: ClassVar[set[str]] = {
+        "C-c",
+        "C-d",
+        "C-z",
+        "C-a",
+        "C-e",
+        "C-k",
+        "C-l",
+        "C-u",
+        "C-w",
+        "C-r",
+        "C-s",
+        "C-t",
+        "C-y",
+        "^c",
+        "^d",
+        "^z",
+        "^a",
+        "^e",
+        "^k",
+        "^l",
+        "^u",
+        "^w",
+        "^r",
+        "^s",
+        "^t",
+        "^y",
+    }
+    SPECIAL_KEYS: ClassVar[set[str]] = {
+        "Enter",
+        "Escape",
+        "Space",
+        "Tab",
+        "BTab",
+        "BSpace",
+        "DC",
+        "IC",
+        "Up",
+        "Down",
+        "Left",
+        "Right",
+        "Home",
+        "End",
+        "PageUp",
+        "PageDown",
+        "PgUp",
+        "PgDn",
+        "PPage",
+        "NPage",
+        "F1",
+        "F2",
+        "F3",
+        "F4",
+        "F5",
+        "F6",
+        "F7",
+        "F8",
+        "F9",
+        "F10",
+        "F11",
+        "F12",
+    }
+
+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_bash(cls, code: str) -> Text:
+        lexer = get_lexer_by_name("bash")
+        text = Text()
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+            color = cls._get_token_color(token_type)
+            text.append(token_value, style=color)
+
+        return text
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        status = tool_data.get("status", "unknown")
-        result = tool_data.get("result", {})
+        result = tool_data.get("result")

        command = args.get("command", "")
        is_input = args.get("is_input", False)
-        terminal_id = args.get("terminal_id", "default")
-        timeout = args.get("timeout")

-        content = cls._build_sleek_content(command, is_input, terminal_id, timeout, result)
+        content = cls._build_content(command, is_input, status, result)

        css_classes = cls.get_css_classes(status)
        return Static(content, classes=css_classes)

    @classmethod
-    def _build_sleek_content(
-        cls,
-        command: str,
-        is_input: bool,
-        terminal_id: str,  # noqa: ARG003
-        timeout: float | None,  # noqa: ARG003
-        result: dict[str, Any],  # noqa: ARG003
-    ) -> str:
+    def _build_content(
+        cls, command: str, is_input: bool, status: str, result: dict[str, Any] | str | None
+    ) -> Text:
+        text = Text()
        terminal_icon = ">_"

        if not command.strip():
-            return f"{terminal_icon} [dim]getting logs...[/]"
-
-        control_sequences = {
-            "C-c",
-            "C-d",
-            "C-z",
-            "C-a",
-            "C-e",
-            "C-k",
-            "C-l",
-            "C-u",
-            "C-w",
-            "C-r",
-            "C-s",
-            "C-t",
-            "C-y",
-            "^c",
-            "^d",
-            "^z",
-            "^a",
-            "^e",
-            "^k",
-            "^l",
-            "^u",
-            "^w",
-            "^r",
-            "^s",
-            "^t",
-            "^y",
-        }
-        special_keys = {
-            "Enter",
-            "Escape",
-            "Space",
-            "Tab",
-            "BTab",
-            "BSpace",
-            "DC",
-            "IC",
-            "Up",
-            "Down",
-            "Left",
-            "Right",
-            "Home",
-            "End",
-            "PageUp",
-            "PageDown",
-            "PgUp",
-            "PgDn",
-            "PPage",
-            "NPage",
-            "F1",
-            "F2",
-            "F3",
-            "F4",
-            "F5",
-            "F6",
-            "F7",
-            "F8",
-            "F9",
-            "F10",
-            "F11",
-            "F12",
-        }
+            text.append(terminal_icon, style="dim")
+            text.append(" ")
+            text.append("getting logs...", style="dim")
+            if result:
+                cls._append_output(text, result, status, command)
+            return text

        is_special = (
-            command in control_sequences
-            or command in special_keys
+            command in cls.CONTROL_SEQUENCES
+            or command in cls.SPECIAL_KEYS
            or command.startswith(("M-", "S-", "C-S-", "C-M-", "S-M-"))
        )

+        text.append(terminal_icon, style="dim")
+        text.append(" ")
+
        if is_special:
-            return f"{terminal_icon} [#ef4444]{cls.escape_markup(command)}[/]"
+            text.append(command, style="#ef4444")
+        elif is_input:
+            text.append(">>>", style="#3b82f6")
+            text.append(" ")
+            text.append_text(cls._format_command(command))
+        else:
+            text.append("$", style="#22c55e")
+            text.append(" ")
+            text.append_text(cls._format_command(command))

-        if is_input:
-            formatted_command = cls._format_command_display(command)
-            return f"{terminal_icon} [#3b82f6]>>>[/] [#22c55e]{formatted_command}[/]"
+        if result:
+            cls._append_output(text, result, status, command)

-        formatted_command = cls._format_command_display(command)
-        return f"{terminal_icon} [#22c55e]$ {formatted_command}[/]"
+        return text

    @classmethod
-    def _format_command_display(cls, command: str) -> str:
-        if not command:
-            return ""
+    def _clean_output(cls, output: str, command: str = "") -> str:
+        cleaned = output

-        if len(command) > 400:
-            command = command[:397] + "..."
+        for pattern in STRIP_PATTERNS:
+            cleaned = re.sub(pattern, "", cleaned, flags=re.MULTILINE)

-        return cls.escape_markup(command)
+        if cleaned.strip():
+            lines = cleaned.splitlines()
+            filtered_lines: list[str] = []
+            for line in lines:
+                if not filtered_lines and not line.strip():
+                    continue
+                if re.match(r"^\[STRIX_\d+\]\$\s*", line):
+                    continue
+                if command and line.strip() == command.strip():
+                    continue
+                if command and re.match(r"^[\$#>]\s*" + re.escape(command.strip()) + r"\s*$", line):
+                    continue
+                filtered_lines.append(line)
+
+            while filtered_lines and re.match(r"^\[STRIX_\d+\]\$\s*", filtered_lines[-1]):
+                filtered_lines.pop()
+
+            cleaned = "\n".join(filtered_lines)
+
+        return cleaned.strip()
+
+    @classmethod
+    def _append_output(
+        cls, text: Text, result: dict[str, Any] | str, tool_status: str, command: str = ""
+    ) -> None:
+        if isinstance(result, str):
+            if result.strip():
+                text.append("\n")
+                text.append_text(cls._format_output(result))
+            return
+
+        raw_output = result.get("content", "")
+        output = cls._clean_output(raw_output, command)
+        error = result.get("error")
+        exit_code = result.get("exit_code")
+        result_status = result.get("status", "")
+
+        if error and not cls._is_status_message(error):
+            text.append("\n")
+            text.append("  error: ", style="bold #ef4444")
+            text.append(cls._truncate_line(error), style="#ef4444")
+            return
+
+        if result_status == "running" or tool_status == "running":
+            if output and output.strip():
+                text.append("\n")
+                formatted_output = cls._format_output(output)
+                text.append_text(formatted_output)
+            return
+
+        if not output or not output.strip():
+            if exit_code is not None and exit_code != 0:
+                text.append("\n")
+                text.append(f"  exit {exit_code}", style="dim #ef4444")
+            return
+
+        text.append("\n")
+        formatted_output = cls._format_output(output)
+        text.append_text(formatted_output)
+
+        if exit_code is not None and exit_code != 0:
+            text.append("\n")
+            text.append(f"  exit {exit_code}", style="dim #ef4444")
+
+    @classmethod
+    def _is_status_message(cls, message: str) -> bool:
+        status_patterns = [
+            r"No command is currently running",
+            r"A command is already running",
+            r"Cannot send input",
+            r"Use is_input=true",
+            r"Use C-c to interrupt",
+            r"showing output so far",
+        ]
+        return any(re.search(pattern, message) for pattern in status_patterns)
+
+    @classmethod
+    def _format_output(cls, output: str) -> Text:
+        text = Text()
+        lines = output.splitlines()
+        total_lines = len(lines)
+
+        head_count = MAX_OUTPUT_LINES // 2
+        tail_count = MAX_OUTPUT_LINES - head_count - 1
+
+        if total_lines <= MAX_OUTPUT_LINES:
+            display_lines = lines
+            truncated = False
+            hidden_count = 0
+        else:
+            display_lines = lines[:head_count]
+            truncated = True
+            hidden_count = total_lines - head_count - tail_count
+
+        for i, line in enumerate(display_lines):
+            truncated_line = cls._truncate_line(line)
+            text.append("  ")
+            text.append(truncated_line, style="dim")
+            if i < len(display_lines) - 1 or truncated:
+                text.append("\n")
+
+        if truncated:
+            text.append(f"  ... {hidden_count} lines truncated ...", style="dim italic")
+            text.append("\n")
+            tail_lines = lines[-tail_count:]
+            for i, line in enumerate(tail_lines):
+                truncated_line = cls._truncate_line(line)
+                text.append("  ")
+                text.append(truncated_line, style="dim")
+                if i < len(tail_lines) - 1:
+                    text.append("\n")
+
+        return text
+
+    @classmethod
+    def _truncate_line(cls, line: str) -> str:
+        clean_line = re.sub(r"\x1b\[[0-9;]*m", "", line)
+        if len(clean_line) > MAX_LINE_LENGTH:
+            return line[: MAX_LINE_LENGTH - 3] + "..."
+        return line
+
+    @classmethod
+    def _format_command(cls, command: str) -> Text:
+        return cls._highlight_bash(command)
--- a/strix/interface/tool_components/thinking_renderer.py
+++ b/strix/interface/tool_components/thinking_renderer.py
@@ -1,5 +1,6 @@
 from typing import Any, ClassVar

+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
@@ -14,16 +15,17 @@ class ThinkRenderer(BaseToolRenderer):
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
-
        thought = args.get("thought", "")

-        header = "🧠 [bold #a855f7]Thinking[/]"
+        text = Text()
+        text.append("🧠 ")
+        text.append("Thinking", style="bold #a855f7")
+        text.append("\n  ")

        if thought:
-            thought_display = thought[:600] + "..." if len(thought) > 600 else thought
-            content = f"{header}\n  [italic dim]{cls.escape_markup(thought_display)}[/]"
+            text.append(thought, style="italic dim")
        else:
-            content = f"{header}\n  [italic dim]Thinking...[/]"
+            text.append("Thinking...", style="italic dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content, classes=css_classes)
+        return Static(text, classes=css_classes)
--- a/strix/interface/tool_components/todo_renderer.py
+++ b/strix/interface/tool_components/todo_renderer.py
@@ -0,0 +1,225 @@
+from typing import Any, ClassVar
+
+from rich.text import Text
+from textual.widgets import Static
+
+from .base_renderer import BaseToolRenderer
+from .registry import register_tool_renderer
+
+
+STATUS_MARKERS: dict[str, str] = {
+    "pending": "[ ]",
+    "in_progress": "[~]",
+    "done": "[•]",
+}
+
+
+def _format_todo_lines(text: Text, result: dict[str, Any]) -> None:
+    todos = result.get("todos")
+    if not isinstance(todos, list) or not todos:
+        text.append("\n  ")
+        text.append("No todos", style="dim")
+        return
+
+    for todo in todos:
+        status = todo.get("status", "pending")
+        marker = STATUS_MARKERS.get(status, STATUS_MARKERS["pending"])
+
+        title = todo.get("title", "").strip() or "(untitled)"
+
+        text.append("\n  ")
+        text.append(marker)
+        text.append(" ")
+
+        if status == "done":
+            text.append(title, style="dim strike")
+        elif status == "in_progress":
+            text.append(title, style="italic")
+        else:
+            text.append(title)
+
+
+@register_tool_renderer
+class CreateTodoRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "create_todo"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+
+        text = Text()
+        text.append("📋 ")
+        text.append("Todo", style="bold #a78bfa")
+
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict):
+            if result.get("success"):
+                _format_todo_lines(text, result)
+            else:
+                error = result.get("error", "Failed to create todo")
+                text.append("\n  ")
+                text.append(error, style="#ef4444")
+        else:
+            text.append("\n  ")
+            text.append("Creating...", style="dim")
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(text, classes=css_classes)
+
+
+@register_tool_renderer
+class ListTodosRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "list_todos"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+
+        text = Text()
+        text.append("📋 ")
+        text.append("Todos", style="bold #a78bfa")
+
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict):
+            if result.get("success"):
+                _format_todo_lines(text, result)
+            else:
+                error = result.get("error", "Unable to list todos")
+                text.append("\n  ")
+                text.append(error, style="#ef4444")
+        else:
+            text.append("\n  ")
+            text.append("Loading...", style="dim")
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(text, classes=css_classes)
+
+
+@register_tool_renderer
+class UpdateTodoRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "update_todo"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+
+        text = Text()
+        text.append("📋 ")
+        text.append("Todo Updated", style="bold #a78bfa")
+
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict):
+            if result.get("success"):
+                _format_todo_lines(text, result)
+            else:
+                error = result.get("error", "Failed to update todo")
+                text.append("\n  ")
+                text.append(error, style="#ef4444")
+        else:
+            text.append("\n  ")
+            text.append("Updating...", style="dim")
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(text, classes=css_classes)
+
+
+@register_tool_renderer
+class MarkTodoDoneRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "mark_todo_done"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+
+        text = Text()
+        text.append("📋 ")
+        text.append("Todo Completed", style="bold #a78bfa")
+
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict):
+            if result.get("success"):
+                _format_todo_lines(text, result)
+            else:
+                error = result.get("error", "Failed to mark todo done")
+                text.append("\n  ")
+                text.append(error, style="#ef4444")
+        else:
+            text.append("\n  ")
+            text.append("Marking done...", style="dim")
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(text, classes=css_classes)
+
+
+@register_tool_renderer
+class MarkTodoPendingRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "mark_todo_pending"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+
+        text = Text()
+        text.append("📋 ")
+        text.append("Todo Reopened", style="bold #f59e0b")
+
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict):
+            if result.get("success"):
+                _format_todo_lines(text, result)
+            else:
+                error = result.get("error", "Failed to reopen todo")
+                text.append("\n  ")
+                text.append(error, style="#ef4444")
+        else:
+            text.append("\n  ")
+            text.append("Reopening...", style="dim")
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(text, classes=css_classes)
+
+
+@register_tool_renderer
+class DeleteTodoRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "delete_todo"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+
+        text = Text()
+        text.append("📋 ")
+        text.append("Todo Removed", style="bold #94a3b8")
+
+        if isinstance(result, str) and result.strip():
+            text.append("\n  ")
+            text.append(result.strip(), style="dim")
+        elif result and isinstance(result, dict):
+            if result.get("success"):
+                _format_todo_lines(text, result)
+            else:
+                error = result.get("error", "Failed to remove todo")
+                text.append("\n  ")
+                text.append(error, style="#ef4444")
+        else:
+            text.append("\n  ")
+            text.append("Removing...", style="dim")
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(text, classes=css_classes)
--- a/strix/interface/tool_components/user_message_renderer.py
+++ b/strix/interface/tool_components/user_message_renderer.py
@@ -1,5 +1,6 @@
 from typing import Any, ClassVar

+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
@@ -12,32 +13,38 @@ class UserMessageRenderer(BaseToolRenderer):
    css_classes: ClassVar[list[str]] = ["chat-message", "user-message"]

    @classmethod
-    def render(cls, message_data: dict[str, Any]) -> Static:
-        content = message_data.get("content", "")
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        content = tool_data.get("content", "")

        if not content:
-            return Static("", classes=cls.css_classes)
+            return Static(Text(), classes=" ".join(cls.css_classes))

-        if len(content) > 300:
-            content = content[:297] + "..."
+        styled_text = cls._format_user_message(content)

-        lines = content.split("\n")
-        bordered_lines = [f"[#3b82f6]▍[/#3b82f6] {line}" for line in lines]
-        bordered_content = "\n".join(bordered_lines)
-        formatted_content = f"[#3b82f6]▍[/#3b82f6] [bold]You:[/]\n{bordered_content}"
-
-        css_classes = " ".join(cls.css_classes)
-        return Static(formatted_content, classes=css_classes)
+        return Static(styled_text, classes=" ".join(cls.css_classes))

    @classmethod
-    def render_simple(cls, content: str) -> str:
+    def render_simple(cls, content: str) -> Text:
        if not content:
-            return ""
+            return Text()

-        if len(content) > 300:
-            content = content[:297] + "..."
+        return cls._format_user_message(content)
+
+    @classmethod
+    def _format_user_message(cls, content: str) -> Text:
+        text = Text()
+
+        text.append("▍", style="#3b82f6")
+        text.append(" ")
+        text.append("You:", style="bold")
+        text.append("\n")

        lines = content.split("\n")
-        bordered_lines = [f"[#3b82f6]▍[/#3b82f6] {line}" for line in lines]
-        bordered_content = "\n".join(bordered_lines)
-        return f"[#3b82f6]▍[/#3b82f6] [bold]You:[/]\n{bordered_content}"
+        for i, line in enumerate(lines):
+            if i > 0:
+                text.append("\n")
+            text.append("▍", style="#3b82f6")
+            text.append(" ")
+            text.append(line)
+
+        return text
--- a/strix/interface/tool_components/web_search_renderer.py
+++ b/strix/interface/tool_components/web_search_renderer.py
@@ -1,5 +1,6 @@
 from typing import Any, ClassVar

+from rich.text import Text
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
@@ -16,13 +17,13 @@ class WebSearchRenderer(BaseToolRenderer):
        args = tool_data.get("args", {})
        query = args.get("query", "")

-        header = "🌐 [bold #60a5fa]Searching the web...[/]"
+        text = Text()
+        text.append("🌐 ")
+        text.append("Searching the web...", style="bold #60a5fa")

        if query:
-            query_display = query[:100] + "..." if len(query) > 100 else query
-            content_text = f"{header}\n  [dim]{cls.escape_markup(query_display)}[/]"
-        else:
-            content_text = f"{header}"
+            text.append("\n  ")
+            text.append(query, style="dim")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static(text, classes=css_classes)
--- a/strix/interface/tui.py
+++ b/strix/interface/tui.py
--- a/strix/interface/utils.py
+++ b/strix/interface/utils.py
@@ -38,6 +38,165 @@ def get_severity_color(severity: str) -> str:
    return severity_colors.get(severity, "#6b7280")


+def get_cvss_color(cvss_score: float) -> str:
+    if cvss_score >= 9.0:
+        return "#dc2626"
+    if cvss_score >= 7.0:
+        return "#ea580c"
+    if cvss_score >= 4.0:
+        return "#d97706"
+    if cvss_score >= 0.1:
+        return "#65a30d"
+    return "#6b7280"
+
+
+def format_vulnerability_report(report: dict[str, Any]) -> Text:  # noqa: PLR0912, PLR0915
+    """Format a vulnerability report for CLI display with all rich fields."""
+    field_style = "bold #4ade80"
+
+    text = Text()
+
+    title = report.get("title", "")
+    if title:
+        text.append("Vulnerability Report", style="bold #ea580c")
+        text.append("\n\n")
+        text.append("Title: ", style=field_style)
+        text.append(title)
+
+    severity = report.get("severity", "")
+    if severity:
+        text.append("\n\n")
+        text.append("Severity: ", style=field_style)
+        severity_color = get_severity_color(severity.lower())
+        text.append(severity.upper(), style=f"bold {severity_color}")
+
+    cvss = report.get("cvss")
+    if cvss is not None:
+        text.append("\n\n")
+        text.append("CVSS Score: ", style=field_style)
+        cvss_color = get_cvss_color(cvss)
+        text.append(f"{cvss:.1f}", style=f"bold {cvss_color}")
+
+    target = report.get("target")
+    if target:
+        text.append("\n\n")
+        text.append("Target: ", style=field_style)
+        text.append(target)
+
+    endpoint = report.get("endpoint")
+    if endpoint:
+        text.append("\n\n")
+        text.append("Endpoint: ", style=field_style)
+        text.append(endpoint)
+
+    method = report.get("method")
+    if method:
+        text.append("\n\n")
+        text.append("Method: ", style=field_style)
+        text.append(method)
+
+    cve = report.get("cve")
+    if cve:
+        text.append("\n\n")
+        text.append("CVE: ", style=field_style)
+        text.append(cve)
+
+    cvss_breakdown = report.get("cvss_breakdown", {})
+    if cvss_breakdown:
+        text.append("\n\n")
+        cvss_parts = []
+        if cvss_breakdown.get("attack_vector"):
+            cvss_parts.append(f"AV:{cvss_breakdown['attack_vector']}")
+        if cvss_breakdown.get("attack_complexity"):
+            cvss_parts.append(f"AC:{cvss_breakdown['attack_complexity']}")
+        if cvss_breakdown.get("privileges_required"):
+            cvss_parts.append(f"PR:{cvss_breakdown['privileges_required']}")
+        if cvss_breakdown.get("user_interaction"):
+            cvss_parts.append(f"UI:{cvss_breakdown['user_interaction']}")
+        if cvss_breakdown.get("scope"):
+            cvss_parts.append(f"S:{cvss_breakdown['scope']}")
+        if cvss_breakdown.get("confidentiality"):
+            cvss_parts.append(f"C:{cvss_breakdown['confidentiality']}")
+        if cvss_breakdown.get("integrity"):
+            cvss_parts.append(f"I:{cvss_breakdown['integrity']}")
+        if cvss_breakdown.get("availability"):
+            cvss_parts.append(f"A:{cvss_breakdown['availability']}")
+        if cvss_parts:
+            text.append("CVSS Vector: ", style=field_style)
+            text.append("/".join(cvss_parts), style="dim")
+
+    description = report.get("description")
+    if description:
+        text.append("\n\n")
+        text.append("Description", style=field_style)
+        text.append("\n")
+        text.append(description)
+
+    impact = report.get("impact")
+    if impact:
+        text.append("\n\n")
+        text.append("Impact", style=field_style)
+        text.append("\n")
+        text.append(impact)
+
+    technical_analysis = report.get("technical_analysis")
+    if technical_analysis:
+        text.append("\n\n")
+        text.append("Technical Analysis", style=field_style)
+        text.append("\n")
+        text.append(technical_analysis)
+
+    poc_description = report.get("poc_description")
+    if poc_description:
+        text.append("\n\n")
+        text.append("PoC Description", style=field_style)
+        text.append("\n")
+        text.append(poc_description)
+
+    poc_script_code = report.get("poc_script_code")
+    if poc_script_code:
+        text.append("\n\n")
+        text.append("PoC Code", style=field_style)
+        text.append("\n")
+        text.append(poc_script_code, style="dim")
+
+    code_file = report.get("code_file")
+    if code_file:
+        text.append("\n\n")
+        text.append("Code File: ", style=field_style)
+        text.append(code_file)
+
+    code_before = report.get("code_before")
+    if code_before:
+        text.append("\n\n")
+        text.append("Code Before", style=field_style)
+        text.append("\n")
+        text.append(code_before, style="dim")
+
+    code_after = report.get("code_after")
+    if code_after:
+        text.append("\n\n")
+        text.append("Code After", style=field_style)
+        text.append("\n")
+        text.append(code_after, style="dim")
+
+    code_diff = report.get("code_diff")
+    if code_diff:
+        text.append("\n\n")
+        text.append("Code Diff", style=field_style)
+        text.append("\n")
+        text.append(code_diff, style="dim")
+
+    remediation_steps = report.get("remediation_steps")
+    if remediation_steps:
+        text.append("\n\n")
+        text.append("Remediation", style=field_style)
+        text.append("\n")
+        text.append(remediation_steps)
+
+    return text
+
+
 def _build_vulnerability_stats(stats_text: Text, tracer: Any) -> None:
    """Build vulnerability section of stats text."""
    vuln_count = len(tracer.vulnerability_reports)
@@ -129,11 +288,17 @@ def build_final_stats_text(tracer: Any) -> Text:
    return stats_text


-def build_live_stats_text(tracer: Any) -> Text:
+def build_live_stats_text(tracer: Any, agent_config: dict[str, Any] | None = None) -> Text:
    stats_text = Text()
    if not tracer:
        return stats_text

+    if agent_config:
+        llm_config = agent_config["llm_config"]
+        model = getattr(llm_config, "model_name", "Unknown")
+        stats_text.append(f"🧠 Model: {model}")
+        stats_text.append("\n")
+
    vuln_count = len(tracer.vulnerability_reports)
    tool_count = tracer.get_real_tool_count()
    agent_count = len(tracer.agents)
@@ -196,6 +361,31 @@ def build_live_stats_text(tracer: Any) -> Text:
    return stats_text


+def build_tui_stats_text(tracer: Any, agent_config: dict[str, Any] | None = None) -> Text:
+    stats_text = Text()
+    if not tracer:
+        return stats_text
+
+    if agent_config:
+        llm_config = agent_config["llm_config"]
+        model = getattr(llm_config, "model_name", "Unknown")
+        stats_text.append(model, style="dim")
+
+    llm_stats = tracer.get_total_llm_stats()
+    total_stats = llm_stats["total"]
+
+    total_tokens = total_stats["input_tokens"] + total_stats["output_tokens"]
+    if total_tokens > 0:
+        stats_text.append("\n")
+        stats_text.append(f"{format_token_count(total_tokens)} tokens", style="dim")
+
+    if total_stats["cost"] > 0:
+        stats_text.append("\n")
+        stats_text.append(f"${total_stats['cost']:.2f} spent", style="dim")
+
+    return stats_text
+
+
 # Name generation utilities


@@ -398,6 +588,47 @@ def collect_local_sources(targets_info: list[dict[str, Any]]) -> list[dict[str,
    return local_sources


+def _is_localhost_host(host: str) -> bool:
+    host_lower = host.lower().strip("[]")
+
+    if host_lower in ("localhost", "0.0.0.0", "::1"):  # nosec B104
+        return True
+
+    try:
+        ip = ipaddress.ip_address(host_lower)
+        if isinstance(ip, ipaddress.IPv4Address):
+            return ip.is_loopback  # 127.0.0.0/8
+        if isinstance(ip, ipaddress.IPv6Address):
+            return ip.is_loopback  # ::1
+    except ValueError:
+        pass
+
+    return False
+
+
+def rewrite_localhost_targets(targets_info: list[dict[str, Any]], host_gateway: str) -> None:
+    from yarl import URL  # type: ignore[import-not-found]
+
+    for target_info in targets_info:
+        target_type = target_info.get("type")
+        details = target_info.get("details", {})
+
+        if target_type == "web_application":
+            target_url = details.get("target_url", "")
+            try:
+                url = URL(target_url)
+            except (ValueError, TypeError):
+                continue
+
+            if url.host and _is_localhost_host(url.host):
+                details["target_url"] = str(url.with_host(host_gateway))
+
+        elif target_type == "ip_address":
+            target_ip = details.get("target_ip", "")
+            if target_ip and _is_localhost_host(target_ip):
+                details["target_ip"] = host_gateway
+
+
 # Repository utilities
 def clone_repository(repo_url: str, run_name: str, dest_name: str | None = None) -> str:
    console = Console()
@@ -488,9 +719,10 @@ def check_docker_connection() -> Any:
        error_text.append("DOCKER NOT AVAILABLE", style="bold red")
        error_text.append("\n\n", style="white")
        error_text.append("Cannot connect to Docker daemon.\n", style="white")
-        error_text.append("Please ensure Docker is installed and running.\n\n", style="white")
-        error_text.append("Try running: ", style="dim white")
-        error_text.append("sudo systemctl start docker", style="dim cyan")
+        error_text.append(
+            "Please ensure Docker Desktop is installed and running, and try running strix again.\n",
+            style="white",
+        )

        panel = Panel(
            error_text,
--- a/strix/llm/init.py
+++ b/strix/llm/init.py
@@ -1,3 +1,6 @@
+import logging
+import warnings
+
 import litellm

 from .config import LLMConfig
@@ -11,5 +14,6 @@ __all__ = [
 ]

 litellm._logging._disable_debugging()
-
-litellm.drop_params = True
+logging.getLogger("asyncio").setLevel(logging.CRITICAL)
+logging.getLogger("asyncio").propagate = False
+warnings.filterwarnings("ignore", category=RuntimeWarning, module="asyncio")
--- a/strix/llm/config.py
+++ b/strix/llm/config.py
@@ -1,4 +1,4 @@
-import os
+from strix.config import Config


 class LLMConfig:
@@ -6,15 +6,18 @@ class LLMConfig:
        self,
        model_name: str | None = None,
        enable_prompt_caching: bool = True,
-        prompt_modules: list[str] | None = None,
+        skills: list[str] | None = None,
        timeout: int | None = None,
+        scan_mode: str = "deep",
    ):
-        self.model_name = model_name or os.getenv("STRIX_LLM", "openai/gpt-5")
+        self.model_name = model_name or Config.get("strix_llm")

        if not self.model_name:
            raise ValueError("STRIX_LLM environment variable must be set and not empty")

        self.enable_prompt_caching = enable_prompt_caching
-        self.prompt_modules = prompt_modules or []
+        self.skills = skills or []

-        self.timeout = timeout or int(os.getenv("LLM_TIMEOUT", "600"))
+        self.timeout = timeout or int(Config.get("llm_timeout") or "300")
+
+        self.scan_mode = scan_mode if scan_mode in ["quick", "standard", "deep"] else "deep"
--- a/strix/llm/dedupe.py
+++ b/strix/llm/dedupe.py
@@ -0,0 +1,218 @@
+import json
+import logging
+import re
+from typing import Any
+
+import litellm
+
+from strix.config import Config
+
+
+logger = logging.getLogger(__name__)
+
+DEDUPE_SYSTEM_PROMPT = """You are an expert vulnerability report deduplication judge.
+Your task is to determine if a candidate vulnerability report describes the SAME vulnerability
+as any existing report.
+
+CRITICAL DEDUPLICATION RULES:
+
+1. SAME VULNERABILITY means:
+   - Same root cause (e.g., "missing input validation" not just "SQL injection")
+   - Same affected component/endpoint/file (exact match or clear overlap)
+   - Same exploitation method or attack vector
+   - Would be fixed by the same code change/patch
+
+2. NOT DUPLICATES if:
+   - Different endpoints even with same vulnerability type (e.g., SQLi in /login vs /search)
+   - Different parameters in same endpoint (e.g., XSS in 'name' vs 'comment' field)
+   - Different root causes (e.g., stored XSS vs reflected XSS in same field)
+   - Different severity levels due to different impact
+   - One is authenticated, other is unauthenticated
+
+3. ARE DUPLICATES even if:
+   - Titles are worded differently
+   - Descriptions have different level of detail
+   - PoC uses different payloads but exploits same issue
+   - One report is more thorough than another
+   - Minor variations in technical analysis
+
+COMPARISON GUIDELINES:
+- Focus on the technical root cause, not surface-level similarities
+- Same vulnerability type (SQLi, XSS) doesn't mean duplicate - location matters
+- Consider the fix: would fixing one also fix the other?
+- When uncertain, lean towards NOT duplicate
+
+FIELDS TO ANALYZE:
+- title, description: General vulnerability info
+- target, endpoint, method: Exact location of vulnerability
+- technical_analysis: Root cause details
+- poc_description: How it's exploited
+- impact: What damage it can cause
+
+YOU MUST RESPOND WITH EXACTLY THIS XML FORMAT AND NOTHING ELSE:
+
+<dedupe_result>
+<is_duplicate>true</is_duplicate>
+<duplicate_id>vuln-0001</duplicate_id>
+<confidence>0.95</confidence>
+<reason>Both reports describe SQL injection in /api/login via the username parameter</reason>
+</dedupe_result>
+
+OR if not a duplicate:
+
+<dedupe_result>
+<is_duplicate>false</is_duplicate>
+<duplicate_id></duplicate_id>
+<confidence>0.90</confidence>
+<reason>Different endpoints: candidate is /api/search, existing is /api/login</reason>
+</dedupe_result>
+
+RULES:
+- is_duplicate MUST be exactly "true" or "false" (lowercase)
+- duplicate_id MUST be the exact ID from existing reports or empty if not duplicate
+- confidence MUST be a decimal (your confidence level in the decision)
+- reason MUST be a specific explanation mentioning endpoint/parameter/root cause
+- DO NOT include any text outside the <dedupe_result> tags"""
+
+
+def _prepare_report_for_comparison(report: dict[str, Any]) -> dict[str, Any]:
+    relevant_fields = [
+        "id",
+        "title",
+        "description",
+        "impact",
+        "target",
+        "technical_analysis",
+        "poc_description",
+        "endpoint",
+        "method",
+    ]
+
+    cleaned = {}
+    for field in relevant_fields:
+        if report.get(field):
+            value = report[field]
+            if isinstance(value, str) and len(value) > 8000:
+                value = value[:8000] + "...[truncated]"
+            cleaned[field] = value
+
+    return cleaned
+
+
+def _extract_xml_field(content: str, field: str) -> str:
+    pattern = rf"<{field}>(.*?)</{field}>"
+    match = re.search(pattern, content, re.DOTALL | re.IGNORECASE)
+    if match:
+        return match.group(1).strip()
+    return ""
+
+
+def _parse_dedupe_response(content: str) -> dict[str, Any]:
+    result_match = re.search(
+        r"<dedupe_result>(.*?)</dedupe_result>", content, re.DOTALL | re.IGNORECASE
+    )
+
+    if not result_match:
+        logger.warning(f"No <dedupe_result> block found in response: {content[:500]}")
+        raise ValueError("No <dedupe_result> block found in response")
+
+    result_content = result_match.group(1)
+
+    is_duplicate_str = _extract_xml_field(result_content, "is_duplicate")
+    duplicate_id = _extract_xml_field(result_content, "duplicate_id")
+    confidence_str = _extract_xml_field(result_content, "confidence")
+    reason = _extract_xml_field(result_content, "reason")
+
+    is_duplicate = is_duplicate_str.lower() == "true"
+
+    try:
+        confidence = float(confidence_str) if confidence_str else 0.0
+    except ValueError:
+        confidence = 0.0
+
+    return {
+        "is_duplicate": is_duplicate,
+        "duplicate_id": duplicate_id[:64] if duplicate_id else "",
+        "confidence": confidence,
+        "reason": reason[:500] if reason else "",
+    }
+
+
+def check_duplicate(
+    candidate: dict[str, Any], existing_reports: list[dict[str, Any]]
+) -> dict[str, Any]:
+    if not existing_reports:
+        return {
+            "is_duplicate": False,
+            "duplicate_id": "",
+            "confidence": 1.0,
+            "reason": "No existing reports to compare against",
+        }
+
+    try:
+        candidate_cleaned = _prepare_report_for_comparison(candidate)
+        existing_cleaned = [_prepare_report_for_comparison(r) for r in existing_reports]
+
+        comparison_data = {"candidate": candidate_cleaned, "existing_reports": existing_cleaned}
+
+        model_name = Config.get("strix_llm")
+        api_key = Config.get("llm_api_key")
+        api_base = (
+            Config.get("llm_api_base")
+            or Config.get("openai_api_base")
+            or Config.get("litellm_base_url")
+            or Config.get("ollama_api_base")
+        )
+
+        messages = [
+            {"role": "system", "content": DEDUPE_SYSTEM_PROMPT},
+            {
+                "role": "user",
+                "content": (
+                    f"Compare this candidate vulnerability against existing reports:\n\n"
+                    f"{json.dumps(comparison_data, indent=2)}\n\n"
+                    f"Respond with ONLY the <dedupe_result> XML block."
+                ),
+            },
+        ]
+
+        completion_kwargs: dict[str, Any] = {
+            "model": model_name,
+            "messages": messages,
+            "timeout": 120,
+            "temperature": 0,
+        }
+        if api_key:
+            completion_kwargs["api_key"] = api_key
+        if api_base:
+            completion_kwargs["api_base"] = api_base
+
+        response = litellm.completion(**completion_kwargs)
+
+        content = response.choices[0].message.content
+        if not content:
+            return {
+                "is_duplicate": False,
+                "duplicate_id": "",
+                "confidence": 0.0,
+                "reason": "Empty response from LLM",
+            }
+
+        result = _parse_dedupe_response(content)
+
+        logger.info(
+            f"Deduplication check: is_duplicate={result['is_duplicate']}, "
+            f"confidence={result['confidence']}, reason={result['reason'][:100]}"
+        )
+
+    except Exception as e:
+        logger.exception("Error during vulnerability deduplication check")
+        return {
+            "is_duplicate": False,
+            "duplicate_id": "",
+            "confidence": 0.0,
+            "reason": f"Deduplication check failed: {e}",
+            "error": str(e),
+        }
+    else:
+        return result
--- a/strix/llm/llm.py
+++ b/strix/llm/llm.py
@@ -1,8 +1,8 @@
+import asyncio
 import logging
-import os
+from collections.abc import AsyncIterator
 from dataclasses import dataclass
 from enum import Enum
-from fnmatch import fnmatch
 from pathlib import Path
 from typing import Any

@@ -12,31 +12,48 @@ from jinja2 import (
    FileSystemLoader,
    select_autoescape,
 )
-from litellm import ModelResponse, completion_cost
-from litellm.utils import supports_prompt_caching
+from litellm import completion_cost, stream_chunk_builder, supports_reasoning
+from litellm.utils import supports_prompt_caching, supports_vision

+from strix.config import Config
 from strix.llm.config import LLMConfig
 from strix.llm.memory_compressor import MemoryCompressor
 from strix.llm.request_queue import get_global_queue
 from strix.llm.utils import _truncate_to_first_function, parse_tool_invocations
-from strix.prompts import load_prompt_modules
+from strix.skills import load_skills
 from strix.tools import get_tools_prompt


+MAX_RETRIES = 5
+RETRY_MULTIPLIER = 8
+RETRY_MIN = 8
+RETRY_MAX = 64
+
+
+def _should_retry(exception: Exception) -> bool:
+    status_code = None
+    if hasattr(exception, "status_code"):
+        status_code = exception.status_code
+    elif hasattr(exception, "response") and hasattr(exception.response, "status_code"):
+        status_code = exception.response.status_code
+    if status_code is not None:
+        return bool(litellm._should_retry(status_code))
+    return True
+
+
 logger = logging.getLogger(__name__)

-api_key = os.getenv("LLM_API_KEY")
-if api_key:
-    litellm.api_key = api_key
+litellm.drop_params = True
+litellm.modify_params = True

-api_base = (
-    os.getenv("LLM_API_BASE")
-    or os.getenv("OPENAI_API_BASE")
-    or os.getenv("LITELLM_BASE_URL")
-    or os.getenv("OLLAMA_API_BASE")
+_LLM_API_KEY = Config.get("llm_api_key")
+_LLM_API_BASE = (
+    Config.get("llm_api_base")
+    or Config.get("openai_api_base")
+    or Config.get("litellm_base_url")
+    or Config.get("ollama_api_base")
 )
-if api_base:
-    litellm.api_base = api_base
+_STRIX_REASONING_EFFORT = Config.get("strix_reasoning_effort")


 class LLMRequestFailedError(Exception):
@@ -46,57 +63,6 @@ class LLMRequestFailedError(Exception):
        self.details = details


-SUPPORTS_STOP_WORDS_FALSE_PATTERNS: list[str] = [
-    "o1*",
-    "grok-4-0709",
-    "grok-code-fast-1",
-    "deepseek-r1-0528*",
-]
-
-REASONING_EFFORT_PATTERNS: list[str] = [
-    "o1-2024-12-17",
-    "o1",
-    "o3",
-    "o3-2025-04-16",
-    "o3-mini-2025-01-31",
-    "o3-mini",
-    "o4-mini",
-    "o4-mini-2025-04-16",
-    "gemini-2.5-flash",
-    "gemini-2.5-pro",
-    "gpt-5*",
-    "deepseek-r1-0528*",
-    "claude-sonnet-4-5*",
-    "claude-haiku-4-5*",
-]
-
-
-def normalize_model_name(model: str) -> str:
-    raw = (model or "").strip().lower()
-    if "/" in raw:
-        name = raw.split("/")[-1]
-        if ":" in name:
-            name = name.split(":", 1)[0]
-    else:
-        name = raw
-    if name.endswith("-gguf"):
-        name = name[: -len("-gguf")]
-    return name
-
-
-def model_matches(model: str, patterns: list[str]) -> bool:
-    raw = (model or "").strip().lower()
-    name = normalize_model_name(model)
-    for pat in patterns:
-        pat_l = pat.lower()
-        if "/" in pat_l:
-            if fnmatch(raw, pat_l):
-                return True
-        elif fnmatch(name, pat_l):
-            return True
-    return False
-
-
 class StepRole(str, Enum):
    AGENT = "agent"
    USER = "user"
@@ -110,6 +76,7 @@ class LLMResponse:
    scan_id: str | None = None
    step_number: int = 1
    role: StepRole = StepRole.AGENT
+    thinking_blocks: list[dict[str, Any]] | None = None  # For reasoning models.


@dataclass
@@ -144,6 +111,13 @@ class LLM:
        self._total_stats = RequestStats()
        self._last_request_stats = RequestStats()

+        if _STRIX_REASONING_EFFORT:
+            self._reasoning_effort = _STRIX_REASONING_EFFORT
+        elif self.config.scan_mode == "quick":
+            self._reasoning_effort = "medium"
+        else:
+            self._reasoning_effort = "high"
+
        self.memory_compressor = MemoryCompressor(
            model_name=self.config.model_name,
            timeout=self.config.timeout,
@@ -151,28 +125,29 @@ class LLM:

        if agent_name:
            prompt_dir = Path(__file__).parent.parent / "agents" / agent_name
-            prompts_dir = Path(__file__).parent.parent / "prompts"
+            skills_dir = Path(__file__).parent.parent / "skills"

-            loader = FileSystemLoader([prompt_dir, prompts_dir])
+            loader = FileSystemLoader([prompt_dir, skills_dir])
            self.jinja_env = Environment(
                loader=loader,
                autoescape=select_autoescape(enabled_extensions=(), default_for_string=False),
            )

            try:
-                prompt_module_content = load_prompt_modules(
-                    self.config.prompt_modules or [], self.jinja_env
-                )
+                skills_to_load = list(self.config.skills or [])
+                skills_to_load.append(f"scan_modes/{self.config.scan_mode}")

-                def get_module(name: str) -> str:
-                    return prompt_module_content.get(name, "")
+                skill_content = load_skills(skills_to_load, self.jinja_env)

-                self.jinja_env.globals["get_module"] = get_module
+                def get_skill(name: str) -> str:
+                    return skill_content.get(name, "")
+
+                self.jinja_env.globals["get_skill"] = get_skill

                self.system_prompt = self.jinja_env.get_template("system_prompt.jinja").render(
                    get_tools_prompt=get_tools_prompt,
-                    loaded_module_names=list(prompt_module_content.keys()),
-                    **prompt_module_content,
+                    loaded_skill_names=list(skill_content.keys()),
+                    **skill_content,
                )
            except (FileNotFoundError, OSError, ValueError) as e:
                logger.warning(f"Failed to load system prompt for {agent_name}: {e}")
@@ -272,12 +247,7 @@ class LLM:

        return cached_messages

-    async def generate(  # noqa: PLR0912, PLR0915
-        self,
-        conversation_history: list[dict[str, Any]],
-        scan_id: str | None = None,
-        step_number: int = 1,
-    ) -> LLMResponse:
+    def _prepare_messages(self, conversation_history: list[dict[str, Any]]) -> list[dict[str, Any]]:
        messages = [{"role": "system", "content": self.system_prompt}]

        identity_message = self._build_identity_message()
@@ -290,80 +260,130 @@ class LLM:
        conversation_history.extend(compressed_history)
        messages.extend(compressed_history)

-        cached_messages = self._prepare_cached_messages(messages)
+        return self._prepare_cached_messages(messages)

-        try:
-            response = await self._make_request(cached_messages)
-            self._update_usage_stats(response)
+    async def _stream_and_accumulate(
+        self,
+        messages: list[dict[str, Any]],
+        scan_id: str | None,
+        step_number: int,
+    ) -> AsyncIterator[LLMResponse]:
+        accumulated_content = ""
+        chunks: list[Any] = []

-            content = ""
+        async for chunk in self._stream_request(messages):
+            chunks.append(chunk)
+            delta = self._extract_chunk_delta(chunk)
+            if delta:
+                accumulated_content += delta
+
+                if "</function>" in accumulated_content:
+                    function_end = accumulated_content.find("</function>") + len("</function>")
+                    accumulated_content = accumulated_content[:function_end]
+
+                yield LLMResponse(
+                    scan_id=scan_id,
+                    step_number=step_number,
+                    role=StepRole.AGENT,
+                    content=accumulated_content,
+                    tool_invocations=None,
+                )
+
+        if chunks:
+            complete_response = stream_chunk_builder(chunks)
+            self._update_usage_stats(complete_response)
+
+        accumulated_content = _truncate_to_first_function(accumulated_content)
+        if "</function>" in accumulated_content:
+            function_end = accumulated_content.find("</function>") + len("</function>")
+            accumulated_content = accumulated_content[:function_end]
+
+        tool_invocations = parse_tool_invocations(accumulated_content)
+
+        # Extract thinking blocks from the complete response if available
+        thinking_blocks = None
+        if chunks and self._should_include_reasoning_effort():
+            complete_response = stream_chunk_builder(chunks)
            if (
-                response.choices
-                and hasattr(response.choices[0], "message")
-                and response.choices[0].message
+                hasattr(complete_response, "choices")
+                and complete_response.choices
+                and hasattr(complete_response.choices[0], "message")
            ):
-                content = getattr(response.choices[0].message, "content", "") or ""
+                message = complete_response.choices[0].message
+                if hasattr(message, "thinking_blocks") and message.thinking_blocks:
+                    thinking_blocks = message.thinking_blocks

-            content = _truncate_to_first_function(content)
+        yield LLMResponse(
+            scan_id=scan_id,
+            step_number=step_number,
+            role=StepRole.AGENT,
+            content=accumulated_content,
+            tool_invocations=tool_invocations if tool_invocations else None,
+            thinking_blocks=thinking_blocks,
+        )

-            if "</function>" in content:
-                function_end_index = content.find("</function>") + len("</function>")
-                content = content[:function_end_index]
+    def _raise_llm_error(self, e: Exception) -> None:
+        error_map: list[tuple[type, str]] = [
+            (litellm.RateLimitError, "Rate limit exceeded"),
+            (litellm.AuthenticationError, "Invalid API key"),
+            (litellm.NotFoundError, "Model not found"),
+            (litellm.ContextWindowExceededError, "Context too long"),
+            (litellm.ContentPolicyViolationError, "Content policy violation"),
+            (litellm.ServiceUnavailableError, "Service unavailable"),
+            (litellm.Timeout, "Request timed out"),
+            (litellm.UnprocessableEntityError, "Unprocessable entity"),
+            (litellm.InternalServerError, "Internal server error"),
+            (litellm.APIConnectionError, "Connection error"),
+            (litellm.UnsupportedParamsError, "Unsupported parameters"),
+            (litellm.BudgetExceededError, "Budget exceeded"),
+            (litellm.APIResponseValidationError, "Response validation error"),
+            (litellm.JSONSchemaValidationError, "JSON schema validation error"),
+            (litellm.InvalidRequestError, "Invalid request"),
+            (litellm.BadRequestError, "Bad request"),
+            (litellm.APIError, "API error"),
+            (litellm.OpenAIError, "OpenAI error"),
+        ]

-            tool_invocations = parse_tool_invocations(content)
+        from strix.telemetry import posthog

-            return LLMResponse(
-                scan_id=scan_id,
-                step_number=step_number,
-                role=StepRole.AGENT,
-                content=content,
-                tool_invocations=tool_invocations if tool_invocations else None,
-            )
+        for error_type, message in error_map:
+            if isinstance(e, error_type):
+                posthog.error(f"llm_{error_type.__name__}", message)
+                raise LLMRequestFailedError(f"LLM request failed: {message}", str(e)) from e

-        except litellm.RateLimitError as e:
-            raise LLMRequestFailedError("LLM request failed: Rate limit exceeded", str(e)) from e
-        except litellm.AuthenticationError as e:
-            raise LLMRequestFailedError("LLM request failed: Invalid API key", str(e)) from e
-        except litellm.NotFoundError as e:
-            raise LLMRequestFailedError("LLM request failed: Model not found", str(e)) from e
-        except litellm.ContextWindowExceededError as e:
-            raise LLMRequestFailedError("LLM request failed: Context too long", str(e)) from e
-        except litellm.ContentPolicyViolationError as e:
-            raise LLMRequestFailedError(
-                "LLM request failed: Content policy violation", str(e)
-            ) from e
-        except litellm.ServiceUnavailableError as e:
-            raise LLMRequestFailedError("LLM request failed: Service unavailable", str(e)) from e
-        except litellm.Timeout as e:
-            raise LLMRequestFailedError("LLM request failed: Request timed out", str(e)) from e
-        except litellm.UnprocessableEntityError as e:
-            raise LLMRequestFailedError("LLM request failed: Unprocessable entity", str(e)) from e
-        except litellm.InternalServerError as e:
-            raise LLMRequestFailedError("LLM request failed: Internal server error", str(e)) from e
-        except litellm.APIConnectionError as e:
-            raise LLMRequestFailedError("LLM request failed: Connection error", str(e)) from e
-        except litellm.UnsupportedParamsError as e:
-            raise LLMRequestFailedError("LLM request failed: Unsupported parameters", str(e)) from e
-        except litellm.BudgetExceededError as e:
-            raise LLMRequestFailedError("LLM request failed: Budget exceeded", str(e)) from e
-        except litellm.APIResponseValidationError as e:
-            raise LLMRequestFailedError(
-                "LLM request failed: Response validation error", str(e)
-            ) from e
-        except litellm.JSONSchemaValidationError as e:
-            raise LLMRequestFailedError(
-                "LLM request failed: JSON schema validation error", str(e)
-            ) from e
-        except litellm.InvalidRequestError as e:
-            raise LLMRequestFailedError("LLM request failed: Invalid request", str(e)) from e
-        except litellm.BadRequestError as e:
-            raise LLMRequestFailedError("LLM request failed: Bad request", str(e)) from e
-        except litellm.APIError as e:
-            raise LLMRequestFailedError("LLM request failed: API error", str(e)) from e
-        except litellm.OpenAIError as e:
-            raise LLMRequestFailedError("LLM request failed: OpenAI error", str(e)) from e
-        except Exception as e:
-            raise LLMRequestFailedError(f"LLM request failed: {type(e).__name__}", str(e)) from e
+        posthog.error("llm_unknown_error", type(e).__name__)
+        raise LLMRequestFailedError(f"LLM request failed: {type(e).__name__}", str(e)) from e
+
+    async def generate(
+        self,
+        conversation_history: list[dict[str, Any]],
+        scan_id: str | None = None,
+        step_number: int = 1,
+    ) -> AsyncIterator[LLMResponse]:
+        messages = self._prepare_messages(conversation_history)
+
+        last_error: Exception | None = None
+        for attempt in range(MAX_RETRIES):
+            try:
+                async for response in self._stream_and_accumulate(messages, scan_id, step_number):
+                    yield response
+                return  # noqa: TRY300
+            except Exception as e:  # noqa: BLE001
+                last_error = e
+                if not _should_retry(e) or attempt == MAX_RETRIES - 1:
+                    break
+                wait_time = min(RETRY_MAX, RETRY_MULTIPLIER * (2**attempt))
+                wait_time = max(RETRY_MIN, wait_time)
+                await asyncio.sleep(wait_time)
+
+        if last_error:
+            self._raise_llm_error(last_error)
+
+    def _extract_chunk_delta(self, chunk: Any) -> str:
+        if chunk.choices and hasattr(chunk.choices[0], "delta"):
+            delta = chunk.choices[0].delta
+            return getattr(delta, "content", "") or ""
+        return ""

    @property
    def usage_stats(self) -> dict[str, dict[str, int | float]]:
@@ -378,43 +398,93 @@ class LLM:
            "supported": supports_prompt_caching(self.config.model_name),
        }

-    def _should_include_stop_param(self) -> bool:
-        if not self.config.model_name:
-            return True
-
-        return not model_matches(self.config.model_name, SUPPORTS_STOP_WORDS_FALSE_PATTERNS)
-
    def _should_include_reasoning_effort(self) -> bool:
        if not self.config.model_name:
            return False
+        try:
+            return bool(supports_reasoning(model=self.config.model_name))
+        except Exception:  # noqa: BLE001
+            return False

-        return model_matches(self.config.model_name, REASONING_EFFORT_PATTERNS)
+    def _model_supports_vision(self) -> bool:
+        if not self.config.model_name:
+            return False
+        try:
+            return bool(supports_vision(model=self.config.model_name))
+        except Exception:  # noqa: BLE001
+            return False

-    async def _make_request(
+    def _filter_images_from_messages(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
+        filtered_messages = []
+        for msg in messages:
+            content = msg.get("content")
+            updated_msg = msg
+            if isinstance(content, list):
+                filtered_content = []
+                for item in content:
+                    if isinstance(item, dict):
+                        if item.get("type") == "image_url":
+                            filtered_content.append(
+                                {
+                                    "type": "text",
+                                    "text": "[Screenshot removed - model does not support "
+                                    "vision. Use view_source or execute_js instead.]",
+                                }
+                            )
+                        else:
+                            filtered_content.append(item)
+                    else:
+                        filtered_content.append(item)
+                if filtered_content:
+                    text_parts = [
+                        item.get("text", "") if isinstance(item, dict) else str(item)
+                        for item in filtered_content
+                    ]
+                    all_text = all(
+                        isinstance(item, dict) and item.get("type") == "text"
+                        for item in filtered_content
+                    )
+                    if all_text:
+                        updated_msg = {**msg, "content": "\n".join(text_parts)}
+                    else:
+                        updated_msg = {**msg, "content": filtered_content}
+                else:
+                    updated_msg = {**msg, "content": ""}
+            filtered_messages.append(updated_msg)
+        return filtered_messages
+
+    async def _stream_request(
        self,
        messages: list[dict[str, Any]],
-    ) -> ModelResponse:
+    ) -> AsyncIterator[Any]:
+        if not self._model_supports_vision():
+            messages = self._filter_images_from_messages(messages)
+
        completion_args: dict[str, Any] = {
            "model": self.config.model_name,
            "messages": messages,
            "timeout": self.config.timeout,
+            "stream_options": {"include_usage": True},
        }

-        if self._should_include_stop_param():
-            completion_args["stop"] = ["</function>"]
+        if _LLM_API_KEY:
+            completion_args["api_key"] = _LLM_API_KEY
+        if _LLM_API_BASE:
+            completion_args["api_base"] = _LLM_API_BASE
+
+        completion_args["stop"] = ["</function>"]

        if self._should_include_reasoning_effort():
-            completion_args["reasoning_effort"] = "high"
+            completion_args["reasoning_effort"] = self._reasoning_effort

        queue = get_global_queue()
-        response = await queue.make_request(completion_args)
-
        self._total_stats.requests += 1
        self._last_request_stats = RequestStats(requests=1)

-        return response
+        async for chunk in queue.stream_request(completion_args):
+            yield chunk

-    def _update_usage_stats(self, response: ModelResponse) -> None:
+    def _update_usage_stats(self, response: Any) -> None:
        try:
            if hasattr(response, "usage") and response.usage:
                input_tokens = getattr(response.usage, "prompt_tokens", 0)
--- a/strix/llm/memory_compressor.py
+++ b/strix/llm/memory_compressor.py
@@ -1,9 +1,10 @@
 import logging
-import os
 from typing import Any

 import litellm

+from strix.config import Config
+

 logger = logging.getLogger(__name__)

@@ -150,7 +151,7 @@ class MemoryCompressor:
        timeout: int = 600,
    ):
        self.max_images = max_images
-        self.model_name = model_name or os.getenv("STRIX_LLM", "openai/gpt-5")
+        self.model_name = model_name or Config.get("strix_llm")
        self.timeout = timeout

        if not self.model_name:
--- a/strix/llm/request_queue.py
+++ b/strix/llm/request_queue.py
@@ -1,48 +1,26 @@
 import asyncio
-import logging
-import os
 import threading
 import time
+from collections.abc import AsyncIterator
 from typing import Any

-import litellm
-from litellm import ModelResponse, completion
-from tenacity import retry, retry_if_exception, stop_after_attempt, wait_exponential
+from litellm import acompletion
+from litellm.types.utils import ModelResponseStream

-
-logger = logging.getLogger(__name__)
-
-
-def should_retry_exception(exception: Exception) -> bool:
-    status_code = None
-
-    if hasattr(exception, "status_code"):
-        status_code = exception.status_code
-    elif hasattr(exception, "response") and hasattr(exception.response, "status_code"):
-        status_code = exception.response.status_code
-
-    if status_code is not None:
-        return bool(litellm._should_retry(status_code))
-    return True
+from strix.config import Config


 class LLMRequestQueue:
-    def __init__(self, max_concurrent: int = 6, delay_between_requests: float = 5.0):
-        rate_limit_delay = os.getenv("LLM_RATE_LIMIT_DELAY")
-        if rate_limit_delay:
-            delay_between_requests = float(rate_limit_delay)
-
-        rate_limit_concurrent = os.getenv("LLM_RATE_LIMIT_CONCURRENT")
-        if rate_limit_concurrent:
-            max_concurrent = int(rate_limit_concurrent)
-
-        self.max_concurrent = max_concurrent
-        self.delay_between_requests = delay_between_requests
-        self._semaphore = threading.BoundedSemaphore(max_concurrent)
+    def __init__(self) -> None:
+        self.delay_between_requests = float(Config.get("llm_rate_limit_delay") or "4.0")
+        self.max_concurrent = int(Config.get("llm_rate_limit_concurrent") or "1")
+        self._semaphore = threading.BoundedSemaphore(self.max_concurrent)
        self._last_request_time = 0.0
        self._lock = threading.Lock()

-    async def make_request(self, completion_args: dict[str, Any]) -> ModelResponse:
+    async def stream_request(
+        self, completion_args: dict[str, Any]
+    ) -> AsyncIterator[ModelResponseStream]:
        try:
            while not self._semaphore.acquire(timeout=0.2):
                await asyncio.sleep(0.1)
@@ -56,25 +34,18 @@ class LLMRequestQueue:
            if sleep_needed > 0:
                await asyncio.sleep(sleep_needed)

-            return await self._reliable_request(completion_args)
+            async for chunk in self._stream_request(completion_args):
+                yield chunk
        finally:
            self._semaphore.release()

-    @retry(  # type: ignore[misc]
-        stop=stop_after_attempt(7),
-        wait=wait_exponential(multiplier=6, min=12, max=150),
-        retry=retry_if_exception(should_retry_exception),
-        reraise=True,
-    )
-    async def _reliable_request(self, completion_args: dict[str, Any]) -> ModelResponse:
-        response = completion(**completion_args, stream=False)
-        if isinstance(response, ModelResponse):
-            return response
-        self._raise_unexpected_response()
-        raise RuntimeError("Unreachable code")
+    async def _stream_request(
+        self, completion_args: dict[str, Any]
+    ) -> AsyncIterator[ModelResponseStream]:
+        response = await acompletion(**completion_args, stream=True)

-    def _raise_unexpected_response(self) -> None:
-        raise RuntimeError("Unexpected response type")
+        async for chunk in response:
+            yield chunk


 _global_queue: LLMRequestQueue | None = None
--- a/strix/llm/utils.py
+++ b/strix/llm/utils.py
@@ -47,10 +47,14 @@ def parse_tool_invocations(content: str) -> list[dict[str, Any]] | None:


 def _fix_stopword(content: str) -> str:
-    if "<function=" in content and content.count("<function=") == 1:
+    if (
+        "<function=" in content
+        and content.count("<function=") == 1
+        and "</function>" not in content
+    ):
        if content.endswith("</"):
            content = content.rstrip() + "function>"
-        elif not content.rstrip().endswith("</function>"):
+        else:
            content = content + "\n</function>"
    return content

@@ -75,6 +79,12 @@ def clean_content(content: str) -> str:
    tool_pattern = r"<function=[^>]+>.*?</function>"
    cleaned = re.sub(tool_pattern, "", content, flags=re.DOTALL)

+    incomplete_tool_pattern = r"<function=[^>]+>.*$"
+    cleaned = re.sub(incomplete_tool_pattern, "", cleaned, flags=re.DOTALL)
+
+    partial_tag_pattern = r"<f(?:u(?:n(?:c(?:t(?:i(?:o(?:n(?:=(?:[^>]*)?)?)?)?)?)?)?)?)?$"
+    cleaned = re.sub(partial_tag_pattern, "", cleaned)
+
    hidden_xml_patterns = [
        r"<inter_agent_message>.*?</inter_agent_message>",
        r"<agent_completion_report>.*?</agent_completion_report>",
--- a/strix/prompts/README.md
+++ b/strix/prompts/README.md
@@ -1,64 +0,0 @@
-# 📚 Strix Prompt Modules
-
-## 🎯 Overview
-
-Prompt modules are specialized knowledge packages that enhance Strix agents with deep expertise in specific vulnerability types, technologies, and testing methodologies. Each module provides advanced techniques, practical examples, and validation methods that go beyond baseline security knowledge.
-
---
-
-## 🏗️ Architecture
-
-### How Prompts Work
-
-When an agent is created, it can load up to 5 specialized prompt modules relevant to the specific subtask and context at hand:
-
-```python
-# Agent creation with specialized modules
-create_agent(
-    task="Test authentication mechanisms in API",
-    name="Auth Specialist",
-    prompt_modules="authentication_jwt,business_logic"
-)
-```
-
-The modules are dynamically injected into the agent's system prompt, allowing it to operate with deep expertise tailored to the specific vulnerability types or technologies required for the task at hand.
-
---
-
-## 📁 Module Categories
-
-| Category | Purpose |
-|----------|---------|
-| **`/vulnerabilities`** | Advanced testing techniques for core vulnerability classes like authentication bypasses, business logic flaws, and race conditions |
-| **`/frameworks`** | Specific testing methods for popular frameworks e.g. Django, Express, FastAPI, and Next.js |
-| **`/technologies`** | Specialized techniques for third-party services such as Supabase, Firebase, Auth0, and payment gateways |
-| **`/protocols`** | Protocol-specific testing patterns for GraphQL, WebSocket, OAuth, and other communication standards |
-| **`/cloud`** | Cloud provider security testing for AWS, Azure, GCP, and Kubernetes environments |
-| **`/reconnaissance`** | Advanced information gathering and enumeration techniques for comprehensive attack surface mapping |
-| **`/custom`** | Community-contributed modules for specialized or industry-specific testing scenarios |
-
---
-
-## 🎨 Creating New Modules
-
-### What Should a Module Contain?
-
-A good prompt module is a structured knowledge package that typically includes:
-
- **Advanced techniques** - Non-obvious methods specific to the task and domain
- **Practical examples** - Working payloads, commands, or test cases with variations
- **Validation methods** - How to confirm findings and avoid false positives
- **Context-specific insights** - Environment and version nuances, configuration-dependent behavior, and edge cases
-
-Modules use XML-style tags for structure and focus on deep, specialized knowledge that significantly enhances agent capabilities for that specific context.
-
---
-
-## 🤝 Contributing
-
-Community contributions are more than welcome — contribute new modules via [pull requests](https://github.com/usestrix/strix/pulls) or [GitHub issues](https://github.com/usestrix/strix/issues) to help expand the collection and improve extensibility for Strix agents.
-
---
-
-> [!NOTE]
-> **Work in Progress** - We're actively expanding the prompt module collection with specialized techniques and new categories.
--- a/strix/prompts/init.py
+++ b/strix/prompts/init.py
@@ -1,109 +0,0 @@
-from pathlib import Path
-
-from jinja2 import Environment
-
-
-def get_available_prompt_modules() -> dict[str, list[str]]:
-    modules_dir = Path(__file__).parent
-    available_modules = {}
-
-    for category_dir in modules_dir.iterdir():
-        if category_dir.is_dir() and not category_dir.name.startswith("__"):
-            category_name = category_dir.name
-            modules = []
-
-            for file_path in category_dir.glob("*.jinja"):
-                module_name = file_path.stem
-                modules.append(module_name)
-
-            if modules:
-                available_modules[category_name] = sorted(modules)
-
-    return available_modules
-
-
-def get_all_module_names() -> set[str]:
-    all_modules = set()
-    for category_modules in get_available_prompt_modules().values():
-        all_modules.update(category_modules)
-    return all_modules
-
-
-def validate_module_names(module_names: list[str]) -> dict[str, list[str]]:
-    available_modules = get_all_module_names()
-    valid_modules = []
-    invalid_modules = []
-
-    for module_name in module_names:
-        if module_name in available_modules:
-            valid_modules.append(module_name)
-        else:
-            invalid_modules.append(module_name)
-
-    return {"valid": valid_modules, "invalid": invalid_modules}
-
-
-def generate_modules_description() -> str:
-    available_modules = get_available_prompt_modules()
-
-    if not available_modules:
-        return "No prompt modules available"
-
-    all_module_names = get_all_module_names()
-
-    if not all_module_names:
-        return "No prompt modules available"
-
-    sorted_modules = sorted(all_module_names)
-    modules_str = ", ".join(sorted_modules)
-
-    description = (
-        f"List of prompt modules to load for this agent (max 5). Available modules: {modules_str}. "
-    )
-
-    example_modules = sorted_modules[:2]
-    if example_modules:
-        example = f"Example: {', '.join(example_modules)} for specialized agent"
-        description += example
-
-    return description
-
-
-def load_prompt_modules(module_names: list[str], jinja_env: Environment) -> dict[str, str]:
-    import logging
-
-    logger = logging.getLogger(__name__)
-    module_content = {}
-    prompts_dir = Path(__file__).parent
-
-    available_modules = get_available_prompt_modules()
-
-    for module_name in module_names:
-        try:
-            module_path = None
-
-            if "/" in module_name:
-                module_path = f"{module_name}.jinja"
-            else:
-                for category, modules in available_modules.items():
-                    if module_name in modules:
-                        module_path = f"{category}/{module_name}.jinja"
-                        break
-
-                if not module_path:
-                    root_candidate = f"{module_name}.jinja"
-                    if (prompts_dir / root_candidate).exists():
-                        module_path = root_candidate
-
-            if module_path and (prompts_dir / module_path).exists():
-                template = jinja_env.get_template(module_path)
-                var_name = module_name.split("/")[-1]
-                module_content[var_name] = template.render()
-                logger.info(f"Loaded prompt module: {module_name} -> {var_name}")
-            else:
-                logger.warning(f"Prompt module not found: {module_name}")
-
-        except (FileNotFoundError, OSError, ValueError) as e:
-            logger.warning(f"Failed to load prompt module {module_name}: {e}")
-
-    return module_content
--- a/strix/runtime/init.py
+++ b/strix/runtime/init.py
@@ -1,10 +1,19 @@
-import os
+from strix.config import Config

 from .runtime import AbstractRuntime


+class SandboxInitializationError(Exception):
+    """Raised when sandbox initialization fails (e.g., Docker issues)."""
+
+    def __init__(self, message: str, details: str | None = None):
+        super().__init__(message)
+        self.message = message
+        self.details = details
+
+
 def get_runtime() -> AbstractRuntime:
-    runtime_backend = os.getenv("STRIX_RUNTIME_BACKEND", "docker")
+    runtime_backend = Config.get("strix_runtime_backend")

    if runtime_backend == "docker":
        from .docker_runtime import DockerRuntime
@@ -16,4 +25,4 @@ def get_runtime() -> AbstractRuntime:
    )


-__all__ = ["AbstractRuntime", "get_runtime"]
+__all__ = ["AbstractRuntime", "SandboxInitializationError", "get_runtime"]
--- a/strix/runtime/docker_runtime.py
+++ b/strix/runtime/docker_runtime.py
@@ -4,27 +4,49 @@ import os
 import secrets
 import socket
 import time
+from concurrent.futures import ThreadPoolExecutor
+from concurrent.futures import TimeoutError as FuturesTimeoutError
 from pathlib import Path
-from typing import cast
+from typing import Any, cast

 import docker
 from docker.errors import DockerException, ImageNotFound, NotFound
 from docker.models.containers import Container
+from requests.exceptions import ConnectionError as RequestsConnectionError
+from requests.exceptions import Timeout as RequestsTimeout

+from strix.config import Config
+
+from . import SandboxInitializationError
 from .runtime import AbstractRuntime, SandboxInfo


-STRIX_IMAGE = os.getenv("STRIX_IMAGE", "ghcr.io/usestrix/strix-sandbox:0.1.10")
+HOST_GATEWAY_HOSTNAME = "host.docker.internal"
+DOCKER_TIMEOUT = 60  # seconds
+TOOL_SERVER_HEALTH_REQUEST_TIMEOUT = 5  # seconds per health check request
+TOOL_SERVER_HEALTH_RETRIES = 10  # number of retries for health check
 logger = logging.getLogger(__name__)


 class DockerRuntime(AbstractRuntime):
    def __init__(self) -> None:
        try:
-            self.client = docker.from_env()
-        except DockerException as e:
+            self.client = docker.from_env(timeout=DOCKER_TIMEOUT)
+        except (DockerException, RequestsConnectionError, RequestsTimeout) as e:
            logger.exception("Failed to connect to Docker daemon")
-            raise RuntimeError("Docker is not available or not configured correctly.") from e
+            if isinstance(e, RequestsConnectionError | RequestsTimeout):
+                raise SandboxInitializationError(
+                    "Docker daemon unresponsive",
+                    f"Connection timed out after {DOCKER_TIMEOUT} seconds. "
+                    "Please ensure Docker Desktop is installed and running, "
+                    "and try running strix again.",
+                ) from e
+            raise SandboxInitializationError(
+                "Docker is not available",
+                "Docker is not available or not configured correctly. "
+                "Please ensure Docker Desktop is installed and running, "
+                "and try running strix again.",
+            ) from e

        self._scan_container: Container | None = None
        self._tool_server_port: int | None = None
@@ -38,6 +60,23 @@ class DockerRuntime(AbstractRuntime):
            s.bind(("", 0))
            return cast("int", s.getsockname()[1])

+    def _exec_run_with_timeout(
+        self, container: Container, cmd: str, timeout: int = DOCKER_TIMEOUT, **kwargs: Any
+    ) -> Any:
+        with ThreadPoolExecutor(max_workers=1) as executor:
+            future = executor.submit(container.exec_run, cmd, **kwargs)
+            try:
+                return future.result(timeout=timeout)
+            except FuturesTimeoutError:
+                logger.exception(f"exec_run timed out after {timeout}s: {cmd[:100]}...")
+                raise SandboxInitializationError(
+                    "Container command timed out",
+                    f"Command timed out after {timeout} seconds. "
+                    "Docker may be overloaded or unresponsive. "
+                    "Please ensure Docker Desktop is installed and running, "
+                    "and try running strix again.",
+                ) from None
+
    def _get_scan_id(self, agent_id: str) -> str:
        try:
            from strix.telemetry.tracer import get_global_tracer
@@ -80,10 +119,13 @@ class DockerRuntime(AbstractRuntime):
    def _create_container_with_retry(self, scan_id: str, max_retries: int = 3) -> Container:
        last_exception = None
        container_name = f"strix-scan-{scan_id}"
+        image_name = Config.get("strix_image")
+        if not image_name:
+            raise ValueError("STRIX_IMAGE must be configured")

        for attempt in range(max_retries):
            try:
-                self._verify_image_available(STRIX_IMAGE)
+                self._verify_image_available(image_name)

                try:
                    existing_container = self.client.containers.get(container_name)
@@ -105,7 +147,7 @@ class DockerRuntime(AbstractRuntime):
                self._tool_server_token = tool_server_token

                container = self.client.containers.run(
-                    STRIX_IMAGE,
+                    image_name,
                    command="sleep infinity",
                    detach=True,
                    name=container_name,
@@ -121,7 +163,9 @@ class DockerRuntime(AbstractRuntime):
                        "CAIDO_PORT": str(caido_port),
                        "TOOL_SERVER_PORT": str(tool_server_port),
                        "TOOL_SERVER_TOKEN": tool_server_token,
+                        "HOST_GATEWAY": HOST_GATEWAY_HOSTNAME,
                    },
+                    extra_hosts=self._get_extra_hosts(),
                    tty=True,
                )

@@ -131,7 +175,7 @@ class DockerRuntime(AbstractRuntime):
                self._initialize_container(
                    container, caido_port, tool_server_port, tool_server_token
                )
-            except DockerException as e:
+            except (DockerException, RequestsConnectionError, RequestsTimeout) as e:
                last_exception = e
                if attempt == max_retries - 1:
                    logger.exception(f"Failed to create container after {max_retries} attempts")
@@ -147,8 +191,19 @@ class DockerRuntime(AbstractRuntime):
            else:
                return container

-        raise RuntimeError(
-            f"Failed to create Docker container after {max_retries} attempts: {last_exception}"
+        if isinstance(last_exception, RequestsConnectionError | RequestsTimeout):
+            raise SandboxInitializationError(
+                "Failed to create sandbox container",
+                f"Docker daemon unresponsive after {max_retries} attempts "
+                f"(timed out after {DOCKER_TIMEOUT}s). "
+                "Please ensure Docker Desktop is installed and running, "
+                "and try running strix again.",
+            ) from last_exception
+        raise SandboxInitializationError(
+            "Failed to create sandbox container",
+            f"Container creation failed after {max_retries} attempts: {last_exception}. "
+            "Please ensure Docker Desktop is installed and running, "
+            "and try running strix again.",
        ) from last_exception

    def _get_or_create_scan_container(self, scan_id: str) -> Container:  # noqa: PLR0912
@@ -193,7 +248,7 @@ class DockerRuntime(AbstractRuntime):

        except NotFound:
            pass
-        except DockerException as e:
+        except (DockerException, RequestsConnectionError, RequestsTimeout) as e:
            logger.warning(f"Failed to get container by name {container_name}: {e}")
        else:
            return container
@@ -203,7 +258,7 @@ class DockerRuntime(AbstractRuntime):
                all=True, filters={"label": f"strix-scan-id={scan_id}"}
            )
            if containers:
-                container = cast("Container", containers[0])
+                container = containers[0]
                if container.status != "running":
                    container.start()
                    time.sleep(2)
@@ -217,7 +272,7 @@ class DockerRuntime(AbstractRuntime):

                logger.info(f"Found existing container by label for scan {scan_id}")
                return container
-        except DockerException as e:
+        except (DockerException, RequestsConnectionError, RequestsTimeout) as e:
            logger.warning("Failed to find existing container by label for scan %s: %s", scan_id, e)

        logger.info("Creating new Docker container for scan %s", scan_id)
@@ -227,15 +282,18 @@ class DockerRuntime(AbstractRuntime):
        self, container: Container, caido_port: int, tool_server_port: int, tool_server_token: str
    ) -> None:
        logger.info("Initializing Caido proxy on port %s", caido_port)
-        result = container.exec_run(
+        self._exec_run_with_timeout(
+            container,
            f"bash -c 'export CAIDO_PORT={caido_port} && /usr/local/bin/docker-entrypoint.sh true'",
            detach=False,
        )

        time.sleep(5)

-        result = container.exec_run(
-            "bash -c 'source /etc/profile.d/proxy.sh && echo $CAIDO_API_TOKEN'", user="pentester"
+        result = self._exec_run_with_timeout(
+            container,
+            "bash -c 'source /etc/profile.d/proxy.sh && echo $CAIDO_API_TOKEN'",
+            user="pentester",
        )
        caido_token = result.output.decode().strip() if result.exit_code == 0 else ""

@@ -248,7 +306,57 @@ class DockerRuntime(AbstractRuntime):
            user="pentester",
        )

-        time.sleep(5)
+        time.sleep(2)
+
+        host = self._resolve_docker_host()
+        health_url = f"http://{host}:{tool_server_port}/health"
+        self._wait_for_tool_server_health(health_url)
+
+    def _wait_for_tool_server_health(
+        self,
+        health_url: str,
+        max_retries: int = TOOL_SERVER_HEALTH_RETRIES,
+        request_timeout: int = TOOL_SERVER_HEALTH_REQUEST_TIMEOUT,
+    ) -> None:
+        import httpx
+
+        logger.info(f"Waiting for tool server health at {health_url}")
+
+        for attempt in range(max_retries):
+            try:
+                with httpx.Client(trust_env=False, timeout=request_timeout) as client:
+                    response = client.get(health_url)
+                    response.raise_for_status()
+                    health_data = response.json()
+
+                    if health_data.get("status") == "healthy":
+                        logger.info(
+                            f"Tool server is healthy after {attempt + 1} attempt(s): {health_data}"
+                        )
+                        return
+
+                    logger.warning(f"Tool server returned unexpected status: {health_data}")
+
+            except httpx.ConnectError:
+                logger.debug(
+                    f"Tool server not ready (attempt {attempt + 1}/{max_retries}): "
+                    f"Connection refused"
+                )
+            except httpx.TimeoutException:
+                logger.debug(
+                    f"Tool server not ready (attempt {attempt + 1}/{max_retries}): "
+                    f"Request timed out"
+                )
+            except (httpx.RequestError, httpx.HTTPStatusError) as e:
+                logger.debug(f"Tool server not ready (attempt {attempt + 1}/{max_retries}): {e}")
+
+            sleep_time = min(2**attempt * 0.5, 5)
+            time.sleep(sleep_time)
+
+        raise SandboxInitializationError(
+            "Tool server failed to start",
+            "Please ensure Docker Desktop is installed and running, and try running strix again.",
+        )

    def _copy_local_directory_to_container(
        self, container: Container, local_path: str, target_name: str | None = None
@@ -381,6 +489,9 @@ class DockerRuntime(AbstractRuntime):

        return "127.0.0.1"

+    def _get_extra_hosts(self) -> dict[str, str]:
+        return {HOST_GATEWAY_HOSTNAME: "host-gateway"}
+
    async def destroy_sandbox(self, container_id: str) -> None:
        logger.info("Destroying scan container %s", container_id)
        try:
--- a/strix/skills/README.md
+++ b/strix/skills/README.md
@@ -0,0 +1,64 @@
+# 📚 Strix Skills
+
+## 🎯 Overview
+
+Skills are specialized knowledge packages that enhance Strix agents with deep expertise in specific vulnerability types, technologies, and testing methodologies. Each skill provides advanced techniques, practical examples, and validation methods that go beyond baseline security knowledge.
+
+---
+
+## 🏗️ Architecture
+
+### How Skills Work
+
+When an agent is created, it can load up to 5 specialized skills relevant to the specific subtask and context at hand:
+
+```python
+# Agent creation with specialized skills
+create_agent(
+    task="Test authentication mechanisms in API",
+    name="Auth Specialist",
+    skills="authentication_jwt,business_logic"
+)
+```
+
+The skills are dynamically injected into the agent's system prompt, allowing it to operate with deep expertise tailored to the specific vulnerability types or technologies required for the task at hand.
+
+---
+
+## 📁 Skill Categories
+
+| Category | Purpose |
+|----------|---------|
+| **`/vulnerabilities`** | Advanced testing techniques for core vulnerability classes like authentication bypasses, business logic flaws, and race conditions |
+| **`/frameworks`** | Specific testing methods for popular frameworks e.g. Django, Express, FastAPI, and Next.js |
+| **`/technologies`** | Specialized techniques for third-party services such as Supabase, Firebase, Auth0, and payment gateways |
+| **`/protocols`** | Protocol-specific testing patterns for GraphQL, WebSocket, OAuth, and other communication standards |
+| **`/cloud`** | Cloud provider security testing for AWS, Azure, GCP, and Kubernetes environments |
+| **`/reconnaissance`** | Advanced information gathering and enumeration techniques for comprehensive attack surface mapping |
+| **`/custom`** | Community-contributed skills for specialized or industry-specific testing scenarios |
+
+---
+
+## 🎨 Creating New Skills
+
+### What Should a Skill Contain?
+
+A good skill is a structured knowledge package that typically includes:
+
+- **Advanced techniques** - Non-obvious methods specific to the task and domain
+- **Practical examples** - Working payloads, commands, or test cases with variations
+- **Validation methods** - How to confirm findings and avoid false positives
+- **Context-specific insights** - Environment and version nuances, configuration-dependent behavior, and edge cases
+
+Skills use XML-style tags for structure and focus on deep, specialized knowledge that significantly enhances agent capabilities for that specific context.
+
+---
+
+## 🤝 Contributing
+
+Community contributions are more than welcome — contribute new skills via [pull requests](https://github.com/usestrix/strix/pulls) or [GitHub issues](https://github.com/usestrix/strix/issues) to help expand the collection and improve extensibility for Strix agents.
+
+---
+
+> [!NOTE]
+> **Work in Progress** - We're actively expanding the skills collection with specialized techniques and new categories.
--- a/strix/skills/init.py
+++ b/strix/skills/init.py
@@ -0,0 +1,107 @@
+from pathlib import Path
+
+from jinja2 import Environment
+
+
+def get_available_skills() -> dict[str, list[str]]:
+    skills_dir = Path(__file__).parent
+    available_skills = {}
+
+    for category_dir in skills_dir.iterdir():
+        if category_dir.is_dir() and not category_dir.name.startswith("__"):
+            category_name = category_dir.name
+            skills = []
+
+            for file_path in category_dir.glob("*.jinja"):
+                skill_name = file_path.stem
+                skills.append(skill_name)
+
+            if skills:
+                available_skills[category_name] = sorted(skills)
+
+    return available_skills
+
+
+def get_all_skill_names() -> set[str]:
+    all_skills = set()
+    for category_skills in get_available_skills().values():
+        all_skills.update(category_skills)
+    return all_skills
+
+
+def validate_skill_names(skill_names: list[str]) -> dict[str, list[str]]:
+    available_skills = get_all_skill_names()
+    valid_skills = []
+    invalid_skills = []
+
+    for skill_name in skill_names:
+        if skill_name in available_skills:
+            valid_skills.append(skill_name)
+        else:
+            invalid_skills.append(skill_name)
+
+    return {"valid": valid_skills, "invalid": invalid_skills}
+
+
+def generate_skills_description() -> str:
+    available_skills = get_available_skills()
+
+    if not available_skills:
+        return "No skills available"
+
+    all_skill_names = get_all_skill_names()
+
+    if not all_skill_names:
+        return "No skills available"
+
+    sorted_skills = sorted(all_skill_names)
+    skills_str = ", ".join(sorted_skills)
+
+    description = f"List of skills to load for this agent (max 5). Available skills: {skills_str}. "
+
+    example_skills = sorted_skills[:2]
+    if example_skills:
+        example = f"Example: {', '.join(example_skills)} for specialized agent"
+        description += example
+
+    return description
+
+
+def load_skills(skill_names: list[str], jinja_env: Environment) -> dict[str, str]:
+    import logging
+
+    logger = logging.getLogger(__name__)
+    skill_content = {}
+    skills_dir = Path(__file__).parent
+
+    available_skills = get_available_skills()
+
+    for skill_name in skill_names:
+        try:
+            skill_path = None
+
+            if "/" in skill_name:
+                skill_path = f"{skill_name}.jinja"
+            else:
+                for category, skills in available_skills.items():
+                    if skill_name in skills:
+                        skill_path = f"{category}/{skill_name}.jinja"
+                        break
+
+                if not skill_path:
+                    root_candidate = f"{skill_name}.jinja"
+                    if (skills_dir / root_candidate).exists():
+                        skill_path = root_candidate
+
+            if skill_path and (skills_dir / skill_path).exists():
+                template = jinja_env.get_template(skill_path)
+                var_name = skill_name.split("/")[-1]
+                skill_content[var_name] = template.render()
+                logger.info(f"Loaded skill: {skill_name} -> {var_name}")
+            else:
+                logger.warning(f"Skill not found: {skill_name}")
+
+        except (FileNotFoundError, OSError, ValueError) as e:
+            logger.warning(f"Failed to load skill {skill_name}: {e}")
+
+    return skill_content
--- a/strix/prompts/cloud/.gitkeep
+++ b/strix/prompts/cloud/.gitkeep
--- a/strix/prompts/coordination/root_agent.jinja
+++ b/strix/prompts/coordination/root_agent.jinja
--- a/strix/prompts/custom/.gitkeep
+++ b/strix/prompts/custom/.gitkeep
--- a/strix/prompts/frameworks/fastapi.jinja
+++ b/strix/prompts/frameworks/fastapi.jinja
--- a/strix/prompts/frameworks/nextjs.jinja
+++ b/strix/prompts/frameworks/nextjs.jinja
@@ -31,6 +31,18 @@
 </high_value_targets>

 <advanced_techniques>
+<route_enumeration>
+- __BUILD_MANIFEST.sortedPages: Execute `console.log(__BUILD_MANIFEST.sortedPages.join('\n'))` in browser console to instantly reveal all registered routes (Pages Router and static App Router paths compiled at build time)
+- __NEXT_DATA__: Inspect `<script id="__NEXT_DATA__">` for serverside props, pageProps, buildId, and dynamic route params on current page; reveals data flow and prop structure
+- Source maps exposure: Check `/_next/static/` for exposed .map files revealing full route structure, server action IDs, API endpoints, and internal function names
+- Client bundle mining: Search main-*.js and page chunks for route definitions; grep for 'pathname:', 'href:', '__next_route__', 'serverActions', and API endpoint strings
+- Static chunk enumeration: Probe `/_next/static/chunks/pages/` and `/_next/static/chunks/app/` for build artifacts; filenames map directly to routes (e.g., `admin.js` → `/admin`)
+- Build manifest fetch: GET `/_next/static/<buildId>/_buildManifest.js` and `/_next/static/<buildId>/_ssgManifest.js` for complete route and static generation metadata
+- Sitemap/robots leakage: Check `/sitemap.xml`, `/robots.txt`, and `/sitemap-*.xml` for unintended exposure of admin/internal/preview paths
+- Server action discovery: Inspect Network tab for POST requests with `Next-Action` header; extract action IDs from response streams and client hydration data
+- Environment variable leakage: Execute `Object.keys(process.env).filter(k => k.startsWith('NEXT_PUBLIC_'))` in console to list public env vars; grep bundles for 'API_KEY', 'SECRET', 'TOKEN', 'PASSWORD' to find accidentally leaked credentials
+</route_enumeration>
+
 <middleware_bypass>
 - Test for CVE-class middleware bypass via `x-middleware-subrequest` crafting and `x-nextjs-data` probing. Look for 307 + `x-middleware-rewrite`/`x-nextjs-redirect` headers and attempt bypass on protected routes.
 - Attempt direct route access on Node vs Edge runtimes; confirm protection parity.
@@ -80,6 +92,14 @@
 - Identify `dangerouslySetInnerHTML`, Markdown renderers, and user-controlled href/src attributes. Validate CSP/Trusted Types coverage for SSR/CSR/hydration.
 - Attack hydration boundaries: server vs client render mismatches can enable gadget-based XSS.
 </client_and_dom>
+
+<data_fetching_over_exposure>
+- getServerSideProps/getStaticProps leakage: Execute `JSON.parse(document.getElementById('__NEXT_DATA__').textContent).props.pageProps` in console to inspect all server-fetched data; look for sensitive fields (emails, tokens, internal IDs, full user objects) passed to client but not rendered in UI
+- Over-fetched database queries: Check if pageProps include entire user records, relations, or admin-only fields when only username is displayed; common when using ORM select-all patterns
+- API response pass-through: Verify if API responses are sanitized before passing to props; developers often forward entire responses including metadata, cursors, or debug info
+- Environment-dependent data: Test if staging/dev accidentally exposes more fields in props than production due to inconsistent serialization logic
+- Nested object inspection: Drill into nested props objects; look for `_metadata`, `_internal`, `__typename` (GraphQL), or framework-added fields containing sensitive context
+</data_fetching_over_exposure>
 </advanced_techniques>

 <bypass_techniques>
@@ -87,6 +107,8 @@
 - Method override/tunneling: `_method`, `X-HTTP-Method-Override`, GET on endpoints unexpectedly accepting writes.
 - Case/param aliasing and query duplication affecting middleware vs handler parsing.
 - Cache key confusion at CDN/proxy (lack of Vary on auth cookies/headers) to leak personalized SSR/ISR content.
+- API route path normalization: Test `/api/users` vs `/api/users/` vs `/api//users` vs `/api/./users`; middleware may normalize differently than route handlers, allowing protection bypass. Try double slashes, trailing slashes, and dot segments.
+- Parameter pollution: Send duplicate query params (`?id=1&id=2`) or array notation (`?filter[]=a&filter[]=b`) to exploit parsing differences between middleware (which may check first value) and handler (which may use last or array).
 </bypass_techniques>

 <special_contexts>
@@ -107,6 +129,10 @@
 3. Demonstrate server action invocation outside UI with insufficient authorization checks.
 4. Show middleware bypass (where applicable) with explicit headers and resulting protected content.
 5. Include runtime parity checks (Edge vs Node) proving inconsistent enforcement.
+6. For route enumeration: verify discovered routes return 200/403 (deployed) not 404 (build artifacts); test with authenticated vs unauthenticated requests.
+7. For leaked credentials: test API keys with minimal read-only calls; filter placeholders (YOUR_API_KEY, demo-token); confirm keys match provider patterns (sk_live_*, pk_prod_*).
+8. For __NEXT_DATA__ over-exposure: test cross-user (User A's props should not contain User B's PII); verify exposed fields are not in DOM; validate token validity with API calls.
+9. For path normalization bypasses: show differential responses (403 vs 200 for path variants); redirects (307/308) don't count—only direct access bypasses matter.
 </validation>

 <pro_tips>
--- a/strix/prompts/protocols/graphql.jinja
+++ b/strix/prompts/protocols/graphql.jinja
--- a/strix/prompts/reconnaissance/.gitkeep
+++ b/strix/prompts/reconnaissance/.gitkeep
--- a/strix/skills/scan_modes/deep.jinja
+++ b/strix/skills/scan_modes/deep.jinja
@@ -0,0 +1,145 @@
+<scan_mode>
+DEEP SCAN MODE - Exhaustive Security Assessment
+
+This mode is for thorough security reviews where finding vulnerabilities is critical.
+
+PHASE 1: EXHAUSTIVE RECONNAISSANCE AND MAPPING
+Spend significant effort understanding the target before exploitation.
+
+For whitebox (source code available):
+- Map EVERY file, module, and code path in the repository
+- Trace all entry points from HTTP handlers to database queries
+- Identify all authentication mechanisms and their implementations
+- Map all authorization checks and understand the access control model
+- Identify all external service integrations and API calls
+- Analyze all configuration files for secrets and misconfigurations
+- Review all database schemas and understand data relationships
+- Map all background jobs, cron tasks, and async processing
+- Identify all serialization/deserialization points
+- Review all file handling operations (upload, download, processing)
+- Understand the deployment model and infrastructure assumptions
+- Check all dependency versions against known CVE databases
+
+For blackbox (no source code):
+- Exhaustive subdomain enumeration using multiple sources and tools
+- Full port scanning to identify all services
+- Complete content discovery with multiple wordlists
+- Technology fingerprinting on all discovered assets
+- API endpoint discovery through documentation, JavaScript analysis, and fuzzing
+- Identify all parameters including hidden and rarely-used ones
+- Map all user roles by testing with different account types
+- Understand rate limiting, WAF rules, and security controls in place
+- Document the complete application architecture as understood from outside
+
+EXECUTION STRATEGY - HIERARCHICAL AGENT SWARM:
+After Phase 1 (Recon & Mapping) is complete:
+1. Divide the application into major components/parts (e.g., Auth System, Payment Gateway, User Profile, Admin Panel)
+2. Spawn a specialized subagent for EACH major component
+3. Each component agent must then:
+   - Further subdivide its scope into subparts (e.g., Login Form, Registration API, Password Reset)
+   - Spawn sub-subagents for each distinct subpart
+4. At the lowest level (specific functionality), spawn specialized agents for EACH potential vulnerability type:
+   - "Auth System" → "Login Form" → "SQLi Agent", "XSS Agent", "Auth Bypass Agent"
+   - This creates a massive parallel swarm covering every angle
+   - Do NOT overload a single agent with multiple vulnerability types
+   - Scale horizontally to maximum capacity
+
+PHASE 2: DEEP BUSINESS LOGIC ANALYSIS
+Understand the application deeply enough to find logic flaws:
+- CREATE A FULL STORYBOARD of all user flows and state transitions
+- Document every step of the business logic in a structured flow diagram
+- Use the application extensively as every type of user to map the full lifecycle of data
+- Document all state machines and workflows (e.g. Order Created -> Paid -> Shipped)
+- Identify trust boundaries between components
+- Map all integrations with third-party services
+- Understand what invariants the application tries to maintain
+- Identify all points where roles, privileges, or sensitive data changes hands
+- Look for implicit assumptions in the business logic
+- Consider multi-step attacks that abuse normal functionality
+
+PHASE 3: COMPREHENSIVE ATTACK SURFACE TESTING
+Test EVERY input vector with EVERY applicable technique.
+
+Input Handling - Test all parameters, headers, cookies with:
+- Multiple injection payloads (SQL, NoSQL, LDAP, XPath, Command, Template)
+- Various encodings and bypass techniques (double encoding, unicode, null bytes)
+- Boundary conditions and type confusion
+- Large payloads and buffer-related issues
+
+Authentication and Session:
+- Exhaustive brute force protection testing
+- Session fixation, hijacking, and prediction attacks
+- JWT/token manipulation if applicable
+- OAuth flow abuse scenarios
+- Password reset flow vulnerabilities (token leakage, reuse, timing)
+- Multi-factor authentication bypass techniques
+- Account enumeration through all possible channels
+
+Access Control:
+- Test EVERY endpoint for horizontal and vertical access control
+- Parameter tampering on all object references
+- Forced browsing to all discovered resources
+- HTTP method tampering
+- Test access control after session changes (logout, role change)
+
+File Operations:
+- Exhaustive file upload bypass testing (extension, content-type, magic bytes)
+- Path traversal on all file parameters
+- Server-side request forgery through file inclusion
+- XXE through all XML parsing points
+
+Business Logic:
+- Race conditions on all state-changing operations
+- Workflow bypass attempts on every multi-step process
+- Price/quantity manipulation in all transactions
+- Parallel execution attacks
+- Time-of-check to time-of-use vulnerabilities
+
+Advanced Attacks:
+- HTTP request smuggling if multiple proxies/servers
+- Cache poisoning and cache deception
+- Subdomain takeover on all subdomains
+- Prototype pollution in JavaScript applications
+- CORS misconfiguration exploitation
+- WebSocket security testing
+- GraphQL specific attacks if applicable
+
+PHASE 4: VULNERABILITY CHAINING
+Don't just find individual bugs - chain them:
+- Combine information disclosure with access control bypass
+- Chain SSRF to access internal services
+- Use low-severity findings to enable high-impact attacks
+- Look for multi-step attack paths that automated tools miss
+- Consider attacks that span multiple application components
+
+CHAINING PRINCIPLES (MAX IMPACT):
+- Treat every finding as a pivot: ask "What does this unlock next?" until you reach maximum privilege / maximum data exposure / maximum control
+- Prefer end-to-end exploit paths over isolated bugs: initial foothold → pivot → privilege gain → sensitive action/data
+- Cross boundaries deliberately: user → admin, external → internal, unauthenticated → authenticated, read → write, single-tenant → cross-tenant
+- Validate chains by executing the full sequence using the available tools (proxy + browser for workflows, python for automation, terminal for supporting commands)
+- When a component agent finds a potential pivot, it must message/spawn the next focused agent to continue the chain in the next component/subpart
+
+PHASE 5: PERSISTENT TESTING
+If initial attempts fail, don't give up:
+- Research specific technologies for known bypasses
+- Try alternative exploitation techniques
+- Look for edge cases and unusual functionality
+- Test with different client contexts
+- Revisit previously tested areas with new information
+- Consider timing-based and blind exploitation techniques
+
+PHASE 6: THOROUGH REPORTING
+- Document EVERY confirmed vulnerability with full details
+- Include all severity levels - even low findings may enable chains
+- Provide complete reproduction steps and PoC
+- Document remediation recommendations
+- Note areas requiring additional review beyond current scope
+
+MINDSET:
+- Relentless - this is about finding what others miss
+- Creative - think of unconventional attack vectors
+- Patient - real vulnerabilities often require deep investigation
+- Thorough - test every parameter, every endpoint, every edge case
+- Persistent - if one approach fails, try ten more
+- Holistic - understand how components interact to find systemic issues
+</scan_mode>
--- a/strix/skills/scan_modes/quick.jinja
+++ b/strix/skills/scan_modes/quick.jinja
@@ -0,0 +1,63 @@
+<scan_mode>
+QUICK SCAN MODE - Rapid Security Assessment
+
+This mode is optimized for fast feedback. Focus on HIGH-IMPACT vulnerabilities with minimal overhead.
+
+PHASE 1: RAPID ORIENTATION
+- If source code is available: Focus primarily on RECENT CHANGES (git diff, new commits, modified files)
+- Identify the most critical entry points: authentication endpoints, payment flows, admin interfaces, API endpoints handling sensitive data
+- Quickly understand the tech stack and frameworks in use
+- Skip exhaustive reconnaissance - use what's immediately visible
+
+PHASE 2: TARGETED ATTACK SURFACE
+For whitebox (source code available):
+- Prioritize files changed in recent commits/PRs - these are most likely to contain fresh bugs
+- Look for security-sensitive patterns in diffs: auth checks, input handling, database queries, file operations
+- Trace user-controllable input in changed code paths
+- Check if security controls were modified or bypassed
+
+For blackbox (no source code):
+- Focus on authentication and session management
+- Test the most critical user flows only
+- Check for obvious misconfigurations and exposed endpoints
+- Skip deep content discovery - test what's immediately accessible
+
+PHASE 3: HIGH-IMPACT VULNERABILITY FOCUS
+Prioritize in this order:
+1. Authentication bypass and broken access control
+2. Remote code execution vectors
+3. SQL injection in critical endpoints
+4. Insecure direct object references (IDOR) in sensitive resources
+5. Server-side request forgery (SSRF)
+6. Hardcoded credentials or secrets in code
+
+Skip lower-priority items:
+- Extensive subdomain enumeration
+- Full directory bruteforcing
+- Information disclosure that doesn't lead to exploitation
+- Theoretical vulnerabilities without PoC
+
+PHASE 4: VALIDATION AND REPORTING
+- Validate only critical/high severity findings with minimal PoC
+- Report findings as you discover them - don't wait for completion
+- Focus on exploitability and business impact
+
+QUICK CHAINING RULE:
+- If you find ANY strong primitive (auth weakness, access control gap, injection point, internal reachability), immediately attempt a single high-impact pivot to demonstrate real impact
+- Do not stop at a low-context “maybe”; turn it into a concrete exploit sequence (even if short) that reaches privileged action or sensitive data
+
+OPERATIONAL GUIDELINES:
+- Use the browser tool for quick manual testing of critical flows
+- Use terminal for targeted scans with fast presets (e.g., nuclei with critical/high templates only)
+- Use proxy to inspect traffic on key endpoints
+- Skip extensive fuzzing - use targeted payloads only
+- Create subagents only for parallel high-priority tasks
+- If whitebox: file_edit tool to review specific suspicious code sections
+- Use notes tool to track critical findings only
+
+MINDSET:
+- Think like a time-boxed bug bounty hunter going for quick wins
+- Prioritize breadth over depth on critical areas
+- If something looks exploitable, validate quickly and move on
+- Don't get stuck - if an attack vector isn't yielding results quickly, pivot
+</scan_mode>
--- a/strix/skills/scan_modes/standard.jinja
+++ b/strix/skills/scan_modes/standard.jinja
@@ -0,0 +1,91 @@
+<scan_mode>
+STANDARD SCAN MODE - Balanced Security Assessment
+
+This mode provides thorough coverage with a structured methodology. Balance depth with efficiency.
+
+PHASE 1: RECONNAISSANCE AND MAPPING
+Understanding the target is critical before exploitation. Never skip this phase.
+
+For whitebox (source code available):
+- Map the entire codebase structure: directories, modules, entry points
+- Identify the application architecture (MVC, microservices, monolith)
+- Understand the routing: how URLs map to handlers/controllers
+- Identify all user input vectors: forms, APIs, file uploads, headers, cookies
+- Map authentication and authorization flows
+- Identify database interactions and ORM usage
+- Review dependency manifests for known vulnerable packages
+- Understand the data model and sensitive data locations
+
+For blackbox (no source code):
+- Crawl the application thoroughly using browser tool - interact with every feature
+- Enumerate all endpoints, parameters, and functionality
+- Identify the technology stack through fingerprinting
+- Map user roles and access levels
+- Understand the business logic by using the application as intended
+- Document all forms, APIs, and data entry points
+- Use proxy tool to capture and analyze all traffic during exploration
+
+PHASE 2: BUSINESS LOGIC UNDERSTANDING
+Before testing for vulnerabilities, understand what the application DOES:
+- What are the critical business flows? (payments, user registration, data access)
+- What actions should be restricted to specific roles?
+- What data should users NOT be able to access?
+- What state transitions exist? (order pending → paid → shipped)
+- Where does money, sensitive data, or privilege flow?
+
+PHASE 3: SYSTEMATIC VULNERABILITY ASSESSMENT
+Test each attack surface methodically. Create focused subagents for different areas.
+
+Entry Point Analysis:
+- Test all input fields for injection vulnerabilities
+- Check all API endpoints for authentication and authorization
+- Verify all file upload functionality for bypass
+- Test all search and filter functionality
+- Check redirect parameters and URL handling
+
+Authentication and Session:
+- Test login for brute force protection
+- Check session token entropy and handling
+- Test password reset flows for weaknesses
+- Verify logout invalidates sessions
+- Test for authentication bypass techniques
+
+Access Control:
+- For every privileged action, test as unprivileged user
+- Test horizontal access control (user A accessing user B's data)
+- Test vertical access control (user escalating to admin)
+- Check API endpoints mirror UI access controls
+- Test direct object references with different user contexts
+
+Business Logic:
+- Attempt to skip steps in multi-step processes
+- Test for race conditions in critical operations
+- Try negative values, zero values, boundary conditions
+- Attempt to replay transactions
+- Test for price manipulation in e-commerce flows
+
+PHASE 4: EXPLOITATION AND VALIDATION
+- Every finding must have a working proof-of-concept
+- Demonstrate actual impact, not theoretical risk
+- Chain vulnerabilities when possible to show maximum impact
+- Document the full attack path from initial access to impact
+- Use python tool for complex exploit development
+
+CHAINING & MAX IMPACT MINDSET:
+- Always ask: "If I can do X, what does that enable me to do next?" Keep pivoting until you reach maximum privilege or maximum sensitive data access
+- Prefer complete end-to-end paths (entry point → pivot → privileged action/data) over isolated bug reports
+- Use the application as a real user would: exploit must survive the actual workflow and state transitions
+- When you discover a useful pivot (info leak, weak boundary, partial access), immediately pursue the next step rather than stopping at the first win
+
+PHASE 5: COMPREHENSIVE REPORTING
+- Report all confirmed vulnerabilities with clear reproduction steps
+- Include severity based on actual exploitability and business impact
+- Provide remediation recommendations
+- Document any areas that need further investigation
+
+MINDSET:
+- Methodical and systematic - cover the full attack surface
+- Document as you go - findings and areas tested
+- Validate everything - no assumptions about exploitability
+- Think about business impact, not just technical severity
+</scan_mode>
--- a/strix/prompts/technologies/firebase_firestore.jinja
+++ b/strix/prompts/technologies/firebase_firestore.jinja
--- a/strix/prompts/technologies/supabase.jinja
+++ b/strix/prompts/technologies/supabase.jinja
--- a/strix/prompts/vulnerabilities/authentication_jwt.jinja
+++ b/strix/prompts/vulnerabilities/authentication_jwt.jinja
--- a/strix/prompts/vulnerabilities/broken_function_level_authorization.jinja
+++ b/strix/prompts/vulnerabilities/broken_function_level_authorization.jinja
--- a/strix/prompts/vulnerabilities/business_logic.jinja
+++ b/strix/prompts/vulnerabilities/business_logic.jinja
--- a/strix/prompts/vulnerabilities/csrf.jinja
+++ b/strix/prompts/vulnerabilities/csrf.jinja
--- a/strix/prompts/vulnerabilities/idor.jinja
+++ b/strix/prompts/vulnerabilities/idor.jinja
--- a/strix/prompts/vulnerabilities/information_disclosure.jinja
+++ b/strix/prompts/vulnerabilities/information_disclosure.jinja
--- a/strix/prompts/vulnerabilities/insecure_file_uploads.jinja
+++ b/strix/prompts/vulnerabilities/insecure_file_uploads.jinja
--- a/strix/prompts/vulnerabilities/mass_assignment.jinja
+++ b/strix/prompts/vulnerabilities/mass_assignment.jinja
--- a/strix/prompts/vulnerabilities/open_redirect.jinja
+++ b/strix/prompts/vulnerabilities/open_redirect.jinja
--- a/strix/prompts/vulnerabilities/path_traversal_lfi_rfi.jinja
+++ b/strix/prompts/vulnerabilities/path_traversal_lfi_rfi.jinja
--- a/strix/prompts/vulnerabilities/race_conditions.jinja
+++ b/strix/prompts/vulnerabilities/race_conditions.jinja
--- a/strix/prompts/vulnerabilities/rce.jinja
+++ b/strix/prompts/vulnerabilities/rce.jinja
--- a/strix/prompts/vulnerabilities/sql_injection.jinja
+++ b/strix/prompts/vulnerabilities/sql_injection.jinja
--- a/strix/prompts/vulnerabilities/ssrf.jinja
+++ b/strix/prompts/vulnerabilities/ssrf.jinja
--- a/strix/prompts/vulnerabilities/subdomain_takeover.jinja
+++ b/strix/prompts/vulnerabilities/subdomain_takeover.jinja
--- a/strix/prompts/vulnerabilities/xss.jinja
+++ b/strix/prompts/vulnerabilities/xss.jinja
--- a/strix/prompts/vulnerabilities/xxe.jinja
+++ b/strix/prompts/vulnerabilities/xxe.jinja
--- a/strix/telemetry/README.md
+++ b/strix/telemetry/README.md
@@ -0,0 +1,38 @@
+### Overview
+
+To help make Strix better for everyone, we collect anonymized data that helps us understand how to better improve our AI security agent for our users, guide the addition of new features, and fix common errors and bugs. This feedback loop is crucial for improving Strix's capabilities and user experience.
+
+We use [PostHog](https://posthog.com), an open-source analytics platform, for data collection and analysis. Our telemetry implementation is fully transparent - you can review the [source code](https://github.com/usestrix/strix/blob/main/strix/telemetry/posthog.py) to see exactly what we track.
+
+### Telemetry Policy
+
+Privacy is our priority. All collected data is anonymized by default. Each session gets a random UUID that is not persisted or tied to you. Your code, scan targets, vulnerability details, and findings always remain private and are never collected.
+
+### What We Track
+
+We collect only very **basic** usage data including:
+
+**Session Errors:** Duration and error types (not messages or stack traces)\
+**System Context:** OS type, architecture, Strix version\
+**Scan Context:** Scan mode (quick/standard/deep), scan type (whitebox/blackbox)\
+**Model Usage:** Which LLM model is being used (not prompts or responses)\
+**Aggregate Metrics:** Vulnerability counts by severity, agent/tool counts, token usage and cost estimates
+
+For complete transparency, you can inspect our [telemetry implementation](https://github.com/usestrix/strix/blob/main/strix/telemetry/posthog.py) to see the exact events we track.
+
+### What We **Never** Collect
+
+- IP addresses, usernames, or any identifying information
+- Scan targets, file paths, target URLs, or domains
+- Vulnerability details, descriptions, or code
+- LLM requests and responses
+
+### How to Opt Out
+
+Telemetry in Strix is entirely **optional**:
+
+```bash
+export STRIX_TELEMETRY=0
+```
+
+You can set this environment variable before running Strix to disable **all** telemetry.
--- a/strix/telemetry/init.py
+++ b/strix/telemetry/init.py
@@ -1,4 +1,10 @@
+from . import posthog
 from .tracer import Tracer, get_global_tracer, set_global_tracer


-__all__ = ["Tracer", "get_global_tracer", "set_global_tracer"]
+__all__ = [
+    "Tracer",
+    "get_global_tracer",
+    "posthog",
+    "set_global_tracer",
+]
--- a/strix/telemetry/posthog.py
+++ b/strix/telemetry/posthog.py
@@ -0,0 +1,137 @@
+import json
+import platform
+import sys
+import urllib.request
+from pathlib import Path
+from typing import TYPE_CHECKING, Any
+from uuid import uuid4
+
+from strix.config import Config
+
+
+if TYPE_CHECKING:
+    from strix.telemetry.tracer import Tracer
+
+_POSTHOG_PUBLIC_API_KEY = "phc_7rO3XRuNT5sgSKAl6HDIrWdSGh1COzxw0vxVIAR6vVZ"
+_POSTHOG_HOST = "https://us.i.posthog.com"
+
+_SESSION_ID = uuid4().hex[:16]
+
+
+def _is_enabled() -> bool:
+    return (Config.get("strix_telemetry") or "1").lower() not in ("0", "false", "no", "off")
+
+
+def _is_first_run() -> bool:
+    marker = Path.home() / ".strix" / ".seen"
+    if marker.exists():
+        return False
+    try:
+        marker.parent.mkdir(parents=True, exist_ok=True)
+        marker.touch()
+    except Exception:  # noqa: BLE001, S110
+        pass  # nosec B110
+    return True
+
+
+def _get_version() -> str:
+    try:
+        from importlib.metadata import version
+
+        return version("strix-agent")
+    except Exception:  # noqa: BLE001
+        return "unknown"
+
+
+def _send(event: str, properties: dict[str, Any]) -> None:
+    if not _is_enabled():
+        return
+    try:
+        payload = {
+            "api_key": _POSTHOG_PUBLIC_API_KEY,
+            "event": event,
+            "distinct_id": _SESSION_ID,
+            "properties": properties,
+        }
+        req = urllib.request.Request(  # noqa: S310
+            f"{_POSTHOG_HOST}/capture/",
+            data=json.dumps(payload).encode(),
+            headers={"Content-Type": "application/json"},
+        )
+        with urllib.request.urlopen(req, timeout=10):  # noqa: S310  # nosec B310
+            pass
+    except Exception:  # noqa: BLE001, S110
+        pass  # nosec B110
+
+
+def _base_props() -> dict[str, Any]:
+    return {
+        "os": platform.system().lower(),
+        "arch": platform.machine(),
+        "python": f"{sys.version_info.major}.{sys.version_info.minor}",
+        "strix_version": _get_version(),
+    }
+
+
+def start(
+    model: str | None,
+    scan_mode: str | None,
+    is_whitebox: bool,
+    interactive: bool,
+    has_instructions: bool,
+) -> None:
+    _send(
+        "scan_started",
+        {
+            **_base_props(),
+            "model": model or "unknown",
+            "scan_mode": scan_mode or "unknown",
+            "scan_type": "whitebox" if is_whitebox else "blackbox",
+            "interactive": interactive,
+            "has_instructions": has_instructions,
+            "first_run": _is_first_run(),
+        },
+    )
+
+
+def finding(severity: str) -> None:
+    _send(
+        "finding_reported",
+        {
+            **_base_props(),
+            "severity": severity.lower(),
+        },
+    )
+
+
+def end(tracer: "Tracer", exit_reason: str = "completed") -> None:
+    vulnerabilities_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
+    for v in tracer.vulnerability_reports:
+        sev = v.get("severity", "info").lower()
+        if sev in vulnerabilities_counts:
+            vulnerabilities_counts[sev] += 1
+
+    llm = tracer.get_total_llm_stats()
+    total = llm.get("total", {})
+
+    _send(
+        "scan_ended",
+        {
+            **_base_props(),
+            "exit_reason": exit_reason,
+            "duration_seconds": round(tracer._calculate_duration()),
+            "vulnerabilities_total": len(tracer.vulnerability_reports),
+            **{f"vulnerabilities_{k}": v for k, v in vulnerabilities_counts.items()},
+            "agent_count": len(tracer.agents),
+            "tool_count": tracer.get_real_tool_count(),
+            "llm_tokens": llm.get("total_tokens", 0),
+            "llm_cost": total.get("cost", 0.0),
+        },
+    )
+
+
+def error(error_type: str, error_msg: str | None = None) -> None:
+    props = {**_base_props(), "error_type": error_type}
+    if error_msg:
+        props["error_msg"] = error_msg
+    _send("error", props)
--- a/strix/telemetry/tracer.py
+++ b/strix/telemetry/tracer.py
@@ -4,6 +4,8 @@ from pathlib import Path
 from typing import TYPE_CHECKING, Any, Optional
 from uuid import uuid4

+from strix.telemetry import posthog
+

 if TYPE_CHECKING:
    from collections.abc import Callable
@@ -33,6 +35,8 @@ class Tracer:
        self.agents: dict[str, dict[str, Any]] = {}
        self.tool_executions: dict[int, dict[str, Any]] = {}
        self.chat_messages: list[dict[str, Any]] = []
+        self.streaming_content: dict[str, str] = {}
+        self.interrupted_content: dict[str, str] = {}

        self.vulnerability_reports: list[dict[str, Any]] = []
        self.final_scan_result: str | None = None
@@ -52,7 +56,7 @@ class Tracer:
        self._next_message_id = 1
        self._saved_vuln_ids: set[str] = set()

-        self.vulnerability_found_callback: Callable[[str, str, str, str], None] | None = None
+        self.vulnerability_found_callback: Callable[[dict[str, Any]], None] | None = None

    def set_run_name(self, run_name: str) -> None:
        self.run_name = run_name
@@ -69,48 +73,118 @@ class Tracer:

        return self._run_dir

-    def add_vulnerability_report(
+    def add_vulnerability_report(  # noqa: PLR0912
        self,
        title: str,
-        content: str,
        severity: str,
+        description: str | None = None,
+        impact: str | None = None,
+        target: str | None = None,
+        technical_analysis: str | None = None,
+        poc_description: str | None = None,
+        poc_script_code: str | None = None,
+        remediation_steps: str | None = None,
+        cvss: float | None = None,
+        cvss_breakdown: dict[str, str] | None = None,
+        endpoint: str | None = None,
+        method: str | None = None,
+        cve: str | None = None,
+        code_file: str | None = None,
+        code_before: str | None = None,
+        code_after: str | None = None,
+        code_diff: str | None = None,
    ) -> str:
        report_id = f"vuln-{len(self.vulnerability_reports) + 1:04d}"

-        report = {
+        report: dict[str, Any] = {
            "id": report_id,
            "title": title.strip(),
-            "content": content.strip(),
            "severity": severity.lower().strip(),
            "timestamp": datetime.now(UTC).strftime("%Y-%m-%d %H:%M:%S UTC"),
        }

+        if description:
+            report["description"] = description.strip()
+        if impact:
+            report["impact"] = impact.strip()
+        if target:
+            report["target"] = target.strip()
+        if technical_analysis:
+            report["technical_analysis"] = technical_analysis.strip()
+        if poc_description:
+            report["poc_description"] = poc_description.strip()
+        if poc_script_code:
+            report["poc_script_code"] = poc_script_code.strip()
+        if remediation_steps:
+            report["remediation_steps"] = remediation_steps.strip()
+        if cvss is not None:
+            report["cvss"] = cvss
+        if cvss_breakdown:
+            report["cvss_breakdown"] = cvss_breakdown
+        if endpoint:
+            report["endpoint"] = endpoint.strip()
+        if method:
+            report["method"] = method.strip()
+        if cve:
+            report["cve"] = cve.strip()
+        if code_file:
+            report["code_file"] = code_file.strip()
+        if code_before:
+            report["code_before"] = code_before.strip()
+        if code_after:
+            report["code_after"] = code_after.strip()
+        if code_diff:
+            report["code_diff"] = code_diff.strip()
+
        self.vulnerability_reports.append(report)
        logger.info(f"Added vulnerability report: {report_id} - {title}")
+        posthog.finding(severity)

        if self.vulnerability_found_callback:
-            self.vulnerability_found_callback(
-                report_id, title.strip(), content.strip(), severity.lower().strip()
-            )
+            self.vulnerability_found_callback(report)

        self.save_run_data()
        return report_id

-    def set_final_scan_result(
-        self,
-        content: str,
-        success: bool = True,
-    ) -> None:
-        self.final_scan_result = content.strip()
+    def get_existing_vulnerabilities(self) -> list[dict[str, Any]]:
+        return list(self.vulnerability_reports)

+    def update_scan_final_fields(
+        self,
+        executive_summary: str,
+        methodology: str,
+        technical_analysis: str,
+        recommendations: str,
+    ) -> None:
        self.scan_results = {
            "scan_completed": True,
-            "content": content,
-            "success": success,
+            "executive_summary": executive_summary.strip(),
+            "methodology": methodology.strip(),
+            "technical_analysis": technical_analysis.strip(),
+            "recommendations": recommendations.strip(),
+            "success": True,
        }

-        logger.info(f"Set final scan result: success={success}")
+        self.final_scan_result = f"""# Executive Summary
+
+{executive_summary.strip()}
+
+# Methodology
+
+{methodology.strip()}
+
+# Technical Analysis
+
+{technical_analysis.strip()}
+
+# Recommendations
+
+{recommendations.strip()}
+"""
+
+        logger.info("Updated scan final fields")
        self.save_run_data(mark_complete=True)
+        posthog.end(self, exit_reason="finished_by_tool")

    def log_agent_creation(
        self, agent_id: str, name: str, task: str, parent_id: str | None = None
@@ -202,7 +276,7 @@ class Tracer:
        )
        self.get_run_dir()

-    def save_run_data(self, mark_complete: bool = False) -> None:
+    def save_run_data(self, mark_complete: bool = False) -> None:  # noqa: PLR0912, PLR0915
        try:
            run_dir = self.get_run_dir()
            if mark_complete:
@@ -230,42 +304,89 @@ class Tracer:
                    if report["id"] not in self._saved_vuln_ids
                ]

+                severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3, "info": 4}
+                sorted_reports = sorted(
+                    self.vulnerability_reports,
+                    key=lambda x: (severity_order.get(x["severity"], 5), x["timestamp"]),
+                )
+
                for report in new_reports:
                    vuln_file = vuln_dir / f"{report['id']}.md"
                    with vuln_file.open("w", encoding="utf-8") as f:
-                        f.write(f"# {report['title']}\n\n")
-                        f.write(f"**ID:** {report['id']}\n")
-                        f.write(f"**Severity:** {report['severity'].upper()}\n")
-                        f.write(f"**Found:** {report['timestamp']}\n\n")
-                        f.write("## Description\n\n")
-                        f.write(f"{report['content']}\n")
+                        f.write(f"# {report.get('title', 'Untitled Vulnerability')}\n\n")
+                        f.write(f"**ID:** {report.get('id', 'unknown')}\n")
+                        f.write(f"**Severity:** {report.get('severity', 'unknown').upper()}\n")
+                        f.write(f"**Found:** {report.get('timestamp', 'unknown')}\n")
+
+                        metadata_fields: list[tuple[str, Any]] = [
+                            ("Target", report.get("target")),
+                            ("Endpoint", report.get("endpoint")),
+                            ("Method", report.get("method")),
+                            ("CVE", report.get("cve")),
+                        ]
+                        cvss_score = report.get("cvss")
+                        if cvss_score is not None:
+                            metadata_fields.append(("CVSS", cvss_score))
+
+                        for label, value in metadata_fields:
+                            if value:
+                                f.write(f"**{label}:** {value}\n")
+
+                        f.write("\n## Description\n\n")
+                        desc = report.get("description") or "No description provided."
+                        f.write(f"{desc}\n\n")
+
+                        if report.get("impact"):
+                            f.write("## Impact\n\n")
+                            f.write(f"{report['impact']}\n\n")
+
+                        if report.get("technical_analysis"):
+                            f.write("## Technical Analysis\n\n")
+                            f.write(f"{report['technical_analysis']}\n\n")
+
+                        if report.get("poc_description") or report.get("poc_script_code"):
+                            f.write("## Proof of Concept\n\n")
+                            if report.get("poc_description"):
+                                f.write(f"{report['poc_description']}\n\n")
+                            if report.get("poc_script_code"):
+                                f.write("```\n")
+                                f.write(f"{report['poc_script_code']}\n")
+                                f.write("```\n\n")
+
+                        if report.get("code_file") or report.get("code_diff"):
+                            f.write("## Code Analysis\n\n")
+                            if report.get("code_file"):
+                                f.write(f"**File:** {report['code_file']}\n\n")
+                            if report.get("code_diff"):
+                                f.write("**Changes:**\n")
+                                f.write("```diff\n")
+                                f.write(f"{report['code_diff']}\n")
+                                f.write("```\n\n")
+
+                        if report.get("remediation_steps"):
+                            f.write("## Remediation\n\n")
+                            f.write(f"{report['remediation_steps']}\n\n")
+
                    self._saved_vuln_ids.add(report["id"])

-                if self.vulnerability_reports:
-                    severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3, "info": 4}
-                    sorted_reports = sorted(
-                        self.vulnerability_reports,
-                        key=lambda x: (severity_order.get(x["severity"], 5), x["timestamp"]),
-                    )
+                vuln_csv_file = run_dir / "vulnerabilities.csv"
+                with vuln_csv_file.open("w", encoding="utf-8", newline="") as f:
+                    import csv

-                    vuln_csv_file = run_dir / "vulnerabilities.csv"
-                    with vuln_csv_file.open("w", encoding="utf-8", newline="") as f:
-                        import csv
+                    fieldnames = ["id", "title", "severity", "timestamp", "file"]
+                    writer = csv.DictWriter(f, fieldnames=fieldnames)
+                    writer.writeheader()

-                        fieldnames = ["id", "title", "severity", "timestamp", "file"]
-                        writer = csv.DictWriter(f, fieldnames=fieldnames)
-                        writer.writeheader()
-
-                        for report in sorted_reports:
-                            writer.writerow(
-                                {
-                                    "id": report["id"],
-                                    "title": report["title"],
-                                    "severity": report["severity"].upper(),
-                                    "timestamp": report["timestamp"],
-                                    "file": f"vulnerabilities/{report['id']}.md",
-                                }
-                            )
+                    for report in sorted_reports:
+                        writer.writerow(
+                            {
+                                "id": report["id"],
+                                "title": report["title"],
+                                "severity": report["severity"].upper(),
+                                "timestamp": report["timestamp"],
+                                "file": f"vulnerabilities/{report['id']}.md",
+                            }
+                        )

                if new_reports:
                    logger.info(
@@ -291,14 +412,14 @@ class Tracer:
    def get_agent_tools(self, agent_id: str) -> list[dict[str, Any]]:
        return [
            exec_data
-            for exec_data in self.tool_executions.values()
+            for exec_data in list(self.tool_executions.values())
            if exec_data.get("agent_id") == agent_id
        ]

    def get_real_tool_count(self) -> int:
        return sum(
            1
-            for exec_data in self.tool_executions.values()
+            for exec_data in list(self.tool_executions.values())
            if exec_data.get("tool_name") not in ["scan_start_info", "subagent_start_info"]
        )

@@ -333,5 +454,28 @@ class Tracer:
            "total_tokens": total_stats["input_tokens"] + total_stats["output_tokens"],
        }

+    def update_streaming_content(self, agent_id: str, content: str) -> None:
+        self.streaming_content[agent_id] = content
+
+    def clear_streaming_content(self, agent_id: str) -> None:
+        self.streaming_content.pop(agent_id, None)
+
+    def get_streaming_content(self, agent_id: str) -> str | None:
+        return self.streaming_content.get(agent_id)
+
+    def finalize_streaming_as_interrupted(self, agent_id: str) -> str | None:
+        content = self.streaming_content.pop(agent_id, None)
+        if content and content.strip():
+            self.interrupted_content[agent_id] = content
+            self.log_chat_message(
+                content=content,
+                role="assistant",
+                agent_id=agent_id,
+                metadata={"interrupted": True},
+            )
+            return content
+
+        return self.interrupted_content.pop(agent_id, None)
+
    def cleanup(self) -> None:
        self.save_run_data(mark_complete=True)
--- a/strix/tools/init.py
+++ b/strix/tools/init.py
@@ -1,5 +1,7 @@
 import os

+from strix.config import Config
+
 from .executor import (
    execute_tool,
    execute_tool_invocation,
@@ -22,11 +24,15 @@ from .registry import (

 SANDBOX_MODE = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"

-HAS_PERPLEXITY_API = bool(os.getenv("PERPLEXITY_API_KEY"))
+HAS_PERPLEXITY_API = bool(Config.get("perplexity_api_key"))
+
+DISABLE_BROWSER = (Config.get("strix_disable_browser") or "false").lower() == "true"

 if not SANDBOX_MODE:
    from .agents_graph import *  # noqa: F403
-    from .browser import *  # noqa: F403
+
+    if not DISABLE_BROWSER:
+        from .browser import *  # noqa: F403
    from .file_edit import *  # noqa: F403
    from .finish import *  # noqa: F403
    from .notes import *  # noqa: F403
@@ -35,13 +41,14 @@ if not SANDBOX_MODE:
    from .reporting import *  # noqa: F403
    from .terminal import *  # noqa: F403
    from .thinking import *  # noqa: F403
+    from .todo import *  # noqa: F403

    if HAS_PERPLEXITY_API:
        from .web_search import *  # noqa: F403
 else:
-    from .browser import *  # noqa: F403
+    if not DISABLE_BROWSER:
+        from .browser import *  # noqa: F403
    from .file_edit import *  # noqa: F403
-    from .notes import *  # noqa: F403
    from .proxy import *  # noqa: F403
    from .python import *  # noqa: F403
    from .terminal import *  # noqa: F403
--- a/strix/tools/agents_graph/agents_graph_actions.py
+++ b/strix/tools/agents_graph/agents_graph_actions.py
@@ -190,36 +190,35 @@ def create_agent(
    task: str,
    name: str,
    inherit_context: bool = True,
-    prompt_modules: str | None = None,
+    skills: str | None = None,
 ) -> dict[str, Any]:
    try:
        parent_id = agent_state.agent_id

-        module_list = []
-        if prompt_modules:
-            module_list = [m.strip() for m in prompt_modules.split(",") if m.strip()]
+        skill_list = []
+        if skills:
+            skill_list = [s.strip() for s in skills.split(",") if s.strip()]

-        if len(module_list) > 5:
+        if len(skill_list) > 5:
            return {
                "success": False,
                "error": (
-                    "Cannot specify more than 5 prompt modules for an agent "
-                    "(use comma-separated format)"
+                    "Cannot specify more than 5 skills for an agent (use comma-separated format)"
                ),
                "agent_id": None,
            }

-        if module_list:
-            from strix.prompts import get_all_module_names, validate_module_names
+        if skill_list:
+            from strix.skills import get_all_skill_names, validate_skill_names

-            validation = validate_module_names(module_list)
+            validation = validate_skill_names(skill_list)
            if validation["invalid"]:
-                available_modules = list(get_all_module_names())
+                available_skills = list(get_all_skill_names())
                return {
                    "success": False,
                    "error": (
-                        f"Invalid prompt modules: {validation['invalid']}. "
-                        f"Available modules: {', '.join(available_modules)}"
+                        f"Invalid skills: {validation['invalid']}. "
+                        f"Available skills: {', '.join(available_skills)}"
                    ),
                    "agent_id": None,
                }
@@ -233,14 +232,14 @@ def create_agent(
        parent_agent = _agent_instances.get(parent_id)

        timeout = None
-        if (
-            parent_agent
-            and hasattr(parent_agent, "llm_config")
-            and hasattr(parent_agent.llm_config, "timeout")
-        ):
-            timeout = parent_agent.llm_config.timeout
+        scan_mode = "deep"
+        if parent_agent and hasattr(parent_agent, "llm_config"):
+            if hasattr(parent_agent.llm_config, "timeout"):
+                timeout = parent_agent.llm_config.timeout
+            if hasattr(parent_agent.llm_config, "scan_mode"):
+                scan_mode = parent_agent.llm_config.scan_mode

-        llm_config = LLMConfig(prompt_modules=module_list, timeout=timeout)
+        llm_config = LLMConfig(skills=skill_list, timeout=timeout, scan_mode=scan_mode)

        agent_config = {
            "llm_config": llm_config,
--- a/strix/tools/agents_graph/agents_graph_actions_schema.xml
+++ b/strix/tools/agents_graph/agents_graph_actions_schema.xml
@@ -79,8 +79,8 @@ Only create a new agent if no existing agent is handling the specific task.</des
      <parameter name="inherit_context" type="boolean" required="false">
        <description>Whether the new agent should inherit parent's conversation history and context</description>
      </parameter>
-      <parameter name="prompt_modules" type="string" required="false">
-        <description>Comma-separated list of prompt modules to use for the agent (MAXIMUM 5 modules allowed). Most agents should have at least one module in order to be useful. Agents should be highly specialized - use 1-3 related modules; up to 5 for complex contexts. {{DYNAMIC_MODULES_DESCRIPTION}}</description>
+      <parameter name="skills" type="string" required="false">
+        <description>Comma-separated list of skills to use for the agent (MAXIMUM 5 skills allowed). Most agents should have at least one skill in order to be useful. Agents should be highly specialized - use 1-3 related skills; up to 5 for complex contexts. {{DYNAMIC_SKILLS_DESCRIPTION}}</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
@@ -92,30 +92,30 @@ Only create a new agent if no existing agent is handling the specific task.</des
  <parameter=task>Validate and exploit the suspected SQL injection vulnerability found in
              the login form. Confirm exploitability and document proof of concept.</parameter>
  <parameter=name>SQLi Validator</parameter>
-  <parameter=prompt_modules>sql_injection</parameter>
+  <parameter=skills>sql_injection</parameter>
  </function>

  <function=create_agent>
  <parameter=task>Test authentication mechanisms, JWT implementation, and session management
              for security vulnerabilities and bypass techniques.</parameter>
  <parameter=name>Auth Specialist</parameter>
-  <parameter=prompt_modules>authentication_jwt, business_logic</parameter>
+  <parameter=skills>authentication_jwt, business_logic</parameter>
  </function>

-  # Example of single-module specialization (most focused)
+  # Example of single-skill specialization (most focused)
  <function=create_agent>
  <parameter=task>Perform comprehensive XSS testing including reflected, stored, and DOM-based
              variants across all identified input points.</parameter>
  <parameter=name>XSS Specialist</parameter>
-  <parameter=prompt_modules>xss</parameter>
+  <parameter=skills>xss</parameter>
  </function>

-  # Example of up to 5 related modules (borderline acceptable)
+  # Example of up to 5 related skills (borderline acceptable)
  <function=create_agent>
  <parameter=task>Test for server-side vulnerabilities including SSRF, XXE, and potential
              RCE vectors in file upload and XML processing endpoints.</parameter>
  <parameter=name>Server-Side Attack Specialist</parameter>
-  <parameter=prompt_modules>ssrf, xxe, rce</parameter>
+  <parameter=skills>ssrf, xxe, rce</parameter>
  </function>
    </examples>
  </tool>
--- a/strix/tools/browser/browser_actions.py
+++ b/strix/tools/browser/browser_actions.py
@@ -1,8 +1,10 @@
-from typing import Any, Literal, NoReturn
+from typing import TYPE_CHECKING, Any, Literal, NoReturn

 from strix.tools.registry import register_tool

-from .tab_manager import BrowserTabManager, get_browser_tab_manager
+
+if TYPE_CHECKING:
+    from .tab_manager import BrowserTabManager


 BrowserAction = Literal[
@@ -71,7 +73,7 @@ def _validate_file_path(action_name: str, file_path: str | None) -> None:


 def _handle_navigation_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    url: str | None = None,
    tab_id: str | None = None,
@@ -90,7 +92,7 @@ def _handle_navigation_actions(


 def _handle_interaction_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    coordinate: str | None = None,
    text: str | None = None,
@@ -128,7 +130,7 @@ def _raise_unknown_action(action: str) -> NoReturn:


 def _handle_tab_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    url: str | None = None,
    tab_id: str | None = None,
@@ -149,7 +151,7 @@ def _handle_tab_actions(


 def _handle_utility_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    duration: float | None = None,
    js_code: str | None = None,
@@ -191,6 +193,8 @@ def browser_action(
    file_path: str | None = None,
    clear: bool = False,
 ) -> dict[str, Any]:
+    from .tab_manager import get_browser_tab_manager
+
    manager = get_browser_tab_manager()

    try:
--- a/strix/tools/executor.py
+++ b/strix/tools/executor.py
@@ -4,6 +4,8 @@ from typing import Any

 import httpx

+from strix.config import Config
+

 if os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "false":
    from strix.runtime import get_runtime
@@ -17,6 +19,10 @@ from .registry import (
 )


+SANDBOX_EXECUTION_TIMEOUT = float(Config.get("strix_sandbox_execution_timeout") or "500")
+SANDBOX_CONNECT_TIMEOUT = float(Config.get("strix_sandbox_connect_timeout") or "10")
+
+
 async def execute_tool(tool_name: str, agent_state: Any | None = None, **kwargs: Any) -> Any:
    execute_in_sandbox = should_execute_in_sandbox(tool_name)
    sandbox_mode = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
@@ -62,10 +68,15 @@ async def _execute_tool_in_sandbox(tool_name: str, agent_state: Any, **kwargs: A
        "Content-Type": "application/json",
    }

+    timeout = httpx.Timeout(
+        timeout=SANDBOX_EXECUTION_TIMEOUT,
+        connect=SANDBOX_CONNECT_TIMEOUT,
+    )
+
    async with httpx.AsyncClient(trust_env=False) as client:
        try:
            response = await client.post(
-                request_url, json=request_data, headers=headers, timeout=None
+                request_url, json=request_data, headers=headers, timeout=timeout
            )
            response.raise_for_status()
            response_data = response.json()
--- a/strix/tools/file_edit/file_edit_actions.py
+++ b/strix/tools/file_edit/file_edit_actions.py
@@ -3,9 +3,6 @@ import re
 from pathlib import Path
 from typing import Any, cast

-from openhands_aci import file_editor
-from openhands_aci.utils.shell import run_shell_cmd
-
 from strix.tools.registry import register_tool


@@ -33,6 +30,8 @@ def str_replace_editor(
    new_str: str | None = None,
    insert_line: int | None = None,
 ) -> dict[str, Any]:
+    from openhands_aci import file_editor
+
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
@@ -64,6 +63,8 @@ def list_files(
    path: str,
    recursive: bool = False,
 ) -> dict[str, Any]:
+    from openhands_aci.utils.shell import run_shell_cmd
+
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
@@ -116,6 +117,8 @@ def search_files(
    regex: str,
    file_pattern: str = "*",
 ) -> dict[str, Any]:
+    from openhands_aci.utils.shell import run_shell_cmd
+
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
--- a/strix/tools/finish/finish_actions.py
+++ b/strix/tools/finish/finish_actions.py
@@ -4,49 +4,40 @@ from strix.tools.registry import register_tool


 def _validate_root_agent(agent_state: Any) -> dict[str, Any] | None:
-    if (
-        agent_state is not None
-        and hasattr(agent_state, "parent_id")
-        and agent_state.parent_id is not None
-    ):
+    if agent_state and hasattr(agent_state, "parent_id") and agent_state.parent_id is not None:
        return {
            "success": False,
-            "message": (
-                "This tool can only be used by the root/main agent. "
-                "Subagents must use agent_finish instead."
-            ),
+            "error": "finish_scan_wrong_agent",
+            "message": "This tool can only be used by the root/main agent",
+            "suggestion": "If you are a subagent, use agent_finish from agents_graph tool instead",
        }
    return None


-def _validate_content(content: str) -> dict[str, Any] | None:
-    if not content or not content.strip():
-        return {"success": False, "message": "Content cannot be empty"}
-    return None
-
-
 def _check_active_agents(agent_state: Any = None) -> dict[str, Any] | None:
    try:
        from strix.tools.agents_graph.agents_graph_actions import _agent_graph

-        current_agent_id = None
-        if agent_state and hasattr(agent_state, "agent_id"):
+        if agent_state and agent_state.agent_id:
            current_agent_id = agent_state.agent_id
+        else:
+            return None

-        running_agents = []
+        active_agents = []
        stopping_agents = []

-        for agent_id, node in _agent_graph.get("nodes", {}).items():
+        for agent_id, node in _agent_graph["nodes"].items():
            if agent_id == current_agent_id:
                continue

-            status = node.get("status", "")
+            status = node.get("status", "unknown")
            if status == "running":
-                running_agents.append(
+                active_agents.append(
                    {
                        "id": agent_id,
                        "name": node.get("name", "Unknown"),
-                        "task": node.get("task", "No task description"),
+                        "task": node.get("task", "Unknown task")[:300],
+                        "status": status,
                    }
                )
            elif status == "stopping":
@@ -54,121 +45,105 @@ def _check_active_agents(agent_state: Any = None) -> dict[str, Any] | None:
                    {
                        "id": agent_id,
                        "name": node.get("name", "Unknown"),
+                        "task": node.get("task", "Unknown task")[:300],
+                        "status": status,
                    }
                )

-        if running_agents or stopping_agents:
-            message_parts = ["Cannot finish scan while other agents are still active:"]
-
-            if running_agents:
-                message_parts.append("\n\nRunning agents:")
-                message_parts.extend(
-                    [
-                        f"  - {agent['name']} ({agent['id']}): {agent['task']}"
-                        for agent in running_agents
-                    ]
-                )
-
-            if stopping_agents:
-                message_parts.append("\n\nStopping agents:")
-                message_parts.extend(
-                    [f"  - {agent['name']} ({agent['id']})" for agent in stopping_agents]
-                )
-
-            message_parts.extend(
-                [
-                    "\n\nSuggested actions:",
-                    "1. Use wait_for_message to wait for all agents to complete",
-                    "2. Send messages to agents asking them to finish if urgent",
-                    "3. Use view_agent_graph to monitor agent status",
-                ]
-            )
-
-            return {
+        if active_agents or stopping_agents:
+            response: dict[str, Any] = {
                "success": False,
-                "message": "\n".join(message_parts),
-                "active_agents": {
-                    "running": len(running_agents),
-                    "stopping": len(stopping_agents),
-                    "details": {
-                        "running": running_agents,
-                        "stopping": stopping_agents,
-                    },
-                },
+                "error": "agents_still_active",
+                "message": "Cannot finish scan: agents are still active",
            }

+            if active_agents:
+                response["active_agents"] = active_agents
+
+            if stopping_agents:
+                response["stopping_agents"] = stopping_agents
+
+            response["suggestions"] = [
+                "Use wait_for_message to wait for all agents to complete",
+                "Use send_message_to_agent if you need agents to complete immediately",
+                "Check agent_status to see current agent states",
+            ]
+
+            response["total_active"] = len(active_agents) + len(stopping_agents)
+
+            return response
+
    except ImportError:
+        pass
+    except Exception:
        import logging

-        logging.warning("Could not check agent graph status - agents_graph module unavailable")
+        logging.exception("Error checking active agents")

    return None


-def _finalize_with_tracer(content: str, success: bool) -> dict[str, Any]:
+@register_tool(sandbox_execution=False)
+def finish_scan(
+    executive_summary: str,
+    methodology: str,
+    technical_analysis: str,
+    recommendations: str,
+    agent_state: Any = None,
+) -> dict[str, Any]:
+    validation_error = _validate_root_agent(agent_state)
+    if validation_error:
+        return validation_error
+
+    active_agents_error = _check_active_agents(agent_state)
+    if active_agents_error:
+        return active_agents_error
+
+    validation_errors = []
+
+    if not executive_summary or not executive_summary.strip():
+        validation_errors.append("Executive summary cannot be empty")
+    if not methodology or not methodology.strip():
+        validation_errors.append("Methodology cannot be empty")
+    if not technical_analysis or not technical_analysis.strip():
+        validation_errors.append("Technical analysis cannot be empty")
+    if not recommendations or not recommendations.strip():
+        validation_errors.append("Recommendations cannot be empty")
+
+    if validation_errors:
+        return {"success": False, "message": "Validation failed", "errors": validation_errors}
+
    try:
        from strix.telemetry.tracer import get_global_tracer

        tracer = get_global_tracer()
        if tracer:
-            tracer.set_final_scan_result(
-                content=content.strip(),
-                success=success,
+            tracer.update_scan_final_fields(
+                executive_summary=executive_summary.strip(),
+                methodology=methodology.strip(),
+                technical_analysis=technical_analysis.strip(),
+                recommendations=recommendations.strip(),
            )

+            vulnerability_count = len(tracer.vulnerability_reports)
+
            return {
                "success": True,
                "scan_completed": True,
-                "message": "Scan completed successfully"
-                if success
-                else "Scan completed with errors",
-                "vulnerabilities_found": len(tracer.vulnerability_reports),
+                "message": "Scan completed successfully",
+                "vulnerabilities_found": vulnerability_count,
            }

        import logging

-        logging.warning("Global tracer not available - final scan result not stored")
+        logging.warning("Current tracer not available - scan results not stored")

-        return {  # noqa: TRY300
-            "success": True,
-            "scan_completed": True,
-            "message": "Scan completed successfully (not persisted)"
-            if success
-            else "Scan completed with errors (not persisted)",
-            "warning": "Final result could not be persisted - tracer unavailable",
-        }
-
-    except ImportError:
+    except (ImportError, AttributeError) as e:
+        return {"success": False, "message": f"Failed to complete scan: {e!s}"}
+    else:
        return {
            "success": True,
            "scan_completed": True,
-            "message": "Scan completed successfully (not persisted)"
-            if success
-            else "Scan completed with errors (not persisted)",
-            "warning": "Final result could not be persisted - tracer module unavailable",
+            "message": "Scan completed (not persisted)",
+            "warning": "Results could not be persisted - tracer unavailable",
        }
-
-
-@register_tool(sandbox_execution=False)
-def finish_scan(
-    content: str,
-    success: bool = True,
-    agent_state: Any = None,
-) -> dict[str, Any]:
-    try:
-        validation_error = _validate_root_agent(agent_state)
-        if validation_error:
-            return validation_error
-
-        validation_error = _validate_content(content)
-        if validation_error:
-            return validation_error
-
-        active_agents_error = _check_active_agents(agent_state)
-        if active_agents_error:
-            return active_agents_error
-
-        return _finalize_with_tracer(content, success)
-
-    except (ValueError, TypeError, KeyError) as e:
-        return {"success": False, "message": f"Failed to complete scan: {e!s}"}
--- a/strix/tools/finish/finish_actions_schema.xml
+++ b/strix/tools/finish/finish_actions_schema.xml
@@ -1,6 +1,6 @@
 <tools>
  <tool name="finish_scan">
-    <description>Complete the main security scan and generate final report.
+    <description>Complete the security scan by providing the final assessment fields as full penetration test report.

 IMPORTANT: This tool can ONLY be used by the root/main agent.
 Subagents must use agent_finish from agents_graph tool instead.
@@ -8,11 +8,20 @@ Subagents must use agent_finish from agents_graph tool instead.
 IMPORTANT: This tool will NOT allow finishing if any agents are still running or stopping.
 You must wait for all agents to complete before using this tool.

-This tool MUST be called at the very end of the security assessment to:
- Verify all agents have completed their tasks
- Generate the final comprehensive scan report
- Mark the entire scan as completed
- Stop the agent from running
+This tool directly updates the scan report data:
+- executive_summary
+- methodology
+- technical_analysis
+- recommendations
+
+All fields are REQUIRED and map directly to the final report.
+
+This must be the last tool called in the scan. It will:
+1. Verify you are the root agent
+2. Check all subagents have completed
+3. Update the scan with your provided fields
+4. Mark the scan as completed
+5. Stop agent execution

 Use this tool when:
 - You are the main/root agent conducting the security assessment
@@ -23,23 +32,39 @@ Use this tool when:
 IMPORTANT: Calling this tool multiple times will OVERWRITE any previous scan report.
 Make sure you include ALL findings and details in a single comprehensive report.

-If agents are still running, this tool will:
+If agents are still running, the tool will:
 - Show you which agents are still active
 - Suggest using wait_for_message to wait for completion
 - Suggest messaging agents if immediate completion is needed

-Put ALL details in the content - methodology, tools used, vulnerability counts, key findings, recommendations,
-compliance notes, risk assessments, next steps, etc. Be comprehensive and include everything relevant.</description>
+NOTE: Make sure the vulnerabilities found were reported with create_vulnerability_report tool, otherwise they will not be tracked and you will not be rewarded.
+But make sure to not report the same vulnerability multiple times.
+
+Professional, customer-facing penetration test report rules (PDF-ready):
+- Do NOT include internal or system details: never mention local/absolute paths (e.g., "/workspace"), internal tools, agents, orchestrators, sandboxes, models, system prompts/instructions, connection/tooling issues, or tester environment details.
+- Tone and style: formal, objective, third-person, concise. No internal checklists or engineering runbooks. Content must read as a polished client deliverable.
+- Structure across fields should align to standard pentest reports:
+  - Executive summary: business impact, risk posture, notable criticals, remediation theme.
+  - Methodology: industry-standard methods (e.g., OWASP, OSSTMM, NIST), scope, constraints—no internal execution notes.
+  - Technical analysis: consolidated findings overview referencing created vulnerability reports; avoid raw logs.
+  - Recommendations: prioritized, actionable, aligned to risk and best practices.
+</description>
    <parameters>
-      <parameter name="content" type="string" required="true">
-        <description>Complete scan report including executive summary, methodology, findings, vulnerability details, recommendations, compliance notes, risk assessment, and conclusions. Include everything relevant to the assessment.</description>
+      <parameter name="executive_summary" type="string" required="true">
+        <description>High-level summary for executives: key findings, overall security posture, critical risks, business impact</description>
      </parameter>
-      <parameter name="success" type="boolean" required="false">
-        <description>Whether the scan completed successfully without critical errors</description>
+      <parameter name="methodology" type="string" required="true">
+        <description>Testing methodology: approach, tools used, scope, techniques employed</description>
+      </parameter>
+      <parameter name="technical_analysis" type="string" required="true">
+        <description>Detailed technical findings and security assessment results over the scan</description>
+      </parameter>
+      <parameter name="recommendations" type="string" required="true">
+        <description>Actionable security recommendations and remediation priorities</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
-      <description>Response containing success status and completion message. If agents are still running, returns details about active agents and suggested actions.</description>
+      <description>Response containing success status, vulnerability count, and completion message. If agents are still running, returns details about active agents and suggested actions.</description>
    </returns>
  </tool>
 </tools>
--- a/strix/tools/notes/notes_actions.py
+++ b/strix/tools/notes/notes_actions.py
@@ -11,7 +11,6 @@ _notes_storage: dict[str, dict[str, Any]] = {}
 def _filter_notes(
    category: str | None = None,
    tags: list[str] | None = None,
-    priority: str | None = None,
    search_query: str | None = None,
 ) -> list[dict[str, Any]]:
    filtered_notes = []
@@ -20,9 +19,6 @@ def _filter_notes(
        if category and note.get("category") != category:
            continue

-        if priority and note.get("priority") != priority:
-            continue
-
        if tags:
            note_tags = note.get("tags", [])
            if not any(tag in note_tags for tag in tags):
@@ -43,13 +39,12 @@ def _filter_notes(
    return filtered_notes


-@register_tool
+@register_tool(sandbox_execution=False)
 def create_note(
    title: str,
    content: str,
    category: str = "general",
    tags: list[str] | None = None,
-    priority: str = "normal",
 ) -> dict[str, Any]:
    try:
        if not title or not title.strip():
@@ -58,7 +53,7 @@ def create_note(
        if not content or not content.strip():
            return {"success": False, "error": "Content cannot be empty", "note_id": None}

-        valid_categories = ["general", "findings", "methodology", "todo", "questions", "plan"]
+        valid_categories = ["general", "findings", "methodology", "questions", "plan"]
        if category not in valid_categories:
            return {
                "success": False,
@@ -66,14 +61,6 @@ def create_note(
                "note_id": None,
            }

-        valid_priorities = ["low", "normal", "high", "urgent"]
-        if priority not in valid_priorities:
-            return {
-                "success": False,
-                "error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
-                "note_id": None,
-            }
-
        note_id = str(uuid.uuid4())[:5]
        timestamp = datetime.now(UTC).isoformat()

@@ -82,7 +69,6 @@ def create_note(
            "content": content.strip(),
            "category": category,
            "tags": tags or [],
-            "priority": priority,
            "created_at": timestamp,
            "updated_at": timestamp,
        }
@@ -99,17 +85,14 @@ def create_note(
        }


-@register_tool
+@register_tool(sandbox_execution=False)
 def list_notes(
    category: str | None = None,
    tags: list[str] | None = None,
-    priority: str | None = None,
    search: str | None = None,
 ) -> dict[str, Any]:
    try:
-        filtered_notes = _filter_notes(
-            category=category, tags=tags, priority=priority, search_query=search
-        )
+        filtered_notes = _filter_notes(category=category, tags=tags, search_query=search)

        return {
            "success": True,
@@ -126,13 +109,12 @@ def list_notes(
        }


-@register_tool
+@register_tool(sandbox_execution=False)
 def update_note(
    note_id: str,
    title: str | None = None,
    content: str | None = None,
    tags: list[str] | None = None,
-    priority: str | None = None,
 ) -> dict[str, Any]:
    try:
        if note_id not in _notes_storage:
@@ -153,15 +135,6 @@ def update_note(
        if tags is not None:
            note["tags"] = tags

-        if priority is not None:
-            valid_priorities = ["low", "normal", "high", "urgent"]
-            if priority not in valid_priorities:
-                return {
-                    "success": False,
-                    "error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
-                }
-            note["priority"] = priority
-
        note["updated_at"] = datetime.now(UTC).isoformat()

        return {
@@ -173,7 +146,7 @@ def update_note(
        return {"success": False, "error": f"Failed to update note: {e}"}


-@register_tool
+@register_tool(sandbox_execution=False)
 def delete_note(note_id: str) -> dict[str, Any]:
    try:
        if note_id not in _notes_storage:
--- a/strix/tools/notes/notes_actions_schema.xml
+++ b/strix/tools/notes/notes_actions_schema.xml
@@ -1,10 +1,9 @@
 <tools>
  <tool name="create_note">
-    <description>Create a personal note for TODOs, side notes, plans, and organizational purposes during
-  the scan.</description>
-    <details>Use this tool for quick reminders, action items, planning thoughts, and organizational notes
-  rather than formal vulnerability reports or detailed findings. This is your personal notepad
-  for keeping track of tasks, ideas, and things to remember or follow up on.</details>
+    <description>Create a personal note for observations, findings, and research during the scan.</description>
+    <details>Use this tool for documenting discoveries, observations, methodology notes, and questions.
+  This is your personal notepad for recording information you want to remember or reference later.
+  For tracking actionable tasks, use the todo tool instead.</details>
    <parameters>
      <parameter name="title" type="string" required="true">
        <description>Title of the note</description>
@@ -13,49 +12,41 @@
        <description>Content of the note</description>
      </parameter>
      <parameter name="category" type="string" required="false">
-        <description>Category to organize the note (default: "general", "findings", "methodology", "todo", "questions", "plan")</description>
+        <description>Category to organize the note (default: "general", "findings", "methodology", "questions", "plan")</description>
      </parameter>
      <parameter name="tags" type="string" required="false">
        <description>Tags for categorization</description>
      </parameter>
-      <parameter name="priority" type="string" required="false">
-        <description>Priority level of the note ("low", "normal", "high", "urgent")</description>
-      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - note_id: ID of the created note - success: Whether the note was created successfully</description>
    </returns>
    <examples>
-  # Create a TODO reminder
-  <function=create_note>
-  <parameter=title>TODO: Check SSL Certificate Details</parameter>
-  <parameter=content>Remember to verify SSL certificate validity and check for weak ciphers
-               on the HTTPS service discovered on port 443. Also check for certificate
-               transparency logs.</parameter>
-  <parameter=category>todo</parameter>
-  <parameter=tags>["ssl", "certificate", "followup"]</parameter>
-  <parameter=priority>normal</parameter>
-  </function>
-
-  # Planning note
-  <function=create_note>
-  <parameter=title>Scan Strategy Planning</parameter>
-  <parameter=content>Plan for next phase: 1) Complete subdomain enumeration 2) Test discovered
-               web apps for OWASP Top 10 3) Check database services for default creds
-               4) Review any custom applications for business logic flaws</parameter>
-  <parameter=category>plan</parameter>
-  <parameter=tags>["planning", "strategy", "next_steps"]</parameter>
-  </function>
-
-  # Side note for later investigation
+  # Document an interesting finding
  <function=create_note>
  <parameter=title>Interesting Directory Found</parameter>
-  <parameter=content>Found /backup/ directory that might contain sensitive files. Low priority
-               for now but worth checking if time permits. Directory listing seems
-               disabled.</parameter>
+  <parameter=content>Found /backup/ directory that might contain sensitive files. Directory listing
+               seems disabled but worth investigating further.</parameter>
  <parameter=category>findings</parameter>
-  <parameter=tags>["directory", "backup", "low_priority"]</parameter>
-  <parameter=priority>low</parameter>
+  <parameter=tags>["directory", "backup"]</parameter>
+  </function>
+
+  # Methodology note
+  <function=create_note>
+  <parameter=title>Authentication Flow Analysis</parameter>
+  <parameter=content>The application uses JWT tokens stored in localStorage. Token expiration is
+               set to 24 hours. Observed that refresh token rotation is not implemented.</parameter>
+  <parameter=category>methodology</parameter>
+  <parameter=tags>["auth", "jwt", "session"]</parameter>
+  </function>
+
+  # Research question
+  <function=create_note>
+  <parameter=title>Custom Header Investigation</parameter>
+  <parameter=content>The API returns a custom X-Request-ID header. Need to research if this
+               could be used for user tracking or has any security implications.</parameter>
+  <parameter=category>questions</parameter>
+  <parameter=tags>["headers", "research"]</parameter>
  </function>
    </examples>
  </tool>
@@ -84,9 +75,6 @@
      <parameter name="tags" type="string" required="false">
        <description>Filter by tags (returns notes with any of these tags)</description>
      </parameter>
-      <parameter name="priority" type="string" required="false">
-        <description>Filter by priority level</description>
-      </parameter>
      <parameter name="search" type="string" required="false">
        <description>Search query to find in note titles and content</description>
      </parameter>
@@ -100,11 +88,6 @@
  <parameter=category>findings</parameter>
  </function>

-  # List high priority items
-  <function=list_notes>
-  <parameter=priority>high</parameter>
-  </function>
-
  # Search for SQL injection related notes
  <function=list_notes>
  <parameter=search>SQL injection</parameter>
@@ -132,9 +115,6 @@
      <parameter name="tags" type="string" required="false">
        <description>New tags for the note</description>
      </parameter>
-      <parameter name="priority" type="string" required="false">
-        <description>New priority level</description>
-      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - success: Whether the note was updated successfully</description>
@@ -143,7 +123,6 @@
  <function=update_note>
  <parameter=note_id>note_123</parameter>
  <parameter=content>Updated content with new findings...</parameter>
-  <parameter=priority>urgent</parameter>
  </function>
    </examples>
  </tool>
--- a/strix/tools/proxy/proxy_actions.py
+++ b/strix/tools/proxy/proxy_actions.py
@@ -2,8 +2,6 @@ from typing import Any, Literal

 from strix.tools.registry import register_tool

-from .proxy_manager import get_proxy_manager
-

 RequestPart = Literal["request", "response"]

@@ -27,6 +25,8 @@ def list_requests(
    sort_order: Literal["asc", "desc"] = "desc",
    scope_id: str | None = None,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.list_requests(
        httpql_filter, start_page, end_page, page_size, sort_by, sort_order, scope_id
@@ -41,6 +41,8 @@ def view_request(
    page: int = 1,
    page_size: int = 50,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.view_request(request_id, part, search_pattern, page, page_size)

@@ -53,6 +55,8 @@ def send_request(
    body: str = "",
    timeout: int = 30,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    if headers is None:
        headers = {}
    manager = get_proxy_manager()
@@ -64,6 +68,8 @@ def repeat_request(
    request_id: str,
    modifications: dict[str, Any] | None = None,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    if modifications is None:
        modifications = {}
    manager = get_proxy_manager()
@@ -78,6 +84,8 @@ def scope_rules(
    scope_id: str | None = None,
    scope_name: str | None = None,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.scope_rules(action, allowlist, denylist, scope_id, scope_name)

@@ -89,6 +97,8 @@ def list_sitemap(
    depth: Literal["DIRECT", "ALL"] = "DIRECT",
    page: int = 1,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.list_sitemap(scope_id, parent_id, depth, page)

@@ -97,5 +107,7 @@ def list_sitemap(
 def view_sitemap_entry(
    entry_id: str,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.view_sitemap_entry(entry_id)
--- a/strix/tools/python/python_actions.py
+++ b/strix/tools/python/python_actions.py
@@ -2,8 +2,6 @@ from typing import Any, Literal

 from strix.tools.registry import register_tool

-from .python_manager import get_python_session_manager
-

 PythonAction = Literal["new_session", "execute", "close", "list_sessions"]

@@ -15,6 +13,8 @@ def python_action(
    timeout: int = 30,
    session_id: str | None = None,
 ) -> dict[str, Any]:
+    from .python_manager import get_python_session_manager
+
    def _validate_code(action_name: str, code: str | None) -> None:
        if not code:
            raise ValueError(f"code parameter is required for {action_name} action")
--- a/strix/tools/registry.py
+++ b/strix/tools/registry.py
@@ -23,17 +23,17 @@ class ImplementedInClientSideOnlyError(Exception):


 def _process_dynamic_content(content: str) -> str:
-    if "{{DYNAMIC_MODULES_DESCRIPTION}}" in content:
+    if "{{DYNAMIC_SKILLS_DESCRIPTION}}" in content:
        try:
-            from strix.prompts import generate_modules_description
+            from strix.skills import generate_skills_description

-            modules_description = generate_modules_description()
-            content = content.replace("{{DYNAMIC_MODULES_DESCRIPTION}}", modules_description)
+            skills_description = generate_skills_description()
+            content = content.replace("{{DYNAMIC_SKILLS_DESCRIPTION}}", skills_description)
        except ImportError:
-            logger.warning("Could not import prompts utilities for dynamic schema generation")
+            logger.warning("Could not import skills utilities for dynamic schema generation")
            content = content.replace(
-                "{{DYNAMIC_MODULES_DESCRIPTION}}",
-                "List of prompt modules to load for this agent (max 5). Module discovery failed.",
+                "{{DYNAMIC_SKILLS_DESCRIPTION}}",
+                "List of skills to load for this agent (max 5). Skill discovery failed.",
            )

    return content
--- a/strix/tools/reporting/reporting_actions.py
+++ b/strix/tools/reporting/reporting_actions.py
@@ -3,61 +3,248 @@ from typing import Any
 from strix.tools.registry import register_tool


+def calculate_cvss_and_severity(
+    attack_vector: str,
+    attack_complexity: str,
+    privileges_required: str,
+    user_interaction: str,
+    scope: str,
+    confidentiality: str,
+    integrity: str,
+    availability: str,
+) -> tuple[float, str, str]:
+    try:
+        from cvss import CVSS3
+
+        vector = (
+            f"CVSS:3.1/AV:{attack_vector}/AC:{attack_complexity}/"
+            f"PR:{privileges_required}/UI:{user_interaction}/S:{scope}/"
+            f"C:{confidentiality}/I:{integrity}/A:{availability}"
+        )
+
+        c = CVSS3(vector)
+        scores = c.scores()
+        severities = c.severities()
+
+        base_score = scores[0]
+        base_severity = severities[0]
+
+        severity = base_severity.lower()
+
+    except Exception:
+        import logging
+
+        logging.exception("Failed to calculate CVSS")
+        return 7.5, "high", ""
+    else:
+        return base_score, severity, vector
+
+
+def _validate_required_fields(**kwargs: str | None) -> list[str]:
+    validation_errors: list[str] = []
+
+    required_fields = {
+        "title": "Title cannot be empty",
+        "description": "Description cannot be empty",
+        "impact": "Impact cannot be empty",
+        "target": "Target cannot be empty",
+        "technical_analysis": "Technical analysis cannot be empty",
+        "poc_description": "PoC description cannot be empty",
+        "poc_script_code": "PoC script/code is REQUIRED - provide the actual exploit/payload",
+        "remediation_steps": "Remediation steps cannot be empty",
+    }
+
+    for field_name, error_msg in required_fields.items():
+        value = kwargs.get(field_name)
+        if not value or not str(value).strip():
+            validation_errors.append(error_msg)
+
+    return validation_errors
+
+
+def _validate_cvss_parameters(**kwargs: str) -> list[str]:
+    validation_errors: list[str] = []
+
+    cvss_validations = {
+        "attack_vector": ["N", "A", "L", "P"],
+        "attack_complexity": ["L", "H"],
+        "privileges_required": ["N", "L", "H"],
+        "user_interaction": ["N", "R"],
+        "scope": ["U", "C"],
+        "confidentiality": ["N", "L", "H"],
+        "integrity": ["N", "L", "H"],
+        "availability": ["N", "L", "H"],
+    }
+
+    for param_name, valid_values in cvss_validations.items():
+        value = kwargs.get(param_name)
+        if value not in valid_values:
+            validation_errors.append(
+                f"Invalid {param_name}: {value}. Must be one of: {valid_values}"
+            )
+
+    return validation_errors
+
+
@register_tool(sandbox_execution=False)
 def create_vulnerability_report(
    title: str,
-    content: str,
-    severity: str,
+    description: str,
+    impact: str,
+    target: str,
+    technical_analysis: str,
+    poc_description: str,
+    poc_script_code: str,
+    remediation_steps: str,
+    # CVSS Breakdown Components
+    attack_vector: str,
+    attack_complexity: str,
+    privileges_required: str,
+    user_interaction: str,
+    scope: str,
+    confidentiality: str,
+    integrity: str,
+    availability: str,
+    # Optional fields
+    endpoint: str | None = None,
+    method: str | None = None,
+    cve: str | None = None,
+    code_file: str | None = None,
+    code_before: str | None = None,
+    code_after: str | None = None,
+    code_diff: str | None = None,
 ) -> dict[str, Any]:
-    validation_error = None
-    if not title or not title.strip():
-        validation_error = "Title cannot be empty"
-    elif not content or not content.strip():
-        validation_error = "Content cannot be empty"
-    elif not severity or not severity.strip():
-        validation_error = "Severity cannot be empty"
-    else:
-        valid_severities = ["critical", "high", "medium", "low", "info"]
-        if severity.lower() not in valid_severities:
-            validation_error = (
-                f"Invalid severity '{severity}'. Must be one of: {', '.join(valid_severities)}"
-            )
+    validation_errors = _validate_required_fields(
+        title=title,
+        description=description,
+        impact=impact,
+        target=target,
+        technical_analysis=technical_analysis,
+        poc_description=poc_description,
+        poc_script_code=poc_script_code,
+        remediation_steps=remediation_steps,
+    )

-    if validation_error:
-        return {"success": False, "message": validation_error}
+    validation_errors.extend(
+        _validate_cvss_parameters(
+            attack_vector=attack_vector,
+            attack_complexity=attack_complexity,
+            privileges_required=privileges_required,
+            user_interaction=user_interaction,
+            scope=scope,
+            confidentiality=confidentiality,
+            integrity=integrity,
+            availability=availability,
+        )
+    )
+
+    if validation_errors:
+        return {"success": False, "message": "Validation failed", "errors": validation_errors}
+
+    cvss_score, severity, cvss_vector = calculate_cvss_and_severity(
+        attack_vector,
+        attack_complexity,
+        privileges_required,
+        user_interaction,
+        scope,
+        confidentiality,
+        integrity,
+        availability,
+    )

    try:
        from strix.telemetry.tracer import get_global_tracer

        tracer = get_global_tracer()
        if tracer:
+            from strix.llm.dedupe import check_duplicate
+
+            existing_reports = tracer.get_existing_vulnerabilities()
+
+            candidate = {
+                "title": title,
+                "description": description,
+                "impact": impact,
+                "target": target,
+                "technical_analysis": technical_analysis,
+                "poc_description": poc_description,
+                "poc_script_code": poc_script_code,
+                "endpoint": endpoint,
+                "method": method,
+            }
+
+            dedupe_result = check_duplicate(candidate, existing_reports)
+
+            if dedupe_result.get("is_duplicate"):
+                duplicate_id = dedupe_result.get("duplicate_id", "")
+
+                duplicate_title = ""
+                for report in existing_reports:
+                    if report.get("id") == duplicate_id:
+                        duplicate_title = report.get("title", "Unknown")
+                        break
+
+                return {
+                    "success": False,
+                    "message": (
+                        f"Potential duplicate of '{duplicate_title}' "
+                        f"(id={duplicate_id[:8]}...). Do not re-report the same vulnerability."
+                    ),
+                    "duplicate_of": duplicate_id,
+                    "duplicate_title": duplicate_title,
+                    "confidence": dedupe_result.get("confidence", 0.0),
+                    "reason": dedupe_result.get("reason", ""),
+                }
+
+            cvss_breakdown = {
+                "attack_vector": attack_vector,
+                "attack_complexity": attack_complexity,
+                "privileges_required": privileges_required,
+                "user_interaction": user_interaction,
+                "scope": scope,
+                "confidentiality": confidentiality,
+                "integrity": integrity,
+                "availability": availability,
+            }
+
            report_id = tracer.add_vulnerability_report(
                title=title,
-                content=content,
+                description=description,
                severity=severity,
+                impact=impact,
+                target=target,
+                technical_analysis=technical_analysis,
+                poc_description=poc_description,
+                poc_script_code=poc_script_code,
+                remediation_steps=remediation_steps,
+                cvss=cvss_score,
+                cvss_breakdown=cvss_breakdown,
+                endpoint=endpoint,
+                method=method,
+                cve=cve,
+                code_file=code_file,
+                code_before=code_before,
+                code_after=code_after,
+                code_diff=code_diff,
            )

            return {
                "success": True,
                "message": f"Vulnerability report '{title}' created successfully",
                "report_id": report_id,
-                "severity": severity.lower(),
+                "severity": severity,
+                "cvss_score": cvss_score,
            }
+
        import logging

-        logging.warning("Global tracer not available - vulnerability report not stored")
+        logging.warning("Current tracer not available - vulnerability report not stored")

-        return {  # noqa: TRY300
-            "success": True,
-            "message": f"Vulnerability report '{title}' created successfully (not persisted)",
-            "warning": "Report could not be persisted - tracer unavailable",
-        }
-
-    except ImportError:
+    except (ImportError, AttributeError) as e:
+        return {"success": False, "message": f"Failed to create vulnerability report: {e!s}"}
+    else:
        return {
            "success": True,
-            "message": f"Vulnerability report '{title}' created successfully (not persisted)",
-            "warning": "Report could not be persisted - tracer module unavailable",
+            "message": f"Vulnerability report '{title}' created (not persisted)",
+            "warning": "Report could not be persisted - tracer unavailable",
        }
-    except (ValueError, TypeError) as e:
-        return {"success": False, "message": f"Failed to create vulnerability report: {e!s}"}
--- a/strix/tools/reporting/reporting_actions_schema.xml
+++ b/strix/tools/reporting/reporting_actions_schema.xml
@@ -2,8 +2,9 @@
  <tool name="create_vulnerability_report">
    <description>Create a vulnerability report for a discovered security issue.

-Use this tool to document a specific verified security vulnerability.
-Put ALL details in the content field - affected URLs, parameters, proof of concept, remediation steps, CVE references, CVSS scores, technical details, impact assessment, etc.
+IMPORTANT: This tool includes automatic LLM-based deduplication. Reports that describe the same vulnerability (same root cause on the same asset) as an existing report will be rejected.
+
+Use this tool to document a specific fully verified security vulnerability.

 DO NOT USE:
 - For general security observations without specific vulnerabilities
@@ -11,20 +12,124 @@ DO NOT USE:
 - When you don't have a proof of concept, or still not 100% sure if it's a vulnerability
 - For tracking multiple vulnerabilities (create separate reports)
 - For reporting multiple vulnerabilities at once. Use a separate create_vulnerability_report for each vulnerability.
+- To re-report a vulnerability that was already reported (even with different details)
+
+White-box requirement (when you have access to the code): You MUST include code_file, code_before, code_after, and code_diff. These must contain the actual code (before/after) and a complete, apply-able unified diff.
+
+DEDUPLICATION: If this tool returns with success=false and mentions a duplicate, DO NOT attempt to re-submit. The vulnerability has already been reported. Move on to testing other areas.
+
+Professional, customer-facing report rules (PDF-ready):
+- Do NOT include internal or system details: never mention local or absolute paths (e.g., "/workspace"), internal tools, agents, orchestrators, sandboxes, models, system prompts/instructions, connection issues, internal errors/logs/stack traces, or tester machine environment details.
+- Tone and style: formal, objective, third-person, vendor-neutral, concise. No runbooks, checklists, or engineering notes. Avoid headings like "QUICK", "Approach", or "Techniques" that read like internal guidance.
+- Use a standard penetration testing report structure per finding:
+  1) Overview
+  2) Severity and CVSS (vector only)
+  3) Affected asset(s)
+  4) Technical details
+  5) Proof of concept (repro steps plus code)
+  6) Impact
+  7) Remediation
+  8) Evidence (optional request/response excerpts, etc.) in the technical analysis field.
+- Numbered steps are allowed ONLY within the proof of concept. Elsewhere, use clear, concise paragraphs suitable for customer-facing reports.
+- Language must be precise and non-vague; avoid hedging.
 </description>
    <parameters>
      <parameter name="title" type="string" required="true">
-        <description>Clear, concise title of the vulnerability</description>
+        <description>Clear, specific title (e.g., "SQL Injection in /api/users Login Parameter"). But not too long. Don't mention CVE number in the title.</description>
      </parameter>
-      <parameter name="content" type="string" required="true">
-        <description>Complete vulnerability details including affected URLs, technical details, impact, proof of concept, remediation steps, and any relevant references. Be comprehensive and include everything relevant.</description>
+      <parameter name="description" type="string" required="true">
+        <description>Comprehensive description of the vulnerability and how it was discovered</description>
      </parameter>
-      <parameter name="severity" type="string" required="true">
-        <description>Severity level: critical, high, medium, low, or info</description>
+      <parameter name="impact" type="string" required="true">
+        <description>Impact assessment: what attacker can do, business risk, data at risk</description>
+      </parameter>
+      <parameter name="target" type="string" required="true">
+        <description>Affected target: URL, domain, or Git repository</description>
+      </parameter>
+      <parameter name="technical_analysis" type="string" required="true">
+        <description>Technical explanation of the vulnerability mechanism and root cause</description>
+      </parameter>
+      <parameter name="poc_description" type="string" required="true">
+        <description>Step-by-step instructions to reproduce the vulnerability</description>
+      </parameter>
+      <parameter name="poc_script_code" type="string" required="true">
+        <description>Actual proof of concept code, exploit, payload, or script that demonstrates the vulnerability. Python code.</description>
+      </parameter>
+      <parameter name="remediation_steps" type="string" required="true">
+        <description>Specific, actionable steps to fix the vulnerability</description>
+      </parameter>
+      <parameter name="attack_vector" type="string" required="true">
+        <description>CVSS Attack Vector - How the vulnerability is exploited:
+N = Network (remotely exploitable)
+A = Adjacent (same network segment)
+L = Local (local access required)
+P = Physical (physical access required)</description>
+      </parameter>
+      <parameter name="attack_complexity" type="string" required="true">
+        <description>CVSS Attack Complexity - Conditions beyond attacker's control:
+L = Low (no special conditions)
+H = High (special conditions must exist)</description>
+      </parameter>
+      <parameter name="privileges_required" type="string" required="true">
+        <description>CVSS Privileges Required - Level of privileges needed:
+N = None (no privileges needed)
+L = Low (basic user privileges)
+H = High (admin privileges)</description>
+      </parameter>
+      <parameter name="user_interaction" type="string" required="true">
+        <description>CVSS User Interaction - Does exploit require user action:
+N = None (no user interaction needed)
+R = Required (user must perform some action)</description>
+      </parameter>
+      <parameter name="scope" type="string" required="true">
+        <description>CVSS Scope - Can the vulnerability affect resources beyond its security scope:
+U = Unchanged (only affects the vulnerable component)
+C = Changed (affects resources beyond vulnerable component)</description>
+      </parameter>
+      <parameter name="confidentiality" type="string" required="true">
+        <description>CVSS Confidentiality Impact - Impact to confidentiality:
+N = None (no impact)
+L = Low (some information disclosure)
+H = High (all information disclosed)</description>
+      </parameter>
+      <parameter name="integrity" type="string" required="true">
+        <description>CVSS Integrity Impact - Impact to integrity:
+N = None (no impact)
+L = Low (data can be modified but scope is limited)
+H = High (total loss of integrity)</description>
+      </parameter>
+      <parameter name="availability" type="string" required="true">
+        <description>CVSS Availability Impact - Impact to availability:
+N = None (no impact)
+L = Low (reduced performance or interruptions)
+H = High (total loss of availability)</description>
+      </parameter>
+      <parameter name="endpoint" type="string" required="false">
+        <description>API endpoint(s) or URL path(s) (e.g., "/api/login") - for web vulnerabilities, or Git repository path(s) - for code vulnerabilities</description>
+      </parameter>
+      <parameter name="method" type="string" required="false">
+        <description>HTTP method(s) (GET, POST, etc.) - for web vulnerabilities.</description>
+      </parameter>
+      <parameter name="cve" type="string" required="false">
+        <description>CVE identifier (e.g., "CVE-2024-1234"). Make sure it's a valid CVE. Use web search or vulnerability databases to make sure it's a valid CVE number.</description>
+      </parameter>
+      <parameter name="code_file" type="string" required="false">
+        <description>MANDATORY for white-box testing: exact affected source file path(s).</description>
+      </parameter>
+      <parameter name="code_before" type="string" required="false">
+        <description>MANDATORY for white-box testing: actual vulnerable code snippet(s) copied verbatim from the repository.</description>
+      </parameter>
+      <parameter name="code_after" type="string" required="false">
+        <description>MANDATORY for white-box testing: corrected code snippet(s) exactly as they should appear after the fix.</description>
+      </parameter>
+      <parameter name="code_diff" type="string" required="false">
+        <description>MANDATORY for white-box testing: unified diff showing the code changes. Must be a complete, apply-able unified diff (git format) covering all affected files, with proper file headers, line numbers, and sufficient context.</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
-      <description>Response containing success status and message</description>
+      <description>Response containing:
+- On success: success=true, message, report_id, severity, cvss_score
+- On duplicate detection: success=false, message (with duplicate info), duplicate_of (ID), duplicate_title, confidence (0-1), reason (why it's a duplicate)</description>
    </returns>
  </tool>
 </tools>
--- a/Show More
+++ b/Show More