enhance todo tool prompt

Update README.md
chore: bump version to 0.5.0
2025-12-15 10:26:59 -08:00 · 2025-12-15 10:11:08 -08:00 · 2025-12-15 08:21:03 -08:00 · 2025-12-15 08:21:03 -08:00 · 2025-12-15 07:41:33 -08:00 · 2025-12-15 19:39:47 +04:00
52 changed files with 4189 additions and 585 deletions
--- a/.github/workflows/build-release.yml
+++ b/.github/workflows/build-release.yml
@@ -0,0 +1,78 @@
+name: Build & Release
+
+on:
+  push:
+    tags:
+      - 'v*'
+  workflow_dispatch:
+
+jobs:
+  build:
+    strategy:
+      fail-fast: false
+      matrix:
+        include:
+          - os: macos-latest
+            target: macos-arm64
+          - os: macos-15-intel
+            target: macos-x86_64
+          - os: ubuntu-latest
+            target: linux-x86_64
+          - os: windows-latest
+            target: windows-x86_64
+
+    runs-on: ${{ matrix.os }}
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: '3.12'
+
+      - uses: snok/install-poetry@v1
+
+      - name: Build
+        shell: bash
+        run: |
+          poetry install --with dev
+          poetry run pyinstaller strix.spec --noconfirm
+
+          VERSION=$(poetry version -s)
+          mkdir -p dist/release
+
+          if [[ "${{ runner.os }}" == "Windows" ]]; then
+            cp dist/strix.exe "dist/release/strix-${VERSION}-${{ matrix.target }}.exe"
+            (cd dist/release && 7z a "strix-${VERSION}-${{ matrix.target }}.zip" "strix-${VERSION}-${{ matrix.target }}.exe")
+          else
+            cp dist/strix "dist/release/strix-${VERSION}-${{ matrix.target }}"
+            chmod +x "dist/release/strix-${VERSION}-${{ matrix.target }}"
+            tar -C dist/release -czvf "dist/release/strix-${VERSION}-${{ matrix.target }}.tar.gz" "strix-${VERSION}-${{ matrix.target }}"
+          fi
+
+      - uses: actions/upload-artifact@v4
+        with:
+          name: strix-${{ matrix.target }}
+          path: |
+            dist/release/*.tar.gz
+            dist/release/*.zip
+          if-no-files-found: error
+
+  release:
+    needs: build
+    runs-on: ubuntu-latest
+    permissions:
+      contents: write
+
+    steps:
+      - uses: actions/download-artifact@v4
+        with:
+          path: release
+          merge-multiple: true
+
+      - name: Create Release
+        uses: softprops/action-gh-release@v2
+        with:
+          prerelease: ${{ !startsWith(github.ref, 'refs/tags/') }}
+          generate_release_notes: true
+          files: release/*
--- a/README.md
+++ b/README.md
@@ -12,7 +12,7 @@

 [![Python](https://img.shields.io/pypi/pyversions/strix-agent?color=3776AB)](https://pypi.org/project/strix-agent/)
 [![PyPI](https://img.shields.io/pypi/v/strix-agent?color=10b981)](https://pypi.org/project/strix-agent/)
-[![PyPI Downloads](https://static.pepy.tech/personalized-badge/strix-agent?period=total&units=INTERNATIONAL_SYSTEM&left_color=GREY&right_color=RED&left_text=Downloads)](https://pepy.tech/projects/strix-agent)
+![PyPI Downloads](https://static.pepy.tech/personalized-badge/strix-agent?period=total&units=INTERNATIONAL_SYSTEM&left_color=GREY&right_color=RED&left_text=Downloads)
 [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)

 [![GitHub Stars](https://img.shields.io/github/stars/usestrix/strix)](https://github.com/usestrix/strix)
@@ -21,6 +21,9 @@

 <a href="https://trendshift.io/repositories/15362" target="_blank"><img src="https://trendshift.io/api/badge/repositories/15362" alt="usestrix%2Fstrix | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>

+
+[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/usestrix/strix)
+
 </div>

 <br>
@@ -62,13 +65,15 @@ Strix are autonomous AI agents that act just like real hackers - they run your c

 **Prerequisites:**
 - Docker (running)
- Python 3.12+
 - An LLM provider key (e.g. [get OpenAI API key](https://platform.openai.com/api-keys) or use a local LLM)

 ### Installation & First Scan

 ```bash
 # Install Strix
+curl -sSL https://strix.ai/install | bash
+
+# Or via pipx
 pipx install strix-agent

 # Configure your AI provider
@@ -84,7 +89,7 @@ strix --target ./app-directory

 ## ☁️ Run Strix in Cloud

-Want to skip the local setup, API keys, and unpredictable LLM costs? Run the hosted cloud version of Strix at **[app.usestrix.com](https://app.usestrix.com)**.
+Want to skip the local setup, API keys, and unpredictable LLM costs? Run the hosted cloud version of Strix at **[app.usestrix.com](https://usestrix.com)**.

 Launch a scan in just a few minutes—no setup or configuration required—and you’ll get:

@@ -93,7 +98,7 @@ Launch a scan in just a few minutes—no setup or configuration required—and y
 - **CI/CD and GitHub integrations** to block risky changes before production
 - **Continuous monitoring** so new vulnerabilities are caught quickly

-[**Run your first pentest now →**](https://app.usestrix.com)
+[**Run your first pentest now →**](https://usestrix.com)

 ---

@@ -159,6 +164,9 @@ strix -t https://github.com/org/app -t https://your-app.com

 # Focused testing with custom instructions
 strix --target api.your-app.com --instruction "Focus on business logic flaws and IDOR vulnerabilities"
+
+# Provide detailed instructions through file (e.g., rules of engagement, scope, exclusions)
+strix --target api.your-app.com --instruction-file ./instruction.md
 ```

 ### 🤖 Headless Mode
@@ -183,17 +191,17 @@ jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6

      - name: Install Strix
-        run: pipx install strix-agent
+        run: curl -sSL https://strix.ai/install | bash

      - name: Run Strix
        env:
          STRIX_LLM: ${{ secrets.STRIX_LLM }}
          LLM_API_KEY: ${{ secrets.LLM_API_KEY }}

-        run: strix -n -t ./
+        run: strix -n -t ./ --scan-mode quick
 ```

 ### ⚙️ Configuration
@@ -211,21 +219,7 @@ export PERPLEXITY_API_KEY="your-api-key"  # for search capabilities

 ## 🤝 Contributing

-We welcome contributions from the community! There are several ways to contribute:
-
-### Code Contributions
-See our [Contributing Guide](CONTRIBUTING.md) for details on:
- Setting up your development environment
- Running tests and quality checks
- Submitting pull requests
- Code style guidelines
-
-
-### Prompt Modules Collection
-Help expand our collection of specialized prompt modules for AI agents:
- Advanced testing techniques for vulnerabilities, frameworks, and technologies
- See [Prompt Modules Documentation](strix/prompts/README.md) for guidelines
- Submit via [pull requests](https://github.com/usestrix/strix/pulls) or [issues](https://github.com/usestrix/strix/issues)
+We welcome contributions of code, docs, and new prompt modules - check out our [Contributing Guide](CONTRIBUTING.md) to get started or open a [pull request](https://github.com/usestrix/strix/pulls)/[issue](https://github.com/usestrix/strix/issues).

 ## 👥 Join Our Community

@@ -234,6 +228,10 @@ Have questions? Found a bug? Want to contribute? **[Join our Discord!](https://d
 ## 🌟 Support the Project

 **Love Strix?** Give us a ⭐ on GitHub!
+## 🙏 Acknowledgements
+
+Strix builds on the incredible work of open-source projects like [LiteLLM](https://github.com/BerriAI/litellm), [Caido](https://github.com/caido/caido), [ProjectDiscovery](https://github.com/projectdiscovery), [Playwright](https://github.com/microsoft/playwright), and [Textual](https://github.com/Textualize/textual). Huge thanks to their maintainers!
+

 > [!WARNING]
 > Only test apps you own or have permission to test. You are responsible for using Strix ethically and legally.
--- a/containers/Dockerfile
+++ b/containers/Dockerfile
@@ -158,7 +158,7 @@ RUN mkdir -p /workspace && chown -R pentester:pentester /workspace /app
 COPY pyproject.toml poetry.lock ./

 USER pentester
-RUN poetry install --no-root --without dev
+RUN poetry install --no-root --without dev --extras sandbox
 RUN poetry run playwright install chromium

 RUN /app/venv/bin/pip install -r /home/pentester/tools/jwt_tool/requirements.txt && \
--- a/poetry.lock
+++ b/poetry.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "strix-agent"
-version = "0.4.0"
+version = "0.5.0"
 description = "Open-source AI Hackers for your apps"
 authors = ["Strix <hi@usestrix.com>"]
 readme = "README.md"
@@ -26,6 +26,8 @@ classifiers = [
  "Programming Language :: Python :: 3",
  "Programming Language :: Python :: 3 :: Only",
  "Programming Language :: Python :: 3.12",
+  "Programming Language :: Python :: 3.13",
+  "Programming Language :: Python :: 3.14",
 ]
 packages = [
  { include = "strix", format = ["sdist", "wheel"] }
@@ -43,24 +45,33 @@ strix = "strix.interface.main:main"

 [tool.poetry.dependencies]
 python = "^3.12"
-fastapi = "*"
-uvicorn = "*"
-litellm = { version = "~1.79.1", extras = ["proxy"] }
-openai = ">=1.99.5,<1.100.0"
+# Core CLI dependencies
+litellm = { version = "~1.80.7", extras = ["proxy"] }
 tenacity = "^9.0.0"
-numpydoc = "^1.8.0"
 pydantic = {extras = ["email"], version = "^2.11.3"}
-ipython = "^9.3.0"
-openhands-aci = "^0.3.0"
-playwright = "^1.48.0"
 rich = "*"
 docker = "^7.1.0"
-gql = {extras = ["requests"], version = "^3.5.3"}
 textual = "^4.0.0"
 xmltodict = "^0.13.0"
-pyte = "^0.8.1"
 requests = "^2.32.0"
-libtmux = "^0.46.2"
+
+# Optional LLM provider dependencies
+google-cloud-aiplatform = { version = ">=1.38", optional = true }
+
+# Sandbox-only dependencies (only needed inside Docker container)
+fastapi = { version = "*", optional = true }
+uvicorn = { version = "*", optional = true }
+ipython = { version = "^9.3.0", optional = true }
+openhands-aci = { version = "^0.3.0", optional = true }
+playwright = { version = "^1.48.0", optional = true }
+gql = { version = "^3.5.3", extras = ["requests"], optional = true }
+pyte = { version = "^0.8.1", optional = true }
+libtmux = { version = "^0.46.2", optional = true }
+numpydoc = { version = "^1.8.0", optional = true }
+
+[tool.poetry.extras]
+vertex = ["google-cloud-aiplatform"]
+sandbox = ["fastapi", "uvicorn", "ipython", "openhands-aci", "playwright", "gql", "pyte", "libtmux", "numpydoc"]

 [tool.poetry.group.dev.dependencies]
 # Type checking and static analysis
@@ -81,6 +92,9 @@ pre-commit = "^4.2.0"
 black = "^25.1.0"
 isort = "^6.0.1"

+# Build tools
+pyinstaller = { version = "^6.17.0", python = ">=3.12,<3.15" }
+
 [build-system]
 requires = ["poetry-core"]
 build-backend = "poetry.core.masonry.api"
@@ -129,9 +143,15 @@ module = [
    "textual.*",
    "pyte.*",
    "libtmux.*",
+    "pytest.*",
 ]
 ignore_missing_imports = true

+# Relax strict rules for test files (pytest decorators are not fully typed)
+[[tool.mypy.overrides]]
+module = ["tests.*"]
+disallow_untyped_decorators = false
+
 # ============================================================================
 # Ruff Configuration (Fast Python Linter & Formatter)
 # ============================================================================
@@ -321,7 +341,6 @@ addopts = [
    "--cov-report=term-missing",
    "--cov-report=html",
    "--cov-report=xml",
-    "--cov-fail-under=80"
 ]
 testpaths = ["tests"]
 python_files = ["test_*.py", "*_test.py"]
--- a/scripts/build.sh
+++ b/scripts/build.sh
@@ -0,0 +1,98 @@
+#!/bin/bash
+set -e
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
+
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+
+echo -e "${BLUE}🦉 Strix Build Script${NC}"
+echo "================================"
+
+OS="$(uname -s)"
+ARCH="$(uname -m)"
+
+case "$OS" in
+    Linux*)     OS_NAME="linux";;
+    Darwin*)    OS_NAME="macos";;
+    MINGW*|MSYS*|CYGWIN*) OS_NAME="windows";;
+    *)          OS_NAME="unknown";;
+esac
+
+case "$ARCH" in
+    x86_64|amd64)   ARCH_NAME="x86_64";;
+    arm64|aarch64)  ARCH_NAME="arm64";;
+    *)              ARCH_NAME="$ARCH";;
+esac
+
+echo -e "${YELLOW}Platform:${NC} $OS_NAME-$ARCH_NAME"
+
+cd "$PROJECT_ROOT"
+
+if ! command -v poetry &> /dev/null; then
+    echo -e "${RED}Error: Poetry is not installed${NC}"
+    echo "Please install Poetry first: https://python-poetry.org/docs/#installation"
+    exit 1
+fi
+
+echo -e "\n${BLUE}Installing dependencies...${NC}"
+poetry install --with dev
+
+VERSION=$(poetry version -s)
+echo -e "${YELLOW}Version:${NC} $VERSION"
+
+echo -e "\n${BLUE}Cleaning previous builds...${NC}"
+rm -rf build/ dist/
+
+echo -e "\n${BLUE}Building binary with PyInstaller...${NC}"
+poetry run pyinstaller strix.spec --noconfirm
+
+RELEASE_DIR="dist/release"
+mkdir -p "$RELEASE_DIR"
+
+BINARY_NAME="strix-${VERSION}-${OS_NAME}-${ARCH_NAME}"
+
+if [ "$OS_NAME" = "windows" ]; then
+    if [ ! -f "dist/strix.exe" ]; then
+        echo -e "${RED}Build failed: Binary not found${NC}"
+        exit 1
+    fi
+    BINARY_NAME="${BINARY_NAME}.exe"
+    cp "dist/strix.exe" "$RELEASE_DIR/$BINARY_NAME"
+    echo -e "\n${BLUE}Creating zip...${NC}"
+    ARCHIVE_NAME="${BINARY_NAME%.exe}.zip"
+
+    if command -v 7z &> /dev/null; then
+        7z a "$RELEASE_DIR/$ARCHIVE_NAME" "$RELEASE_DIR/$BINARY_NAME"
+    else
+        powershell -Command "Compress-Archive -Path '$RELEASE_DIR/$BINARY_NAME' -DestinationPath '$RELEASE_DIR/$ARCHIVE_NAME'"
+    fi
+    echo -e "${GREEN}Created:${NC} $RELEASE_DIR/$ARCHIVE_NAME"
+else
+    if [ ! -f "dist/strix" ]; then
+        echo -e "${RED}Build failed: Binary not found${NC}"
+        exit 1
+    fi
+    cp "dist/strix" "$RELEASE_DIR/$BINARY_NAME"
+    chmod +x "$RELEASE_DIR/$BINARY_NAME"
+    echo -e "\n${BLUE}Creating tarball...${NC}"
+    ARCHIVE_NAME="${BINARY_NAME}.tar.gz"
+    tar -czvf "$RELEASE_DIR/$ARCHIVE_NAME" -C "$RELEASE_DIR" "$BINARY_NAME"
+    echo -e "${GREEN}Created:${NC} $RELEASE_DIR/$ARCHIVE_NAME"
+fi
+
+echo -e "\n${GREEN}Build successful!${NC}"
+echo "================================"
+echo -e "${YELLOW}Binary:${NC} $RELEASE_DIR/$BINARY_NAME"
+
+SIZE=$(ls -lh "$RELEASE_DIR/$BINARY_NAME" | awk '{print $5}')
+echo -e "${YELLOW}Size:${NC} $SIZE"
+
+echo -e "\n${BLUE}Testing binary...${NC}"
+"$RELEASE_DIR/$BINARY_NAME" --help > /dev/null 2>&1 && echo -e "${GREEN}Binary test passed!${NC}" || echo -e "${RED}Binary test failed${NC}"
+
+echo -e "\n${GREEN}Done!${NC}"
--- a/scripts/install.sh
+++ b/scripts/install.sh
@@ -0,0 +1,328 @@
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+APP=strix
+REPO="usestrix/strix"
+STRIX_IMAGE="ghcr.io/usestrix/strix-sandbox:0.1.10"
+
+MUTED='\033[0;2m'
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+CYAN='\033[0;36m'
+NC='\033[0m'
+
+requested_version=${VERSION:-}
+SKIP_DOWNLOAD=false
+
+raw_os=$(uname -s)
+os=$(echo "$raw_os" | tr '[:upper:]' '[:lower:]')
+case "$raw_os" in
+  Darwin*) os="macos" ;;
+  Linux*) os="linux" ;;
+  MINGW*|MSYS*|CYGWIN*) os="windows" ;;
+esac
+
+arch=$(uname -m)
+if [[ "$arch" == "aarch64" ]]; then
+  arch="arm64"
+fi
+if [[ "$arch" == "x86_64" ]]; then
+  arch="x86_64"
+fi
+
+if [ "$os" = "macos" ] && [ "$arch" = "x86_64" ]; then
+  rosetta_flag=$(sysctl -n sysctl.proc_translated 2>/dev/null || echo 0)
+  if [ "$rosetta_flag" = "1" ]; then
+    arch="arm64"
+  fi
+fi
+
+combo="$os-$arch"
+case "$combo" in
+  linux-x86_64|macos-x86_64|macos-arm64|windows-x86_64)
+    ;;
+  *)
+    echo -e "${RED}Unsupported OS/Arch: $os/$arch${NC}"
+    exit 1
+    ;;
+esac
+
+archive_ext=".tar.gz"
+if [ "$os" = "windows" ]; then
+  archive_ext=".zip"
+fi
+
+target="$os-$arch"
+
+if [ "$os" = "linux" ]; then
+    if ! command -v tar >/dev/null 2>&1; then
+         echo -e "${RED}Error: 'tar' is required but not installed.${NC}"
+         exit 1
+    fi
+fi
+
+if [ "$os" = "windows" ]; then
+    if ! command -v unzip >/dev/null 2>&1; then
+        echo -e "${RED}Error: 'unzip' is required but not installed.${NC}"
+        exit 1
+    fi
+fi
+
+INSTALL_DIR=$HOME/.strix/bin
+mkdir -p "$INSTALL_DIR"
+
+if [ -z "$requested_version" ]; then
+    specific_version=$(curl -s "https://api.github.com/repos/$REPO/releases/latest" | sed -n 's/.*"tag_name": *"v\([^"]*\)".*/\1/p')
+    if [[ $? -ne 0 || -z "$specific_version" ]]; then
+        echo -e "${RED}Failed to fetch version information${NC}"
+        exit 1
+    fi
+else
+    specific_version=$requested_version
+fi
+
+filename="$APP-${specific_version}-${target}${archive_ext}"
+url="https://github.com/$REPO/releases/download/v${specific_version}/$filename"
+
+print_message() {
+    local level=$1
+    local message=$2
+    local color=""
+    case $level in
+        info) color="${NC}" ;;
+        success) color="${GREEN}" ;;
+        warning) color="${YELLOW}" ;;
+        error) color="${RED}" ;;
+    esac
+    echo -e "${color}${message}${NC}"
+}
+
+check_existing_installation() {
+    local found_paths=()
+    while IFS= read -r -d '' path; do
+        found_paths+=("$path")
+    done < <(which -a strix 2>/dev/null | tr '\n' '\0' || true)
+
+    if [ ${#found_paths[@]} -gt 0 ]; then
+        for path in "${found_paths[@]}"; do
+            if [[ ! -e "$path" ]] || [[ "$path" == "$INSTALL_DIR/strix"* ]]; then
+                continue
+            fi
+
+            if [[ -n "$path" ]]; then
+                echo -e "${MUTED}Found existing strix at: ${NC}$path"
+
+                if [[ "$path" == *".local/bin"* ]]; then
+                    echo -e "${MUTED}Removing old pipx installation...${NC}"
+                    if command -v pipx >/dev/null 2>&1; then
+                        pipx uninstall strix-agent 2>/dev/null || true
+                    fi
+                    rm -f "$path" 2>/dev/null || true
+                elif [[ -L "$path" || -f "$path" ]]; then
+                    echo -e "${MUTED}Removing old installation...${NC}"
+                    rm -f "$path" 2>/dev/null || true
+                fi
+            fi
+        done
+    fi
+}
+
+check_version() {
+    check_existing_installation
+
+    if [[ -x "$INSTALL_DIR/strix" ]]; then
+        installed_version=$("$INSTALL_DIR/strix" --version 2>/dev/null | awk '{print $2}' || echo "")
+        if [[ "$installed_version" == "$specific_version" ]]; then
+            print_message info "${GREEN}✓ Strix ${NC}$specific_version${GREEN} already installed${NC}"
+            SKIP_DOWNLOAD=true
+        elif [[ -n "$installed_version" ]]; then
+            print_message info "${MUTED}Installed: ${NC}$installed_version ${MUTED}→ Upgrading to ${NC}$specific_version"
+        fi
+    fi
+}
+
+download_and_install() {
+    print_message info "\n${CYAN}🦉 Installing Strix${NC} ${MUTED}version: ${NC}$specific_version"
+    print_message info "${MUTED}Platform: ${NC}$target\n"
+
+    local tmp_dir=$(mktemp -d)
+    cd "$tmp_dir"
+
+    echo -e "${MUTED}Downloading...${NC}"
+    curl -# -L -o "$filename" "$url"
+
+    if [ ! -f "$filename" ]; then
+        echo -e "${RED}Download failed${NC}"
+        exit 1
+    fi
+
+    echo -e "${MUTED}Extracting...${NC}"
+    if [ "$os" = "windows" ]; then
+        unzip -q "$filename"
+        mv "strix-${specific_version}-${target}.exe" "$INSTALL_DIR/strix.exe"
+    else
+        tar -xzf "$filename"
+        mv "strix-${specific_version}-${target}" "$INSTALL_DIR/strix"
+        chmod 755 "$INSTALL_DIR/strix"
+    fi
+
+    cd - > /dev/null
+    rm -rf "$tmp_dir"
+
+    echo -e "${GREEN}✓ Strix installed to $INSTALL_DIR${NC}"
+}
+
+check_docker() {
+    echo ""
+    if ! command -v docker >/dev/null 2>&1; then
+        echo -e "${YELLOW}⚠ Docker not found${NC}"
+        echo -e "${MUTED}Strix requires Docker to run the security sandbox.${NC}"
+        echo -e "${MUTED}Please install Docker: ${NC}https://docs.docker.com/get-docker/"
+        echo ""
+        return 1
+    fi
+
+    if ! docker info >/dev/null 2>&1; then
+        echo -e "${YELLOW}⚠ Docker daemon not running${NC}"
+        echo -e "${MUTED}Please start Docker and run: ${NC}docker pull $STRIX_IMAGE"
+        echo ""
+        return 1
+    fi
+
+    echo -e "${MUTED}Checking for sandbox image...${NC}"
+    if docker image inspect "$STRIX_IMAGE" >/dev/null 2>&1; then
+        echo -e "${GREEN}✓ Sandbox image already available${NC}"
+    else
+        echo -e "${MUTED}Pulling sandbox image (this may take a few minutes)...${NC}"
+        if docker pull "$STRIX_IMAGE"; then
+            echo -e "${GREEN}✓ Sandbox image pulled successfully${NC}"
+        else
+            echo -e "${YELLOW}⚠ Failed to pull sandbox image${NC}"
+            echo -e "${MUTED}You can pull it manually later: ${NC}docker pull $STRIX_IMAGE"
+        fi
+    fi
+    return 0
+}
+
+add_to_path() {
+    local config_file=$1
+    local command=$2
+    if grep -Fxq "$command" "$config_file" 2>/dev/null; then
+        return 0
+    elif [[ -w $config_file ]]; then
+        echo -e "\n# strix" >> "$config_file"
+        echo "$command" >> "$config_file"
+    fi
+}
+
+setup_path() {
+    XDG_CONFIG_HOME=${XDG_CONFIG_HOME:-$HOME/.config}
+    current_shell=$(basename "$SHELL")
+
+    case $current_shell in
+        fish)
+            config_files="$HOME/.config/fish/config.fish"
+            ;;
+        zsh)
+            config_files="$HOME/.zshrc $HOME/.zshenv"
+            ;;
+        bash)
+            config_files="$HOME/.bashrc $HOME/.bash_profile $HOME/.profile"
+            ;;
+        *)
+            config_files="$HOME/.bashrc $HOME/.profile"
+            ;;
+    esac
+
+    config_file=""
+    for file in $config_files; do
+        if [[ -f $file ]]; then
+            config_file=$file
+            break
+        fi
+    done
+
+    if [[ -z $config_file ]]; then
+        config_file="$HOME/.bashrc"
+        touch "$config_file"
+    fi
+
+    if [[ ":$PATH:" != *":$INSTALL_DIR:"* ]]; then
+        case $current_shell in
+            fish)
+                add_to_path "$config_file" "fish_add_path $INSTALL_DIR"
+                ;;
+            *)
+                add_to_path "$config_file" "export PATH=\"$INSTALL_DIR:\$PATH\""
+                ;;
+        esac
+    fi
+
+    if [ -n "${GITHUB_ACTIONS-}" ] && [ "${GITHUB_ACTIONS}" == "true" ]; then
+        echo "$INSTALL_DIR" >> "$GITHUB_PATH"
+    fi
+}
+
+verify_installation() {
+    export PATH="$INSTALL_DIR:$PATH"
+
+    local which_strix=$(which strix 2>/dev/null || echo "")
+
+    if [[ "$which_strix" != "$INSTALL_DIR/strix" && "$which_strix" != "$INSTALL_DIR/strix.exe" ]]; then
+        if [[ -n "$which_strix" ]]; then
+            echo -e "${YELLOW}⚠ Found conflicting strix at: ${NC}$which_strix"
+            echo -e "${MUTED}Attempting to remove...${NC}"
+
+            if rm -f "$which_strix" 2>/dev/null; then
+                echo -e "${GREEN}✓ Removed conflicting installation${NC}"
+            else
+                echo -e "${YELLOW}Could not remove automatically.${NC}"
+                echo -e "${MUTED}Please remove manually: ${NC}rm $which_strix"
+            fi
+        fi
+    fi
+
+    if [[ -x "$INSTALL_DIR/strix" ]]; then
+        local version=$("$INSTALL_DIR/strix" --version 2>/dev/null | awk '{print $2}' || echo "unknown")
+        echo -e "${GREEN}✓ Strix ${NC}$version${GREEN} ready${NC}"
+    fi
+}
+
+check_version
+if [ "$SKIP_DOWNLOAD" = false ]; then
+    download_and_install
+fi
+setup_path
+verify_installation
+check_docker
+
+echo ""
+echo -e "${CYAN}"
+echo "   ███████╗████████╗██████╗ ██╗██╗  ██╗"
+echo "   ██╔════╝╚══██╔══╝██╔══██╗██║╚██╗██╔╝"
+echo "   ███████╗   ██║   ██████╔╝██║ ╚███╔╝ "
+echo "   ╚════██║   ██║   ██╔══██╗██║ ██╔██╗ "
+echo "   ███████║   ██║   ██║  ██║██║██╔╝ ██╗"
+echo "   ╚══════╝   ╚═╝   ╚═╝  ╚═╝╚═╝╚═╝  ╚═╝"
+echo -e "${NC}"
+echo -e "${MUTED}  AI Penetration Testing Agent${NC}"
+echo ""
+echo -e "${MUTED}To get started:${NC}"
+echo ""
+echo -e "  ${CYAN}1.${NC} Set your LLM provider:"
+echo -e "     ${MUTED}export STRIX_LLM='openai/gpt-5'${NC}"
+echo -e "     ${MUTED}export LLM_API_KEY='your-api-key'${NC}"
+echo ""
+echo -e "  ${CYAN}2.${NC} Run a penetration test:"
+echo -e "     ${MUTED}strix --target https://example.com${NC}"
+echo ""
+echo -e "${MUTED}For more information visit ${NC}https://usestrix.com"
+echo -e "${MUTED}Join our community ${NC}https://discord.gg/YjKFvEZSdZ"
+echo ""
+
+if [[ ":$PATH:" != *":$INSTALL_DIR:"* ]]; then
+    echo -e "${YELLOW}→${NC} Run ${MUTED}source ~/.$(basename $SHELL)rc${NC} or open a new terminal"
+    echo ""
+fi
--- a/strix.spec
+++ b/strix.spec
@@ -0,0 +1,221 @@
+# -*- mode: python ; coding: utf-8 -*-
+
+import sys
+from pathlib import Path
+from PyInstaller.utils.hooks import collect_data_files, collect_submodules
+
+project_root = Path(SPECPATH)
+strix_root = project_root / 'strix'
+
+datas = []
+
+for jinja_file in strix_root.rglob('*.jinja'):
+    rel_path = jinja_file.relative_to(project_root)
+    datas.append((str(jinja_file), str(rel_path.parent)))
+
+for xml_file in strix_root.rglob('*.xml'):
+    rel_path = xml_file.relative_to(project_root)
+    datas.append((str(xml_file), str(rel_path.parent)))
+
+for tcss_file in strix_root.rglob('*.tcss'):
+    rel_path = tcss_file.relative_to(project_root)
+    datas.append((str(tcss_file), str(rel_path.parent)))
+
+datas += collect_data_files('textual')
+
+datas += collect_data_files('tiktoken')
+datas += collect_data_files('tiktoken_ext')
+
+datas += collect_data_files('litellm')
+
+hiddenimports = [
+    # Core dependencies
+    'litellm',
+    'litellm.llms',
+    'litellm.llms.openai',
+    'litellm.llms.anthropic',
+    'litellm.llms.vertex_ai',
+    'litellm.llms.bedrock',
+    'litellm.utils',
+    'litellm.caching',
+
+    # Textual TUI
+    'textual',
+    'textual.app',
+    'textual.widgets',
+    'textual.containers',
+    'textual.screen',
+    'textual.binding',
+    'textual.reactive',
+    'textual.css',
+    'textual._text_area_theme',
+
+    # Rich console
+    'rich',
+    'rich.console',
+    'rich.panel',
+    'rich.text',
+    'rich.markup',
+    'rich.style',
+    'rich.align',
+    'rich.live',
+
+    # Pydantic
+    'pydantic',
+    'pydantic.fields',
+    'pydantic_core',
+    'email_validator',
+
+    # Docker
+    'docker',
+    'docker.api',
+    'docker.models',
+    'docker.errors',
+
+    # HTTP/Networking
+    'httpx',
+    'httpcore',
+    'requests',
+    'urllib3',
+    'certifi',
+
+    # Jinja2 templating
+    'jinja2',
+    'jinja2.ext',
+    'markupsafe',
+
+    # XML parsing
+    'xmltodict',
+
+    # Tiktoken (for token counting)
+    'tiktoken',
+    'tiktoken_ext',
+    'tiktoken_ext.openai_public',
+
+    # Tenacity retry
+    'tenacity',
+
+    # Strix modules
+    'strix',
+    'strix.interface',
+    'strix.interface.main',
+    'strix.interface.cli',
+    'strix.interface.tui',
+    'strix.interface.utils',
+    'strix.interface.tool_components',
+    'strix.agents',
+    'strix.agents.base_agent',
+    'strix.agents.state',
+    'strix.agents.StrixAgent',
+    'strix.llm',
+    'strix.llm.llm',
+    'strix.llm.config',
+    'strix.llm.utils',
+    'strix.llm.request_queue',
+    'strix.llm.memory_compressor',
+    'strix.runtime',
+    'strix.runtime.runtime',
+    'strix.runtime.docker_runtime',
+    'strix.telemetry',
+    'strix.telemetry.tracer',
+    'strix.tools',
+    'strix.tools.registry',
+    'strix.tools.executor',
+    'strix.tools.argument_parser',
+    'strix.prompts',
+]
+
+hiddenimports += collect_submodules('litellm')
+hiddenimports += collect_submodules('textual')
+hiddenimports += collect_submodules('rich')
+hiddenimports += collect_submodules('pydantic')
+
+excludes = [
+    # Sandbox-only packages
+    'playwright',
+    'playwright.sync_api',
+    'playwright.async_api',
+    'IPython',
+    'ipython',
+    'libtmux',
+    'pyte',
+    'openhands_aci',
+    'openhands-aci',
+    'gql',
+    'fastapi',
+    'uvicorn',
+    'numpydoc',
+
+    # Google Cloud / Vertex AI
+    'google.cloud',
+    'google.cloud.aiplatform',
+    'google.api_core',
+    'google.auth',
+    'google.oauth2',
+    'google.protobuf',
+    'grpc',
+    'grpcio',
+    'grpcio_status',
+
+    # Test frameworks
+    'pytest',
+    'pytest_asyncio',
+    'pytest_cov',
+    'pytest_mock',
+
+    # Development tools
+    'mypy',
+    'ruff',
+    'black',
+    'isort',
+    'pylint',
+    'pyright',
+    'bandit',
+    'pre_commit',
+
+    # Unnecessary for runtime
+    'tkinter',
+    'matplotlib',
+    'numpy',
+    'pandas',
+    'scipy',
+    'PIL',
+    'cv2',
+]
+
+a = Analysis(
+    ['strix/interface/main.py'],
+    pathex=[str(project_root)],
+    binaries=[],
+    datas=datas,
+    hiddenimports=hiddenimports,
+    hookspath=[],
+    hooksconfig={},
+    runtime_hooks=[],
+    excludes=excludes,
+    noarchive=False,
+    optimize=0,
+)
+
+pyz = PYZ(a.pure)
+
+exe = EXE(
+    pyz,
+    a.scripts,
+    a.binaries,
+    a.datas,
+    [],
+    name='strix',
+    debug=False,
+    bootloader_ignore_signals=False,
+    strip=False,
+    upx=False,
+    upx_exclude=[],
+    runtime_tmpdir=None,
+    console=True,
+    disable_windowed_traceback=False,
+    argv_emulation=False,
+    target_arch=None,
+    codesign_identity=None,
+    entitlements_file=None,
+)
--- a/strix/agents/StrixAgent/system_prompt.jinja
+++ b/strix/agents/StrixAgent/system_prompt.jinja
@@ -10,8 +10,8 @@ You follow all instructions and rules provided to you exactly as written in the

 <communication_rules>
 CLI OUTPUT:
- Never use markdown formatting - you are a CLI agent
- Output plain text only (no **bold**, `code`, [links], # headers)
+- You may use simple markdown: **bold**, *italic*, `code`, ~~strikethrough~~, [links](url), and # headers
+- Do NOT use complex markdown like bullet lists, numbered lists, or tables
 - Use line breaks and indentation for structure
 - NEVER use "Strix" or any identifiable names/markers in HTTP requests, payloads, user-agents, or any inputs

--- a/strix/interface/cli.py
+++ b/strix/interface/cli.py
@@ -66,6 +66,8 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915
    console.print(startup_panel)
    console.print()

+    scan_mode = getattr(args, "scan_mode", "deep")
+
    scan_config = {
        "scan_id": args.run_name,
        "targets": args.targets_info,
@@ -73,7 +75,7 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915
        "run_name": args.run_name,
    }

-    llm_config = LLMConfig()
+    llm_config = LLMConfig(scan_mode=scan_mode)
    agent_config = {
        "llm_config": llm_config,
        "max_iterations": 300,
@@ -139,7 +141,7 @@ async def run_cli(args: Any) -> None:  # noqa: PLR0915
        status_text.append("Running penetration test...", style="bold #22c55e")
        status_text.append("\n\n")

-        stats_text = build_live_stats_text(tracer)
+        stats_text = build_live_stats_text(tracer, agent_config)
        if stats_text:
            status_text.append(stats_text)

--- a/strix/interface/main.py
+++ b/strix/interface/main.py
@@ -10,6 +10,7 @@ import os
 import shutil
 import sys
 from pathlib import Path
+from typing import Any

 import litellm
 from docker.errors import DockerException
@@ -56,10 +57,7 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
    )

    if not os.getenv("LLM_API_KEY"):
-        if not has_base_url:
-            missing_required_vars.append("LLM_API_KEY")
-        else:
-            missing_optional_vars.append("LLM_API_KEY")
+        missing_optional_vars.append("LLM_API_KEY")

    if not has_base_url:
        missing_optional_vars.append("LLM_API_BASE")
@@ -92,13 +90,6 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
                    " - Model name to use with litellm (e.g., 'openai/gpt-5')\n",
                    style="white",
                )
-            elif var == "LLM_API_KEY":
-                error_text.append("• ", style="white")
-                error_text.append("LLM_API_KEY", style="bold cyan")
-                error_text.append(
-                    " - API key for the LLM provider (required for cloud providers)\n",
-                    style="white",
-                )

        if missing_optional_vars:
            error_text.append("\nOptional environment variables:\n", style="white")
@@ -106,7 +97,11 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
                if var == "LLM_API_KEY":
                    error_text.append("• ", style="white")
                    error_text.append("LLM_API_KEY", style="bold cyan")
-                    error_text.append(" - API key for the LLM provider\n", style="white")
+                    error_text.append(
+                        " - API key for the LLM provider "
+                        "(not needed for local models, Vertex AI, AWS, etc.)\n",
+                        style="white",
+                    )
                elif var == "LLM_API_BASE":
                    error_text.append("• ", style="white")
                    error_text.append("LLM_API_BASE", style="bold cyan")
@@ -125,14 +120,12 @@ def validate_environment() -> None:  # noqa: PLR0912, PLR0915
        error_text.append("\nExample setup:\n", style="white")
        error_text.append("export STRIX_LLM='openai/gpt-5'\n", style="dim white")

-        if "LLM_API_KEY" in missing_required_vars:
-            error_text.append("export LLM_API_KEY='your-api-key-here'\n", style="dim white")
-
        if missing_optional_vars:
            for var in missing_optional_vars:
                if var == "LLM_API_KEY":
                    error_text.append(
-                        "export LLM_API_KEY='your-api-key-here'  # optional with local models\n",
+                        "export LLM_API_KEY='your-api-key-here'  "
+                        "# not needed for local models, Vertex AI, AWS, etc.\n",
                        style="dim white",
                    )
                elif var == "LLM_API_BASE":
@@ -189,18 +182,12 @@ async def warm_up_llm() -> None:
    try:
        model_name = os.getenv("STRIX_LLM", "openai/gpt-5")
        api_key = os.getenv("LLM_API_KEY")
-
-        if api_key:
-            litellm.api_key = api_key
-
        api_base = (
            os.getenv("LLM_API_BASE")
            or os.getenv("OPENAI_API_BASE")
            or os.getenv("LITELLM_BASE_URL")
            or os.getenv("OLLAMA_API_BASE")
        )
-        if api_base:
-            litellm.api_base = api_base

        test_messages = [
            {"role": "system", "content": "You are a helpful assistant."},
@@ -209,11 +196,17 @@ async def warm_up_llm() -> None:

        llm_timeout = int(os.getenv("LLM_TIMEOUT", "600"))

-        response = litellm.completion(
-            model=model_name,
-            messages=test_messages,
-            timeout=llm_timeout,
-        )
+        completion_kwargs: dict[str, Any] = {
+            "model": model_name,
+            "messages": test_messages,
+            "timeout": llm_timeout,
+        }
+        if api_key:
+            completion_kwargs["api_key"] = api_key
+        if api_base:
+            completion_kwargs["api_base"] = api_base
+
+        response = litellm.completion(**completion_kwargs)

        validate_llm_response(response)

@@ -240,6 +233,15 @@ async def warm_up_llm() -> None:
        sys.exit(1)


+def get_version() -> str:
+    try:
+        from importlib.metadata import version
+
+        return version("strix-agent")
+    except Exception:  # noqa: BLE001
+        return "unknown"
+
+
 def parse_arguments() -> argparse.Namespace:
    parser = argparse.ArgumentParser(
        description="Strix Multi-Agent Cybersecurity Penetration Testing Tool",
@@ -270,11 +272,18 @@ Examples:
  strix --target example.com --instruction "Focus on authentication vulnerabilities"

  # Custom instructions (from file)
-  strix --target example.com --instruction ./instructions.txt
-  strix --target https://app.com --instruction /path/to/detailed_instructions.md
+  strix --target example.com --instruction-file ./instructions.txt
+  strix --target https://app.com --instruction-file /path/to/detailed_instructions.md
        """,
    )

+    parser.add_argument(
+        "-v",
+        "--version",
+        action="version",
+        version=f"strix {get_version()}",
+    )
+
    parser.add_argument(
        "-t",
        "--target",
@@ -292,9 +301,15 @@ Examples:
        "testing approaches (e.g., 'Perform thorough authentication testing'), "
        "test credentials (e.g., 'Use the following credentials to access the app: "
        "admin:password123'), "
-        "or areas of interest (e.g., 'Check login API endpoint for security issues'). "
-        "You can also provide a path to a file containing detailed instructions "
-        "(e.g., '--instruction ./instructions.txt').",
+        "or areas of interest (e.g., 'Check login API endpoint for security issues').",
+    )
+
+    parser.add_argument(
+        "--instruction-file",
+        type=str,
+        help="Path to a file containing detailed custom instructions for the penetration test. "
+        "Use this option when you have lengthy or complex instructions saved in a file "
+        "(e.g., '--instruction-file ./detailed_instructions.txt').",
    )

    parser.add_argument(
@@ -313,18 +328,37 @@ Examples:
        ),
    )

+    parser.add_argument(
+        "-m",
+        "--scan-mode",
+        type=str,
+        choices=["quick", "standard", "deep"],
+        default="deep",
+        help=(
+            "Scan mode: "
+            "'quick' for fast CI/CD checks, "
+            "'standard' for routine testing, "
+            "'deep' for thorough security reviews (default). "
+            "Default: deep."
+        ),
+    )
+
    args = parser.parse_args()

-    if args.instruction:
-        instruction_path = Path(args.instruction)
-        if instruction_path.exists() and instruction_path.is_file():
-            try:
-                with instruction_path.open(encoding="utf-8") as f:
-                    args.instruction = f.read().strip()
-                    if not args.instruction:
-                        parser.error(f"Instruction file '{instruction_path}' is empty")
-            except Exception as e:  # noqa: BLE001
-                parser.error(f"Failed to read instruction file '{instruction_path}': {e}")
+    if args.instruction and args.instruction_file:
+        parser.error(
+            "Cannot specify both --instruction and --instruction-file. Use one or the other."
+        )
+
+    if args.instruction_file:
+        instruction_path = Path(args.instruction_file)
+        try:
+            with instruction_path.open(encoding="utf-8") as f:
+                args.instruction = f.read().strip()
+                if not args.instruction:
+                    parser.error(f"Instruction file '{instruction_path}' is empty")
+        except Exception as e:  # noqa: BLE001
+            parser.error(f"Failed to read instruction file '{instruction_path}': {e}")

    args.targets_info = []
    for target in args.target:
@@ -410,6 +444,9 @@ def display_completion_message(args: argparse.Namespace, results_path: Path) ->
    console.print("\n")
    console.print(panel)
    console.print()
+    console.print("[dim]🌐 Website:[/] [cyan]https://usestrix.com[/]")
+    console.print("[dim]💬 Discord:[/] [cyan]https://discord.gg/YjKFvEZSdZ[/]")
+    console.print()


 def pull_docker_image() -> None:
--- a/strix/interface/tool_components/init.py
+++ b/strix/interface/tool_components/init.py
@@ -1,4 +1,5 @@
 from . import (
+    agent_message_renderer,
    agents_graph_renderer,
    browser_renderer,
    file_edit_renderer,
@@ -10,6 +11,7 @@ from . import (
    scan_info_renderer,
    terminal_renderer,
    thinking_renderer,
+    todo_renderer,
    user_message_renderer,
    web_search_renderer,
 )
@@ -20,6 +22,7 @@ from .registry import ToolTUIRegistry, get_tool_renderer, register_tool_renderer
 __all__ = [
    "BaseToolRenderer",
    "ToolTUIRegistry",
+    "agent_message_renderer",
    "agents_graph_renderer",
    "browser_renderer",
    "file_edit_renderer",
@@ -34,6 +37,7 @@ __all__ = [
    "scan_info_renderer",
    "terminal_renderer",
    "thinking_renderer",
+    "todo_renderer",
    "user_message_renderer",
    "web_search_renderer",
 ]
--- a/strix/interface/tool_components/agent_message_renderer.py
+++ b/strix/interface/tool_components/agent_message_renderer.py
@@ -0,0 +1,70 @@
+import re
+from typing import Any, ClassVar
+
+from textual.widgets import Static
+
+from .base_renderer import BaseToolRenderer
+from .registry import register_tool_renderer
+
+
+def markdown_to_rich(text: str) -> str:
+    # Fenced code blocks: ```lang\n...\n``` or ```\n...\n```
+    text = re.sub(
+        r"```(?:\w*)\n(.*?)```",
+        r"[dim]\1[/dim]",
+        text,
+        flags=re.DOTALL,
+    )
+
+    # Headers
+    text = re.sub(r"^#### (.+)$", r"[bold]\1[/bold]", text, flags=re.MULTILINE)
+    text = re.sub(r"^### (.+)$", r"[bold]\1[/bold]", text, flags=re.MULTILINE)
+    text = re.sub(r"^## (.+)$", r"[bold]\1[/bold]", text, flags=re.MULTILINE)
+    text = re.sub(r"^# (.+)$", r"[bold]\1[/bold]", text, flags=re.MULTILINE)
+
+    # Links
+    text = re.sub(r"\[([^\]]+)\]\(([^)]+)\)", r"[underline]\1[/underline] [dim](\2)[/dim]", text)
+
+    # Bold
+    text = re.sub(r"\*\*(.+?)\*\*", r"[bold]\1[/bold]", text)
+    text = re.sub(r"__(.+?)__", r"[bold]\1[/bold]", text)
+
+    # Italic
+    text = re.sub(r"(?<!\*)\*(?!\*)(.+?)(?<!\*)\*(?!\*)", r"[italic]\1[/italic]", text)
+    text = re.sub(r"(?<![_\w])_(?!_)(.+?)(?<!_)_(?![_\w])", r"[italic]\1[/italic]", text)
+
+    # Inline code
+    text = re.sub(r"`([^`]+)`", r"[bold dim]\1[/bold dim]", text)
+
+    # Strikethrough
+    return re.sub(r"~~(.+?)~~", r"[strike]\1[/strike]", text)
+
+
+@register_tool_renderer
+class AgentMessageRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "agent_message"
+    css_classes: ClassVar[list[str]] = ["chat-message", "agent-message"]
+
+    @classmethod
+    def render(cls, message_data: dict[str, Any]) -> Static:
+        content = message_data.get("content", "")
+
+        if not content:
+            return Static("", classes=cls.css_classes)
+
+        formatted_content = cls._format_agent_message(content)
+
+        css_classes = " ".join(cls.css_classes)
+        return Static(formatted_content, classes=css_classes)
+
+    @classmethod
+    def render_simple(cls, content: str) -> str:
+        if not content:
+            return ""
+
+        return cls._format_agent_message(content)
+
+    @classmethod
+    def _format_agent_message(cls, content: str) -> str:
+        escaped_content = cls.escape_markup(content)
+        return markdown_to_rich(escaped_content)
--- a/strix/interface/tool_components/browser_renderer.py
+++ b/strix/interface/tool_components/browser_renderer.py
@@ -1,16 +1,53 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import get_lexer_by_name
+from pygments.styles import get_style_by_name
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
@register_tool_renderer
 class BrowserRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "browser_action"
    css_classes: ClassVar[list[str]] = ["tool-call", "browser-tool"]

+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_js(cls, code: str) -> str:
+        lexer = get_lexer_by_name("javascript")
+        result_parts: list[str] = []
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+
+            escaped_value = cls.escape_markup(token_value)
+            color = cls._get_token_color(token_type)
+
+            if color:
+                result_parts.append(f"[{color}]{escaped_value}[/]")
+            else:
+                result_parts.append(escaped_value)
+
+        return "".join(result_parts)
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
@@ -115,6 +152,5 @@ class BrowserRenderer(BaseToolRenderer):

    @classmethod
    def _format_js(cls, js_code: str) -> str:
-        if len(js_code) > 200:
-            js_code = js_code[:197] + "..."
-        return f"[white]{cls.escape_markup(js_code)}[/white]"
+        code_display = js_code[:2000] + "..." if len(js_code) > 2000 else js_code
+        return cls._highlight_js(code_display)
--- a/strix/interface/tool_components/file_edit_renderer.py
+++ b/strix/interface/tool_components/file_edit_renderer.py
@@ -1,16 +1,61 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import get_lexer_by_name, get_lexer_for_filename
+from pygments.styles import get_style_by_name
+from pygments.util import ClassNotFound
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
+def _get_lexer_for_file(path: str) -> Any:
+    try:
+        return get_lexer_for_filename(path)
+    except ClassNotFound:
+        return get_lexer_by_name("text")
+
+
@register_tool_renderer
 class StrReplaceEditorRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "str_replace_editor"
    css_classes: ClassVar[list[str]] = ["tool-call", "file-edit-tool"]

+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_code(cls, code: str, path: str) -> str:
+        lexer = _get_lexer_for_file(path)
+        result_parts: list[str] = []
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+
+            escaped_value = cls.escape_markup(token_value)
+            color = cls._get_token_color(token_type)
+
+            if color:
+                result_parts.append(f"[{color}]{escaped_value}[/]")
+            else:
+                result_parts.append(escaped_value)
+
+        return "".join(result_parts)
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
@@ -18,6 +63,9 @@ class StrReplaceEditorRenderer(BaseToolRenderer):

        command = args.get("command", "")
        path = args.get("path", "")
+        old_str = args.get("old_str", "")
+        new_str = args.get("new_str", "")
+        file_text = args.get("file_text", "")

        if command == "view":
            header = "📖 [bold #10b981]Reading file[/]"
@@ -32,12 +80,33 @@ class StrReplaceEditorRenderer(BaseToolRenderer):
        else:
            header = "📄 [bold #10b981]File operation[/]"

-        if (result and isinstance(result, dict) and "content" in result) or path:
-            path_display = path[-60:] if len(path) > 60 else path
-            content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
-        else:
-            content_text = f"{header} [dim]Processing...[/]"
+        path_display = path[-60:] if len(path) > 60 else path
+        content_parts = [f"{header} [dim]{cls.escape_markup(path_display)}[/]"]

+        if command == "str_replace" and (old_str or new_str):
+            if old_str:
+                old_display = old_str[:1000] + "..." if len(old_str) > 1000 else old_str
+                highlighted_old = cls._highlight_code(old_display, path)
+                old_lines = highlighted_old.split("\n")
+                content_parts.extend(f"[#ef4444]-[/] {line}" for line in old_lines)
+            if new_str:
+                new_display = new_str[:1000] + "..." if len(new_str) > 1000 else new_str
+                highlighted_new = cls._highlight_code(new_display, path)
+                new_lines = highlighted_new.split("\n")
+                content_parts.extend(f"[#22c55e]+[/] {line}" for line in new_lines)
+        elif command == "create" and file_text:
+            text_display = file_text[:1500] + "..." if len(file_text) > 1500 else file_text
+            highlighted_text = cls._highlight_code(text_display, path)
+            content_parts.append(highlighted_text)
+        elif command == "insert" and new_str:
+            new_display = new_str[:1000] + "..." if len(new_str) > 1000 else new_str
+            highlighted_new = cls._highlight_code(new_display, path)
+            new_lines = highlighted_new.split("\n")
+            content_parts.extend(f"[#22c55e]+[/] {line}" for line in new_lines)
+        elif not (result and isinstance(result, dict) and "content" in result) and not path:
+            content_parts = [f"{header} [dim]Processing...[/]"]
+
+        content_text = "\n".join(content_parts)
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)

--- a/strix/interface/tool_components/notes_renderer.py
+++ b/strix/interface/tool_components/notes_renderer.py
@@ -6,6 +6,12 @@ from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+def _truncate(text: str, length: int = 800) -> str:
+    if len(text) <= length:
+        return text
+    return text[: length - 3] + "..."
+
+
@register_tool_renderer
 class CreateNoteRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "create_note"
@@ -17,23 +23,24 @@ class CreateNoteRenderer(BaseToolRenderer):

        title = args.get("title", "")
        content = args.get("content", "")
+        category = args.get("category", "general")

-        header = "📝 [bold #fbbf24]Note[/]"
+        header = f"📝 [bold #fbbf24]Note[/] [dim]({category})[/]"

+        lines = [header]
        if title:
-            title_display = title[:100] + "..." if len(title) > 100 else title
-            note_parts = [f"{header}\n  [bold]{cls.escape_markup(title_display)}[/]"]
+            title_display = _truncate(title.strip(), 300)
+            lines.append(f"  {cls.escape_markup(title_display)}")

-            if content:
-                content_display = content[:200] + "..." if len(content) > 200 else content
-                note_parts.append(f"  [dim]{cls.escape_markup(content_display)}[/]")
+        if content:
+            content_display = _truncate(content.strip(), 800)
+            lines.append(f"  [dim]{cls.escape_markup(content_display)}[/]")

-            content_text = "\n".join(note_parts)
-        else:
-            content_text = f"{header}\n  [dim]Creating note...[/]"
+        if len(lines) == 1:
+            lines.append("  [dim]Capturing...[/]")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static("\n".join(lines), classes=css_classes)


@register_tool_renderer
@@ -43,8 +50,8 @@ class DeleteNoteRenderer(BaseToolRenderer):

    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:  # noqa: ARG003
-        header = "🗑️ [bold #fbbf24]Delete Note[/]"
-        content_text = f"{header}\n  [dim]Deleting...[/]"
+        header = "📝 [bold #94a3b8]Note Removed[/]"
+        content_text = header

        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@@ -59,28 +66,24 @@ class UpdateNoteRenderer(BaseToolRenderer):
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})

-        title = args.get("title", "")
-        content = args.get("content", "")
+        title = args.get("title")
+        content = args.get("content")

-        header = "✏️ [bold #fbbf24]Update Note[/]"
+        header = "📝 [bold #fbbf24]Note Updated[/]"
+        lines = [header]

-        if title or content:
-            note_parts = [header]
+        if title:
+            lines.append(f"  {cls.escape_markup(_truncate(title, 300))}")

-            if title:
-                title_display = title[:100] + "..." if len(title) > 100 else title
-                note_parts.append(f"  [bold]{cls.escape_markup(title_display)}[/]")
+        if content:
+            content_display = _truncate(content.strip(), 800)
+            lines.append(f"  [dim]{cls.escape_markup(content_display)}[/]")

-            if content:
-                content_display = content[:200] + "..." if len(content) > 200 else content
-                note_parts.append(f"  [dim]{cls.escape_markup(content_display)}[/]")
-
-            content_text = "\n".join(note_parts)
-        else:
-            content_text = f"{header}\n  [dim]Updating...[/]"
+        if len(lines) == 1:
+            lines.append("  [dim]Updating...[/]")

        css_classes = cls.get_css_classes("completed")
-        return Static(content_text, classes=css_classes)
+        return Static("\n".join(lines), classes=css_classes)


@register_tool_renderer
@@ -92,17 +95,34 @@ class ListNotesRenderer(BaseToolRenderer):
    def render(cls, tool_data: dict[str, Any]) -> Static:
        result = tool_data.get("result")

-        header = "📋 [bold #fbbf24]Listing notes[/]"
+        header = "📝 [bold #fbbf24]Notes[/]"

-        if result and isinstance(result, dict) and "notes" in result:
-            notes = result["notes"]
-            if isinstance(notes, list):
-                count = len(notes)
-                content_text = f"{header}\n  [dim]{count} notes found[/]"
+        if result and isinstance(result, dict) and result.get("success"):
+            count = result.get("total_count", 0)
+            notes = result.get("notes", []) or []
+            lines = [header]
+
+            if count == 0:
+                lines.append("  [dim]No notes[/]")
            else:
-                content_text = f"{header}\n  [dim]No notes found[/]"
+                for note in notes[:5]:
+                    title = note.get("title", "").strip() or "(untitled)"
+                    category = note.get("category", "general")
+                    content = note.get("content", "").strip()
+
+                    lines.append(
+                        f"  - {cls.escape_markup(_truncate(title, 300))} [dim]({category})[/]"
+                    )
+                    if content:
+                        content_preview = _truncate(content, 400)
+                        lines.append(f"    [dim]{cls.escape_markup(content_preview)}[/]")
+
+                remaining = max(count - 5, 0)
+                if remaining:
+                    lines.append(f"  [dim]... +{remaining} more[/]")
+            content_text = "\n".join(lines)
        else:
-            content_text = f"{header}\n  [dim]Listing notes...[/]"
+            content_text = f"{header}\n  [dim]Loading...[/]"

        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
--- a/strix/interface/tool_components/python_renderer.py
+++ b/strix/interface/tool_components/python_renderer.py
@@ -1,16 +1,53 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import PythonLexer
+from pygments.styles import get_style_by_name
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
@register_tool_renderer
 class PythonRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "python_action"
    css_classes: ClassVar[list[str]] = ["tool-call", "python-tool"]

+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_python(cls, code: str) -> str:
+        lexer = PythonLexer()
+        result_parts: list[str] = []
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+
+            escaped_value = cls.escape_markup(token_value)
+            color = cls._get_token_color(token_type)
+
+            if color:
+                result_parts.append(f"[{color}]{escaped_value}[/]")
+            else:
+                result_parts.append(escaped_value)
+
+        return "".join(result_parts)
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
@@ -21,8 +58,9 @@ class PythonRenderer(BaseToolRenderer):
        header = "</> [bold #3b82f6]Python[/]"

        if code and action in ["new_session", "execute"]:
-            code_display = code[:600] + "..." if len(code) > 600 else code
-            content_text = f"{header}\n  [italic white]{cls.escape_markup(code_display)}[/]"
+            code_display = code[:2000] + "..." if len(code) > 2000 else code
+            highlighted_code = cls._highlight_python(code_display)
+            content_text = f"{header}\n{highlighted_code}"
        elif action == "close":
            content_text = f"{header}\n  [dim]Closing session...[/]"
        elif action == "list_sessions":
--- a/strix/interface/tool_components/terminal_renderer.py
+++ b/strix/interface/tool_components/terminal_renderer.py
@@ -1,16 +1,53 @@
+from functools import cache
 from typing import Any, ClassVar

+from pygments.lexers import get_lexer_by_name
+from pygments.styles import get_style_by_name
 from textual.widgets import Static

 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer


+@cache
+def _get_style_colors() -> dict[Any, str]:
+    style = get_style_by_name("native")
+    return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
+
+
@register_tool_renderer
 class TerminalRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "terminal_execute"
    css_classes: ClassVar[list[str]] = ["tool-call", "terminal-tool"]

+    @classmethod
+    def _get_token_color(cls, token_type: Any) -> str | None:
+        colors = _get_style_colors()
+        while token_type:
+            if token_type in colors:
+                return colors[token_type]
+            token_type = token_type.parent
+        return None
+
+    @classmethod
+    def _highlight_bash(cls, code: str) -> str:
+        lexer = get_lexer_by_name("bash")
+        result_parts: list[str] = []
+
+        for token_type, token_value in lexer.get_tokens(code):
+            if not token_value:
+                continue
+
+            escaped_value = cls.escape_markup(token_value)
+            color = cls._get_token_color(token_type)
+
+            if color:
+                result_parts.append(f"[{color}]{escaped_value}[/]")
+            else:
+                result_parts.append(escaped_value)
+
+        return "".join(result_parts)
+
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
@@ -115,17 +152,15 @@ class TerminalRenderer(BaseToolRenderer):

        if is_input:
            formatted_command = cls._format_command_display(command)
-            return f"{terminal_icon} [#3b82f6]>>>[/] [#22c55e]{formatted_command}[/]"
+            return f"{terminal_icon} [#3b82f6]>>>[/] {formatted_command}"

        formatted_command = cls._format_command_display(command)
-        return f"{terminal_icon} [#22c55e]$ {formatted_command}[/]"
+        return f"{terminal_icon} [#22c55e]$[/] {formatted_command}"

    @classmethod
    def _format_command_display(cls, command: str) -> str:
        if not command:
            return ""

-        if len(command) > 400:
-            command = command[:397] + "..."
-
-        return cls.escape_markup(command)
+        cmd_display = command[:2000] + "..." if len(command) > 2000 else command
+        return cls._highlight_bash(cmd_display)
--- a/strix/interface/tool_components/todo_renderer.py
+++ b/strix/interface/tool_components/todo_renderer.py
@@ -0,0 +1,204 @@
+from typing import Any, ClassVar
+
+from textual.widgets import Static
+
+from .base_renderer import BaseToolRenderer
+from .registry import register_tool_renderer
+
+
+STATUS_MARKERS = {
+    "pending": "[ ]",
+    "in_progress": "[~]",
+    "done": "[•]",
+}
+
+
+def _truncate(text: str, length: int = 80) -> str:
+    if len(text) <= length:
+        return text
+    return text[: length - 3] + "..."
+
+
+def _format_todo_lines(
+    cls: type[BaseToolRenderer], result: dict[str, Any], limit: int = 25
+) -> list[str]:
+    todos = result.get("todos")
+    if not isinstance(todos, list) or not todos:
+        return ["  [dim]No todos[/]"]
+
+    lines: list[str] = []
+    total = len(todos)
+
+    for index, todo in enumerate(todos):
+        if index >= limit:
+            remaining = total - limit
+            if remaining > 0:
+                lines.append(f"  [dim]... +{remaining} more[/]")
+            break
+
+        status = todo.get("status", "pending")
+        marker = STATUS_MARKERS.get(status, STATUS_MARKERS["pending"])
+
+        title = todo.get("title", "").strip() or "(untitled)"
+        title = cls.escape_markup(_truncate(title, 90))
+
+        if status == "done":
+            title_markup = f"[dim strike]{title}[/]"
+        elif status == "in_progress":
+            title_markup = f"[italic]{title}[/]"
+        else:
+            title_markup = title
+
+        lines.append(f"  {marker} {title_markup}")
+
+    return lines
+
+
+@register_tool_renderer
+class CreateTodoRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "create_todo"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #a78bfa]Todo[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Failed to create todo")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Creating...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
+
+
+@register_tool_renderer
+class ListTodosRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "list_todos"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #a78bfa]Todos[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Unable to list todos")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Loading...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
+
+
+@register_tool_renderer
+class UpdateTodoRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "update_todo"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #a78bfa]Todo Updated[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Failed to update todo")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Updating...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
+
+
+@register_tool_renderer
+class MarkTodoDoneRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "mark_todo_done"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #a78bfa]Todo Completed[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Failed to mark todo done")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Marking done...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
+
+
+@register_tool_renderer
+class MarkTodoPendingRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "mark_todo_pending"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #f59e0b]Todo Reopened[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Failed to reopen todo")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Reopening...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
+
+
+@register_tool_renderer
+class DeleteTodoRenderer(BaseToolRenderer):
+    tool_name: ClassVar[str] = "delete_todo"
+    css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
+
+    @classmethod
+    def render(cls, tool_data: dict[str, Any]) -> Static:
+        result = tool_data.get("result")
+        header = "📋 [bold #94a3b8]Todo Removed[/]"
+
+        if result and isinstance(result, dict):
+            if result.get("success"):
+                lines = [header]
+                lines.extend(_format_todo_lines(cls, result))
+                content_text = "\n".join(lines)
+            else:
+                error = result.get("error", "Failed to remove todo")
+                content_text = f"{header}\n  [#ef4444]{cls.escape_markup(error)}[/]"
+        else:
+            content_text = f"{header}\n  [dim]Removing...[/]"
+
+        css_classes = cls.get_css_classes("completed")
+        return Static(content_text, classes=css_classes)
--- a/strix/interface/tui.py
+++ b/strix/interface/tui.py
@@ -319,7 +319,8 @@ class StrixTUIApp(App):  # type: ignore[misc]
        }

    def _build_agent_config(self, args: argparse.Namespace) -> dict[str, Any]:
-        llm_config = LLMConfig()
+        scan_mode = getattr(args, "scan_mode", "deep")
+        llm_config = LLMConfig(scan_mode=scan_mode)

        config = {
            "llm_config": llm_config,
@@ -676,7 +677,7 @@ class StrixTUIApp(App):  # type: ignore[misc]

        stats_content = Text()

-        stats_text = build_live_stats_text(self.tracer)
+        stats_text = build_live_stats_text(self.tracer, self.agent_config)
        if stats_text:
            stats_content.append(stats_text)

@@ -987,7 +988,7 @@ class StrixTUIApp(App):  # type: ignore[misc]

    def _render_chat_content(self, msg_data: dict[str, Any]) -> str:
        role = msg_data.get("role")
-        content = escape_markup(msg_data.get("content", ""))
+        content = msg_data.get("content", "")

        if not content:
            return ""
@@ -995,8 +996,11 @@ class StrixTUIApp(App):  # type: ignore[misc]
        if role == "user":
            from strix.interface.tool_components.user_message_renderer import UserMessageRenderer

-            return UserMessageRenderer.render_simple(content)
-        return content
+            return UserMessageRenderer.render_simple(escape_markup(content))
+
+        from strix.interface.tool_components.agent_message_renderer import AgentMessageRenderer
+
+        return AgentMessageRenderer.render_simple(content)

    def _render_tool_content_simple(self, tool_data: dict[str, Any]) -> str:
        tool_name = tool_data.get("tool_name", "Unknown Tool")
--- a/strix/interface/utils.py
+++ b/strix/interface/utils.py
@@ -129,7 +129,7 @@ def build_final_stats_text(tracer: Any) -> Text:
    return stats_text


-def build_live_stats_text(tracer: Any) -> Text:
+def build_live_stats_text(tracer: Any, agent_config: dict[str, Any] | None = None) -> Text:
    stats_text = Text()
    if not tracer:
        return stats_text
@@ -165,6 +165,12 @@ def build_live_stats_text(tracer: Any) -> Text:

        stats_text.append("\n")

+    if agent_config:
+        llm_config = agent_config["llm_config"]
+        model = getattr(llm_config, "model_name", "Unknown")
+        stats_text.append(f"🧠 Model: {model}")
+        stats_text.append("\n")
+
    stats_text.append("🤖 Agents: ", style="bold white")
    stats_text.append(str(agent_count), style="dim white")
    stats_text.append(" • ", style="dim white")
--- a/strix/llm/init.py
+++ b/strix/llm/init.py
@@ -11,5 +11,3 @@ __all__ = [
 ]

 litellm._logging._disable_debugging()
-
-litellm.drop_params = True
--- a/strix/llm/config.py
+++ b/strix/llm/config.py
@@ -8,6 +8,7 @@ class LLMConfig:
        enable_prompt_caching: bool = True,
        prompt_modules: list[str] | None = None,
        timeout: int | None = None,
+        scan_mode: str = "deep",
    ):
        self.model_name = model_name or os.getenv("STRIX_LLM", "openai/gpt-5")

@@ -17,4 +18,6 @@ class LLMConfig:
        self.enable_prompt_caching = enable_prompt_caching
        self.prompt_modules = prompt_modules or []

-        self.timeout = timeout or int(os.getenv("LLM_TIMEOUT", "600"))
+        self.timeout = timeout or int(os.getenv("LLM_TIMEOUT", "300"))
+
+        self.scan_mode = scan_mode if scan_mode in ["quick", "standard", "deep"] else "deep"
--- a/strix/llm/llm.py
+++ b/strix/llm/llm.py
@@ -13,7 +13,7 @@ from jinja2 import (
    select_autoescape,
 )
 from litellm import ModelResponse, completion_cost
-from litellm.utils import supports_prompt_caching
+from litellm.utils import supports_prompt_caching, supports_vision

 from strix.llm.config import LLMConfig
 from strix.llm.memory_compressor import MemoryCompressor
@@ -25,18 +25,16 @@ from strix.tools import get_tools_prompt

 logger = logging.getLogger(__name__)

-api_key = os.getenv("LLM_API_KEY")
-if api_key:
-    litellm.api_key = api_key
+litellm.drop_params = True
+litellm.modify_params = True

-api_base = (
+_LLM_API_KEY = os.getenv("LLM_API_KEY")
+_LLM_API_BASE = (
    os.getenv("LLM_API_BASE")
    or os.getenv("OPENAI_API_BASE")
    or os.getenv("LITELLM_BASE_URL")
    or os.getenv("OLLAMA_API_BASE")
 )
-if api_base:
-    litellm.api_base = api_base


 class LLMRequestFailedError(Exception):
@@ -160,9 +158,10 @@ class LLM:
            )

            try:
-                prompt_module_content = load_prompt_modules(
-                    self.config.prompt_modules or [], self.jinja_env
-                )
+                modules_to_load = list(self.config.prompt_modules or [])
+                modules_to_load.append(f"scan_modes/{self.config.scan_mode}")
+
+                prompt_module_content = load_prompt_modules(modules_to_load, self.jinja_env)

                def get_module(name: str) -> str:
                    return prompt_module_content.get(name, "")
@@ -390,16 +389,71 @@ class LLM:

        return model_matches(self.config.model_name, REASONING_EFFORT_PATTERNS)

+    def _model_supports_vision(self) -> bool:
+        if not self.config.model_name:
+            return False
+        try:
+            return bool(supports_vision(model=self.config.model_name))
+        except Exception:  # noqa: BLE001
+            return False
+
+    def _filter_images_from_messages(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
+        filtered_messages = []
+        for msg in messages:
+            content = msg.get("content")
+            updated_msg = msg
+            if isinstance(content, list):
+                filtered_content = []
+                for item in content:
+                    if isinstance(item, dict):
+                        if item.get("type") == "image_url":
+                            filtered_content.append(
+                                {
+                                    "type": "text",
+                                    "text": "[Screenshot removed - model does not support "
+                                    "vision. Use view_source or execute_js instead.]",
+                                }
+                            )
+                        else:
+                            filtered_content.append(item)
+                    else:
+                        filtered_content.append(item)
+                if filtered_content:
+                    text_parts = [
+                        item.get("text", "") if isinstance(item, dict) else str(item)
+                        for item in filtered_content
+                    ]
+                    all_text = all(
+                        isinstance(item, dict) and item.get("type") == "text"
+                        for item in filtered_content
+                    )
+                    if all_text:
+                        updated_msg = {**msg, "content": "\n".join(text_parts)}
+                    else:
+                        updated_msg = {**msg, "content": filtered_content}
+                else:
+                    updated_msg = {**msg, "content": ""}
+            filtered_messages.append(updated_msg)
+        return filtered_messages
+
    async def _make_request(
        self,
        messages: list[dict[str, Any]],
    ) -> ModelResponse:
+        if not self._model_supports_vision():
+            messages = self._filter_images_from_messages(messages)
+
        completion_args: dict[str, Any] = {
            "model": self.config.model_name,
            "messages": messages,
            "timeout": self.config.timeout,
        }

+        if _LLM_API_KEY:
+            completion_args["api_key"] = _LLM_API_KEY
+        if _LLM_API_BASE:
+            completion_args["api_base"] = _LLM_API_BASE
+
        if self._should_include_stop_param():
            completion_args["stop"] = ["</function>"]

--- a/strix/llm/request_queue.py
+++ b/strix/llm/request_queue.py
@@ -27,7 +27,7 @@ def should_retry_exception(exception: Exception) -> bool:


 class LLMRequestQueue:
-    def __init__(self, max_concurrent: int = 6, delay_between_requests: float = 5.0):
+    def __init__(self, max_concurrent: int = 1, delay_between_requests: float = 4.0):
        rate_limit_delay = os.getenv("LLM_RATE_LIMIT_DELAY")
        if rate_limit_delay:
            delay_between_requests = float(rate_limit_delay)
@@ -61,8 +61,8 @@ class LLMRequestQueue:
            self._semaphore.release()

    @retry(  # type: ignore[misc]
-        stop=stop_after_attempt(7),
-        wait=wait_exponential(multiplier=6, min=12, max=150),
+        stop=stop_after_attempt(3),
+        wait=wait_exponential(multiplier=8, min=8, max=64),
        retry=retry_if_exception(should_retry_exception),
        reraise=True,
    )
--- a/strix/prompts/scan_modes/deep.jinja
+++ b/strix/prompts/scan_modes/deep.jinja
@@ -0,0 +1,145 @@
+<scan_mode>
+DEEP SCAN MODE - Exhaustive Security Assessment
+
+This mode is for thorough security reviews where finding vulnerabilities is critical.
+
+PHASE 1: EXHAUSTIVE RECONNAISSANCE AND MAPPING
+Spend significant effort understanding the target before exploitation.
+
+For whitebox (source code available):
+- Map EVERY file, module, and code path in the repository
+- Trace all entry points from HTTP handlers to database queries
+- Identify all authentication mechanisms and their implementations
+- Map all authorization checks and understand the access control model
+- Identify all external service integrations and API calls
+- Analyze all configuration files for secrets and misconfigurations
+- Review all database schemas and understand data relationships
+- Map all background jobs, cron tasks, and async processing
+- Identify all serialization/deserialization points
+- Review all file handling operations (upload, download, processing)
+- Understand the deployment model and infrastructure assumptions
+- Check all dependency versions against known CVE databases
+
+For blackbox (no source code):
+- Exhaustive subdomain enumeration using multiple sources and tools
+- Full port scanning to identify all services
+- Complete content discovery with multiple wordlists
+- Technology fingerprinting on all discovered assets
+- API endpoint discovery through documentation, JavaScript analysis, and fuzzing
+- Identify all parameters including hidden and rarely-used ones
+- Map all user roles by testing with different account types
+- Understand rate limiting, WAF rules, and security controls in place
+- Document the complete application architecture as understood from outside
+
+EXECUTION STRATEGY - HIERARCHICAL AGENT SWARM:
+After Phase 1 (Recon & Mapping) is complete:
+1. Divide the application into major components/parts (e.g., Auth System, Payment Gateway, User Profile, Admin Panel)
+2. Spawn a specialized subagent for EACH major component
+3. Each component agent must then:
+   - Further subdivide its scope into subparts (e.g., Login Form, Registration API, Password Reset)
+   - Spawn sub-subagents for each distinct subpart
+4. At the lowest level (specific functionality), spawn specialized agents for EACH potential vulnerability type:
+   - "Auth System" → "Login Form" → "SQLi Agent", "XSS Agent", "Auth Bypass Agent"
+   - This creates a massive parallel swarm covering every angle
+   - Do NOT overload a single agent with multiple vulnerability types
+   - Scale horizontally to maximum capacity
+
+PHASE 2: DEEP BUSINESS LOGIC ANALYSIS
+Understand the application deeply enough to find logic flaws:
+- CREATE A FULL STORYBOARD of all user flows and state transitions
+- Document every step of the business logic in a structured flow diagram
+- Use the application extensively as every type of user to map the full lifecycle of data
+- Document all state machines and workflows (e.g. Order Created -> Paid -> Shipped)
+- Identify trust boundaries between components
+- Map all integrations with third-party services
+- Understand what invariants the application tries to maintain
+- Identify all points where roles, privileges, or sensitive data changes hands
+- Look for implicit assumptions in the business logic
+- Consider multi-step attacks that abuse normal functionality
+
+PHASE 3: COMPREHENSIVE ATTACK SURFACE TESTING
+Test EVERY input vector with EVERY applicable technique.
+
+Input Handling - Test all parameters, headers, cookies with:
+- Multiple injection payloads (SQL, NoSQL, LDAP, XPath, Command, Template)
+- Various encodings and bypass techniques (double encoding, unicode, null bytes)
+- Boundary conditions and type confusion
+- Large payloads and buffer-related issues
+
+Authentication and Session:
+- Exhaustive brute force protection testing
+- Session fixation, hijacking, and prediction attacks
+- JWT/token manipulation if applicable
+- OAuth flow abuse scenarios
+- Password reset flow vulnerabilities (token leakage, reuse, timing)
+- Multi-factor authentication bypass techniques
+- Account enumeration through all possible channels
+
+Access Control:
+- Test EVERY endpoint for horizontal and vertical access control
+- Parameter tampering on all object references
+- Forced browsing to all discovered resources
+- HTTP method tampering
+- Test access control after session changes (logout, role change)
+
+File Operations:
+- Exhaustive file upload bypass testing (extension, content-type, magic bytes)
+- Path traversal on all file parameters
+- Server-side request forgery through file inclusion
+- XXE through all XML parsing points
+
+Business Logic:
+- Race conditions on all state-changing operations
+- Workflow bypass attempts on every multi-step process
+- Price/quantity manipulation in all transactions
+- Parallel execution attacks
+- Time-of-check to time-of-use vulnerabilities
+
+Advanced Attacks:
+- HTTP request smuggling if multiple proxies/servers
+- Cache poisoning and cache deception
+- Subdomain takeover on all subdomains
+- Prototype pollution in JavaScript applications
+- CORS misconfiguration exploitation
+- WebSocket security testing
+- GraphQL specific attacks if applicable
+
+PHASE 4: VULNERABILITY CHAINING
+Don't just find individual bugs - chain them:
+- Combine information disclosure with access control bypass
+- Chain SSRF to access internal services
+- Use low-severity findings to enable high-impact attacks
+- Look for multi-step attack paths that automated tools miss
+- Consider attacks that span multiple application components
+
+CHAINING PRINCIPLES (MAX IMPACT):
+- Treat every finding as a pivot: ask "What does this unlock next?" until you reach maximum privilege / maximum data exposure / maximum control
+- Prefer end-to-end exploit paths over isolated bugs: initial foothold → pivot → privilege gain → sensitive action/data
+- Cross boundaries deliberately: user → admin, external → internal, unauthenticated → authenticated, read → write, single-tenant → cross-tenant
+- Validate chains by executing the full sequence using the available tools (proxy + browser for workflows, python for automation, terminal for supporting commands)
+- When a component agent finds a potential pivot, it must message/spawn the next focused agent to continue the chain in the next component/subpart
+
+PHASE 5: PERSISTENT TESTING
+If initial attempts fail, don't give up:
+- Research specific technologies for known bypasses
+- Try alternative exploitation techniques
+- Look for edge cases and unusual functionality
+- Test with different client contexts
+- Revisit previously tested areas with new information
+- Consider timing-based and blind exploitation techniques
+
+PHASE 6: THOROUGH REPORTING
+- Document EVERY confirmed vulnerability with full details
+- Include all severity levels - even low findings may enable chains
+- Provide complete reproduction steps and PoC
+- Document remediation recommendations
+- Note areas requiring additional review beyond current scope
+
+MINDSET:
+- Relentless - this is about finding what others miss
+- Creative - think of unconventional attack vectors
+- Patient - real vulnerabilities often require deep investigation
+- Thorough - test every parameter, every endpoint, every edge case
+- Persistent - if one approach fails, try ten more
+- Holistic - understand how components interact to find systemic issues
+</scan_mode>
--- a/strix/prompts/scan_modes/quick.jinja
+++ b/strix/prompts/scan_modes/quick.jinja
@@ -0,0 +1,63 @@
+<scan_mode>
+QUICK SCAN MODE - Rapid Security Assessment
+
+This mode is optimized for fast feedback. Focus on HIGH-IMPACT vulnerabilities with minimal overhead.
+
+PHASE 1: RAPID ORIENTATION
+- If source code is available: Focus primarily on RECENT CHANGES (git diff, new commits, modified files)
+- Identify the most critical entry points: authentication endpoints, payment flows, admin interfaces, API endpoints handling sensitive data
+- Quickly understand the tech stack and frameworks in use
+- Skip exhaustive reconnaissance - use what's immediately visible
+
+PHASE 2: TARGETED ATTACK SURFACE
+For whitebox (source code available):
+- Prioritize files changed in recent commits/PRs - these are most likely to contain fresh bugs
+- Look for security-sensitive patterns in diffs: auth checks, input handling, database queries, file operations
+- Trace user-controllable input in changed code paths
+- Check if security controls were modified or bypassed
+
+For blackbox (no source code):
+- Focus on authentication and session management
+- Test the most critical user flows only
+- Check for obvious misconfigurations and exposed endpoints
+- Skip deep content discovery - test what's immediately accessible
+
+PHASE 3: HIGH-IMPACT VULNERABILITY FOCUS
+Prioritize in this order:
+1. Authentication bypass and broken access control
+2. Remote code execution vectors
+3. SQL injection in critical endpoints
+4. Insecure direct object references (IDOR) in sensitive resources
+5. Server-side request forgery (SSRF)
+6. Hardcoded credentials or secrets in code
+
+Skip lower-priority items:
+- Extensive subdomain enumeration
+- Full directory bruteforcing
+- Information disclosure that doesn't lead to exploitation
+- Theoretical vulnerabilities without PoC
+
+PHASE 4: VALIDATION AND REPORTING
+- Validate only critical/high severity findings with minimal PoC
+- Report findings as you discover them - don't wait for completion
+- Focus on exploitability and business impact
+
+QUICK CHAINING RULE:
+- If you find ANY strong primitive (auth weakness, access control gap, injection point, internal reachability), immediately attempt a single high-impact pivot to demonstrate real impact
+- Do not stop at a low-context “maybe”; turn it into a concrete exploit sequence (even if short) that reaches privileged action or sensitive data
+
+OPERATIONAL GUIDELINES:
+- Use the browser tool for quick manual testing of critical flows
+- Use terminal for targeted scans with fast presets (e.g., nuclei with critical/high templates only)
+- Use proxy to inspect traffic on key endpoints
+- Skip extensive fuzzing - use targeted payloads only
+- Create subagents only for parallel high-priority tasks
+- If whitebox: file_edit tool to review specific suspicious code sections
+- Use notes tool to track critical findings only
+
+MINDSET:
+- Think like a time-boxed bug bounty hunter going for quick wins
+- Prioritize breadth over depth on critical areas
+- If something looks exploitable, validate quickly and move on
+- Don't get stuck - if an attack vector isn't yielding results quickly, pivot
+</scan_mode>
--- a/strix/prompts/scan_modes/standard.jinja
+++ b/strix/prompts/scan_modes/standard.jinja
@@ -0,0 +1,91 @@
+<scan_mode>
+STANDARD SCAN MODE - Balanced Security Assessment
+
+This mode provides thorough coverage with a structured methodology. Balance depth with efficiency.
+
+PHASE 1: RECONNAISSANCE AND MAPPING
+Understanding the target is critical before exploitation. Never skip this phase.
+
+For whitebox (source code available):
+- Map the entire codebase structure: directories, modules, entry points
+- Identify the application architecture (MVC, microservices, monolith)
+- Understand the routing: how URLs map to handlers/controllers
+- Identify all user input vectors: forms, APIs, file uploads, headers, cookies
+- Map authentication and authorization flows
+- Identify database interactions and ORM usage
+- Review dependency manifests for known vulnerable packages
+- Understand the data model and sensitive data locations
+
+For blackbox (no source code):
+- Crawl the application thoroughly using browser tool - interact with every feature
+- Enumerate all endpoints, parameters, and functionality
+- Identify the technology stack through fingerprinting
+- Map user roles and access levels
+- Understand the business logic by using the application as intended
+- Document all forms, APIs, and data entry points
+- Use proxy tool to capture and analyze all traffic during exploration
+
+PHASE 2: BUSINESS LOGIC UNDERSTANDING
+Before testing for vulnerabilities, understand what the application DOES:
+- What are the critical business flows? (payments, user registration, data access)
+- What actions should be restricted to specific roles?
+- What data should users NOT be able to access?
+- What state transitions exist? (order pending → paid → shipped)
+- Where does money, sensitive data, or privilege flow?
+
+PHASE 3: SYSTEMATIC VULNERABILITY ASSESSMENT
+Test each attack surface methodically. Create focused subagents for different areas.
+
+Entry Point Analysis:
+- Test all input fields for injection vulnerabilities
+- Check all API endpoints for authentication and authorization
+- Verify all file upload functionality for bypass
+- Test all search and filter functionality
+- Check redirect parameters and URL handling
+
+Authentication and Session:
+- Test login for brute force protection
+- Check session token entropy and handling
+- Test password reset flows for weaknesses
+- Verify logout invalidates sessions
+- Test for authentication bypass techniques
+
+Access Control:
+- For every privileged action, test as unprivileged user
+- Test horizontal access control (user A accessing user B's data)
+- Test vertical access control (user escalating to admin)
+- Check API endpoints mirror UI access controls
+- Test direct object references with different user contexts
+
+Business Logic:
+- Attempt to skip steps in multi-step processes
+- Test for race conditions in critical operations
+- Try negative values, zero values, boundary conditions
+- Attempt to replay transactions
+- Test for price manipulation in e-commerce flows
+
+PHASE 4: EXPLOITATION AND VALIDATION
+- Every finding must have a working proof-of-concept
+- Demonstrate actual impact, not theoretical risk
+- Chain vulnerabilities when possible to show maximum impact
+- Document the full attack path from initial access to impact
+- Use python tool for complex exploit development
+
+CHAINING & MAX IMPACT MINDSET:
+- Always ask: "If I can do X, what does that enable me to do next?" Keep pivoting until you reach maximum privilege or maximum sensitive data access
+- Prefer complete end-to-end paths (entry point → pivot → privileged action/data) over isolated bug reports
+- Use the application as a real user would: exploit must survive the actual workflow and state transitions
+- When you discover a useful pivot (info leak, weak boundary, partial access), immediately pursue the next step rather than stopping at the first win
+
+PHASE 5: COMPREHENSIVE REPORTING
+- Report all confirmed vulnerabilities with clear reproduction steps
+- Include severity based on actual exploitability and business impact
+- Provide remediation recommendations
+- Document any areas that need further investigation
+
+MINDSET:
+- Methodical and systematic - cover the full attack surface
+- Document as you go - findings and areas tested
+- Validate everything - no assumptions about exploitability
+- Think about business impact, not just technical severity
+</scan_mode>
--- a/strix/runtime/docker_runtime.py
+++ b/strix/runtime/docker_runtime.py
@@ -203,7 +203,7 @@ class DockerRuntime(AbstractRuntime):
                all=True, filters={"label": f"strix-scan-id={scan_id}"}
            )
            if containers:
-                container = cast("Container", containers[0])
+                container = containers[0]
                if container.status != "running":
                    container.start()
                    time.sleep(2)
--- a/strix/tools/init.py
+++ b/strix/tools/init.py
@@ -24,9 +24,13 @@ SANDBOX_MODE = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"

 HAS_PERPLEXITY_API = bool(os.getenv("PERPLEXITY_API_KEY"))

+DISABLE_BROWSER = os.getenv("STRIX_DISABLE_BROWSER", "false").lower() == "true"
+
 if not SANDBOX_MODE:
    from .agents_graph import *  # noqa: F403
-    from .browser import *  # noqa: F403
+
+    if not DISABLE_BROWSER:
+        from .browser import *  # noqa: F403
    from .file_edit import *  # noqa: F403
    from .finish import *  # noqa: F403
    from .notes import *  # noqa: F403
@@ -35,13 +39,14 @@ if not SANDBOX_MODE:
    from .reporting import *  # noqa: F403
    from .terminal import *  # noqa: F403
    from .thinking import *  # noqa: F403
+    from .todo import *  # noqa: F403

    if HAS_PERPLEXITY_API:
        from .web_search import *  # noqa: F403
 else:
-    from .browser import *  # noqa: F403
+    if not DISABLE_BROWSER:
+        from .browser import *  # noqa: F403
    from .file_edit import *  # noqa: F403
-    from .notes import *  # noqa: F403
    from .proxy import *  # noqa: F403
    from .python import *  # noqa: F403
    from .terminal import *  # noqa: F403
--- a/strix/tools/agents_graph/agents_graph_actions.py
+++ b/strix/tools/agents_graph/agents_graph_actions.py
@@ -233,14 +233,14 @@ def create_agent(
        parent_agent = _agent_instances.get(parent_id)

        timeout = None
-        if (
-            parent_agent
-            and hasattr(parent_agent, "llm_config")
-            and hasattr(parent_agent.llm_config, "timeout")
-        ):
-            timeout = parent_agent.llm_config.timeout
+        scan_mode = "deep"
+        if parent_agent and hasattr(parent_agent, "llm_config"):
+            if hasattr(parent_agent.llm_config, "timeout"):
+                timeout = parent_agent.llm_config.timeout
+            if hasattr(parent_agent.llm_config, "scan_mode"):
+                scan_mode = parent_agent.llm_config.scan_mode

-        llm_config = LLMConfig(prompt_modules=module_list, timeout=timeout)
+        llm_config = LLMConfig(prompt_modules=module_list, timeout=timeout, scan_mode=scan_mode)

        agent_config = {
            "llm_config": llm_config,
--- a/strix/tools/browser/browser_actions.py
+++ b/strix/tools/browser/browser_actions.py
@@ -1,8 +1,10 @@
-from typing import Any, Literal, NoReturn
+from typing import TYPE_CHECKING, Any, Literal, NoReturn

 from strix.tools.registry import register_tool

-from .tab_manager import BrowserTabManager, get_browser_tab_manager
+
+if TYPE_CHECKING:
+    from .tab_manager import BrowserTabManager


 BrowserAction = Literal[
@@ -71,7 +73,7 @@ def _validate_file_path(action_name: str, file_path: str | None) -> None:


 def _handle_navigation_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    url: str | None = None,
    tab_id: str | None = None,
@@ -90,7 +92,7 @@ def _handle_navigation_actions(


 def _handle_interaction_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    coordinate: str | None = None,
    text: str | None = None,
@@ -128,7 +130,7 @@ def _raise_unknown_action(action: str) -> NoReturn:


 def _handle_tab_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    url: str | None = None,
    tab_id: str | None = None,
@@ -149,7 +151,7 @@ def _handle_tab_actions(


 def _handle_utility_actions(
-    manager: BrowserTabManager,
+    manager: "BrowserTabManager",
    action: str,
    duration: float | None = None,
    js_code: str | None = None,
@@ -191,6 +193,8 @@ def browser_action(
    file_path: str | None = None,
    clear: bool = False,
 ) -> dict[str, Any]:
+    from .tab_manager import get_browser_tab_manager
+
    manager = get_browser_tab_manager()

    try:
--- a/strix/tools/executor.py
+++ b/strix/tools/executor.py
@@ -17,6 +17,10 @@ from .registry import (
 )


+SANDBOX_EXECUTION_TIMEOUT = float(os.getenv("STRIX_SANDBOX_EXECUTION_TIMEOUT", "500"))
+SANDBOX_CONNECT_TIMEOUT = float(os.getenv("STRIX_SANDBOX_CONNECT_TIMEOUT", "10"))
+
+
 async def execute_tool(tool_name: str, agent_state: Any | None = None, **kwargs: Any) -> Any:
    execute_in_sandbox = should_execute_in_sandbox(tool_name)
    sandbox_mode = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
@@ -62,10 +66,15 @@ async def _execute_tool_in_sandbox(tool_name: str, agent_state: Any, **kwargs: A
        "Content-Type": "application/json",
    }

+    timeout = httpx.Timeout(
+        timeout=SANDBOX_EXECUTION_TIMEOUT,
+        connect=SANDBOX_CONNECT_TIMEOUT,
+    )
+
    async with httpx.AsyncClient(trust_env=False) as client:
        try:
            response = await client.post(
-                request_url, json=request_data, headers=headers, timeout=None
+                request_url, json=request_data, headers=headers, timeout=timeout
            )
            response.raise_for_status()
            response_data = response.json()
--- a/strix/tools/file_edit/file_edit_actions.py
+++ b/strix/tools/file_edit/file_edit_actions.py
@@ -3,9 +3,6 @@ import re
 from pathlib import Path
 from typing import Any, cast

-from openhands_aci import file_editor
-from openhands_aci.utils.shell import run_shell_cmd
-
 from strix.tools.registry import register_tool


@@ -33,6 +30,8 @@ def str_replace_editor(
    new_str: str | None = None,
    insert_line: int | None = None,
 ) -> dict[str, Any]:
+    from openhands_aci import file_editor
+
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
@@ -64,6 +63,8 @@ def list_files(
    path: str,
    recursive: bool = False,
 ) -> dict[str, Any]:
+    from openhands_aci.utils.shell import run_shell_cmd
+
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
@@ -116,6 +117,8 @@ def search_files(
    regex: str,
    file_pattern: str = "*",
 ) -> dict[str, Any]:
+    from openhands_aci.utils.shell import run_shell_cmd
+
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
--- a/strix/tools/notes/notes_actions.py
+++ b/strix/tools/notes/notes_actions.py
@@ -11,7 +11,6 @@ _notes_storage: dict[str, dict[str, Any]] = {}
 def _filter_notes(
    category: str | None = None,
    tags: list[str] | None = None,
-    priority: str | None = None,
    search_query: str | None = None,
 ) -> list[dict[str, Any]]:
    filtered_notes = []
@@ -20,9 +19,6 @@ def _filter_notes(
        if category and note.get("category") != category:
            continue

-        if priority and note.get("priority") != priority:
-            continue
-
        if tags:
            note_tags = note.get("tags", [])
            if not any(tag in note_tags for tag in tags):
@@ -43,13 +39,12 @@ def _filter_notes(
    return filtered_notes


-@register_tool
+@register_tool(sandbox_execution=False)
 def create_note(
    title: str,
    content: str,
    category: str = "general",
    tags: list[str] | None = None,
-    priority: str = "normal",
 ) -> dict[str, Any]:
    try:
        if not title or not title.strip():
@@ -58,7 +53,7 @@ def create_note(
        if not content or not content.strip():
            return {"success": False, "error": "Content cannot be empty", "note_id": None}

-        valid_categories = ["general", "findings", "methodology", "todo", "questions", "plan"]
+        valid_categories = ["general", "findings", "methodology", "questions", "plan"]
        if category not in valid_categories:
            return {
                "success": False,
@@ -66,14 +61,6 @@ def create_note(
                "note_id": None,
            }

-        valid_priorities = ["low", "normal", "high", "urgent"]
-        if priority not in valid_priorities:
-            return {
-                "success": False,
-                "error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
-                "note_id": None,
-            }
-
        note_id = str(uuid.uuid4())[:5]
        timestamp = datetime.now(UTC).isoformat()

@@ -82,7 +69,6 @@ def create_note(
            "content": content.strip(),
            "category": category,
            "tags": tags or [],
-            "priority": priority,
            "created_at": timestamp,
            "updated_at": timestamp,
        }
@@ -99,17 +85,14 @@ def create_note(
        }


-@register_tool
+@register_tool(sandbox_execution=False)
 def list_notes(
    category: str | None = None,
    tags: list[str] | None = None,
-    priority: str | None = None,
    search: str | None = None,
 ) -> dict[str, Any]:
    try:
-        filtered_notes = _filter_notes(
-            category=category, tags=tags, priority=priority, search_query=search
-        )
+        filtered_notes = _filter_notes(category=category, tags=tags, search_query=search)

        return {
            "success": True,
@@ -126,13 +109,12 @@ def list_notes(
        }


-@register_tool
+@register_tool(sandbox_execution=False)
 def update_note(
    note_id: str,
    title: str | None = None,
    content: str | None = None,
    tags: list[str] | None = None,
-    priority: str | None = None,
 ) -> dict[str, Any]:
    try:
        if note_id not in _notes_storage:
@@ -153,15 +135,6 @@ def update_note(
        if tags is not None:
            note["tags"] = tags

-        if priority is not None:
-            valid_priorities = ["low", "normal", "high", "urgent"]
-            if priority not in valid_priorities:
-                return {
-                    "success": False,
-                    "error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
-                }
-            note["priority"] = priority
-
        note["updated_at"] = datetime.now(UTC).isoformat()

        return {
@@ -173,7 +146,7 @@ def update_note(
        return {"success": False, "error": f"Failed to update note: {e}"}


-@register_tool
+@register_tool(sandbox_execution=False)
 def delete_note(note_id: str) -> dict[str, Any]:
    try:
        if note_id not in _notes_storage:
--- a/strix/tools/notes/notes_actions_schema.xml
+++ b/strix/tools/notes/notes_actions_schema.xml
@@ -1,10 +1,9 @@
 <tools>
  <tool name="create_note">
-    <description>Create a personal note for TODOs, side notes, plans, and organizational purposes during
-  the scan.</description>
-    <details>Use this tool for quick reminders, action items, planning thoughts, and organizational notes
-  rather than formal vulnerability reports or detailed findings. This is your personal notepad
-  for keeping track of tasks, ideas, and things to remember or follow up on.</details>
+    <description>Create a personal note for observations, findings, and research during the scan.</description>
+    <details>Use this tool for documenting discoveries, observations, methodology notes, and questions.
+  This is your personal notepad for recording information you want to remember or reference later.
+  For tracking actionable tasks, use the todo tool instead.</details>
    <parameters>
      <parameter name="title" type="string" required="true">
        <description>Title of the note</description>
@@ -13,49 +12,41 @@
        <description>Content of the note</description>
      </parameter>
      <parameter name="category" type="string" required="false">
-        <description>Category to organize the note (default: "general", "findings", "methodology", "todo", "questions", "plan")</description>
+        <description>Category to organize the note (default: "general", "findings", "methodology", "questions", "plan")</description>
      </parameter>
      <parameter name="tags" type="string" required="false">
        <description>Tags for categorization</description>
      </parameter>
-      <parameter name="priority" type="string" required="false">
-        <description>Priority level of the note ("low", "normal", "high", "urgent")</description>
-      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - note_id: ID of the created note - success: Whether the note was created successfully</description>
    </returns>
    <examples>
-  # Create a TODO reminder
-  <function=create_note>
-  <parameter=title>TODO: Check SSL Certificate Details</parameter>
-  <parameter=content>Remember to verify SSL certificate validity and check for weak ciphers
-               on the HTTPS service discovered on port 443. Also check for certificate
-               transparency logs.</parameter>
-  <parameter=category>todo</parameter>
-  <parameter=tags>["ssl", "certificate", "followup"]</parameter>
-  <parameter=priority>normal</parameter>
-  </function>
-
-  # Planning note
-  <function=create_note>
-  <parameter=title>Scan Strategy Planning</parameter>
-  <parameter=content>Plan for next phase: 1) Complete subdomain enumeration 2) Test discovered
-               web apps for OWASP Top 10 3) Check database services for default creds
-               4) Review any custom applications for business logic flaws</parameter>
-  <parameter=category>plan</parameter>
-  <parameter=tags>["planning", "strategy", "next_steps"]</parameter>
-  </function>
-
-  # Side note for later investigation
+  # Document an interesting finding
  <function=create_note>
  <parameter=title>Interesting Directory Found</parameter>
-  <parameter=content>Found /backup/ directory that might contain sensitive files. Low priority
-               for now but worth checking if time permits. Directory listing seems
-               disabled.</parameter>
+  <parameter=content>Found /backup/ directory that might contain sensitive files. Directory listing
+               seems disabled but worth investigating further.</parameter>
  <parameter=category>findings</parameter>
-  <parameter=tags>["directory", "backup", "low_priority"]</parameter>
-  <parameter=priority>low</parameter>
+  <parameter=tags>["directory", "backup"]</parameter>
+  </function>
+
+  # Methodology note
+  <function=create_note>
+  <parameter=title>Authentication Flow Analysis</parameter>
+  <parameter=content>The application uses JWT tokens stored in localStorage. Token expiration is
+               set to 24 hours. Observed that refresh token rotation is not implemented.</parameter>
+  <parameter=category>methodology</parameter>
+  <parameter=tags>["auth", "jwt", "session"]</parameter>
+  </function>
+
+  # Research question
+  <function=create_note>
+  <parameter=title>Custom Header Investigation</parameter>
+  <parameter=content>The API returns a custom X-Request-ID header. Need to research if this
+               could be used for user tracking or has any security implications.</parameter>
+  <parameter=category>questions</parameter>
+  <parameter=tags>["headers", "research"]</parameter>
  </function>
    </examples>
  </tool>
@@ -84,9 +75,6 @@
      <parameter name="tags" type="string" required="false">
        <description>Filter by tags (returns notes with any of these tags)</description>
      </parameter>
-      <parameter name="priority" type="string" required="false">
-        <description>Filter by priority level</description>
-      </parameter>
      <parameter name="search" type="string" required="false">
        <description>Search query to find in note titles and content</description>
      </parameter>
@@ -100,11 +88,6 @@
  <parameter=category>findings</parameter>
  </function>

-  # List high priority items
-  <function=list_notes>
-  <parameter=priority>high</parameter>
-  </function>
-
  # Search for SQL injection related notes
  <function=list_notes>
  <parameter=search>SQL injection</parameter>
@@ -132,9 +115,6 @@
      <parameter name="tags" type="string" required="false">
        <description>New tags for the note</description>
      </parameter>
-      <parameter name="priority" type="string" required="false">
-        <description>New priority level</description>
-      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - success: Whether the note was updated successfully</description>
@@ -143,7 +123,6 @@
  <function=update_note>
  <parameter=note_id>note_123</parameter>
  <parameter=content>Updated content with new findings...</parameter>
-  <parameter=priority>urgent</parameter>
  </function>
    </examples>
  </tool>
--- a/strix/tools/proxy/proxy_actions.py
+++ b/strix/tools/proxy/proxy_actions.py
@@ -2,8 +2,6 @@ from typing import Any, Literal

 from strix.tools.registry import register_tool

-from .proxy_manager import get_proxy_manager
-

 RequestPart = Literal["request", "response"]

@@ -27,6 +25,8 @@ def list_requests(
    sort_order: Literal["asc", "desc"] = "desc",
    scope_id: str | None = None,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.list_requests(
        httpql_filter, start_page, end_page, page_size, sort_by, sort_order, scope_id
@@ -41,6 +41,8 @@ def view_request(
    page: int = 1,
    page_size: int = 50,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.view_request(request_id, part, search_pattern, page, page_size)

@@ -53,6 +55,8 @@ def send_request(
    body: str = "",
    timeout: int = 30,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    if headers is None:
        headers = {}
    manager = get_proxy_manager()
@@ -64,6 +68,8 @@ def repeat_request(
    request_id: str,
    modifications: dict[str, Any] | None = None,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    if modifications is None:
        modifications = {}
    manager = get_proxy_manager()
@@ -78,6 +84,8 @@ def scope_rules(
    scope_id: str | None = None,
    scope_name: str | None = None,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.scope_rules(action, allowlist, denylist, scope_id, scope_name)

@@ -89,6 +97,8 @@ def list_sitemap(
    depth: Literal["DIRECT", "ALL"] = "DIRECT",
    page: int = 1,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.list_sitemap(scope_id, parent_id, depth, page)

@@ -97,5 +107,7 @@ def list_sitemap(
 def view_sitemap_entry(
    entry_id: str,
 ) -> dict[str, Any]:
+    from .proxy_manager import get_proxy_manager
+
    manager = get_proxy_manager()
    return manager.view_sitemap_entry(entry_id)
--- a/strix/tools/python/python_actions.py
+++ b/strix/tools/python/python_actions.py
@@ -2,8 +2,6 @@ from typing import Any, Literal

 from strix.tools.registry import register_tool

-from .python_manager import get_python_session_manager
-

 PythonAction = Literal["new_session", "execute", "close", "list_sessions"]

@@ -15,6 +13,8 @@ def python_action(
    timeout: int = 30,
    session_id: str | None = None,
 ) -> dict[str, Any]:
+    from .python_manager import get_python_session_manager
+
    def _validate_code(action_name: str, code: str | None) -> None:
        if not code:
            raise ValueError(f"code parameter is required for {action_name} action")
--- a/strix/tools/terminal/terminal_actions.py
+++ b/strix/tools/terminal/terminal_actions.py
@@ -2,8 +2,6 @@ from typing import Any

 from strix.tools.registry import register_tool

-from .terminal_manager import get_terminal_manager
-

@register_tool
 def terminal_execute(
@@ -13,6 +11,8 @@ def terminal_execute(
    terminal_id: str | None = None,
    no_enter: bool = False,
 ) -> dict[str, Any]:
+    from .terminal_manager import get_terminal_manager
+
    manager = get_terminal_manager()

    try:
--- a/strix/tools/todo/init.py
+++ b/strix/tools/todo/init.py
@@ -0,0 +1,18 @@
+from .todo_actions import (
+    create_todo,
+    delete_todo,
+    list_todos,
+    mark_todo_done,
+    mark_todo_pending,
+    update_todo,
+)
+
+
+__all__ = [
+    "create_todo",
+    "delete_todo",
+    "list_todos",
+    "mark_todo_done",
+    "mark_todo_pending",
+    "update_todo",
+]
--- a/strix/tools/todo/todo_actions.py
+++ b/strix/tools/todo/todo_actions.py
@@ -0,0 +1,568 @@
+import json
+import uuid
+from datetime import UTC, datetime
+from typing import Any
+
+from strix.tools.registry import register_tool
+
+
+VALID_PRIORITIES = ["low", "normal", "high", "critical"]
+VALID_STATUSES = ["pending", "in_progress", "done"]
+
+_todos_storage: dict[str, dict[str, dict[str, Any]]] = {}
+
+
+def _get_agent_todos(agent_id: str) -> dict[str, dict[str, Any]]:
+    if agent_id not in _todos_storage:
+        _todos_storage[agent_id] = {}
+    return _todos_storage[agent_id]
+
+
+def _normalize_priority(priority: str | None, default: str = "normal") -> str:
+    candidate = (priority or default or "normal").lower()
+    if candidate not in VALID_PRIORITIES:
+        raise ValueError(f"Invalid priority. Must be one of: {', '.join(VALID_PRIORITIES)}")
+    return candidate
+
+
+def _sorted_todos(agent_id: str) -> list[dict[str, Any]]:
+    agent_todos = _get_agent_todos(agent_id)
+
+    todos_list: list[dict[str, Any]] = []
+    for todo_id, todo in agent_todos.items():
+        entry = todo.copy()
+        entry["todo_id"] = todo_id
+        todos_list.append(entry)
+
+    priority_order = {"critical": 0, "high": 1, "normal": 2, "low": 3}
+    status_order = {"done": 0, "in_progress": 1, "pending": 2}
+
+    todos_list.sort(
+        key=lambda x: (
+            status_order.get(x.get("status", "pending"), 99),
+            priority_order.get(x.get("priority", "normal"), 99),
+            x.get("created_at", ""),
+        )
+    )
+    return todos_list
+
+
+def _normalize_todo_ids(raw_ids: Any) -> list[str]:
+    if raw_ids is None:
+        return []
+
+    if isinstance(raw_ids, str):
+        stripped = raw_ids.strip()
+        if not stripped:
+            return []
+        try:
+            data = json.loads(stripped)
+        except json.JSONDecodeError:
+            data = stripped.split(",") if "," in stripped else [stripped]
+        if isinstance(data, list):
+            return [str(item).strip() for item in data if str(item).strip()]
+        return [str(data).strip()]
+
+    if isinstance(raw_ids, list):
+        return [str(item).strip() for item in raw_ids if str(item).strip()]
+
+    return [str(raw_ids).strip()]
+
+
+def _normalize_bulk_updates(raw_updates: Any) -> list[dict[str, Any]]:
+    if raw_updates is None:
+        return []
+
+    data = raw_updates
+    if isinstance(raw_updates, str):
+        stripped = raw_updates.strip()
+        if not stripped:
+            return []
+        try:
+            data = json.loads(stripped)
+        except json.JSONDecodeError as e:
+            raise ValueError("Updates must be valid JSON") from e
+
+    if isinstance(data, dict):
+        data = [data]
+
+    if not isinstance(data, list):
+        raise TypeError("Updates must be a list of update objects")
+
+    normalized: list[dict[str, Any]] = []
+    for item in data:
+        if not isinstance(item, dict):
+            raise TypeError("Each update must be an object with todo_id")
+
+        todo_id = item.get("todo_id") or item.get("id")
+        if not todo_id:
+            raise ValueError("Each update must include 'todo_id'")
+
+        normalized.append(
+            {
+                "todo_id": str(todo_id).strip(),
+                "title": item.get("title"),
+                "description": item.get("description"),
+                "priority": item.get("priority"),
+                "status": item.get("status"),
+            }
+        )
+
+    return normalized
+
+
+def _normalize_bulk_todos(raw_todos: Any) -> list[dict[str, Any]]:
+    if raw_todos is None:
+        return []
+
+    data = raw_todos
+    if isinstance(raw_todos, str):
+        stripped = raw_todos.strip()
+        if not stripped:
+            return []
+        try:
+            data = json.loads(stripped)
+        except json.JSONDecodeError:
+            entries = [line.strip(" -*\t") for line in stripped.splitlines() if line.strip(" -*\t")]
+            return [{"title": entry} for entry in entries]
+
+    if isinstance(data, dict):
+        data = [data]
+
+    if not isinstance(data, list):
+        raise TypeError("Todos must be provided as a list, dict, or JSON string")
+
+    normalized: list[dict[str, Any]] = []
+    for item in data:
+        if isinstance(item, str):
+            title = item.strip()
+            if title:
+                normalized.append({"title": title})
+            continue
+
+        if not isinstance(item, dict):
+            raise TypeError("Each todo entry must be a string or object with a title")
+
+        title = item.get("title", "")
+        if not isinstance(title, str) or not title.strip():
+            raise ValueError("Each todo entry must include a non-empty 'title'")
+
+        normalized.append(
+            {
+                "title": title.strip(),
+                "description": (item.get("description") or "").strip() or None,
+                "priority": item.get("priority"),
+            }
+        )
+
+    return normalized
+
+
+@register_tool(sandbox_execution=False)
+def create_todo(
+    agent_state: Any,
+    title: str | None = None,
+    description: str | None = None,
+    priority: str = "normal",
+    todos: Any | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        default_priority = _normalize_priority(priority)
+
+        tasks_to_create: list[dict[str, Any]] = []
+
+        if todos is not None:
+            tasks_to_create.extend(_normalize_bulk_todos(todos))
+
+        if title and title.strip():
+            tasks_to_create.append(
+                {
+                    "title": title.strip(),
+                    "description": description.strip() if description else None,
+                    "priority": default_priority,
+                }
+            )
+
+        if not tasks_to_create:
+            return {
+                "success": False,
+                "error": "Provide a title or 'todos' list to create.",
+                "todo_id": None,
+            }
+
+        agent_todos = _get_agent_todos(agent_id)
+        created: list[dict[str, Any]] = []
+
+        for task in tasks_to_create:
+            task_priority = _normalize_priority(task.get("priority"), default_priority)
+            todo_id = str(uuid.uuid4())[:6]
+            timestamp = datetime.now(UTC).isoformat()
+
+            todo = {
+                "title": task["title"],
+                "description": task.get("description"),
+                "priority": task_priority,
+                "status": "pending",
+                "created_at": timestamp,
+                "updated_at": timestamp,
+                "completed_at": None,
+            }
+
+            agent_todos[todo_id] = todo
+            created.append(
+                {
+                    "todo_id": todo_id,
+                    "title": task["title"],
+                    "priority": task_priority,
+                }
+            )
+
+    except (ValueError, TypeError) as e:
+        return {"success": False, "error": f"Failed to create todo: {e}", "todo_id": None}
+    else:
+        todos_list = _sorted_todos(agent_id)
+
+        response: dict[str, Any] = {
+            "success": True,
+            "created": created,
+            "count": len(created),
+            "todos": todos_list,
+            "total_count": len(todos_list),
+        }
+        return response
+
+
+@register_tool(sandbox_execution=False)
+def list_todos(
+    agent_state: Any,
+    status: str | None = None,
+    priority: str | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        agent_todos = _get_agent_todos(agent_id)
+
+        status_filter = status.lower() if isinstance(status, str) else None
+        priority_filter = priority.lower() if isinstance(priority, str) else None
+
+        todos_list = []
+        for todo_id, todo in agent_todos.items():
+            if status_filter and todo.get("status") != status_filter:
+                continue
+
+            if priority_filter and todo.get("priority") != priority_filter:
+                continue
+
+            todo_with_id = todo.copy()
+            todo_with_id["todo_id"] = todo_id
+            todos_list.append(todo_with_id)
+
+        priority_order = {"critical": 0, "high": 1, "normal": 2, "low": 3}
+        status_order = {"done": 0, "in_progress": 1, "pending": 2}
+
+        todos_list.sort(
+            key=lambda x: (
+                status_order.get(x.get("status", "pending"), 99),
+                priority_order.get(x.get("priority", "normal"), 99),
+                x.get("created_at", ""),
+            )
+        )
+
+        summary_counts = {
+            "pending": 0,
+            "in_progress": 0,
+            "done": 0,
+        }
+        for todo in todos_list:
+            status_value = todo.get("status", "pending")
+            if status_value not in summary_counts:
+                summary_counts[status_value] = 0
+            summary_counts[status_value] += 1
+
+        return {
+            "success": True,
+            "todos": todos_list,
+            "total_count": len(todos_list),
+            "summary": summary_counts,
+        }
+
+    except (ValueError, TypeError) as e:
+        return {
+            "success": False,
+            "error": f"Failed to list todos: {e}",
+            "todos": [],
+            "total_count": 0,
+            "summary": {"pending": 0, "in_progress": 0, "done": 0},
+        }
+
+
+def _apply_single_update(
+    agent_todos: dict[str, dict[str, Any]],
+    todo_id: str,
+    title: str | None = None,
+    description: str | None = None,
+    priority: str | None = None,
+    status: str | None = None,
+) -> dict[str, Any] | None:
+    if todo_id not in agent_todos:
+        return {"todo_id": todo_id, "error": f"Todo with ID '{todo_id}' not found"}
+
+    todo = agent_todos[todo_id]
+
+    if title is not None:
+        if not title.strip():
+            return {"todo_id": todo_id, "error": "Title cannot be empty"}
+        todo["title"] = title.strip()
+
+    if description is not None:
+        todo["description"] = description.strip() if description else None
+
+    if priority is not None:
+        try:
+            todo["priority"] = _normalize_priority(priority, str(todo.get("priority", "normal")))
+        except ValueError as exc:
+            return {"todo_id": todo_id, "error": str(exc)}
+
+    if status is not None:
+        status_candidate = status.lower()
+        if status_candidate not in VALID_STATUSES:
+            return {
+                "todo_id": todo_id,
+                "error": f"Invalid status. Must be one of: {', '.join(VALID_STATUSES)}",
+            }
+        todo["status"] = status_candidate
+        if status_candidate == "done":
+            todo["completed_at"] = datetime.now(UTC).isoformat()
+        else:
+            todo["completed_at"] = None
+
+    todo["updated_at"] = datetime.now(UTC).isoformat()
+    return None
+
+
+@register_tool(sandbox_execution=False)
+def update_todo(
+    agent_state: Any,
+    todo_id: str | None = None,
+    title: str | None = None,
+    description: str | None = None,
+    priority: str | None = None,
+    status: str | None = None,
+    updates: Any | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        agent_todos = _get_agent_todos(agent_id)
+
+        updates_to_apply: list[dict[str, Any]] = []
+
+        if updates is not None:
+            updates_to_apply.extend(_normalize_bulk_updates(updates))
+
+        if todo_id is not None:
+            updates_to_apply.append(
+                {
+                    "todo_id": todo_id,
+                    "title": title,
+                    "description": description,
+                    "priority": priority,
+                    "status": status,
+                }
+            )
+
+        if not updates_to_apply:
+            return {
+                "success": False,
+                "error": "Provide todo_id or 'updates' list to update.",
+            }
+
+        updated: list[str] = []
+        errors: list[dict[str, Any]] = []
+
+        for update in updates_to_apply:
+            error = _apply_single_update(
+                agent_todos,
+                update["todo_id"],
+                update.get("title"),
+                update.get("description"),
+                update.get("priority"),
+                update.get("status"),
+            )
+            if error:
+                errors.append(error)
+            else:
+                updated.append(update["todo_id"])
+
+        todos_list = _sorted_todos(agent_id)
+
+        response: dict[str, Any] = {
+            "success": len(errors) == 0,
+            "updated": updated,
+            "updated_count": len(updated),
+            "todos": todos_list,
+            "total_count": len(todos_list),
+        }
+
+        if errors:
+            response["errors"] = errors
+
+    except (ValueError, TypeError) as e:
+        return {"success": False, "error": str(e)}
+    else:
+        return response
+
+
+@register_tool(sandbox_execution=False)
+def mark_todo_done(
+    agent_state: Any,
+    todo_id: str | None = None,
+    todo_ids: Any | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        agent_todos = _get_agent_todos(agent_id)
+
+        ids_to_mark: list[str] = []
+        if todo_ids is not None:
+            ids_to_mark.extend(_normalize_todo_ids(todo_ids))
+        if todo_id is not None:
+            ids_to_mark.append(todo_id)
+
+        if not ids_to_mark:
+            return {"success": False, "error": "Provide todo_id or todo_ids to mark as done."}
+
+        marked: list[str] = []
+        errors: list[dict[str, Any]] = []
+        timestamp = datetime.now(UTC).isoformat()
+
+        for tid in ids_to_mark:
+            if tid not in agent_todos:
+                errors.append({"todo_id": tid, "error": f"Todo with ID '{tid}' not found"})
+                continue
+
+            todo = agent_todos[tid]
+            todo["status"] = "done"
+            todo["completed_at"] = timestamp
+            todo["updated_at"] = timestamp
+            marked.append(tid)
+
+        todos_list = _sorted_todos(agent_id)
+
+        response: dict[str, Any] = {
+            "success": len(errors) == 0,
+            "marked_done": marked,
+            "marked_count": len(marked),
+            "todos": todos_list,
+            "total_count": len(todos_list),
+        }
+
+        if errors:
+            response["errors"] = errors
+
+    except (ValueError, TypeError) as e:
+        return {"success": False, "error": str(e)}
+    else:
+        return response
+
+
+@register_tool(sandbox_execution=False)
+def mark_todo_pending(
+    agent_state: Any,
+    todo_id: str | None = None,
+    todo_ids: Any | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        agent_todos = _get_agent_todos(agent_id)
+
+        ids_to_mark: list[str] = []
+        if todo_ids is not None:
+            ids_to_mark.extend(_normalize_todo_ids(todo_ids))
+        if todo_id is not None:
+            ids_to_mark.append(todo_id)
+
+        if not ids_to_mark:
+            return {"success": False, "error": "Provide todo_id or todo_ids to mark as pending."}
+
+        marked: list[str] = []
+        errors: list[dict[str, Any]] = []
+        timestamp = datetime.now(UTC).isoformat()
+
+        for tid in ids_to_mark:
+            if tid not in agent_todos:
+                errors.append({"todo_id": tid, "error": f"Todo with ID '{tid}' not found"})
+                continue
+
+            todo = agent_todos[tid]
+            todo["status"] = "pending"
+            todo["completed_at"] = None
+            todo["updated_at"] = timestamp
+            marked.append(tid)
+
+        todos_list = _sorted_todos(agent_id)
+
+        response: dict[str, Any] = {
+            "success": len(errors) == 0,
+            "marked_pending": marked,
+            "marked_count": len(marked),
+            "todos": todos_list,
+            "total_count": len(todos_list),
+        }
+
+        if errors:
+            response["errors"] = errors
+
+    except (ValueError, TypeError) as e:
+        return {"success": False, "error": str(e)}
+    else:
+        return response
+
+
+@register_tool(sandbox_execution=False)
+def delete_todo(
+    agent_state: Any,
+    todo_id: str | None = None,
+    todo_ids: Any | None = None,
+) -> dict[str, Any]:
+    try:
+        agent_id = agent_state.agent_id
+        agent_todos = _get_agent_todos(agent_id)
+
+        ids_to_delete: list[str] = []
+        if todo_ids is not None:
+            ids_to_delete.extend(_normalize_todo_ids(todo_ids))
+        if todo_id is not None:
+            ids_to_delete.append(todo_id)
+
+        if not ids_to_delete:
+            return {"success": False, "error": "Provide todo_id or todo_ids to delete."}
+
+        deleted: list[str] = []
+        errors: list[dict[str, Any]] = []
+
+        for tid in ids_to_delete:
+            if tid not in agent_todos:
+                errors.append({"todo_id": tid, "error": f"Todo with ID '{tid}' not found"})
+                continue
+
+            del agent_todos[tid]
+            deleted.append(tid)
+
+        todos_list = _sorted_todos(agent_id)
+
+        response: dict[str, Any] = {
+            "success": len(errors) == 0,
+            "deleted": deleted,
+            "deleted_count": len(deleted),
+            "todos": todos_list,
+            "total_count": len(todos_list),
+        }
+
+        if errors:
+            response["errors"] = errors
+
+    except (ValueError, TypeError) as e:
+        return {"success": False, "error": str(e)}
+    else:
+        return response
--- a/strix/tools/todo/todo_actions_schema.xml
+++ b/strix/tools/todo/todo_actions_schema.xml
@@ -0,0 +1,225 @@
+<tools>
+  <important>
+  The todo tool is available for organizing complex tasks when needed. Each subagent has their own
+  separate todo list - your todos are private to you and do not interfere with other agents' todos.
+
+  WHEN TO USE TODOS:
+  - Planning complex multi-step operations
+  - Tracking multiple parallel workstreams
+  - When you need to remember tasks to return to later
+  - Organizing large-scope assessments with many components
+
+  WHEN NOT NEEDED:
+  - Simple, straightforward tasks
+  - Linear workflows where progress is obvious
+  - Short tasks that can be completed quickly
+
+  If you do use todos, batch operations together to minimize tool calls.
+  </important>
+
+  <tool name="create_todo">
+    <description>Create a new todo item to track tasks, goals, and progress.</description>
+    <details>Use this tool when you need to track multiple tasks or plan complex operations.
+  Each subagent maintains their own independent todo list - your todos are yours alone.
+
+  Useful for breaking down complex tasks into smaller, manageable items when the workflow
+  is non-trivial or when you need to track progress across multiple components.</details>
+    <parameters>
+      <parameter name="title" type="string" required="false">
+        <description>Short, actionable title for the todo (e.g., "Test login endpoint for SQL injection")</description>
+      </parameter>
+      <parameter name="todos" type="string" required="false">
+        <description>Create multiple todos at once. Provide a JSON array of {"title": "...", "description": "...", "priority": "..."} objects or a newline-separated bullet list.</description>
+      </parameter>
+      <parameter name="description" type="string" required="false">
+        <description>Detailed description or notes about the task</description>
+      </parameter>
+      <parameter name="priority" type="string" required="false">
+        <description>Priority level: "low", "normal", "high", "critical" (default: "normal")</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - created: List of created todos with their IDs - todos: Full sorted todo list - success: Whether the operation succeeded</description>
+    </returns>
+    <examples>
+  # Create a high priority todo
+  <function=create_todo>
+  <parameter=title>Test authentication bypass on /api/admin</parameter>
+  <parameter=description>The admin endpoint seems to have weak authentication. Try JWT manipulation, session fixation, and privilege escalation.</parameter>
+  <parameter=priority>high</parameter>
+  </function>
+
+  # Create a simple todo
+  <function=create_todo>
+  <parameter=title>Enumerate all API endpoints</parameter>
+  </function>
+
+  # Bulk create todos (JSON array)
+  <function=create_todo>
+  <parameter=todos>[{"title": "Map all admin routes", "priority": "high"}, {"title": "Check forgotten password flow"}]</parameter>
+  </function>
+
+  # Bulk create todos (bullet list)
+  <function=create_todo>
+  <parameter=todos>
+  - Capture baseline traffic in proxy
+  - Enumerate S3 buckets for leaked assets
+  - Compare responses for timing differences
+  </parameter>
+  </function>
+    </examples>
+  </tool>
+
+  <tool name="list_todos">
+    <description>List all todos with optional filtering by status or priority.</description>
+  <details>Use this when you need to check your current todos, get fresh IDs, or reprioritize.
+  The list is sorted: done first, then in_progress, then pending. Within each status, sorted by priority (critical > high > normal > low).
+  Each subagent has their own independent todo list.</details>
+    <parameters>
+      <parameter name="status" type="string" required="false">
+        <description>Filter by status: "pending", "in_progress", "done"</description>
+      </parameter>
+      <parameter name="priority" type="string" required="false">
+        <description>Filter by priority: "low", "normal", "high", "critical"</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - todos: List of todo items - total_count: Total number of todos - summary: Count by status (pending, in_progress, done)</description>
+    </returns>
+    <examples>
+  # List all todos
+  <function=list_todos>
+  </function>
+
+  # List only pending todos
+  <function=list_todos>
+  <parameter=status>pending</parameter>
+  </function>
+
+  # List high priority items
+  <function=list_todos>
+  <parameter=priority>high</parameter>
+  </function>
+    </examples>
+  </tool>
+
+  <tool name="update_todo">
+  <description>Update one or multiple todo items. Prefer bulk updates in a single call when updating multiple items.</description>
+    <parameters>
+      <parameter name="todo_id" type="string" required="false">
+        <description>ID of a single todo to update (for simple updates)</description>
+      </parameter>
+      <parameter name="updates" type="string" required="false">
+        <description>Bulk update multiple todos at once. JSON array of objects with todo_id and fields to update: [{"todo_id": "abc", "status": "done"}, {"todo_id": "def", "priority": "high"}].</description>
+      </parameter>
+      <parameter name="title" type="string" required="false">
+        <description>New title (used with todo_id)</description>
+      </parameter>
+      <parameter name="description" type="string" required="false">
+        <description>New description (used with todo_id)</description>
+      </parameter>
+      <parameter name="priority" type="string" required="false">
+        <description>New priority: "low", "normal", "high", "critical" (used with todo_id)</description>
+      </parameter>
+      <parameter name="status" type="string" required="false">
+        <description>New status: "pending", "in_progress", "done" (used with todo_id)</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - updated: List of updated todo IDs - updated_count: Number updated - todos: Full sorted todo list - errors: Any failed updates</description>
+    </returns>
+    <examples>
+  # Single update
+  <function=update_todo>
+  <parameter=todo_id>abc123</parameter>
+  <parameter=status>in_progress</parameter>
+  </function>
+
+  # Bulk update - mark multiple todos with different statuses in ONE call
+  <function=update_todo>
+  <parameter=updates>[{"todo_id": "abc123", "status": "done"}, {"todo_id": "def456", "status": "in_progress"}, {"todo_id": "ghi789", "priority": "critical"}]</parameter>
+  </function>
+    </examples>
+  </tool>
+
+  <tool name="mark_todo_done">
+  <description>Mark one or multiple todos as completed in a single call.</description>
+  <details>Mark todos as done after completing them. Group multiple completions into one call using todo_ids when possible.</details>
+    <parameters>
+      <parameter name="todo_id" type="string" required="false">
+        <description>ID of a single todo to mark as done</description>
+      </parameter>
+      <parameter name="todo_ids" type="string" required="false">
+        <description>Mark multiple todos done at once. JSON array of IDs: ["abc123", "def456"] or comma-separated: "abc123, def456"</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - marked_done: List of IDs marked done - marked_count: Number marked - todos: Full sorted list - errors: Any failures</description>
+    </returns>
+    <examples>
+  # Mark single todo done
+  <function=mark_todo_done>
+  <parameter=todo_id>abc123</parameter>
+  </function>
+
+  # Mark multiple todos done in ONE call
+  <function=mark_todo_done>
+  <parameter=todo_ids>["abc123", "def456", "ghi789"]</parameter>
+  </function>
+    </examples>
+  </tool>
+
+  <tool name="mark_todo_pending">
+    <description>Mark one or multiple todos as pending (reopen completed tasks).</description>
+    <details>Use this to reopen tasks that were marked done but need more work. Supports bulk operations.</details>
+    <parameters>
+      <parameter name="todo_id" type="string" required="false">
+        <description>ID of a single todo to mark as pending</description>
+      </parameter>
+      <parameter name="todo_ids" type="string" required="false">
+        <description>Mark multiple todos pending at once. JSON array of IDs: ["abc123", "def456"] or comma-separated: "abc123, def456"</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - marked_pending: List of IDs marked pending - marked_count: Number marked - todos: Full sorted list - errors: Any failures</description>
+    </returns>
+    <examples>
+  # Mark single todo pending
+  <function=mark_todo_pending>
+  <parameter=todo_id>abc123</parameter>
+  </function>
+
+  # Mark multiple todos pending in ONE call
+  <function=mark_todo_pending>
+  <parameter=todo_ids>["abc123", "def456"]</parameter>
+  </function>
+    </examples>
+  </tool>
+
+  <tool name="delete_todo">
+    <description>Delete one or multiple todos in a single call.</description>
+    <details>Use this to remove todos that are no longer relevant. Supports bulk deletion to save tool calls.</details>
+    <parameters>
+      <parameter name="todo_id" type="string" required="false">
+        <description>ID of a single todo to delete</description>
+      </parameter>
+      <parameter name="todo_ids" type="string" required="false">
+        <description>Delete multiple todos at once. JSON array of IDs: ["abc123", "def456"] or comma-separated: "abc123, def456"</description>
+      </parameter>
+    </parameters>
+    <returns type="Dict[str, Any]">
+      <description>Response containing: - deleted: List of deleted IDs - deleted_count: Number deleted - todos: Remaining todos - errors: Any failures</description>
+    </returns>
+    <examples>
+  # Delete single todo
+  <function=delete_todo>
+  <parameter=todo_id>abc123</parameter>
+  </function>
+
+  # Delete multiple todos in ONE call
+  <function=delete_todo>
+  <parameter=todo_ids>["abc123", "def456", "ghi789"]</parameter>
+  </function>
+    </examples>
+  </tool>
+</tools>
--- a/tests/init.py
+++ b/tests/init.py
@@ -0,0 +1 @@
+# Strix Test Suite
--- a/tests/agents/init.py
+++ b/tests/agents/init.py
@@ -0,0 +1 @@
+"""Tests for strix.agents module."""
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -0,0 +1 @@
+"""Pytest configuration and shared fixtures for Strix tests."""
--- a/tests/interface/init.py
+++ b/tests/interface/init.py
@@ -0,0 +1 @@
+"""Tests for strix.interface module."""
--- a/tests/llm/init.py
+++ b/tests/llm/init.py
@@ -0,0 +1 @@
+"""Tests for strix.llm module."""
--- a/tests/runtime/init.py
+++ b/tests/runtime/init.py
@@ -0,0 +1 @@
+"""Tests for strix.runtime module."""
--- a/tests/telemetry/init.py
+++ b/tests/telemetry/init.py
@@ -0,0 +1 @@
+"""Tests for strix.telemetry module."""
--- a/tests/tools/init.py
+++ b/tests/tools/init.py
@@ -0,0 +1 @@
+"""Tests for strix.tools module."""
--- a/tests/tools/conftest.py
+++ b/tests/tools/conftest.py
@@ -0,0 +1,34 @@
+"""Fixtures for strix.tools tests."""
+
+from collections.abc import Callable
+from typing import Any
+
+import pytest
+
+
+@pytest.fixture
+def sample_function_with_types() -> Callable[..., None]:
+    """Create a sample function with type annotations for testing argument conversion."""
+
+    def func(
+        name: str,
+        count: int,
+        enabled: bool,
+        ratio: float,
+        items: list[Any],
+        config: dict[str, Any],
+        optional: str | None = None,
+    ) -> None:
+        pass
+
+    return func
+
+
+@pytest.fixture
+def sample_function_no_annotations() -> Callable[..., None]:
+    """Create a sample function without type annotations."""
+
+    def func(arg1, arg2, arg3):  # type: ignore[no-untyped-def]
+        pass
+
+    return func
--- a/tests/tools/test_argument_parser.py
+++ b/tests/tools/test_argument_parser.py
@@ -0,0 +1,271 @@
+from collections.abc import Callable
+
+import pytest
+
+from strix.tools.argument_parser import (
+    ArgumentConversionError,
+    _convert_basic_types,
+    _convert_to_bool,
+    _convert_to_dict,
+    _convert_to_list,
+    convert_arguments,
+    convert_string_to_type,
+)
+
+
+class TestConvertToBool:
+    """Tests for the _convert_to_bool function."""
+
+    @pytest.mark.parametrize(
+        "value",
+        ["true", "True", "TRUE", "1", "yes", "Yes", "YES", "on", "On", "ON"],
+    )
+    def test_truthy_values(self, value: str) -> None:
+        """Test that truthy string values are converted to True."""
+        assert _convert_to_bool(value) is True
+
+    @pytest.mark.parametrize(
+        "value",
+        ["false", "False", "FALSE", "0", "no", "No", "NO", "off", "Off", "OFF"],
+    )
+    def test_falsy_values(self, value: str) -> None:
+        """Test that falsy string values are converted to False."""
+        assert _convert_to_bool(value) is False
+
+    def test_non_standard_truthy_string(self) -> None:
+        """Test that non-empty non-standard strings are truthy."""
+        assert _convert_to_bool("anything") is True
+        assert _convert_to_bool("hello") is True
+
+    def test_empty_string(self) -> None:
+        """Test that empty string is falsy."""
+        assert _convert_to_bool("") is False
+
+
+class TestConvertToList:
+    """Tests for the _convert_to_list function."""
+
+    def test_json_array_string(self) -> None:
+        """Test parsing a JSON array string."""
+        result = _convert_to_list('["a", "b", "c"]')
+        assert result == ["a", "b", "c"]
+
+    def test_json_array_with_numbers(self) -> None:
+        """Test parsing a JSON array with numbers."""
+        result = _convert_to_list("[1, 2, 3]")
+        assert result == [1, 2, 3]
+
+    def test_comma_separated_string(self) -> None:
+        """Test parsing a comma-separated string."""
+        result = _convert_to_list("a, b, c")
+        assert result == ["a", "b", "c"]
+
+    def test_comma_separated_no_spaces(self) -> None:
+        """Test parsing comma-separated values without spaces."""
+        result = _convert_to_list("x,y,z")
+        assert result == ["x", "y", "z"]
+
+    def test_single_value(self) -> None:
+        """Test that a single value returns a list with one element."""
+        result = _convert_to_list("single")
+        assert result == ["single"]
+
+    def test_json_non_array_wraps_in_list(self) -> None:
+        """Test that a valid JSON non-array value is wrapped in a list."""
+        result = _convert_to_list('"string"')
+        assert result == ["string"]
+
+    def test_json_object_wraps_in_list(self) -> None:
+        """Test that a JSON object is wrapped in a list."""
+        result = _convert_to_list('{"key": "value"}')
+        assert result == [{"key": "value"}]
+
+    def test_empty_json_array(self) -> None:
+        """Test parsing an empty JSON array."""
+        result = _convert_to_list("[]")
+        assert result == []
+
+
+class TestConvertToDict:
+    """Tests for the _convert_to_dict function."""
+
+    def test_valid_json_object(self) -> None:
+        """Test parsing a valid JSON object string."""
+        result = _convert_to_dict('{"key": "value", "number": 42}')
+        assert result == {"key": "value", "number": 42}
+
+    def test_empty_json_object(self) -> None:
+        """Test parsing an empty JSON object."""
+        result = _convert_to_dict("{}")
+        assert result == {}
+
+    def test_invalid_json_returns_empty_dict(self) -> None:
+        """Test that invalid JSON returns an empty dictionary."""
+        result = _convert_to_dict("not json")
+        assert result == {}
+
+    def test_json_array_returns_empty_dict(self) -> None:
+        """Test that a JSON array returns an empty dictionary."""
+        result = _convert_to_dict("[1, 2, 3]")
+        assert result == {}
+
+    def test_nested_json_object(self) -> None:
+        """Test parsing a nested JSON object."""
+        result = _convert_to_dict('{"outer": {"inner": "value"}}')
+        assert result == {"outer": {"inner": "value"}}
+
+
+class TestConvertBasicTypes:
+    """Tests for the _convert_basic_types function."""
+
+    def test_convert_to_int(self) -> None:
+        """Test converting string to int."""
+        assert _convert_basic_types("42", int) == 42
+        assert _convert_basic_types("-10", int) == -10
+
+    def test_convert_to_float(self) -> None:
+        """Test converting string to float."""
+        assert _convert_basic_types("3.14", float) == 3.14
+        assert _convert_basic_types("-2.5", float) == -2.5
+
+    def test_convert_to_str(self) -> None:
+        """Test converting string to str (passthrough)."""
+        assert _convert_basic_types("hello", str) == "hello"
+
+    def test_convert_to_bool(self) -> None:
+        """Test converting string to bool."""
+        assert _convert_basic_types("true", bool) is True
+        assert _convert_basic_types("false", bool) is False
+
+    def test_convert_to_list_type(self) -> None:
+        """Test converting to list type."""
+        result = _convert_basic_types("[1, 2, 3]", list)
+        assert result == [1, 2, 3]
+
+    def test_convert_to_dict_type(self) -> None:
+        """Test converting to dict type."""
+        result = _convert_basic_types('{"a": 1}', dict)
+        assert result == {"a": 1}
+
+    def test_unknown_type_attempts_json(self) -> None:
+        """Test that unknown types attempt JSON parsing."""
+        result = _convert_basic_types('{"key": "value"}', object)
+        assert result == {"key": "value"}
+
+    def test_unknown_type_returns_original(self) -> None:
+        """Test that unparseable values are returned as-is."""
+        result = _convert_basic_types("plain text", object)
+        assert result == "plain text"
+
+
+class TestConvertStringToType:
+    """Tests for the convert_string_to_type function."""
+
+    def test_basic_type_conversion(self) -> None:
+        """Test basic type conversions."""
+        assert convert_string_to_type("42", int) == 42
+        assert convert_string_to_type("3.14", float) == 3.14
+        assert convert_string_to_type("true", bool) is True
+
+    def test_optional_type(self) -> None:
+        """Test conversion with Optional type."""
+        result = convert_string_to_type("42", int | None)
+        assert result == 42
+
+    def test_union_type(self) -> None:
+        """Test conversion with Union type."""
+        result = convert_string_to_type("42", int | str)
+        assert result == 42
+
+    def test_union_type_with_none(self) -> None:
+        """Test conversion with Union including None."""
+        result = convert_string_to_type("hello", str | None)
+        assert result == "hello"
+
+    def test_modern_union_syntax(self) -> None:
+        """Test conversion with modern union syntax (int | None)."""
+        result = convert_string_to_type("100", int | None)
+        assert result == 100
+
+
+class TestConvertArguments:
+    """Tests for the convert_arguments function."""
+
+    def test_converts_typed_arguments(
+        self, sample_function_with_types: Callable[..., None]
+    ) -> None:
+        """Test that arguments are converted based on type annotations."""
+        kwargs = {
+            "name": "test",
+            "count": "5",
+            "enabled": "true",
+            "ratio": "2.5",
+            "items": "[1, 2, 3]",
+            "config": '{"key": "value"}',
+        }
+        result = convert_arguments(sample_function_with_types, kwargs)
+
+        assert result["name"] == "test"
+        assert result["count"] == 5
+        assert result["enabled"] is True
+        assert result["ratio"] == 2.5
+        assert result["items"] == [1, 2, 3]
+        assert result["config"] == {"key": "value"}
+
+    def test_passes_through_none_values(
+        self, sample_function_with_types: Callable[..., None]
+    ) -> None:
+        """Test that None values are passed through unchanged."""
+        kwargs = {"name": "test", "count": None}
+        result = convert_arguments(sample_function_with_types, kwargs)
+        assert result["count"] is None
+
+    def test_passes_through_non_string_values(
+        self, sample_function_with_types: Callable[..., None]
+    ) -> None:
+        """Test that non-string values are passed through unchanged."""
+        kwargs = {"name": "test", "count": 42}
+        result = convert_arguments(sample_function_with_types, kwargs)
+        assert result["count"] == 42
+
+    def test_unknown_parameter_passed_through(
+        self, sample_function_with_types: Callable[..., None]
+    ) -> None:
+        """Test that parameters not in signature are passed through."""
+        kwargs = {"name": "test", "unknown_param": "value"}
+        result = convert_arguments(sample_function_with_types, kwargs)
+        assert result["unknown_param"] == "value"
+
+    def test_function_without_annotations(
+        self, sample_function_no_annotations: Callable[..., None]
+    ) -> None:
+        """Test handling of functions without type annotations."""
+        kwargs = {"arg1": "value1", "arg2": "42"}
+        result = convert_arguments(sample_function_no_annotations, kwargs)
+        assert result["arg1"] == "value1"
+        assert result["arg2"] == "42"
+
+    def test_raises_error_on_conversion_failure(
+        self, sample_function_with_types: Callable[..., None]
+    ) -> None:
+        """Test that ArgumentConversionError is raised on conversion failure."""
+        kwargs = {"count": "not_a_number"}
+        with pytest.raises(ArgumentConversionError) as exc_info:
+            convert_arguments(sample_function_with_types, kwargs)
+        assert exc_info.value.param_name == "count"
+
+
+class TestArgumentConversionError:
+    """Tests for the ArgumentConversionError exception class."""
+
+    def test_error_with_param_name(self) -> None:
+        """Test creating error with parameter name."""
+        error = ArgumentConversionError("Test error", param_name="test_param")
+        assert error.param_name == "test_param"
+        assert str(error) == "Test error"
+
+    def test_error_without_param_name(self) -> None:
+        """Test creating error without parameter name."""
+        error = ArgumentConversionError("Test error")
+        assert error.param_name is None
+        assert str(error) == "Test error"
Author	SHA1	Message	Date
0xallam	78b6c26652	enhance todo tool prompt	2025-12-15 10:26:59 -08:00
0xallam	d649a7c70b	Update README.md	2025-12-15 10:11:08 -08:00
0xallam	d96852de55	chore: bump version to 0.5.0	2025-12-15 08:21:03 -08:00
0xallam	eb0c52b720	feat: add PyInstaller build for standalone binary distribution - Add PyInstaller spec file and build script for creating standalone executables - Add install.sh for curl \| sh installation from GitHub releases - Add GitHub Actions workflow for multi-platform builds (macOS, Linux, Windows) - Move sandbox-only deps (playwright, ipython, libtmux, etc.) to optional extras - Make google-cloud-aiplatform optional ([vertex] extra) to reduce binary size - Use lazy imports in tool actions to avoid loading sandbox deps at startup - Add -v/--version flag to CLI - Add website and Discord links to completion message - Binary size: ~97MB (down from ~120MB with all deps)	2025-12-15 08:21:03 -08:00
0xallam	2899021a21	chore(todo): encourage batched todo operations Strengthen schema guidance to batch todo creation, status updates, and completions while reducing unnecessary list refreshes to cut tool-call volume.	2025-12-15 07:41:33 -08:00
Ahmed Allam	0fcd5c46b2	Fix badge in README.md	2025-12-15 19:39:47 +04:00
0xallam	dcf77b31fc	chore(tools): raise sandbox execution timeout Increase default sandbox tool execution timeout from 120s to 500s while keeping connect timeout unchanged.	2025-12-14 20:40:00 -08:00
0xallam	37c8cffbe3	feat(tools): add bulk operations support to todo tools - update_todo: add `updates` param for bulk updates in one call - mark_todo_done: add `todo_ids` param to mark multiple todos done - mark_todo_pending: add `todo_ids` param to mark multiple pending - delete_todo: add `todo_ids` param to delete multiple todos - Increase todo renderer display limit from 10 to 25 - Maintains backward compatibility with single-ID usage - Update prompts to keep todos short-horizon and dynamic	2025-12-14 20:31:33 -08:00
0xallam	c29f13fd69	feat: add --scan-mode CLI option with quick/standard/deep modes Introduces scan mode selection to control testing depth and methodology: - quick: optimized for CI/CD, focuses on recent changes and high-impact vulns - standard: balanced coverage with systematic methodology - deep: exhaustive testing with hierarchical agent swarm (now default) Each mode has dedicated prompt modules with detailed pentesting guidelines covering reconnaissance, mapping, business logic analysis, exploitation, and vulnerability chaining strategies. Closes #152	2025-12-14 19:13:08 -08:00
Rohit Martires	5c995628bf	Feat: added support for non vision models STRIX_DISABLE_BROWSER flag (#188 ) Co-authored-by: 0xallam <ahmed39652003@gmail.com>	2025-12-14 23:45:43 +04:00
Ahmed Allam	624f1ed77f	feat(tui): add markdown rendering for agent messages (#197 ) Add AgentMessageRenderer to render agent messages with basic markdown support: - Headers (#, ##, ###, ####) - Bold (text) and italic (text) - Inline code and fenced code blocks - Links [text](url) and strikethrough Update system prompt to allow agents to use simple markdown formatting.	2025-12-14 22:53:07 +04:00
Ahmed Allam	2b926c733b	feat(tools): add dedicated todo tool for agent task tracking (#196 ) - Add new todo tool with create, list, update, mark_done, mark_pending, delete actions - Each subagent has isolated todo storage keyed by agent_id - Support bulk todo creation via JSON array or bullet list - Add TUI renderers for all todo actions with status markers - Update notes tool to remove priority and todo-related functionality - Add task tracking guidance to StrixAgent system prompt - Fix instruction file error handling in CLI	2025-12-14 22:16:02 +04:00
Ahmed Allam	a075ea1a0a	feat(tui): add syntax highlighting for tool renderers (#195 ) Add Pygments-based syntax highlighting with native hacker theme: - Python renderer: Python code highlighting - Browser renderer: JavaScript code highlighting - Terminal renderer: Bash command highlighting - File edit renderer: Auto-detect language from file extension, diff-style display	2025-12-14 04:39:28 +04:00
0xallam	5e3d14a1eb	chore: add Python 3.13 and 3.14 classifiers	2025-12-13 11:20:30 -08:00
Ahmed Allam	e57b7238f6	Update README to remove duplicate demo image	2025-12-12 21:59:16 +04:00
Ahmed Allam	13fe87d428	Add DeepWiki docs for Strix	2025-12-12 21:58:28 +04:00
K0IN	3e5845a0e1	Update GitHub Actions checkout action version (#189 )	2025-12-11 22:24:20 +04:00
Alexander De Battista Kvamme	9fedcf1551	Fix/ Long text instruction causes crash (#184 )	2025-12-08 23:23:51 +04:00
0xallam	1edd8eda01	fix: lint errors and code style improvements	2025-12-07 17:54:32 +02:00
0xallam	d8cb21bea3	chore: bump version to 0.4.1	2025-12-07 15:13:45 +02:00
0xallam	bd8d927f34	fix: add timeout to sandbox tool execution HTTP calls Replace timeout=None with configurable timeouts (120s execution, 10s connect) to prevent hung sandbox connections from blocking indefinitely. Configurable via STRIX_SANDBOX_EXECUTION_TIMEOUT and STRIX_SANDBOX_CONNECT_TIMEOUT environment variables.	2025-12-07 17:07:25 +04:00
0xallam	fc267564f5	chore: add google-cloud-aiplatform dependency Adds support for Vertex AI models via the google-cloud-aiplatform SDK.	2025-12-07 04:11:37 +04:00
0xallam	37c9b4b0e0	fix: make LLM_API_KEY optional for all providers Some providers like Vertex AI, AWS Bedrock, and local models don't require an API key as they use different authentication mechanisms.	2025-12-07 02:07:28 +02:00
0xallam	208b31a570	fix: filter out image_url content for non-vision models	2025-12-07 02:13:02 +04:00
Ahmed Allam	a14cb41745	chore: Bump litellm version	2025-12-07 01:38:21 +04:00
0xallam	4297c8f6e4	fix: pass api_key directly to litellm completion calls	2025-12-07 01:38:21 +04:00
0xallam	286d53384a	fix: set LITELLM_API_KEY env var for unified API key support	2025-12-07 01:38:21 +04:00
0xallam	ab40dbc33a	fix: improve request queue reliability and reduce stuck requests	2025-12-06 20:44:48 +02:00
dependabot[bot]	b6cb1302ce	chore(deps): bump urllib3 from 2.5.0 to 2.6.0 Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.0. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/2.5.0...2.6.0) --- updated-dependencies: - dependency-name: urllib3 dependency-version: 2.6.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2025-12-06 16:23:55 +04:00
Ahmed Allam	b74132b2dc	Update README.md	2025-12-03 20:09:22 +00:00
Ahmed Allam	35dd9d0a8f	refactor(tests): reorganize unit tests module structure	2025-12-04 00:02:14 +04:00
Ahmed Allam	6c5c0b0d1c	chore: resolve linting errors in test modules	2025-12-04 00:02:14 +04:00
Jeong-Ryeol	65c3383ecc	test: add initial unit tests for argument_parser module Add comprehensive test suite for the argument_parser module including: - Tests for _convert_to_bool with truthy/falsy values - Tests for _convert_to_list with JSON and comma-separated inputs - Tests for _convert_to_dict with valid/invalid JSON - Tests for convert_string_to_type with various type annotations - Tests for convert_arguments with typed functions - Tests for ArgumentConversionError exception class This establishes the foundation for the project's test infrastructure with pytest configuration already in place.	2025-12-04 00:02:14 +04:00
Vincent Yang	919cb5e248	docs: add file-based instruction example (#165 ) Co-authored-by: 0xallam <ahmed39652003@gmail.com>	2025-12-03 22:59:59 +04:00
Vincent Yang	c97ff94617	feat: Show Model Name in Live Stats Panel (#169 ) Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>	2025-12-03 18:45:01 +00:00
dependabot[bot]	53c9da9213	chore(deps): bump cryptography from 43.0.3 to 44.0.1 (#163 ) Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.3 to 44.0.1. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/43.0.3...44.0.1) --- updated-dependencies: - dependency-name: cryptography dependency-version: 44.0.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-02 21:44:35 +04:00
dependabot[bot]	1e189c1245	chore(deps): bump fonttools from 4.59.1 to 4.61.0 (#161 ) Bumps [fonttools](https://github.com/fonttools/fonttools) from 4.59.1 to 4.61.0. - [Release notes](https://github.com/fonttools/fonttools/releases) - [Changelog](https://github.com/fonttools/fonttools/blob/main/NEWS.rst) - [Commits](https://github.com/fonttools/fonttools/compare/4.59.1...4.61.0) --- updated-dependencies: - dependency-name: fonttools dependency-version: 4.61.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-12-02 19:23:56 +04:00
Ahmed Allam	62f804b8b5	Update link in README	2025-12-01 16:04:46 +04:00
Ahmed Allam	5ff10e9d20	Add acknowledgements in README	2025-11-29 19:27:30 +04:00
				`@@ -0,0 +1 @@`
				`"""Pytest configuration and shared fixtures for Strix tests."""`