146 Commits

Author SHA1 Message Date
0xallam
d7f712581d chore: Bump strix version to 0.6.0 2026-01-12 09:19:19 -08:00
0xallam
4818a854d6 feat: modernize TUI status bar with sweep animation
- Replace braille spinner with ping-pong sweep animation using colored squares
- Add smooth gradient fade with 8 color steps from dim to bright green
- Modernize keymap styling: keys in white, actions in dim, separated by ·
- Move "esc stop" to left side next to animation
- Change ctrl-c to ctrl-q for quit
- Simplify "Initializing Agent" to just "Initializing"
- Remove italic styling from status text
- Waiting state shows only "Send message to resume" hint
- Remove unused action verbs and related dead code

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 23:54:24 -08:00
0xallam
9bcb43e713 fix: correct GitHub repository URL in README
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:53:10 -08:00
0xallam
5672925736 docs: document config persistence in README
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
61c94189c6 fix: allow clearing saved config by setting empty env var
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
f539e5aafd fix: apply saved config at module level before strix imports
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
1ffeedcf55 fix: handle chmod failure on Windows gracefully
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
c059f47d01 refactor: add explicit STRIX_IMAGE validation
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
7dab26cdd5 refactor: remove unused LLMRequestQueue constructor params
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
498032e279 refactor: replace type ignores with inline fallbacks
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
b80bb165b9 refactor: use Config.get() in validate_environment()
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
fe456d57fe fix: set restrictive permissions on config file
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
13e804b7e3 refactor: remove STRIX_IMAGE constant, use Config.get() instead
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
2e3dc0d276 fix: remove default for strix_llm, keep it required 2026-01-10 15:49:03 -08:00
0xallam
83efe3816f feat: add centralized Config class with auto-save to ~/.strix/cli-config.json
- Add Config class with all env var defaults in one place
- Auto-load saved config on startup (env vars take precedence)
- Auto-save config after successful LLM warm-up
- Replace scattered os.getenv() calls with Config.get()

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 15:49:03 -08:00
0xallam
52aa763d47 fix: add missing 'low' value to reasoning effort options
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 20:17:46 -08:00
Ahmed Allam
d932602a6b Update args in strix/interface/main.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-01-09 20:00:01 -08:00
0xallam
6f4ca95338 feat: add STRIX_REASONING_EFFORT env var to control thinking effort
- Add configurable reasoning effort via environment variable
- Default to "high", but use "medium" for quick scan mode
- Document in README and interface error panel

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 20:00:01 -08:00
0xallam
fb6f6295c5 docs: reformat recommended models as bulleted list 2026-01-09 16:49:16 -08:00
0xallam
f56f56a7f7 docs: add Gemini 3 Pro Preview to recommended models 2026-01-09 16:47:33 -08:00
0xallam
86a687ede8 fix: restrict result type check to dict or str 2026-01-09 16:44:05 -08:00
0xallam
7b7ea59a37 fix: handle string results in tool renderers
Previously, tool renderers assumed result was always a dict and would
crash with AttributeError when result was a string (e.g., error messages).
Now all renderers properly check for string results and display them.
2026-01-09 16:44:05 -08:00
Daniel Sangorrin
226678f3f2 fix: add thinking blocks 2026-01-09 15:40:21 -08:00
Ahmed Allam
49421f50d5 Remove title from README 2026-01-10 02:35:20 +04:00
0xallam
b6b0778956 Simplify stats panel display format 2026-01-09 14:25:00 -08:00
0xallam
4a58226c9a Modernize vulnerability detail dialog styling 2026-01-09 14:25:00 -08:00
0xallam
94bb97143e Add PostHog integration for analytics and error debugging 2026-01-09 14:24:04 -08:00
dependabot[bot]
bcd6b8a715 chore(deps): bump pypdf from 6.4.0 to 6.6.0
Bumps [pypdf](https://github.com/py-pdf/pypdf) from 6.4.0 to 6.6.0.
- [Release notes](https://github.com/py-pdf/pypdf/releases)
- [Changelog](https://github.com/py-pdf/pypdf/blob/main/CHANGELOG.md)
- [Commits](https://github.com/py-pdf/pypdf/compare/6.4.0...6.6.0)

---
updated-dependencies:
- dependency-name: pypdf
  dependency-version: 6.6.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-01-09 12:28:41 -08:00
0xallam
c53a0f6b64 fix: reduce spacing between consecutive tool calls in TUI 2026-01-08 17:53:16 -08:00
0xallam
dc5043452e fix: use fixed per-request timeout for tool server health checks
The previous implementation divided total timeout by retries, making the
timeout behavior confusing and the actual wait time unpredictable. Now
uses a consistent 5-second timeout per request for clearer semantics.
2026-01-08 17:41:44 -08:00
0xallam
13ba8746dd feat: add tool server health check and show error details in CLI
- Add _wait_for_tool_server_health() to verify tool server is responding after init
- Show error details in CLI mode when penetration test fails
- Simplify error message (remove technical URL details)
2026-01-08 17:41:44 -08:00
0xallam
a31ed36778 feat: add tool server health check during sandbox initialization
- Add _wait_for_tool_server_health() method with retry logic and exponential backoff
- Check tool server /health endpoint after container initialization
- Add async _verify_tool_server_health() for health check when reusing containers
- Raise SandboxInitializationError with helpful message if tool server is not responding
- Add TOOL_SERVER_HEALTH_TIMEOUT and TOOL_SERVER_HEALTH_RETRIES constants
2026-01-08 17:41:44 -08:00
0xallam
740fb3ed40 fix: add timeout handling for Docker operations and improve error messages
- Add SandboxInitializationError exception for sandbox/Docker failures
- Add 60-second timeout to Docker client initialization
- Add _exec_run_with_timeout() method using ThreadPoolExecutor for exec_run calls
- Catch ConnectionError and Timeout exceptions from requests library
- Add _handle_sandbox_error() and _handle_llm_error() methods in base_agent.py
- Handle sandbox_error_details tool in TUI for displaying errors
- Increase TUI truncation limits for better error visibility
- Update all Docker error messages with helpful hint:
  'Please ensure Docker Desktop is installed and running, and try running strix again.'
2026-01-08 17:41:44 -08:00
0xallam
c327ce621f Remove --run-name CLI argument 2026-01-08 15:16:25 -08:00
0xallam
e8662fbda9 Add background styling to finish and reporting tool renderers
- Wrap finish_scan and create_vulnerability_report tool output in Padding with dark grey background (#141414)
- Refactor TUI rendering to support heterogeneous renderables (Text, Padding, Group) instead of just Text
- Update _render_streaming_content and _render_tool_content_simple to return Any renderable type
- Handle interrupted messages by composing with Group instead of appending to Text
2026-01-08 15:09:10 -08:00
0xallam
cdf3cca3b7 fix(tui): hide cost in stats panel when zero 2026-01-08 12:21:18 -08:00
0xallam
0159d431ea fix(tui): rename 'Tokens' to 'Total Tokens' in stats display 2026-01-08 12:21:18 -08:00
0xallam
bf04b304e6 fix(tui): compare vulnerability content instead of just count for updates 2026-01-08 12:21:18 -08:00
0xallam
a1d7c0f810 fix(tui): use consistent severity colors between vulnerability components 2026-01-08 12:21:18 -08:00
0xallam
47e07c8a04 feat(tui): add vulnerability detail dialog with markdown copy support
- Add VulnerabilityDetailScreen modal with full vulnerability details
- Add Copy button that exports report as markdown to clipboard
- Add VulnerabilitiesPanel in sidebar showing found vulnerabilities
- Add clickable VulnerabilityItem widgets with severity-colored dots
- ESC key closes modal dialogs
- Remove emojis from TUI stats panel for cleaner display
- Add build_tui_stats_text() for minimal TUI-specific stats
2026-01-08 12:21:18 -08:00
0xallam
ea31e0cc9d fix(llm): suppress RuntimeWarnings for unawaited coroutines from asyncio 2026-01-07 20:09:46 -08:00
0xallam
9bb8475e2f refactor(cli): remove final statistics display from CLI output 2026-01-07 19:53:40 -08:00
0xallam
a09d2795e2 feat(reporting): improve vulnerability display and reporting format 2026-01-07 19:51:41 -08:00
0xallam
17ee6e6e6f chore: increase truncation limit to 8000 chars 2026-01-07 19:32:45 -08:00
0xallam
01ae348da8 feat(reporting): add LLM-based vulnerability deduplication
- Add dedupe.py with XML-based LLM deduplication using direct litellm calls
- Integrate deduplication check in create_vulnerability_report tool
- Add get_existing_vulnerabilities() method to tracer for fetching reports
- Update schema and system prompt with deduplication guidelines
2026-01-07 19:32:45 -08:00
dependabot[bot]
0e9cd9b2a4 chore(deps): bump urllib3 from 2.6.0 to 2.6.3
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.6.0 to 2.6.3.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.6.0...2.6.3)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.6.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-01-07 19:25:31 -08:00
0xallam
2ea5ff6695 feat(reporting): enhance vulnerability reporting with detailed fields and CVSS calculation 2026-01-07 17:50:32 -08:00
0xallam
06659d98ba feat: enable container access to host localhost services
Rewrite localhost/127.x.x.x/0.0.0.0 target URLs to use host.docker.internal,
allowing the container to reach services running on the host machine.

- Add extra_hosts mapping for host.docker.internal on Linux
- Add HOST_GATEWAY env var to container
- Add rewrite_localhost_targets() to transform localhost URLs
- Support full 127.0.0.0/8 loopback range and IPv6 ::1
2026-01-07 12:04:21 -08:00
0xallam
7af1180a30 Refactor(skills): rename prompt modules to skills and update documentation 2026-01-06 17:50:15 -08:00
0xallam
f48def1f9e refactor(tui): remove flawed streaming update throttling
The length-based hash was prone to collisions and could miss
content changes. Simplified to always update during streaming.
2026-01-06 16:44:22 -08:00
0xallam
af8eeef4ac feat(tui): display agent vulnerability count in TUI 2026-01-06 16:44:22 -08:00
0xallam
16c9b05121 feat(tui): enhance spinner animations and update renderer styles 2026-01-06 16:44:22 -08:00
0xallam
6422bfa0b4 feat(tui): show tool output in terminal and python renderers
- Terminal renderer now displays command output with smart filtering
- Strips PS1 prompts, command echoes, and hardcoded status messages
- Python renderer now shows stdout/stderr from execution results
- Both renderers support line truncation (50 lines max, 200 chars/line)
- Removed smart coloring in favor of consistent dim styling
- Added proper error and exit code display
2026-01-06 16:44:22 -08:00
0xallam
dd7767c847 feat(tui): enhance streaming content handling and animation efficiency 2026-01-06 16:44:22 -08:00
0xallam
2777ae3fe8 refactor(llm): streamline reasoning effort handling and remove unused patterns 2026-01-06 16:44:22 -08:00
0xallam
45bb0ae8d8 fix(llm): update logging configuration for asyncio 2026-01-06 16:44:22 -08:00
0xallam
67cfe994be feat(tui): implement request and response content truncation for improved readability 2026-01-06 16:44:22 -08:00
0xallam
878d6ebf57 refactor(tui): improve agent node expansion handling and add tree node selection functionality 2026-01-06 16:44:22 -08:00
0xallam
48fb48dba3 feat(agent): implement user interruption handling in agent execution 2026-01-06 16:44:22 -08:00
0xallam
0954ac208f fix(llm): add streaming retry with exponential backoff
- Retry failed streams up to 3 times with exp backoff (8s min, 64s max)
- Reset chunks on failure and retry full request
- Use litellm._should_retry() for retryable error detection
- Switch to async acompletion() for streaming
- Refactor generate() into smaller focused methods
2026-01-06 16:44:22 -08:00
0xallam
a6dcb7756e feat(tui): add real-time streaming LLM output with full content display
- Convert LiteLLM requests to streaming mode with stream_request()
- Add streaming parser to handle live LLM output segments
- Update TUI for real-time streaming content rendering
- Add tracer methods for streaming content tracking
- Clean function tags from streamed content to prevent display
- Remove all truncation from tool renderers for full content visibility
2026-01-06 16:44:22 -08:00
0xallam
a2142cc985 feat(tui): refactor TUI components for improved text rendering and styling
- Removed unused escape_markup function and integrated rich.text for better text handling.
- Updated various renderers to utilize Text for consistent styling and formatting.
- Enhanced chat and agent message displays with dynamic text features.
- Improved error handling and display for various tool components.
- Refined TUI styles for better visual consistency across components.
2026-01-06 16:44:22 -08:00
0xallam
7bcdedfb18 feat(tui): enhance splash screen and agent status display
- Reduced animation timer for splash screen to improve responsiveness.
- Added URL display to the splash screen.
- Improved start line animation with dynamic character styling.
- Updated agent status display to show "Initializing Agent" when no real activity is detected.
- Enhanced waiting and animated verb text with dynamic styling.
- Implemented sidebar visibility toggle based on window size.
- Updated live stats to include model information from agent configuration.
- Refined TUI styles for better visual consistency.
2026-01-06 16:44:22 -08:00
0xallam
e6ddcb1801 feat(tui): add multiline chat input with dynamic height
- Support Shift+Enter to insert newlines in chat input
- Chat input container expands dynamically up to 8 lines
- Enter key sends message as before
- Fix cursor line background to match unselected lines
2026-01-06 16:44:22 -08:00
dependabot[bot]
daba3d8b61 chore(deps): bump pynacl from 1.5.0 to 1.6.2
Bumps [pynacl](https://github.com/pyca/pynacl) from 1.5.0 to 1.6.2.
- [Changelog](https://github.com/pyca/pynacl/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/pynacl/compare/1.5.0...1.6.2)

---
updated-dependencies:
- dependency-name: pynacl
  dependency-version: 1.6.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-01-06 15:47:36 -08:00
dependabot[bot]
e6c1aae38d chore(deps): bump aiohttp from 3.12.15 to 3.13.3
---
updated-dependencies:
- dependency-name: aiohttp
  dependency-version: 3.13.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-01-05 18:06:30 -08:00
Hongchao Ma
1089aab89e libasound2 being a virtual package in newer Kali/Debian. Replace it with libasound2t64. 2026-01-05 12:06:31 -08:00
0xallam
706bb193c0 chore: update website links to strix.ai 2026-01-03 17:58:34 -08:00
0xallam
2ba1d0fe59 docs: add documentation links to README 2026-01-03 17:56:35 -08:00
Ahmed Allam
8b0bb521ba Update link in README 2026-01-03 08:28:03 +04:00
ahmed
a90082bc53 feat(prompts): enhance Next.js framework module with reconnaissance techniques
- Add route enumeration section with __BUILD_MANIFEST.sortedPages technique
  - Add environment variable leakage detection (NEXT_PUBLIC_ prefix)
  - Add data fetching over-exposure section for __NEXT_DATA__ inspection
  - Add API route path normalization bypass techniques
2026-01-02 15:35:52 -08:00
Vincent550102
6fc592b4e8 fix: Convert dictionary views to lists for stable iteration over agents and tool executions. 2026-01-02 14:17:32 -08:00
Vincent550102
62cca3f149 fix: convert tool_executions.items() to list for stable iteration 2026-01-02 14:17:32 -08:00
Ahmed Allam
f25cf9b23d Remove PyPI Downloads badge from readme 2026-01-01 23:27:00 +04:00
dependabot[bot]
2472d590d5 chore(deps): bump filelock from 3.19.1 to 3.20.1
Bumps [filelock](https://github.com/tox-dev/py-filelock) from 3.19.1 to 3.20.1.
- [Release notes](https://github.com/tox-dev/py-filelock/releases)
- [Changelog](https://github.com/tox-dev/filelock/blob/main/docs/changelog.rst)
- [Commits](https://github.com/tox-dev/py-filelock/compare/3.19.1...3.20.1)

---
updated-dependencies:
- dependency-name: filelock
  dependency-version: 3.20.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-16 15:13:22 -08:00
0xallam
78b6c26652 enhance todo tool prompt 2025-12-15 10:26:59 -08:00
0xallam
d649a7c70b Update README.md 2025-12-15 10:11:08 -08:00
0xallam
d96852de55 chore: bump version to 0.5.0 2025-12-15 08:21:03 -08:00
0xallam
eb0c52b720 feat: add PyInstaller build for standalone binary distribution
- Add PyInstaller spec file and build script for creating standalone executables
- Add install.sh for curl | sh installation from GitHub releases
- Add GitHub Actions workflow for multi-platform builds (macOS, Linux, Windows)
- Move sandbox-only deps (playwright, ipython, libtmux, etc.) to optional extras
- Make google-cloud-aiplatform optional ([vertex] extra) to reduce binary size
- Use lazy imports in tool actions to avoid loading sandbox deps at startup
- Add -v/--version flag to CLI
- Add website and Discord links to completion message
- Binary size: ~97MB (down from ~120MB with all deps)
2025-12-15 08:21:03 -08:00
0xallam
2899021a21 chore(todo): encourage batched todo operations
Strengthen schema guidance to batch todo creation, status updates, and completions while reducing unnecessary list refreshes to cut tool-call volume.
2025-12-15 07:41:33 -08:00
Ahmed Allam
0fcd5c46b2 Fix badge in README.md 2025-12-15 19:39:47 +04:00
0xallam
dcf77b31fc chore(tools): raise sandbox execution timeout
Increase default sandbox tool execution timeout from 120s to 500s while keeping connect timeout unchanged.
2025-12-14 20:40:00 -08:00
0xallam
37c8cffbe3 feat(tools): add bulk operations support to todo tools
- update_todo: add `updates` param for bulk updates in one call
- mark_todo_done: add `todo_ids` param to mark multiple todos done
- mark_todo_pending: add `todo_ids` param to mark multiple pending
- delete_todo: add `todo_ids` param to delete multiple todos
- Increase todo renderer display limit from 10 to 25
- Maintains backward compatibility with single-ID usage
- Update prompts to keep todos short-horizon and dynamic
2025-12-14 20:31:33 -08:00
0xallam
c29f13fd69 feat: add --scan-mode CLI option with quick/standard/deep modes
Introduces scan mode selection to control testing depth and methodology:
- quick: optimized for CI/CD, focuses on recent changes and high-impact vulns
- standard: balanced coverage with systematic methodology
- deep: exhaustive testing with hierarchical agent swarm (now default)

Each mode has dedicated prompt modules with detailed pentesting guidelines
covering reconnaissance, mapping, business logic analysis, exploitation,
and vulnerability chaining strategies.

Closes #152
2025-12-14 19:13:08 -08:00
Rohit Martires
5c995628bf Feat: added support for non vision models STRIX_DISABLE_BROWSER flag (#188)
Co-authored-by: 0xallam <ahmed39652003@gmail.com>
2025-12-14 23:45:43 +04:00
Ahmed Allam
624f1ed77f feat(tui): add markdown rendering for agent messages (#197)
Add AgentMessageRenderer to render agent messages with basic markdown support:
- Headers (#, ##, ###, ####)
- Bold (**text**) and italic (*text*)
- Inline code and fenced code blocks
- Links [text](url) and strikethrough

Update system prompt to allow agents to use simple markdown formatting.
2025-12-14 22:53:07 +04:00
Ahmed Allam
2b926c733b feat(tools): add dedicated todo tool for agent task tracking (#196)
- Add new todo tool with create, list, update, mark_done, mark_pending, delete actions
- Each subagent has isolated todo storage keyed by agent_id
- Support bulk todo creation via JSON array or bullet list
- Add TUI renderers for all todo actions with status markers
- Update notes tool to remove priority and todo-related functionality
- Add task tracking guidance to StrixAgent system prompt
- Fix instruction file error handling in CLI
2025-12-14 22:16:02 +04:00
Ahmed Allam
a075ea1a0a feat(tui): add syntax highlighting for tool renderers (#195)
Add Pygments-based syntax highlighting with native hacker theme:
- Python renderer: Python code highlighting
- Browser renderer: JavaScript code highlighting
- Terminal renderer: Bash command highlighting
- File edit renderer: Auto-detect language from file extension, diff-style display
2025-12-14 04:39:28 +04:00
0xallam
5e3d14a1eb chore: add Python 3.13 and 3.14 classifiers 2025-12-13 11:20:30 -08:00
Ahmed Allam
e57b7238f6 Update README to remove duplicate demo image 2025-12-12 21:59:16 +04:00
Ahmed Allam
13fe87d428 Add DeepWiki docs for Strix 2025-12-12 21:58:28 +04:00
K0IN
3e5845a0e1 Update GitHub Actions checkout action version (#189) 2025-12-11 22:24:20 +04:00
Alexander De Battista Kvamme
9fedcf1551 Fix/ Long text instruction causes crash (#184) 2025-12-08 23:23:51 +04:00
0xallam
1edd8eda01 fix: lint errors and code style improvements 2025-12-07 17:54:32 +02:00
0xallam
d8cb21bea3 chore: bump version to 0.4.1 2025-12-07 15:13:45 +02:00
0xallam
bd8d927f34 fix: add timeout to sandbox tool execution HTTP calls
Replace timeout=None with configurable timeouts (120s execution, 10s connect)
to prevent hung sandbox connections from blocking indefinitely.

Configurable via STRIX_SANDBOX_EXECUTION_TIMEOUT and STRIX_SANDBOX_CONNECT_TIMEOUT
environment variables.
2025-12-07 17:07:25 +04:00
0xallam
fc267564f5 chore: add google-cloud-aiplatform dependency
Adds support for Vertex AI models via the google-cloud-aiplatform SDK.
2025-12-07 04:11:37 +04:00
0xallam
37c9b4b0e0 fix: make LLM_API_KEY optional for all providers
Some providers like Vertex AI, AWS Bedrock, and local models don't
require an API key as they use different authentication mechanisms.
2025-12-07 02:07:28 +02:00
0xallam
208b31a570 fix: filter out image_url content for non-vision models 2025-12-07 02:13:02 +04:00
Ahmed Allam
a14cb41745 chore: Bump litellm version 2025-12-07 01:38:21 +04:00
0xallam
4297c8f6e4 fix: pass api_key directly to litellm completion calls 2025-12-07 01:38:21 +04:00
0xallam
286d53384a fix: set LITELLM_API_KEY env var for unified API key support 2025-12-07 01:38:21 +04:00
0xallam
ab40dbc33a fix: improve request queue reliability and reduce stuck requests 2025-12-06 20:44:48 +02:00
dependabot[bot]
b6cb1302ce chore(deps): bump urllib3 from 2.5.0 to 2.6.0
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.0.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.5.0...2.6.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.6.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-06 16:23:55 +04:00
Ahmed Allam
b74132b2dc Update README.md 2025-12-03 20:09:22 +00:00
Ahmed Allam
35dd9d0a8f refactor(tests): reorganize unit tests module structure 2025-12-04 00:02:14 +04:00
Ahmed Allam
6c5c0b0d1c chore: resolve linting errors in test modules 2025-12-04 00:02:14 +04:00
Jeong-Ryeol
65c3383ecc test: add initial unit tests for argument_parser module
Add comprehensive test suite for the argument_parser module including:
- Tests for _convert_to_bool with truthy/falsy values
- Tests for _convert_to_list with JSON and comma-separated inputs
- Tests for _convert_to_dict with valid/invalid JSON
- Tests for convert_string_to_type with various type annotations
- Tests for convert_arguments with typed functions
- Tests for ArgumentConversionError exception class

This establishes the foundation for the project's test infrastructure
with pytest configuration already in place.
2025-12-04 00:02:14 +04:00
Vincent Yang
919cb5e248 docs: add file-based instruction example (#165)
Co-authored-by: 0xallam <ahmed39652003@gmail.com>
2025-12-03 22:59:59 +04:00
Vincent Yang
c97ff94617 feat: Show Model Name in Live Stats Panel (#169)
Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>
2025-12-03 18:45:01 +00:00
dependabot[bot]
53c9da9213 chore(deps): bump cryptography from 43.0.3 to 44.0.1 (#163)
Bumps [cryptography](https://github.com/pyca/cryptography) from 43.0.3 to 44.0.1.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/43.0.3...44.0.1)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 44.0.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-02 21:44:35 +04:00
dependabot[bot]
1e189c1245 chore(deps): bump fonttools from 4.59.1 to 4.61.0 (#161)
Bumps [fonttools](https://github.com/fonttools/fonttools) from 4.59.1 to 4.61.0.
- [Release notes](https://github.com/fonttools/fonttools/releases)
- [Changelog](https://github.com/fonttools/fonttools/blob/main/NEWS.rst)
- [Commits](https://github.com/fonttools/fonttools/compare/4.59.1...4.61.0)

---
updated-dependencies:
- dependency-name: fonttools
  dependency-version: 4.61.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-02 19:23:56 +04:00
Ahmed Allam
62f804b8b5 Update link in README 2025-12-01 16:04:46 +04:00
Ahmed Allam
5ff10e9d20 Add acknowledgements in README 2025-11-29 19:27:30 +04:00
Ahmed Allam
9825fb46ec chore: Bump version for 0.4.0 release 2025-11-25 20:18:44 +04:00
Alexander De Battista Kvamme
c0e547928e Real-time display panel for agent stats (#134)
Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>
2025-11-25 12:06:20 +00:00
Trusthoodies
78d0148d58 Add open redirect, subdomain takeover, and info disclosure prompt modules (#132)
Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>
2025-11-25 10:32:55 +00:00
dependabot[bot]
eebb76de3b chore(deps): bump pypdf from 6.1.3 to 6.4.0
Bumps [pypdf](https://github.com/py-pdf/pypdf) from 6.1.3 to 6.4.0.
- [Release notes](https://github.com/py-pdf/pypdf/releases)
- [Changelog](https://github.com/py-pdf/pypdf/blob/main/CHANGELOG.md)
- [Commits](https://github.com/py-pdf/pypdf/compare/6.1.3...6.4.0)

---
updated-dependencies:
- dependency-name: pypdf
  dependency-version: 6.4.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-11-25 12:44:38 +04:00
Ahmed Allam
2ae1b3ddd1 Update README 2025-11-23 22:29:44 +04:00
Ahmed Allam
a11cd09a93 feat: support file-based instructions for detailed test configuration 2025-11-23 00:46:37 +04:00
Ahmed Allam
68ebdb2b6d feat: enhance run name generation to include target information 2025-11-22 22:54:07 +04:00
Ahmed Allam
5befb32318 feat: implement incremental pentest data persistence 2025-11-22 22:54:07 +04:00
cyberseall
86e6ed49bb feat(llm): make LLM request queue rate limits configurable and more conservative
Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>
2025-11-22 17:07:43 +00:00
Ahmed Allam
0c811845f1 docs: update README 2025-11-21 23:07:11 +04:00
Ahmed Allam
383d53c7a9 feat(agent): implement agent identity guidline and improve system prompt 2025-11-15 16:21:05 +04:00
Ahmed Allam
478bf5d4d3 refactor(llm): remove unused temperature parameter from LLMConfig 2025-11-15 12:44:40 +04:00
Ahmed Allam
d1f7741965 feat(llm): enhance model features handling with pattern matching 2025-11-15 12:43:43 +04:00
Ahmed Allam
821929cd3e fix(agent): increase waiting time threshold from 120 to 600 seconds 2025-11-15 12:39:46 +04:00
Ahmed Allam
5de16d2953 chore: Bump LiteLLM version 2025-11-15 12:37:22 +04:00
Ahmed Allam
6a2a62c121 chore: Fix formatting in README.md 2025-11-14 16:07:54 +00:00
Ahmed Allam
426dd27454 chore: Minor readme tweaks. Bump version for 0.3.4 release 2025-11-14 20:02:48 +04:00
Mark Percival
cedc65409e fix: link 2025-11-14 20:02:48 +04:00
Mark Percival
72d5a73386 Chore: Update README 2025-11-14 20:02:48 +04:00
Ahmed Allam
dab69af033 fix(runtime): correct DOCKER_HOST parsing for sandbox URL 2025-11-14 02:41:00 +04:00
Ahmed Allam
6abb53dc02 feat: support scanning IP addresses 2025-11-14 01:38:58 +04:00
Ahmed Allam
f1d2961779 Update README 2025-11-12 19:29:01 +04:00
purpl3horse
2b7a8e3ee7 Update README.md
Instruction argument was written in plural in the readme ( a typo )
2025-11-12 19:03:27 +04:00
Ahmed Allam
3e7466a533 chore: Bump version for 0.3.3 release 2025-11-12 18:58:03 +04:00
Ahmed Allam
1abfb360e4 feat: add configurable timeout for LLM requests 2025-11-12 18:58:03 +04:00
Ahmed Allam
795ed02955 docs: update README with recommended models 2025-11-12 15:01:15 +04:00
Alexei Macheret Artur
2cb0c31897 chore(deps): bump starlette from 0.46.2 to 0.49.1 (#75)
Bumps [starlette](https://github.com/Kludex/starlette) from 0.46.2 to 0.49.1.
- [Release notes](https://github.com/Kludex/starlette/releases)
- [Changelog](https://github.com/Kludex/starlette/blob/main/docs/release-notes.md)
- [Commits](https://github.com/Kludex/starlette/compare/0.46.2...0.49.1)

---
updated-dependencies:
- dependency-name: starlette
  dependency-version: 0.49.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-10 14:19:18 +04:00
m4ki3lf0
1c8780cf81 Update Readme
Co-authored-by: m4ki3lf0 <m4ki3lf0@git.com>
Co-authored-by: Ahmed Allam <ahmed39652003@gmail.com>
2025-11-10 09:49:37 +00:00
Ahmed Allam
b6d9d941cf Update README 2025-11-08 15:07:53 +04:00
Ahmed Allam
edd628bbc1 Chore: fix discord link in readme 2025-11-07 18:03:47 +04:00
Ahmed Allam
d76c7c55b2 Fix: update litellm dependency version 2025-11-05 12:40:44 +02:00
Ahmed Allam
b5ddba3867 docs: Update README 2025-11-05 01:21:48 +02:00
116 changed files with 10274 additions and 2642 deletions

BIN
.github/logo.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.7 KiB

78
.github/workflows/build-release.yml vendored Normal file
View File

@@ -0,0 +1,78 @@
name: Build & Release
on:
push:
tags:
- 'v*'
workflow_dispatch:
jobs:
build:
strategy:
fail-fast: false
matrix:
include:
- os: macos-latest
target: macos-arm64
- os: macos-15-intel
target: macos-x86_64
- os: ubuntu-latest
target: linux-x86_64
- os: windows-latest
target: windows-x86_64
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- uses: snok/install-poetry@v1
- name: Build
shell: bash
run: |
poetry install --with dev
poetry run pyinstaller strix.spec --noconfirm
VERSION=$(poetry version -s)
mkdir -p dist/release
if [[ "${{ runner.os }}" == "Windows" ]]; then
cp dist/strix.exe "dist/release/strix-${VERSION}-${{ matrix.target }}.exe"
(cd dist/release && 7z a "strix-${VERSION}-${{ matrix.target }}.zip" "strix-${VERSION}-${{ matrix.target }}.exe")
else
cp dist/strix "dist/release/strix-${VERSION}-${{ matrix.target }}"
chmod +x "dist/release/strix-${VERSION}-${{ matrix.target }}"
tar -C dist/release -czvf "dist/release/strix-${VERSION}-${{ matrix.target }}.tar.gz" "strix-${VERSION}-${{ matrix.target }}"
fi
- uses: actions/upload-artifact@v4
with:
name: strix-${{ matrix.target }}
path: |
dist/release/*.tar.gz
dist/release/*.zip
if-no-files-found: error
release:
needs: build
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- uses: actions/download-artifact@v4
with:
path: release
merge-multiple: true
- name: Create Release
uses: softprops/action-gh-release@v2
with:
prerelease: ${{ !startsWith(github.ref, 'refs/tags/') }}
generate_release_notes: true
files: release/*

1
.gitignore vendored
View File

@@ -79,6 +79,7 @@ logs/
tensorboard/
# Agent execution traces
strix_runs/
agent_runs/
# Misc

View File

@@ -39,14 +39,14 @@ Thank you for your interest in contributing to Strix! This guide will help you g
poetry run strix --target https://example.com
```
## 📚 Contributing Prompt Modules
## 📚 Contributing Skills
Prompt modules are specialized knowledge packages that enhance agent capabilities. See [strix/prompts/README.md](strix/prompts/README.md) for detailed guidelines.
Skills are specialized knowledge packages that enhance agent capabilities. See [strix/skills/README.md](strix/skills/README.md) for detailed guidelines.
### Quick Guide
1. **Choose the right category** (`/vulnerabilities`, `/frameworks`, `/technologies`, etc.)
2. **Create a** `.jinja` file with your prompts
2. **Create a** `.jinja` file with your skill content
3. **Include practical examples** - Working payloads, commands, or test cases
4. **Provide validation methods** - How to confirm findings and avoid false positives
5. **Submit via PR** with clear description
@@ -101,7 +101,7 @@ We welcome feature ideas! Please:
## 🤝 Community
- **Discord**: [Join our community](https://discord.gg/J48Fzuh7)
- **Discord**: [Join our community](https://discord.gg/YjKFvEZSdZ)
- **Issues**: [GitHub Issues](https://github.com/usestrix/strix/issues)
## ✨ Recognition
@@ -113,4 +113,4 @@ We value all contributions! Contributors will be:
---
**Questions?** Reach out on [Discord](https://discord.gg/J48Fzuh7) or create an issue. We're here to help!
**Questions?** Reach out on [Discord](https://discord.gg/YjKFvEZSdZ) or create an issue. We're here to help!

235
README.md
View File

@@ -1,82 +1,125 @@
<div align="center">
<p align="center">
<a href="https://strix.ai/">
<img src=".github/logo.png" width="150" alt="Strix Logo">
</a>
</p>
# Strix
<h1 align="center">Strix</h1>
### Open-source AI hackers for your apps
[![Strix](https://img.shields.io/badge/Strix-usestrix.com-1a1a1a.svg)](https://usestrix.com)
[![Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
[![Discord](https://img.shields.io/badge/Discord-join-5865F2?logo=discord&logoColor=white)](https://discord.gg/J48Fzuh7)
[![PyPI Downloads](https://static.pepy.tech/personalized-badge/strix-agent?period=total&units=INTERNATIONAL_SYSTEM&left_color=GRAY&right_color=BLACK&left_text=Downloads)](https://pepy.tech/projects/strix-agent)
[![GitHub stars](https://img.shields.io/github/stars/usestrix/strix.svg?style=social&label=Star)](https://github.com/usestrix/strix)
</div>
<h2 align="center">Open-source AI Hackers to secure your Apps</h2>
<div align="center">
<img src=".github/screenshot.png" alt="Strix Demo" width="800" style="border-radius: 16px; box-shadow: 0 20px 40px rgba(0, 0, 0, 0.3), 0 0 0 1px rgba(255, 255, 255, 0.1), inset 0 1px 0 rgba(255, 255, 255, 0.2); transform: perspective(1000px) rotateX(2deg); transition: transform 0.3s ease;">
[![Python](https://img.shields.io/pypi/pyversions/strix-agent?color=3776AB)](https://pypi.org/project/strix-agent/)
[![PyPI](https://img.shields.io/pypi/v/strix-agent?color=10b981)](https://pypi.org/project/strix-agent/)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
[![Docs](https://img.shields.io/badge/Docs-docs.strix.ai-10b981.svg)](https://docs.strix.ai)
[![GitHub Stars](https://img.shields.io/github/stars/usestrix/strix)](https://github.com/usestrix/strix)
[![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?&logo=discord&logoColor=white)](https://discord.gg/YjKFvEZSdZ)
[![Website](https://img.shields.io/badge/Website-strix.ai-2d3748.svg)](https://strix.ai)
<a href="https://trendshift.io/repositories/15362" target="_blank"><img src="https://trendshift.io/api/badge/repositories/15362" alt="usestrix%2Fstrix | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/usestrix/strix)
</div>
<br>
<div align="center">
<img src=".github/screenshot.png" alt="Strix Demo" width="800" style="border-radius: 16px;">
</div>
<br>
> [!TIP]
> **New!** Strix now integrates seamlessly with GitHub Actions and CI/CD pipelines. Automatically scan for vulnerabilities on every pull request and block insecure code before it reaches production!
---
## 🦉 Strix Overview
Strix are autonomous AI agents that act just like real hackers - they run your code dynamically, find vulnerabilities, and validate them through actual exploitation. Built for developers and security teams who need fast, accurate security testing without the overhead of manual pentesting or the false positives of static analysis tools.
Strix are autonomous AI agents that act just like real hackers - they run your code dynamically, find vulnerabilities, and validate them through actual proof-of-concepts. Built for developers and security teams who need fast, accurate security testing without the overhead of manual pentesting or the false positives of static analysis tools.
- **Full hacker toolkit** out of the box
- **Teams of agents** that collaborate and scale
- **Real validation** via exploitation and PoC, not false positives
- **Developerfirst** CLI with actionable reports
- **Autofix & reporting** to accelerate remediation
**Key Capabilities:**
- 🔧 **Full hacker toolkit** out of the box
- 🤝 **Teams of agents** that collaborate and scale
- **Real validation** with PoCs, not false positives
- 💻 **Developerfirst** CLI with actionable reports
- 🔄 **Autofix & reporting** to accelerate remediation
## 🎯 Use Cases
- **Application Security Testing** - Detect and validate critical vulnerabilities in your applications
- **Rapid Penetration Testing** - Get penetration tests done in hours, not weeks, with compliance reports
- **Bug Bounty Automation** - Automate bug bounty research and generate PoCs for faster reporting
- **CI/CD Integration** - Run tests in CI/CD to block vulnerabilities before reaching production
---
### 🎯 Use Cases
## 🚀 Quick Start
- Detect and validate critical vulnerabilities in your applications.
- Get penetration tests done in hours, not weeks, with compliance reports.
- Automate bug bounty research and generate PoCs for faster reporting.
- Run tests in CI/CD to block vulnerabilities before reaching production.
---
### 🚀 Quick Start
Prerequisites:
**Prerequisites:**
- Docker (running)
- Python 3.12+
- An LLM provider key (or a local LLM)
- An LLM provider key (e.g. [get OpenAI API key](https://platform.openai.com/api-keys) or use a local LLM)
### Installation & First Scan
```bash
# Install
# Install Strix
curl -sSL https://strix.ai/install | bash
# Or via pipx
pipx install strix-agent
# Configure AI provider
# Configure your AI provider
export STRIX_LLM="openai/gpt-5"
export LLM_API_KEY="your-api-key"
# Run security assessment
# Run your first security assessment
strix --target ./app-directory
```
First run pulls the sandbox Docker image. Results are saved under `agent_runs/<run-name>`.
> [!NOTE]
> First run automatically pulls the sandbox Docker image. Results are saved to `strix_runs/<run-name>`
### ☁️ Cloud Hosted
## ☁️ Run Strix in Cloud
Want to skip the setup? Try our cloud-hosted version: **[usestrix.com](https://usestrix.com)**
Want to skip the local setup, API keys, and unpredictable LLM costs? Run the hosted cloud version of Strix at **[app.strix.ai](https://strix.ai)**.
Launch a scan in just a few minutes—no setup or configuration required—and youll get:
- **A full pentest report** with validated findings and clear remediation steps
- **Shareable dashboards** your team can use to track fixes over time
- **CI/CD and GitHub integrations** to block risky changes before production
- **Continuous monitoring** so new vulnerabilities are caught quickly
[**Run your first pentest now →**](https://strix.ai)
---
## ✨ Features
### 🛠️ Agentic Security Tools
- **🔌 Full HTTP Proxy** - Full request/response manipulation and analysis
- **🌐 Browser Automation** - Multi-tab browser for testing of XSS, CSRF, auth flows
- **💻 Terminal Environments** - Interactive shells for command execution and testing
- **🐍 Python Runtime** - Custom exploit development and validation
- **🔍 Reconnaissance** - Automated OSINT and attack surface mapping
- **📁 Code Analysis** - Static and dynamic analysis capabilities
- **📝 Knowledge Management** - Structured findings and attack documentation
Strix agents come equipped with a comprehensive security testing toolkit:
- **Full HTTP Proxy** - Full request/response manipulation and analysis
- **Browser Automation** - Multi-tab browser for testing of XSS, CSRF, auth flows
- **Terminal Environments** - Interactive shells for command execution and testing
- **Python Runtime** - Custom exploit development and validation
- **Reconnaissance** - Automated OSINT and attack surface mapping
- **Code Analysis** - Static and dynamic analysis capabilities
- **Knowledge Management** - Structured findings and attack documentation
### 🎯 Comprehensive Vulnerability Detection
Strix can identify and validate a wide range of security vulnerabilities:
- **Access Control** - IDOR, privilege escalation, auth bypass
- **Injection Attacks** - SQL, NoSQL, command injection
- **Server-Side** - SSRF, XXE, deserialization flaws
@@ -87,55 +130,51 @@ Want to skip the setup? Try our cloud-hosted version: **[usestrix.com](https://u
### 🕸️ Graph of Agents
Advanced multi-agent orchestration for comprehensive security testing:
- **Distributed Workflows** - Specialized agents for different attacks and assets
- **Scalable Testing** - Parallel execution for fast comprehensive coverage
- **Dynamic Coordination** - Agents collaborate and share discoveries
---
## 💻 Usage Examples
### Basic Usage
```bash
# Local codebase analysis
# Scan a local codebase
strix --target ./app-directory
# Repository security review
# Security review of a GitHub repository
strix --target https://github.com/org/repo
# Web application assessment
# Black-box web application assessment
strix --target https://your-app.com
# Multi-target white-box testing (source code + deployed app)
strix -t https://github.com/org/app -t https://your-app.com
# Test multiple environments simultaneously
strix -t https://dev.your-app.com -t https://staging.your-app.com -t https://prod.your-app.com
# Focused testing with instructions
strix --target api.your-app.com --instruction "Prioritize authentication and authorization testing"
# Testing with credentials
strix --target https://your-app.com --instruction "Test with credentials: testuser/testpass. Focus on privilege escalation and access control bypasses."
```
### ⚙️ Configuration
### Advanced Testing Scenarios
```bash
export STRIX_LLM="openai/gpt-5"
export LLM_API_KEY="your-api-key"
# Grey-box authenticated testing
strix --target https://your-app.com --instruction "Perform authenticated testing using credentials: user:pass"
# Optional
export LLM_API_BASE="your-api-base-url" # if using a local model, e.g. Ollama, LMStudio
export PERPLEXITY_API_KEY="your-api-key" # for search capabilities
# Multi-target testing (source code + deployed app)
strix -t https://github.com/org/app -t https://your-app.com
# Focused testing with custom instructions
strix --target api.your-app.com --instruction "Focus on business logic flaws and IDOR vulnerabilities"
# Provide detailed instructions through file (e.g., rules of engagement, scope, exclusions)
strix --target api.your-app.com --instruction-file ./instruction.md
```
[📚 View supported AI models](https://docs.litellm.ai/docs/providers)
### 🤖 Headless Mode
Run Strix programmatically without interactive UI using the `-n/--non-interactive` flag—perfect for servers and automated jobs. The CLI prints real-time vulnerability findings, and the final report before exiting. Exits with non-zero code when vulnerabilities are found.
```bash
strix -n --target https://your-app.com --instruction "Focus on authentication and authorization vulnerabilities"
strix -n --target https://your-app.com
```
### 🔄 CI/CD (GitHub Actions)
@@ -152,63 +191,63 @@ jobs:
security-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
- name: Install Strix
run: pipx install strix-agent
run: curl -sSL https://strix.ai/install | bash
- name: Run Strix
env:
STRIX_LLM: ${{ secrets.STRIX_LLM }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
run: strix -n -t ./
run: strix -n -t ./ --scan-mode quick
```
## 🏆 Enterprise Platform
### ⚙️ Configuration
Our managed platform provides:
```bash
export STRIX_LLM="openai/gpt-5"
export LLM_API_KEY="your-api-key"
- **📈 Executive Dashboards**
- **🧠 Custom Fine-Tuned Models**
- **⚙️ CI/CD Integration**
- **🔍 Large-Scale Scanning**
- **🔌 Third-Party Integrations**
- **🎯 Enterprise Support**
# Optional
export LLM_API_BASE="your-api-base-url" # if using a local model, e.g. Ollama, LMStudio
export PERPLEXITY_API_KEY="your-api-key" # for search capabilities
export STRIX_REASONING_EFFORT="high" # control thinking effort (default: high, quick scan: medium)
```
[**Get Enterprise Demo →**](https://usestrix.com)
> [!NOTE]
> Strix automatically saves your configuration to `~/.strix/cli-config.json`, so you don't have to re-enter it on every run.
## 🔒 Security Architecture
**Recommended models for best results:**
- **Container Isolation** - All testing in sandboxed Docker environments
- **Local Processing** - Testing runs locally, no data sent to external services
- [OpenAI GPT-5](https://openai.com/api/) — `openai/gpt-5`
- [Anthropic Claude Sonnet 4.5](https://claude.com/platform/api) — `anthropic/claude-sonnet-4-5`
- [Google Gemini 3 Pro Preview](https://cloud.google.com/vertex-ai) — `vertex_ai/gemini-3-pro-preview`
> [!WARNING]
> Only test systems you own or have permission to test. You are responsible for using Strix ethically and legally.
See the [LLM Providers documentation](https://docs.strix.ai/llm-providers/overview) for all supported providers including Vertex AI, Bedrock, Azure, and local models.
## 📚 Documentation
Full documentation is available at **[docs.strix.ai](https://docs.strix.ai)** — including detailed guides for usage, CI/CD integrations, skills, and advanced configuration.
## 🤝 Contributing
We welcome contributions from the community! There are several ways to contribute:
We welcome contributions of code, docs, and new skills - check out our [Contributing Guide](https://docs.strix.ai/contributing) to get started or open a [pull request](https://github.com/usestrix/strix/pulls)/[issue](https://github.com/usestrix/strix/issues).
### Code Contributions
See our [Contributing Guide](CONTRIBUTING.md) for details on:
- Setting up your development environment
- Running tests and quality checks
- Submitting pull requests
- Code style guidelines
## 👥 Join Our Community
### Prompt Modules Collection
Help expand our collection of specialized prompt modules for AI agents:
- Advanced testing techniques for vulnerabilities, frameworks, and technologies
- See [Prompt Modules Documentation](strix/prompts/README.md) for guidelines
- Submit via [pull requests](https://github.com/usestrix/strix/pulls) or [issues](https://github.com/usestrix/strix/issues)
Have questions? Found a bug? Want to contribute? **[Join our Discord!](https://discord.gg/YjKFvEZSdZ)**
## 🌟 Support the Project
**Love Strix?** Give us a ⭐ on GitHub!
## 🙏 Acknowledgements
## 👥 Join Our Community
Strix builds on the incredible work of open-source projects like [LiteLLM](https://github.com/BerriAI/litellm), [Caido](https://github.com/caido/caido), [ProjectDiscovery](https://github.com/projectdiscovery), [Playwright](https://github.com/microsoft/playwright), and [Textual](https://github.com/Textualize/textual). Huge thanks to their maintainers!
Have questions? Found a bug? Want to contribute? **[Join our Discord!](https://discord.gg/J48Fzuh7)**
> [!WARNING]
> Only test apps you own or have permission to test. You are responsible for using Strix ethically and legally.
</div>

View File

@@ -40,10 +40,11 @@ RUN apt-get update && \
gdb \
tmux \
libnss3 libnspr4 libdbus-1-3 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 libatspi2.0-0 \
libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libxkbcommon0 libpango-1.0-0 libcairo2 libasound2 \
libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libxkbcommon0 libpango-1.0-0 libcairo2 libasound2t64 \
fonts-unifont fonts-noto-color-emoji fonts-freefont-ttf fonts-dejavu-core ttf-bitstream-vera \
libnss3-tools
RUN setcap cap_net_raw,cap_net_admin,cap_net_bind_service+eip $(which nmap)
USER pentester
@@ -158,7 +159,7 @@ RUN mkdir -p /workspace && chown -R pentester:pentester /workspace /app
COPY pyproject.toml poetry.lock ./
USER pentester
RUN poetry install --no-root --without dev
RUN poetry install --no-root --without dev --extras sandbox
RUN poetry run playwright install chromium
RUN /app/venv/bin/pip install -r /home/pentester/tools/jwt_tool/requirements.txt && \

2116
poetry.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,6 +1,6 @@
[tool.poetry]
name = "strix-agent"
version = "0.3.1"
version = "0.6.0"
description = "Open-source AI Hackers for your apps"
authors = ["Strix <hi@usestrix.com>"]
readme = "README.md"
@@ -26,6 +26,8 @@ classifiers = [
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: 3.14",
]
packages = [
{ include = "strix", format = ["sdist", "wheel"] }
@@ -43,24 +45,34 @@ strix = "strix.interface.main:main"
[tool.poetry.dependencies]
python = "^3.12"
fastapi = "*"
uvicorn = "*"
litellm = { version = "~1.75.8", extras = ["proxy"] }
openai = ">=1.99.5,<1.100.0"
# Core CLI dependencies
litellm = { version = "~1.80.7", extras = ["proxy"] }
tenacity = "^9.0.0"
numpydoc = "^1.8.0"
pydantic = {extras = ["email"], version = "^2.11.3"}
ipython = "^9.3.0"
openhands-aci = "^0.3.0"
playwright = "^1.48.0"
rich = "*"
docker = "^7.1.0"
gql = {extras = ["requests"], version = "^3.5.3"}
textual = "^4.0.0"
xmltodict = "^0.13.0"
pyte = "^0.8.1"
requests = "^2.32.0"
libtmux = "^0.46.2"
cvss = "^3.2"
# Optional LLM provider dependencies
google-cloud-aiplatform = { version = ">=1.38", optional = true }
# Sandbox-only dependencies (only needed inside Docker container)
fastapi = { version = "*", optional = true }
uvicorn = { version = "*", optional = true }
ipython = { version = "^9.3.0", optional = true }
openhands-aci = { version = "^0.3.0", optional = true }
playwright = { version = "^1.48.0", optional = true }
gql = { version = "^3.5.3", extras = ["requests"], optional = true }
pyte = { version = "^0.8.1", optional = true }
libtmux = { version = "^0.46.2", optional = true }
numpydoc = { version = "^1.8.0", optional = true }
[tool.poetry.extras]
vertex = ["google-cloud-aiplatform"]
sandbox = ["fastapi", "uvicorn", "ipython", "openhands-aci", "playwright", "gql", "pyte", "libtmux", "numpydoc"]
[tool.poetry.group.dev.dependencies]
# Type checking and static analysis
@@ -81,6 +93,9 @@ pre-commit = "^4.2.0"
black = "^25.1.0"
isort = "^6.0.1"
# Build tools
pyinstaller = { version = "^6.17.0", python = ">=3.12,<3.15" }
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
@@ -129,9 +144,16 @@ module = [
"textual.*",
"pyte.*",
"libtmux.*",
"pytest.*",
"cvss.*",
]
ignore_missing_imports = true
# Relax strict rules for test files (pytest decorators are not fully typed)
[[tool.mypy.overrides]]
module = ["tests.*"]
disallow_untyped_decorators = false
# ============================================================================
# Ruff Configuration (Fast Python Linter & Formatter)
# ============================================================================
@@ -321,7 +343,6 @@ addopts = [
"--cov-report=term-missing",
"--cov-report=html",
"--cov-report=xml",
"--cov-fail-under=80"
]
testpaths = ["tests"]
python_files = ["test_*.py", "*_test.py"]

98
scripts/build.sh Executable file
View File

@@ -0,0 +1,98 @@
#!/bin/bash
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
echo -e "${BLUE}🦉 Strix Build Script${NC}"
echo "================================"
OS="$(uname -s)"
ARCH="$(uname -m)"
case "$OS" in
Linux*) OS_NAME="linux";;
Darwin*) OS_NAME="macos";;
MINGW*|MSYS*|CYGWIN*) OS_NAME="windows";;
*) OS_NAME="unknown";;
esac
case "$ARCH" in
x86_64|amd64) ARCH_NAME="x86_64";;
arm64|aarch64) ARCH_NAME="arm64";;
*) ARCH_NAME="$ARCH";;
esac
echo -e "${YELLOW}Platform:${NC} $OS_NAME-$ARCH_NAME"
cd "$PROJECT_ROOT"
if ! command -v poetry &> /dev/null; then
echo -e "${RED}Error: Poetry is not installed${NC}"
echo "Please install Poetry first: https://python-poetry.org/docs/#installation"
exit 1
fi
echo -e "\n${BLUE}Installing dependencies...${NC}"
poetry install --with dev
VERSION=$(poetry version -s)
echo -e "${YELLOW}Version:${NC} $VERSION"
echo -e "\n${BLUE}Cleaning previous builds...${NC}"
rm -rf build/ dist/
echo -e "\n${BLUE}Building binary with PyInstaller...${NC}"
poetry run pyinstaller strix.spec --noconfirm
RELEASE_DIR="dist/release"
mkdir -p "$RELEASE_DIR"
BINARY_NAME="strix-${VERSION}-${OS_NAME}-${ARCH_NAME}"
if [ "$OS_NAME" = "windows" ]; then
if [ ! -f "dist/strix.exe" ]; then
echo -e "${RED}Build failed: Binary not found${NC}"
exit 1
fi
BINARY_NAME="${BINARY_NAME}.exe"
cp "dist/strix.exe" "$RELEASE_DIR/$BINARY_NAME"
echo -e "\n${BLUE}Creating zip...${NC}"
ARCHIVE_NAME="${BINARY_NAME%.exe}.zip"
if command -v 7z &> /dev/null; then
7z a "$RELEASE_DIR/$ARCHIVE_NAME" "$RELEASE_DIR/$BINARY_NAME"
else
powershell -Command "Compress-Archive -Path '$RELEASE_DIR/$BINARY_NAME' -DestinationPath '$RELEASE_DIR/$ARCHIVE_NAME'"
fi
echo -e "${GREEN}Created:${NC} $RELEASE_DIR/$ARCHIVE_NAME"
else
if [ ! -f "dist/strix" ]; then
echo -e "${RED}Build failed: Binary not found${NC}"
exit 1
fi
cp "dist/strix" "$RELEASE_DIR/$BINARY_NAME"
chmod +x "$RELEASE_DIR/$BINARY_NAME"
echo -e "\n${BLUE}Creating tarball...${NC}"
ARCHIVE_NAME="${BINARY_NAME}.tar.gz"
tar -czvf "$RELEASE_DIR/$ARCHIVE_NAME" -C "$RELEASE_DIR" "$BINARY_NAME"
echo -e "${GREEN}Created:${NC} $RELEASE_DIR/$ARCHIVE_NAME"
fi
echo -e "\n${GREEN}Build successful!${NC}"
echo "================================"
echo -e "${YELLOW}Binary:${NC} $RELEASE_DIR/$BINARY_NAME"
SIZE=$(ls -lh "$RELEASE_DIR/$BINARY_NAME" | awk '{print $5}')
echo -e "${YELLOW}Size:${NC} $SIZE"
echo -e "\n${BLUE}Testing binary...${NC}"
"$RELEASE_DIR/$BINARY_NAME" --help > /dev/null 2>&1 && echo -e "${GREEN}Binary test passed!${NC}" || echo -e "${RED}Binary test failed${NC}"
echo -e "\n${GREEN}Done!${NC}"

328
scripts/install.sh Executable file
View File

@@ -0,0 +1,328 @@
#!/usr/bin/env bash
set -euo pipefail
APP=strix
REPO="usestrix/strix"
STRIX_IMAGE="ghcr.io/usestrix/strix-sandbox:0.1.10"
MUTED='\033[0;2m'
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
CYAN='\033[0;36m'
NC='\033[0m'
requested_version=${VERSION:-}
SKIP_DOWNLOAD=false
raw_os=$(uname -s)
os=$(echo "$raw_os" | tr '[:upper:]' '[:lower:]')
case "$raw_os" in
Darwin*) os="macos" ;;
Linux*) os="linux" ;;
MINGW*|MSYS*|CYGWIN*) os="windows" ;;
esac
arch=$(uname -m)
if [[ "$arch" == "aarch64" ]]; then
arch="arm64"
fi
if [[ "$arch" == "x86_64" ]]; then
arch="x86_64"
fi
if [ "$os" = "macos" ] && [ "$arch" = "x86_64" ]; then
rosetta_flag=$(sysctl -n sysctl.proc_translated 2>/dev/null || echo 0)
if [ "$rosetta_flag" = "1" ]; then
arch="arm64"
fi
fi
combo="$os-$arch"
case "$combo" in
linux-x86_64|macos-x86_64|macos-arm64|windows-x86_64)
;;
*)
echo -e "${RED}Unsupported OS/Arch: $os/$arch${NC}"
exit 1
;;
esac
archive_ext=".tar.gz"
if [ "$os" = "windows" ]; then
archive_ext=".zip"
fi
target="$os-$arch"
if [ "$os" = "linux" ]; then
if ! command -v tar >/dev/null 2>&1; then
echo -e "${RED}Error: 'tar' is required but not installed.${NC}"
exit 1
fi
fi
if [ "$os" = "windows" ]; then
if ! command -v unzip >/dev/null 2>&1; then
echo -e "${RED}Error: 'unzip' is required but not installed.${NC}"
exit 1
fi
fi
INSTALL_DIR=$HOME/.strix/bin
mkdir -p "$INSTALL_DIR"
if [ -z "$requested_version" ]; then
specific_version=$(curl -s "https://api.github.com/repos/$REPO/releases/latest" | sed -n 's/.*"tag_name": *"v\([^"]*\)".*/\1/p')
if [[ $? -ne 0 || -z "$specific_version" ]]; then
echo -e "${RED}Failed to fetch version information${NC}"
exit 1
fi
else
specific_version=$requested_version
fi
filename="$APP-${specific_version}-${target}${archive_ext}"
url="https://github.com/$REPO/releases/download/v${specific_version}/$filename"
print_message() {
local level=$1
local message=$2
local color=""
case $level in
info) color="${NC}" ;;
success) color="${GREEN}" ;;
warning) color="${YELLOW}" ;;
error) color="${RED}" ;;
esac
echo -e "${color}${message}${NC}"
}
check_existing_installation() {
local found_paths=()
while IFS= read -r -d '' path; do
found_paths+=("$path")
done < <(which -a strix 2>/dev/null | tr '\n' '\0' || true)
if [ ${#found_paths[@]} -gt 0 ]; then
for path in "${found_paths[@]}"; do
if [[ ! -e "$path" ]] || [[ "$path" == "$INSTALL_DIR/strix"* ]]; then
continue
fi
if [[ -n "$path" ]]; then
echo -e "${MUTED}Found existing strix at: ${NC}$path"
if [[ "$path" == *".local/bin"* ]]; then
echo -e "${MUTED}Removing old pipx installation...${NC}"
if command -v pipx >/dev/null 2>&1; then
pipx uninstall strix-agent 2>/dev/null || true
fi
rm -f "$path" 2>/dev/null || true
elif [[ -L "$path" || -f "$path" ]]; then
echo -e "${MUTED}Removing old installation...${NC}"
rm -f "$path" 2>/dev/null || true
fi
fi
done
fi
}
check_version() {
check_existing_installation
if [[ -x "$INSTALL_DIR/strix" ]]; then
installed_version=$("$INSTALL_DIR/strix" --version 2>/dev/null | awk '{print $2}' || echo "")
if [[ "$installed_version" == "$specific_version" ]]; then
print_message info "${GREEN}✓ Strix ${NC}$specific_version${GREEN} already installed${NC}"
SKIP_DOWNLOAD=true
elif [[ -n "$installed_version" ]]; then
print_message info "${MUTED}Installed: ${NC}$installed_version ${MUTED}→ Upgrading to ${NC}$specific_version"
fi
fi
}
download_and_install() {
print_message info "\n${CYAN}🦉 Installing Strix${NC} ${MUTED}version: ${NC}$specific_version"
print_message info "${MUTED}Platform: ${NC}$target\n"
local tmp_dir=$(mktemp -d)
cd "$tmp_dir"
echo -e "${MUTED}Downloading...${NC}"
curl -# -L -o "$filename" "$url"
if [ ! -f "$filename" ]; then
echo -e "${RED}Download failed${NC}"
exit 1
fi
echo -e "${MUTED}Extracting...${NC}"
if [ "$os" = "windows" ]; then
unzip -q "$filename"
mv "strix-${specific_version}-${target}.exe" "$INSTALL_DIR/strix.exe"
else
tar -xzf "$filename"
mv "strix-${specific_version}-${target}" "$INSTALL_DIR/strix"
chmod 755 "$INSTALL_DIR/strix"
fi
cd - > /dev/null
rm -rf "$tmp_dir"
echo -e "${GREEN}✓ Strix installed to $INSTALL_DIR${NC}"
}
check_docker() {
echo ""
if ! command -v docker >/dev/null 2>&1; then
echo -e "${YELLOW}⚠ Docker not found${NC}"
echo -e "${MUTED}Strix requires Docker to run the security sandbox.${NC}"
echo -e "${MUTED}Please install Docker: ${NC}https://docs.docker.com/get-docker/"
echo ""
return 1
fi
if ! docker info >/dev/null 2>&1; then
echo -e "${YELLOW}⚠ Docker daemon not running${NC}"
echo -e "${MUTED}Please start Docker and run: ${NC}docker pull $STRIX_IMAGE"
echo ""
return 1
fi
echo -e "${MUTED}Checking for sandbox image...${NC}"
if docker image inspect "$STRIX_IMAGE" >/dev/null 2>&1; then
echo -e "${GREEN}✓ Sandbox image already available${NC}"
else
echo -e "${MUTED}Pulling sandbox image (this may take a few minutes)...${NC}"
if docker pull "$STRIX_IMAGE"; then
echo -e "${GREEN}✓ Sandbox image pulled successfully${NC}"
else
echo -e "${YELLOW}⚠ Failed to pull sandbox image${NC}"
echo -e "${MUTED}You can pull it manually later: ${NC}docker pull $STRIX_IMAGE"
fi
fi
return 0
}
add_to_path() {
local config_file=$1
local command=$2
if grep -Fxq "$command" "$config_file" 2>/dev/null; then
return 0
elif [[ -w $config_file ]]; then
echo -e "\n# strix" >> "$config_file"
echo "$command" >> "$config_file"
fi
}
setup_path() {
XDG_CONFIG_HOME=${XDG_CONFIG_HOME:-$HOME/.config}
current_shell=$(basename "$SHELL")
case $current_shell in
fish)
config_files="$HOME/.config/fish/config.fish"
;;
zsh)
config_files="$HOME/.zshrc $HOME/.zshenv"
;;
bash)
config_files="$HOME/.bashrc $HOME/.bash_profile $HOME/.profile"
;;
*)
config_files="$HOME/.bashrc $HOME/.profile"
;;
esac
config_file=""
for file in $config_files; do
if [[ -f $file ]]; then
config_file=$file
break
fi
done
if [[ -z $config_file ]]; then
config_file="$HOME/.bashrc"
touch "$config_file"
fi
if [[ ":$PATH:" != *":$INSTALL_DIR:"* ]]; then
case $current_shell in
fish)
add_to_path "$config_file" "fish_add_path $INSTALL_DIR"
;;
*)
add_to_path "$config_file" "export PATH=\"$INSTALL_DIR:\$PATH\""
;;
esac
fi
if [ -n "${GITHUB_ACTIONS-}" ] && [ "${GITHUB_ACTIONS}" == "true" ]; then
echo "$INSTALL_DIR" >> "$GITHUB_PATH"
fi
}
verify_installation() {
export PATH="$INSTALL_DIR:$PATH"
local which_strix=$(which strix 2>/dev/null || echo "")
if [[ "$which_strix" != "$INSTALL_DIR/strix" && "$which_strix" != "$INSTALL_DIR/strix.exe" ]]; then
if [[ -n "$which_strix" ]]; then
echo -e "${YELLOW}⚠ Found conflicting strix at: ${NC}$which_strix"
echo -e "${MUTED}Attempting to remove...${NC}"
if rm -f "$which_strix" 2>/dev/null; then
echo -e "${GREEN}✓ Removed conflicting installation${NC}"
else
echo -e "${YELLOW}Could not remove automatically.${NC}"
echo -e "${MUTED}Please remove manually: ${NC}rm $which_strix"
fi
fi
fi
if [[ -x "$INSTALL_DIR/strix" ]]; then
local version=$("$INSTALL_DIR/strix" --version 2>/dev/null | awk '{print $2}' || echo "unknown")
echo -e "${GREEN}✓ Strix ${NC}$version${GREEN} ready${NC}"
fi
}
check_version
if [ "$SKIP_DOWNLOAD" = false ]; then
download_and_install
fi
setup_path
verify_installation
check_docker
echo ""
echo -e "${CYAN}"
echo " ███████╗████████╗██████╗ ██╗██╗ ██╗"
echo " ██╔════╝╚══██╔══╝██╔══██╗██║╚██╗██╔╝"
echo " ███████╗ ██║ ██████╔╝██║ ╚███╔╝ "
echo " ╚════██║ ██║ ██╔══██╗██║ ██╔██╗ "
echo " ███████║ ██║ ██║ ██║██║██╔╝ ██╗"
echo " ╚══════╝ ╚═╝ ╚═╝ ╚═╝╚═╝╚═╝ ╚═╝"
echo -e "${NC}"
echo -e "${MUTED} AI Penetration Testing Agent${NC}"
echo ""
echo -e "${MUTED}To get started:${NC}"
echo ""
echo -e " ${CYAN}1.${NC} Set your LLM provider:"
echo -e " ${MUTED}export STRIX_LLM='openai/gpt-5'${NC}"
echo -e " ${MUTED}export LLM_API_KEY='your-api-key'${NC}"
echo ""
echo -e " ${CYAN}2.${NC} Run a penetration test:"
echo -e " ${MUTED}strix --target https://example.com${NC}"
echo ""
echo -e "${MUTED}For more information visit ${NC}https://strix.ai"
echo -e "${MUTED}Join our community ${NC}https://discord.gg/YjKFvEZSdZ"
echo ""
if [[ ":$PATH:" != *":$INSTALL_DIR:"* ]]; then
echo -e "${YELLOW}${NC} Run ${MUTED}source ~/.$(basename $SHELL)rc${NC} or open a new terminal"
echo ""
fi

221
strix.spec Normal file
View File

@@ -0,0 +1,221 @@
# -*- mode: python ; coding: utf-8 -*-
import sys
from pathlib import Path
from PyInstaller.utils.hooks import collect_data_files, collect_submodules
project_root = Path(SPECPATH)
strix_root = project_root / 'strix'
datas = []
for jinja_file in strix_root.rglob('*.jinja'):
rel_path = jinja_file.relative_to(project_root)
datas.append((str(jinja_file), str(rel_path.parent)))
for xml_file in strix_root.rglob('*.xml'):
rel_path = xml_file.relative_to(project_root)
datas.append((str(xml_file), str(rel_path.parent)))
for tcss_file in strix_root.rglob('*.tcss'):
rel_path = tcss_file.relative_to(project_root)
datas.append((str(tcss_file), str(rel_path.parent)))
datas += collect_data_files('textual')
datas += collect_data_files('tiktoken')
datas += collect_data_files('tiktoken_ext')
datas += collect_data_files('litellm')
hiddenimports = [
# Core dependencies
'litellm',
'litellm.llms',
'litellm.llms.openai',
'litellm.llms.anthropic',
'litellm.llms.vertex_ai',
'litellm.llms.bedrock',
'litellm.utils',
'litellm.caching',
# Textual TUI
'textual',
'textual.app',
'textual.widgets',
'textual.containers',
'textual.screen',
'textual.binding',
'textual.reactive',
'textual.css',
'textual._text_area_theme',
# Rich console
'rich',
'rich.console',
'rich.panel',
'rich.text',
'rich.markup',
'rich.style',
'rich.align',
'rich.live',
# Pydantic
'pydantic',
'pydantic.fields',
'pydantic_core',
'email_validator',
# Docker
'docker',
'docker.api',
'docker.models',
'docker.errors',
# HTTP/Networking
'httpx',
'httpcore',
'requests',
'urllib3',
'certifi',
# Jinja2 templating
'jinja2',
'jinja2.ext',
'markupsafe',
# XML parsing
'xmltodict',
# Tiktoken (for token counting)
'tiktoken',
'tiktoken_ext',
'tiktoken_ext.openai_public',
# Tenacity retry
'tenacity',
# Strix modules
'strix',
'strix.interface',
'strix.interface.main',
'strix.interface.cli',
'strix.interface.tui',
'strix.interface.utils',
'strix.interface.tool_components',
'strix.agents',
'strix.agents.base_agent',
'strix.agents.state',
'strix.agents.StrixAgent',
'strix.llm',
'strix.llm.llm',
'strix.llm.config',
'strix.llm.utils',
'strix.llm.request_queue',
'strix.llm.memory_compressor',
'strix.runtime',
'strix.runtime.runtime',
'strix.runtime.docker_runtime',
'strix.telemetry',
'strix.telemetry.tracer',
'strix.tools',
'strix.tools.registry',
'strix.tools.executor',
'strix.tools.argument_parser',
'strix.skills',
]
hiddenimports += collect_submodules('litellm')
hiddenimports += collect_submodules('textual')
hiddenimports += collect_submodules('rich')
hiddenimports += collect_submodules('pydantic')
excludes = [
# Sandbox-only packages
'playwright',
'playwright.sync_api',
'playwright.async_api',
'IPython',
'ipython',
'libtmux',
'pyte',
'openhands_aci',
'openhands-aci',
'gql',
'fastapi',
'uvicorn',
'numpydoc',
# Google Cloud / Vertex AI
'google.cloud',
'google.cloud.aiplatform',
'google.api_core',
'google.auth',
'google.oauth2',
'google.protobuf',
'grpc',
'grpcio',
'grpcio_status',
# Test frameworks
'pytest',
'pytest_asyncio',
'pytest_cov',
'pytest_mock',
# Development tools
'mypy',
'ruff',
'black',
'isort',
'pylint',
'pyright',
'bandit',
'pre_commit',
# Unnecessary for runtime
'tkinter',
'matplotlib',
'numpy',
'pandas',
'scipy',
'PIL',
'cv2',
]
a = Analysis(
['strix/interface/main.py'],
pathex=[str(project_root)],
binaries=[],
datas=datas,
hiddenimports=hiddenimports,
hookspath=[],
hooksconfig={},
runtime_hooks=[],
excludes=excludes,
noarchive=False,
optimize=0,
)
pyz = PYZ(a.pure)
exe = EXE(
pyz,
a.scripts,
a.binaries,
a.datas,
[],
name='strix',
debug=False,
bootloader_ignore_signals=False,
strip=False,
upx=False,
upx_exclude=[],
runtime_tmpdir=None,
console=True,
disable_windowed_traceback=False,
argv_emulation=False,
target_arch=None,
codesign_identity=None,
entitlements_file=None,
)

View File

@@ -8,23 +8,24 @@ class StrixAgent(BaseAgent):
max_iterations = 300
def __init__(self, config: dict[str, Any]):
default_modules = []
default_skills = []
state = config.get("state")
if state is None or (hasattr(state, "parent_id") and state.parent_id is None):
default_modules = ["root_agent"]
default_skills = ["root_agent"]
self.default_llm_config = LLMConfig(prompt_modules=default_modules)
self.default_llm_config = LLMConfig(skills=default_skills)
super().__init__(config)
async def execute_scan(self, scan_config: dict[str, Any]) -> dict[str, Any]:
async def execute_scan(self, scan_config: dict[str, Any]) -> dict[str, Any]: # noqa: PLR0912
user_instructions = scan_config.get("user_instructions", "")
targets = scan_config.get("targets", [])
repositories = []
local_code = []
urls = []
ip_addresses = []
for target in targets:
target_type = target["type"]
@@ -53,6 +54,8 @@ class StrixAgent(BaseAgent):
elif target_type == "web_application":
urls.append(details["target_url"])
elif target_type == "ip_address":
ip_addresses.append(details["target_ip"])
task_parts = []
@@ -74,6 +77,10 @@ class StrixAgent(BaseAgent):
task_parts.append("\n\nURLs:")
task_parts.extend(f"- {url}" for url in urls)
if ip_addresses:
task_parts.append("\n\nIP Addresses:")
task_parts.extend(f"- {ip}" for ip in ip_addresses)
task_description = " ".join(task_parts)
if user_instructions:

View File

@@ -10,20 +10,24 @@ You follow all instructions and rules provided to you exactly as written in the
<communication_rules>
CLI OUTPUT:
- Never use markdown formatting - you are a CLI agent
- Output plain text only (no **bold**, `code`, [links], # headers)
- You may use simple markdown: **bold**, *italic*, `code`, ~~strikethrough~~, [links](url), and # headers
- Do NOT use complex markdown like bullet lists, numbered lists, or tables
- Use line breaks and indentation for structure
- NEVER use "Strix" or any identifiable names/markers in HTTP requests, payloads, user-agents, or any inputs
INTER-AGENT MESSAGES:
- NEVER echo inter_agent_message or agent_completion_report XML content that is sent to you in your output.
- Process these internally without displaying the XML
- NEVER echo agent_identity XML blocks; treat them as internal metadata for identity only. Do not include them in outputs or tool calls.
- Minimize inter-agent messaging: only message when essential for coordination or assistance; avoid routine status updates; batch non-urgent information; prefer parent/child completion flows and shared artifacts over messaging
AUTONOMOUS BEHAVIOR:
- Work autonomously by default
- You should NOT ask for user input or confirmation - you should always proceed with your task autonomously.
- Minimize user messaging: avoid redundancy and repetition; consolidate updates into a single concise message
- NEVER send an empty or blank message. If you have no content to output or need to wait (for user input, subagent results, or any other reason), you MUST call the wait_for_message tool (or another appropriate tool) instead of emitting an empty response.
- If there is nothing to execute and no user query to answer any more: do NOT send filler/repetitive text — either call wait_for_message or finish your work (subagents: agent_finish; root: finish_scan)
- While the agent loop is running, almost every output MUST be a tool call. Do NOT send plain text messages; act via tools. If idle, use wait_for_message; when done, use agent_finish (subagents) or finish_scan (root)
</communication_rules>
<execution_guidelines>
@@ -102,7 +106,6 @@ OPERATIONAL PRINCIPLES:
- Choose appropriate tools for each context
- Chain vulnerabilities for maximum impact
- Consider business logic and context in exploitation
- **OVERUSE THE THINK TOOL** - Use it CONSTANTLY. Every 1-2 messages MINIMUM, and after each tool call!
- NEVER skip think tool - it's your most important tool for reasoning and success
- WORK RELENTLESSLY - Don't stop until you've found something significant
- Try multiple approaches simultaneously - don't wait for one to fail
@@ -131,6 +134,7 @@ VALIDATION REQUIREMENTS:
- Keep going until you find something that matters
- A vulnerability is ONLY considered reported when a reporting agent uses create_vulnerability_report with full details. Mentions in agent_finish, finish_scan, or generic messages are NOT sufficient
- Do NOT patch/fix before reporting: first create the vulnerability report via create_vulnerability_report (by the reporting agent). Only after reporting is completed should fixing/patching proceed
- DEDUPLICATION: The create_vulnerability_report tool uses LLM-based deduplication. If it rejects your report as a duplicate, DO NOT attempt to re-submit the same vulnerability. Accept the rejection and move on to testing other areas. The vulnerability has already been reported by another agent
</execution_guidelines>
<vulnerability_focus>
@@ -210,10 +214,9 @@ SIMPLE WORKFLOW RULES:
4. **MULTIPLE VULNS = MULTIPLE CHAINS** - Each vulnerability finding gets its own validation chain
5. **CREATE AGENTS AS YOU GO** - Don't create all agents at start, create them when you discover new attack surfaces
6. **ONE JOB PER AGENT** - Each agent has ONE specific task only
7. **VIEW THE AGENT GRAPH BEFORE ACTING** - Always call view_agent_graph before creating or messaging agents to avoid duplicates and to target correctly
8. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing
9. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children
10. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent
7. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing
8. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children
9. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent
WHEN TO CREATE NEW AGENTS:
@@ -261,25 +264,25 @@ CRITICAL RULES:
- **ONE AGENT = ONE TASK** - Don't let agents do multiple unrelated jobs
- **SPAWN REACTIVELY** - Create new agents based on what you discover
- **ONLY REPORTING AGENTS** can use create_vulnerability_report tool
- **AGENT SPECIALIZATION MANDATORY** - Each agent must be highly specialized; prefer 13 prompt modules, up to 5 for complex contexts
- **AGENT SPECIALIZATION MANDATORY** - Each agent must be highly specialized; prefer 13 skills, up to 5 for complex contexts
- **NO GENERIC AGENTS** - Avoid creating broad, multi-purpose agents that dilute focus
AGENT SPECIALIZATION EXAMPLES:
GOOD SPECIALIZATION:
- "SQLi Validation Agent" with prompt_modules: sql_injection
- "XSS Discovery Agent" with prompt_modules: xss
- "Auth Testing Agent" with prompt_modules: authentication_jwt, business_logic
- "SSRF + XXE Agent" with prompt_modules: ssrf, xxe, rce (related attack vectors)
- "SQLi Validation Agent" with skills: sql_injection
- "XSS Discovery Agent" with skills: xss
- "Auth Testing Agent" with skills: authentication_jwt, business_logic
- "SSRF + XXE Agent" with skills: ssrf, xxe, rce (related attack vectors)
BAD SPECIALIZATION:
- "General Web Testing Agent" with prompt_modules: sql_injection, xss, csrf, ssrf, authentication_jwt (too broad)
- "Everything Agent" with prompt_modules: all available modules (completely unfocused)
- Any agent with more than 5 prompt modules (violates constraints)
- "General Web Testing Agent" with skills: sql_injection, xss, csrf, ssrf, authentication_jwt (too broad)
- "Everything Agent" with skills: all available skills (completely unfocused)
- Any agent with more than 5 skills (violates constraints)
FOCUS PRINCIPLES:
- Each agent should have deep expertise in 1-3 related vulnerability types
- Agents with single modules have the deepest specialization
- Agents with single skills have the deepest specialization
- Related vulnerabilities (like SSRF+XXE or Auth+Business Logic) can be combined
- Never create "kitchen sink" agents that try to do everything
@@ -304,10 +307,25 @@ Tool calls use XML format:
</function>
CRITICAL RULES:
0. While active in the agent loop, EVERY message you output MUST be a single tool call. Do not send plain text-only responses.
1. One tool call per message
2. Tool call must be last in message
3. End response after </function> tag. It's your stop word. Do not continue after it.
5. Thinking is NOT optional - it's required for reasoning and success
4. Use ONLY the exact XML format shown above. NEVER use JSON/YAML/INI or any other syntax for tools or parameters.
5. Tool names must match exactly the tool "name" defined (no module prefixes, dots, or variants).
- Correct: <function=think> ... </function>
- Incorrect: <thinking_tools.think> ... </function>
- Incorrect: <think> ... </think>
- Incorrect: {"think": {...}}
6. Parameters must use <parameter=param_name>value</parameter> exactly. Do NOT pass parameters as JSON or key:value lines. Do NOT add quotes/braces around values.
7. Do NOT wrap tool calls in markdown/code fences or add any text before or after the tool block.
Example (agent creation tool):
<function=create_agent>
<parameter=task>Perform targeted XSS testing on the search endpoint</parameter>
<parameter=name>XSS Discovery Agent</parameter>
<parameter=skills>xss</parameter>
</function>
SPRAYING EXECUTION NOTE:
- When performing large payload sprays or fuzzing, encapsulate the entire spraying loop inside a single python or terminal tool call (e.g., a Python script using asyncio/aiohttp). Do not issue one tool call per payload.
@@ -359,6 +377,7 @@ SPECIALIZED TOOLS:
PROXY & INTERCEPTION:
- Caido CLI - Modern web proxy (already running). Used with proxy tool or with python tool (functions already imported).
- NOTE: If you are seeing proxy errors when sending requests, it usually means you are not sending requests to a correct url/host/port.
- Ignore Caido proxy-generated 50x HTML error pages; these are proxy issues (might happen when requesting a wrong host or SSL/TLS issues, etc).
PROGRAMMING:
- Python 3, Poetry, Go, Node.js/npm
@@ -374,12 +393,12 @@ Directories:
Default user: pentester (sudo available)
</environment>
{% if loaded_module_names %}
{% if loaded_skill_names %}
<specialized_knowledge>
{# Dynamic prompt modules loaded based on agent specialization #}
{# Dynamic skills loaded based on agent specialization #}
{% for module_name in loaded_module_names %}
{{ get_module(module_name) }}
{% for skill_name in loaded_skill_names %}
{{ get_skill(skill_name) }}
{% endfor %}
</specialized_knowledge>

View File

@@ -1,4 +1,5 @@
import asyncio
import contextlib
import logging
from pathlib import Path
from typing import TYPE_CHECKING, Any, Optional
@@ -15,6 +16,7 @@ from jinja2 import (
from strix.llm import LLM, LLMConfig, LLMRequestFailedError
from strix.llm.utils import clean_content
from strix.runtime import SandboxInitializationError
from strix.tools import process_tool_invocations
from .state import AgentState
@@ -75,6 +77,8 @@ class BaseAgent(metaclass=AgentMeta):
max_iterations=self.max_iterations,
)
with contextlib.suppress(Exception):
self.llm.set_agent_identity(self.agent_name, self.state.agent_id)
self._current_task: asyncio.Task[Any] | None = None
from strix.telemetry.tracer import get_global_tracer
@@ -142,18 +146,16 @@ class BaseAgent(metaclass=AgentMeta):
if self.state.parent_id is None and agents_graph_actions._root_agent_id is None:
agents_graph_actions._root_agent_id = self.state.agent_id
def cancel_current_execution(self) -> None:
if self._current_task and not self._current_task.done():
self._current_task.cancel()
self._current_task = None
async def agent_loop(self, task: str) -> dict[str, Any]: # noqa: PLR0912, PLR0915
await self._initialize_sandbox_and_state(task)
from strix.telemetry.tracer import get_global_tracer
tracer = get_global_tracer()
try:
await self._initialize_sandbox_and_state(task)
except SandboxInitializationError as e:
return self._handle_sandbox_error(e, tracer)
while True:
self._check_agent_messages(self.state)
@@ -201,7 +203,11 @@ class BaseAgent(metaclass=AgentMeta):
self.state.add_message("user", final_warning_msg)
try:
should_finish = await self._process_iteration(tracer)
iteration_task = asyncio.create_task(self._process_iteration(tracer))
self._current_task = iteration_task
should_finish = await iteration_task
self._current_task = None
if should_finish:
if self.non_interactive:
self.state.set_completed({"success": True})
@@ -212,43 +218,22 @@ class BaseAgent(metaclass=AgentMeta):
continue
except asyncio.CancelledError:
self._current_task = None
if tracer:
partial_content = tracer.finalize_streaming_as_interrupted(self.state.agent_id)
if partial_content and partial_content.strip():
self.state.add_message(
"assistant", f"{partial_content}\n\n[ABORTED BY USER]"
)
if self.non_interactive:
raise
await self._enter_waiting_state(tracer, error_occurred=False, was_cancelled=True)
continue
except LLMRequestFailedError as e:
error_msg = str(e)
error_details = getattr(e, "details", None)
self.state.add_error(error_msg)
if self.non_interactive:
self.state.set_completed({"success": False, "error": error_msg})
if tracer:
tracer.update_agent_status(self.state.agent_id, "failed", error_msg)
if error_details:
tracer.log_tool_execution_start(
self.state.agent_id,
"llm_error_details",
{"error": error_msg, "details": error_details},
)
tracer.update_tool_execution(
tracer._next_execution_id - 1, "failed", error_details
)
return {"success": False, "error": error_msg}
self.state.enter_waiting_state(llm_failed=True)
if tracer:
tracer.update_agent_status(self.state.agent_id, "llm_failed", error_msg)
if error_details:
tracer.log_tool_execution_start(
self.state.agent_id,
"llm_error_details",
{"error": error_msg, "details": error_details},
)
tracer.update_tool_execution(
tracer._next_execution_id - 1, "failed", error_details
)
result = self._handle_llm_error(e, tracer)
if result is not None:
return result
continue
except (RuntimeError, ValueError, TypeError) as e:
@@ -266,7 +251,7 @@ class BaseAgent(metaclass=AgentMeta):
if self.state.has_waiting_timeout():
self.state.resume_from_waiting()
self.state.add_message("assistant", "Waiting timeout reached. Resuming execution.")
self.state.add_message("user", "Waiting timeout reached. Resuming execution.")
from strix.telemetry.tracer import get_global_tracer
@@ -331,16 +316,22 @@ class BaseAgent(metaclass=AgentMeta):
if not sandbox_mode and self.state.sandbox_id is None:
from strix.runtime import get_runtime
runtime = get_runtime()
sandbox_info = await runtime.create_sandbox(
self.state.agent_id, self.state.sandbox_token, self.local_sources
)
self.state.sandbox_id = sandbox_info["workspace_id"]
self.state.sandbox_token = sandbox_info["auth_token"]
self.state.sandbox_info = sandbox_info
try:
runtime = get_runtime()
sandbox_info = await runtime.create_sandbox(
self.state.agent_id, self.state.sandbox_token, self.local_sources
)
self.state.sandbox_id = sandbox_info["workspace_id"]
self.state.sandbox_token = sandbox_info["auth_token"]
self.state.sandbox_info = sandbox_info
if "agent_id" in sandbox_info:
self.state.sandbox_info["agent_id"] = sandbox_info["agent_id"]
if "agent_id" in sandbox_info:
self.state.sandbox_info["agent_id"] = sandbox_info["agent_id"]
except Exception as e:
from strix.telemetry import posthog
posthog.error("sandbox_init_error", str(e))
raise
if not self.state.task:
self.state.task = task
@@ -348,9 +339,16 @@ class BaseAgent(metaclass=AgentMeta):
self.state.add_message("user", task)
async def _process_iteration(self, tracer: Optional["Tracer"]) -> bool:
response = await self.llm.generate(self.state.get_conversation_history())
final_response = None
async for response in self.llm.generate(self.state.get_conversation_history()):
final_response = response
if tracer and response.content:
tracer.update_streaming_content(self.state.agent_id, response.content)
content_stripped = (response.content or "").strip()
if final_response is None:
return False
content_stripped = (final_response.content or "").strip()
if not content_stripped:
corrective_message = (
@@ -366,17 +364,19 @@ class BaseAgent(metaclass=AgentMeta):
self.state.add_message("user", corrective_message)
return False
self.state.add_message("assistant", response.content)
thinking_blocks = getattr(final_response, "thinking_blocks", None)
self.state.add_message("assistant", final_response.content, thinking_blocks=thinking_blocks)
if tracer:
tracer.clear_streaming_content(self.state.agent_id)
tracer.log_chat_message(
content=clean_content(response.content),
content=clean_content(final_response.content),
role="assistant",
agent_id=self.state.agent_id,
)
actions = (
response.tool_invocations
if hasattr(response, "tool_invocations") and response.tool_invocations
final_response.tool_invocations
if hasattr(final_response, "tool_invocations") and final_response.tool_invocations
else []
)
@@ -417,18 +417,6 @@ class BaseAgent(metaclass=AgentMeta):
return False
async def _handle_iteration_error(
self,
error: RuntimeError | ValueError | TypeError | asyncio.CancelledError,
tracer: Optional["Tracer"],
) -> bool:
error_msg = f"Error in iteration {self.state.iteration}: {error!s}"
logger.exception(error_msg)
self.state.add_error(error_msg)
if tracer:
tracer.update_agent_status(self.state.agent_id, "error")
return True
def _check_agent_messages(self, state: AgentState) -> None: # noqa: PLR0912
try:
from strix.tools.agents_graph.agents_graph_actions import _agent_graph, _agent_messages
@@ -513,3 +501,90 @@ class BaseAgent(metaclass=AgentMeta):
logger = logging.getLogger(__name__)
logger.warning(f"Error checking agent messages: {e}")
return
def _handle_sandbox_error(
self,
error: SandboxInitializationError,
tracer: Optional["Tracer"],
) -> dict[str, Any]:
error_msg = str(error.message)
error_details = error.details
self.state.add_error(error_msg)
if self.non_interactive:
self.state.set_completed({"success": False, "error": error_msg})
if tracer:
tracer.update_agent_status(self.state.agent_id, "failed", error_msg)
if error_details:
exec_id = tracer.log_tool_execution_start(
self.state.agent_id,
"sandbox_error_details",
{"error": error_msg, "details": error_details},
)
tracer.update_tool_execution(exec_id, "failed", {"details": error_details})
return {"success": False, "error": error_msg, "details": error_details}
self.state.enter_waiting_state()
if tracer:
tracer.update_agent_status(self.state.agent_id, "sandbox_failed", error_msg)
if error_details:
exec_id = tracer.log_tool_execution_start(
self.state.agent_id,
"sandbox_error_details",
{"error": error_msg, "details": error_details},
)
tracer.update_tool_execution(exec_id, "failed", {"details": error_details})
return {"success": False, "error": error_msg, "details": error_details}
def _handle_llm_error(
self,
error: LLMRequestFailedError,
tracer: Optional["Tracer"],
) -> dict[str, Any] | None:
error_msg = str(error)
error_details = getattr(error, "details", None)
self.state.add_error(error_msg)
if self.non_interactive:
self.state.set_completed({"success": False, "error": error_msg})
if tracer:
tracer.update_agent_status(self.state.agent_id, "failed", error_msg)
if error_details:
exec_id = tracer.log_tool_execution_start(
self.state.agent_id,
"llm_error_details",
{"error": error_msg, "details": error_details},
)
tracer.update_tool_execution(exec_id, "failed", {"details": error_details})
return {"success": False, "error": error_msg}
self.state.enter_waiting_state(llm_failed=True)
if tracer:
tracer.update_agent_status(self.state.agent_id, "llm_failed", error_msg)
if error_details:
exec_id = tracer.log_tool_execution_start(
self.state.agent_id,
"llm_error_details",
{"error": error_msg, "details": error_details},
)
tracer.update_tool_execution(exec_id, "failed", {"details": error_details})
return None
async def _handle_iteration_error(
self,
error: RuntimeError | ValueError | TypeError | asyncio.CancelledError,
tracer: Optional["Tracer"],
) -> bool:
error_msg = f"Error in iteration {self.state.iteration}: {error!s}"
logger.exception(error_msg)
self.state.add_error(error_msg)
if tracer:
tracer.update_agent_status(self.state.agent_id, "error")
return True
def cancel_current_execution(self) -> None:
if self._current_task and not self._current_task.done():
self._current_task.cancel()
self._current_task = None

View File

@@ -43,8 +43,11 @@ class AgentState(BaseModel):
self.iteration += 1
self.last_updated = datetime.now(UTC).isoformat()
def add_message(self, role: str, content: Any) -> None:
self.messages.append({"role": role, "content": content})
def add_message(self, role: str, content: Any, thinking_blocks: list[dict[str, Any]] | None = None) -> None:
message = {"role": role, "content": content}
if thinking_blocks:
message["thinking_blocks"] = thinking_blocks
self.messages.append(message)
self.last_updated = datetime.now(UTC).isoformat()
def add_action(self, action: dict[str, Any]) -> None:
@@ -123,7 +126,7 @@ class AgentState(BaseModel):
return False
elapsed = (datetime.now(UTC) - self.waiting_start_time).total_seconds()
return elapsed > 120
return elapsed > 600
def has_empty_last_messages(self, count: int = 3) -> bool:
if len(self.messages) < count:

12
strix/config/__init__.py Normal file
View File

@@ -0,0 +1,12 @@
from strix.config.config import (
Config,
apply_saved_config,
save_current_config,
)
__all__ = [
"Config",
"apply_saved_config",
"save_current_config",
]

131
strix/config/config.py Normal file
View File

@@ -0,0 +1,131 @@
import contextlib
import json
import os
from pathlib import Path
from typing import Any
class Config:
"""Configuration Manager for Strix."""
# LLM Configuration
strix_llm = None
llm_api_key = None
llm_api_base = None
openai_api_base = None
litellm_base_url = None
ollama_api_base = None
strix_reasoning_effort = "high"
llm_timeout = "300"
llm_rate_limit_delay = "4.0"
llm_rate_limit_concurrent = "1"
# Tool & Feature Configuration
perplexity_api_key = None
strix_disable_browser = "false"
# Runtime Configuration
strix_image = "ghcr.io/usestrix/strix-sandbox:0.1.10"
strix_runtime_backend = "docker"
strix_sandbox_execution_timeout = "500"
strix_sandbox_connect_timeout = "10"
# Telemetry
strix_telemetry = "1"
@classmethod
def _tracked_names(cls) -> list[str]:
return [
k
for k, v in vars(cls).items()
if not k.startswith("_") and k[0].islower() and (v is None or isinstance(v, str))
]
@classmethod
def tracked_vars(cls) -> list[str]:
return [name.upper() for name in cls._tracked_names()]
@classmethod
def get(cls, name: str) -> str | None:
env_name = name.upper()
default = getattr(cls, name, None)
return os.getenv(env_name, default)
@classmethod
def config_dir(cls) -> Path:
return Path.home() / ".strix"
@classmethod
def config_file(cls) -> Path:
return cls.config_dir() / "cli-config.json"
@classmethod
def load(cls) -> dict[str, Any]:
path = cls.config_file()
if not path.exists():
return {}
try:
with path.open("r", encoding="utf-8") as f:
data: dict[str, Any] = json.load(f)
return data
except (json.JSONDecodeError, OSError):
return {}
@classmethod
def save(cls, config: dict[str, Any]) -> bool:
try:
cls.config_dir().mkdir(parents=True, exist_ok=True)
config_path = cls.config_file()
with config_path.open("w", encoding="utf-8") as f:
json.dump(config, f, indent=2)
except OSError:
return False
with contextlib.suppress(OSError):
config_path.chmod(0o600) # may fail on Windows
return True
@classmethod
def apply_saved(cls) -> dict[str, str]:
saved = cls.load()
env_vars = saved.get("env", {})
applied = {}
for var_name, var_value in env_vars.items():
if var_name in cls.tracked_vars() and not os.getenv(var_name):
os.environ[var_name] = var_value
applied[var_name] = var_value
return applied
@classmethod
def capture_current(cls) -> dict[str, Any]:
env_vars = {}
for var_name in cls.tracked_vars():
value = os.getenv(var_name)
if value:
env_vars[var_name] = value
return {"env": env_vars}
@classmethod
def save_current(cls) -> bool:
existing = cls.load().get("env", {})
merged = dict(existing)
for var_name in cls.tracked_vars():
value = os.getenv(var_name)
if value is None:
pass
elif value == "":
merged.pop(var_name, None)
else:
merged[var_name] = value
return cls.save({"env": merged})
def apply_saved_config() -> dict[str, str]:
return Config.apply_saved()
def save_current_config() -> bool:
return Config.save_current()

View File

@@ -1,13 +1,14 @@
Screen {
background: #1a1a1a;
background: #000000;
color: #d4d4d4;
}
#splash_screen {
height: 100%;
width: 100%;
background: #1a1a1a;
background: #000000;
color: #22c55e;
align: center middle;
content-align: center middle;
text-align: center;
}
@@ -17,6 +18,7 @@ Screen {
height: auto;
background: transparent;
text-align: center;
content-align: center middle;
padding: 2;
}
@@ -24,7 +26,7 @@ Screen {
height: 100%;
padding: 0;
margin: 0;
background: #1a1a1a;
background: #000000;
}
#content_container {
@@ -33,31 +35,163 @@ Screen {
background: transparent;
}
#agents_tree {
width: 20%;
#sidebar {
width: 25%;
background: transparent;
border: round #262626;
margin-left: 1;
}
#sidebar.-hidden {
display: none;
}
#agents_tree {
height: 1fr;
background: transparent;
border: round #333333;
border-title-color: #a8a29e;
border-title-style: bold;
margin-left: 1;
padding: 1;
margin-bottom: 0;
}
#stats_display {
height: auto;
max-height: 15;
background: transparent;
padding: 0;
margin: 0;
}
#vulnerabilities_panel {
height: auto;
max-height: 12;
background: transparent;
padding: 0;
margin: 0;
border: round #333333;
overflow-y: auto;
scrollbar-background: #000000;
scrollbar-color: #333333;
scrollbar-corner-color: #000000;
scrollbar-size-vertical: 1;
}
#vulnerabilities_panel.hidden {
display: none;
}
.vuln-item {
height: auto;
width: 100%;
padding: 0 1;
background: transparent;
color: #d4d4d4;
}
.vuln-item:hover {
background: #1a1a1a;
color: #fafaf9;
}
VulnerabilityDetailScreen {
align: center middle;
background: #000000 80%;
}
#vuln_detail_dialog {
grid-size: 1;
grid-gutter: 1;
grid-rows: 1fr auto;
padding: 2 3;
width: 85%;
max-width: 110;
height: 85%;
max-height: 45;
border: solid #262626;
background: #0a0a0a;
}
#vuln_detail_scroll {
height: 1fr;
background: transparent;
scrollbar-background: #0a0a0a;
scrollbar-color: #404040;
scrollbar-corner-color: #0a0a0a;
scrollbar-size: 1 1;
padding-right: 1;
}
#vuln_detail_content {
width: 100%;
background: transparent;
padding: 0;
}
#vuln_detail_buttons {
width: 100%;
height: auto;
align: right middle;
padding-top: 1;
margin: 0;
border-top: solid #1a1a1a;
}
#copy_vuln_detail {
width: auto;
min-width: 12;
height: auto;
background: transparent;
color: #525252;
border: none;
text-style: none;
margin: 0 1;
padding: 0 2;
}
#close_vuln_detail {
width: auto;
min-width: 10;
height: auto;
background: transparent;
color: #a3a3a3;
border: none;
text-style: none;
margin: 0;
padding: 0 2;
}
#copy_vuln_detail:hover, #copy_vuln_detail:focus {
background: transparent;
color: #22c55e;
border: none;
}
#close_vuln_detail:hover, #close_vuln_detail:focus {
background: transparent;
color: #ffffff;
border: none;
}
#chat_area_container {
width: 80%;
width: 75%;
background: transparent;
}
#chat_area_container.-full-width {
width: 100%;
}
#chat_history {
height: 1fr;
background: transparent;
border: round #1a1a1a;
border: round #0a0a0a;
padding: 0;
margin-bottom: 0;
margin-right: 0;
scrollbar-background: #0f0f0f;
scrollbar-color: #262626;
scrollbar-corner-color: #0f0f0f;
scrollbar-background: #000000;
scrollbar-color: #1a1a1a;
scrollbar-corner-color: #000000;
scrollbar-size: 1 1;
}
@@ -79,7 +213,7 @@ Screen {
color: #a3a3a3;
text-align: left;
content-align: left middle;
text-style: italic;
text-style: none;
margin: 0;
padding: 0;
}
@@ -99,11 +233,11 @@ Screen {
#chat_input_container {
height: 3;
background: transparent;
border: round #525252;
border: round #333333;
margin-right: 0;
padding: 0;
layout: horizontal;
align-vertical: middle;
align-vertical: top;
}
#chat_input_container:focus-within {
@@ -120,7 +254,7 @@ Screen {
height: 100%;
padding: 0 0 0 1;
color: #737373;
content-align-vertical: middle;
content-align-vertical: top;
}
#chat_history:focus {
@@ -130,7 +264,7 @@ Screen {
#chat_input {
width: 1fr;
height: 100%;
background: #121212;
background: transparent;
border: none;
color: #d4d4d4;
padding: 0;
@@ -141,6 +275,14 @@ Screen {
border: none;
}
#chat_input .text-area--cursor-line {
background: transparent;
}
#chat_input:focus .text-area--cursor-line {
background: transparent;
}
#chat_input > .text-area--placeholder {
color: #525252;
text-style: italic;
@@ -184,39 +326,31 @@ Screen {
}
.tool-call {
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
margin-top: 1;
margin-bottom: 0;
padding: 0 1;
background: #0a0a0a;
border: round #1a1a1a;
border-left: thick #f59e0b;
background: transparent;
border: none;
width: 100%;
}
.tool-call.status-completed {
border-left: thick #22c55e;
background: #0d1f12;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
background: transparent;
margin-top: 1;
margin-bottom: 0;
}
.tool-call.status-running {
border-left: thick #f59e0b;
background: #1f1611;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
background: transparent;
margin-top: 1;
margin-bottom: 0;
}
.tool-call.status-failed,
.tool-call.status-error {
border-left: thick #ef4444;
background: #1f0d0d;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
background: transparent;
margin-top: 1;
margin-bottom: 0;
}
.browser-tool,
@@ -228,209 +362,54 @@ Screen {
.notes-tool,
.thinking-tool,
.web-search-tool,
.finish-tool,
.reporting-tool,
.scan-info-tool,
.subagent-info-tool {
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.browser-tool {
border-left: thick #06b6d4;
}
.browser-tool.status-completed {
border-left: thick #06b6d4;
background: transparent;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.browser-tool.status-running {
border-left: thick #0891b2;
background: transparent;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.terminal-tool {
border-left: thick #22c55e;
}
.terminal-tool.status-completed {
border-left: thick #22c55e;
background: transparent;
}
.terminal-tool.status-running {
border-left: thick #16a34a;
background: transparent;
}
.python-tool {
border-left: thick #3b82f6;
}
.python-tool.status-completed {
border-left: thick #3b82f6;
background: transparent;
}
.python-tool.status-running {
border-left: thick #2563eb;
background: transparent;
}
.agents-graph-tool {
border-left: thick #fbbf24;
}
.agents-graph-tool.status-completed {
border-left: thick #fbbf24;
background: transparent;
}
.agents-graph-tool.status-running {
border-left: thick #f59e0b;
background: transparent;
}
.file-edit-tool {
border-left: thick #10b981;
}
.file-edit-tool.status-completed {
border-left: thick #10b981;
background: transparent;
}
.file-edit-tool.status-running {
border-left: thick #059669;
background: transparent;
}
.proxy-tool {
border-left: thick #06b6d4;
}
.proxy-tool.status-completed {
border-left: thick #06b6d4;
background: transparent;
}
.proxy-tool.status-running {
border-left: thick #0891b2;
background: transparent;
}
.notes-tool {
border-left: thick #fbbf24;
}
.notes-tool.status-completed {
border-left: thick #fbbf24;
background: transparent;
}
.notes-tool.status-running {
border-left: thick #f59e0b;
background: transparent;
}
.thinking-tool {
border-left: thick #a855f7;
}
.thinking-tool.status-completed {
border-left: thick #a855f7;
background: transparent;
}
.thinking-tool.status-running {
border-left: thick #9333ea;
background: transparent;
}
.web-search-tool {
border-left: thick #22c55e;
}
.web-search-tool.status-completed {
border-left: thick #22c55e;
background: transparent;
}
.web-search-tool.status-running {
border-left: thick #16a34a;
background: transparent;
}
.finish-tool {
border-left: thick #dc2626;
}
.finish-tool.status-completed {
border-left: thick #dc2626;
background: transparent;
}
.finish-tool.status-running {
border-left: thick #b91c1c;
margin-top: 1;
margin-bottom: 0;
background: transparent;
}
.finish-tool,
.reporting-tool {
border-left: thick #ea580c;
}
.reporting-tool.status-completed {
border-left: thick #ea580c;
background: transparent;
}
.reporting-tool.status-running {
border-left: thick #c2410c;
background: transparent;
}
.scan-info-tool {
border-left: thick #22c55e;
background: transparent;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.scan-info-tool.status-completed {
border-left: thick #22c55e;
background: transparent;
}
.scan-info-tool.status-running {
border-left: thick #16a34a;
background: transparent;
}
.subagent-info-tool {
border-left: thick #22c55e;
background: transparent;
margin: 0 !important;
margin-top: 0 !important;
margin-bottom: 0 !important;
}
.subagent-info-tool.status-completed {
border-left: thick #22c55e;
margin-top: 1;
margin-bottom: 0;
background: transparent;
}
.browser-tool.status-completed,
.browser-tool.status-running,
.terminal-tool.status-completed,
.terminal-tool.status-running,
.python-tool.status-completed,
.python-tool.status-running,
.agents-graph-tool.status-completed,
.agents-graph-tool.status-running,
.file-edit-tool.status-completed,
.file-edit-tool.status-running,
.proxy-tool.status-completed,
.proxy-tool.status-running,
.notes-tool.status-completed,
.notes-tool.status-running,
.thinking-tool.status-completed,
.thinking-tool.status-running,
.web-search-tool.status-completed,
.web-search-tool.status-running,
.scan-info-tool.status-completed,
.scan-info-tool.status-running,
.subagent-info-tool.status-completed,
.subagent-info-tool.status-running {
border-left: thick #16a34a;
background: transparent;
margin-top: 1;
margin-bottom: 0;
}
.finish-tool.status-completed,
.finish-tool.status-running,
.reporting-tool.status-completed,
.reporting-tool.status-running {
background: transparent;
margin-top: 1;
margin-bottom: 0;
}
Tree {
@@ -448,7 +427,7 @@ Tree > .tree--label {
background: transparent;
padding: 0 1;
margin-bottom: 1;
border-bottom: solid #262626;
border-bottom: solid #1a1a1a;
text-align: center;
}
@@ -488,7 +467,7 @@ Tree > .tree--label {
}
Tree:focus {
border: round #262626;
border: round #1a1a1a;
}
Tree:focus > .tree--label {
@@ -532,7 +511,7 @@ StopAgentScreen {
width: 30;
height: auto;
border: round #a3a3a3;
background: #1a1a1a 98%;
background: #000000 98%;
}
#stop_agent_title {
@@ -594,8 +573,8 @@ QuitScreen {
padding: 1;
width: 24;
height: auto;
border: round #525252;
background: #1a1a1a 98%;
border: round #333333;
background: #000000 98%;
}
#quit_title {
@@ -658,7 +637,7 @@ HelpScreen {
width: 40;
height: auto;
border: round #22c55e;
background: #1a1a1a 98%;
background: #000000 98%;
}
#help_title {

View File

@@ -1,9 +1,12 @@
import atexit
import signal
import sys
import threading
import time
from typing import Any
from rich.console import Console
from rich.live import Live
from rich.panel import Panel
from rich.text import Text
@@ -11,7 +14,10 @@ from strix.agents.StrixAgent import StrixAgent
from strix.llm.config import LLMConfig
from strix.telemetry.tracer import Tracer, set_global_tracer
from .utils import get_severity_color
from .utils import (
build_live_stats_text,
format_vulnerability_report,
)
async def run_cli(args: Any) -> None: # noqa: PLR0915
@@ -36,7 +42,7 @@ async def run_cli(args: Any) -> None: # noqa: PLR0915
results_text = Text()
results_text.append("📊 Results will be saved to: ", style="bold cyan")
results_text.append(f"agent_runs/{args.run_name}", style="bold white")
results_text.append(f"strix_runs/{args.run_name}", style="bold white")
note_text = Text()
note_text.append("\n\n", style="dim")
@@ -63,6 +69,8 @@ async def run_cli(args: Any) -> None: # noqa: PLR0915
console.print(startup_panel)
console.print()
scan_mode = getattr(args, "scan_mode", "deep")
scan_config = {
"scan_id": args.run_name,
"targets": args.targets_info,
@@ -70,7 +78,7 @@ async def run_cli(args: Any) -> None: # noqa: PLR0915
"run_name": args.run_name,
}
llm_config = LLMConfig()
llm_config = LLMConfig(scan_mode=scan_mode)
agent_config = {
"llm_config": llm_config,
"max_iterations": 300,
@@ -83,28 +91,14 @@ async def run_cli(args: Any) -> None: # noqa: PLR0915
tracer = Tracer(args.run_name)
tracer.set_scan_config(scan_config)
def display_vulnerability(report_id: str, title: str, content: str, severity: str) -> None:
severity_color = get_severity_color(severity.lower())
def display_vulnerability(report: dict[str, Any]) -> None:
report_id = report.get("id", "unknown")
vuln_text = Text()
vuln_text.append("🐞 ", style="bold red")
vuln_text.append("VULNERABILITY FOUND", style="bold red")
vuln_text.append("", style="dim white")
vuln_text.append(title, style="bold white")
severity_text = Text()
severity_text.append("Severity: ", style="dim white")
severity_text.append(severity.upper(), style=f"bold {severity_color}")
vuln_text = format_vulnerability_report(report)
vuln_panel = Panel(
Text.assemble(
vuln_text,
"\n\n",
severity_text,
"\n\n",
content,
),
title=f"[bold red]🔍 {report_id.upper()}",
vuln_text,
title=f"[bold red]{report_id.upper()}",
title_align="left",
border_style="red",
padding=(1, 2),
@@ -130,19 +124,59 @@ async def run_cli(args: Any) -> None: # noqa: PLR0915
set_global_tracer(tracer)
def create_live_status() -> Panel:
status_text = Text()
status_text.append("🦉 ", style="bold white")
status_text.append("Running penetration test...", style="bold #22c55e")
status_text.append("\n\n")
stats_text = build_live_stats_text(tracer, agent_config)
if stats_text:
status_text.append(stats_text)
return Panel(
status_text,
title="[bold #22c55e]🔍 Live Penetration Test Status",
title_align="center",
border_style="#22c55e",
padding=(1, 2),
)
try:
console.print()
with console.status("[bold cyan]Running penetration test...", spinner="dots") as status:
agent = StrixAgent(agent_config)
result = await agent.execute_scan(scan_config)
status.stop()
if isinstance(result, dict) and not result.get("success", True):
error_msg = result.get("error", "Unknown error")
console.print()
console.print(f"[bold red]❌ Penetration test failed:[/] {error_msg}")
console.print()
sys.exit(1)
with Live(
create_live_status(), console=console, refresh_per_second=2, transient=False
) as live:
stop_updates = threading.Event()
def update_status() -> None:
while not stop_updates.is_set():
try:
live.update(create_live_status())
time.sleep(2)
except Exception: # noqa: BLE001
break
update_thread = threading.Thread(target=update_status, daemon=True)
update_thread.start()
try:
agent = StrixAgent(agent_config)
result = await agent.execute_scan(scan_config)
if isinstance(result, dict) and not result.get("success", True):
error_msg = result.get("error", "Unknown error")
error_details = result.get("details")
console.print()
console.print(f"[bold red]❌ Penetration test failed:[/] {error_msg}")
if error_details:
console.print(f"[dim]{error_details}[/]")
console.print()
sys.exit(1)
finally:
stop_updates.set()
update_thread.join(timeout=1)
except Exception as e:
console.print(f"[bold red]Error during penetration test:[/] {e}")

View File

@@ -6,10 +6,10 @@ Strix Agent Interface
import argparse
import asyncio
import logging
import os
import shutil
import sys
from pathlib import Path
from typing import Any
import litellm
from docker.errors import DockerException
@@ -17,12 +17,16 @@ from rich.console import Console
from rich.panel import Panel
from rich.text import Text
from strix.interface.cli import run_cli
from strix.interface.tui import run_tui
from strix.interface.utils import (
from strix.config import Config, apply_saved_config, save_current_config
apply_saved_config()
from strix.interface.cli import run_cli # noqa: E402
from strix.interface.tui import run_tui # noqa: E402
from strix.interface.utils import ( # noqa: E402
assign_workspace_subdirs,
build_llm_stats_text,
build_stats_text,
build_final_stats_text,
check_docker_connection,
clone_repository,
collect_local_sources,
@@ -30,10 +34,12 @@ from strix.interface.utils import (
image_exists,
infer_target_type,
process_pull_line,
rewrite_localhost_targets,
validate_llm_response,
)
from strix.runtime.docker_runtime import STRIX_IMAGE
from strix.telemetry.tracer import get_global_tracer
from strix.runtime.docker_runtime import HOST_GATEWAY_HOSTNAME # noqa: E402
from strix.telemetry import posthog # noqa: E402
from strix.telemetry.tracer import get_global_tracer # noqa: E402
logging.getLogger().setLevel(logging.ERROR)
@@ -44,30 +50,30 @@ def validate_environment() -> None: # noqa: PLR0912, PLR0915
missing_required_vars = []
missing_optional_vars = []
if not os.getenv("STRIX_LLM"):
if not Config.get("strix_llm"):
missing_required_vars.append("STRIX_LLM")
has_base_url = any(
[
os.getenv("LLM_API_BASE"),
os.getenv("OPENAI_API_BASE"),
os.getenv("LITELLM_BASE_URL"),
os.getenv("OLLAMA_API_BASE"),
Config.get("llm_api_base"),
Config.get("openai_api_base"),
Config.get("litellm_base_url"),
Config.get("ollama_api_base"),
]
)
if not os.getenv("LLM_API_KEY"):
if not has_base_url:
missing_required_vars.append("LLM_API_KEY")
else:
missing_optional_vars.append("LLM_API_KEY")
if not Config.get("llm_api_key"):
missing_optional_vars.append("LLM_API_KEY")
if not has_base_url:
missing_optional_vars.append("LLM_API_BASE")
if not os.getenv("PERPLEXITY_API_KEY"):
if not Config.get("perplexity_api_key"):
missing_optional_vars.append("PERPLEXITY_API_KEY")
if not Config.get("strix_reasoning_effort"):
missing_optional_vars.append("STRIX_REASONING_EFFORT")
if missing_required_vars:
error_text = Text()
error_text.append("", style="bold red")
@@ -93,13 +99,6 @@ def validate_environment() -> None: # noqa: PLR0912, PLR0915
" - Model name to use with litellm (e.g., 'openai/gpt-5')\n",
style="white",
)
elif var == "LLM_API_KEY":
error_text.append("", style="white")
error_text.append("LLM_API_KEY", style="bold cyan")
error_text.append(
" - API key for the LLM provider (required for cloud providers)\n",
style="white",
)
if missing_optional_vars:
error_text.append("\nOptional environment variables:\n", style="white")
@@ -107,7 +106,11 @@ def validate_environment() -> None: # noqa: PLR0912, PLR0915
if var == "LLM_API_KEY":
error_text.append("", style="white")
error_text.append("LLM_API_KEY", style="bold cyan")
error_text.append(" - API key for the LLM provider\n", style="white")
error_text.append(
" - API key for the LLM provider "
"(not needed for local models, Vertex AI, AWS, etc.)\n",
style="white",
)
elif var == "LLM_API_BASE":
error_text.append("", style="white")
error_text.append("LLM_API_BASE", style="bold cyan")
@@ -122,18 +125,24 @@ def validate_environment() -> None: # noqa: PLR0912, PLR0915
" - API key for Perplexity AI web search (enables real-time research)\n",
style="white",
)
elif var == "STRIX_REASONING_EFFORT":
error_text.append("", style="white")
error_text.append("STRIX_REASONING_EFFORT", style="bold cyan")
error_text.append(
" - Reasoning effort level: none, minimal, low, medium, high, xhigh "
"(default: high)\n",
style="white",
)
error_text.append("\nExample setup:\n", style="white")
error_text.append("export STRIX_LLM='openai/gpt-5'\n", style="dim white")
if "LLM_API_KEY" in missing_required_vars:
error_text.append("export LLM_API_KEY='your-api-key-here'\n", style="dim white")
if missing_optional_vars:
for var in missing_optional_vars:
if var == "LLM_API_KEY":
error_text.append(
"export LLM_API_KEY='your-api-key-here' # optional with local models\n",
"export LLM_API_KEY='your-api-key-here' "
"# not needed for local models, Vertex AI, AWS, etc.\n",
style="dim white",
)
elif var == "LLM_API_BASE":
@@ -146,6 +155,11 @@ def validate_environment() -> None: # noqa: PLR0912, PLR0915
error_text.append(
"export PERPLEXITY_API_KEY='your-perplexity-key-here'\n", style="dim white"
)
elif var == "STRIX_REASONING_EFFORT":
error_text.append(
"export STRIX_REASONING_EFFORT='high'\n",
style="dim white",
)
panel = Panel(
error_text,
@@ -188,30 +202,33 @@ async def warm_up_llm() -> None:
console = Console()
try:
model_name = os.getenv("STRIX_LLM", "openai/gpt-5")
api_key = os.getenv("LLM_API_KEY")
if api_key:
litellm.api_key = api_key
model_name = Config.get("strix_llm")
api_key = Config.get("llm_api_key")
api_base = (
os.getenv("LLM_API_BASE")
or os.getenv("OPENAI_API_BASE")
or os.getenv("LITELLM_BASE_URL")
or os.getenv("OLLAMA_API_BASE")
Config.get("llm_api_base")
or Config.get("openai_api_base")
or Config.get("litellm_base_url")
or Config.get("ollama_api_base")
)
if api_base:
litellm.api_base = api_base
test_messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Reply with just 'OK'."},
]
response = litellm.completion(
model=model_name,
messages=test_messages,
)
llm_timeout = int(Config.get("llm_timeout") or "300")
completion_kwargs: dict[str, Any] = {
"model": model_name,
"messages": test_messages,
"timeout": llm_timeout,
}
if api_key:
completion_kwargs["api_key"] = api_key
if api_base:
completion_kwargs["api_base"] = api_base
response = litellm.completion(**completion_kwargs)
validate_llm_response(response)
@@ -238,6 +255,15 @@ async def warm_up_llm() -> None:
sys.exit(1)
def get_version() -> str:
try:
from importlib.metadata import version
return version("strix-agent")
except Exception: # noqa: BLE001
return "unknown"
def parse_arguments() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Strix Multi-Agent Cybersecurity Penetration Testing Tool",
@@ -257,22 +283,36 @@ Examples:
# Domain penetration test
strix --target example.com
# IP address penetration test
strix --target 192.168.1.42
# Multiple targets (e.g., white-box testing with source and deployed app)
strix --target https://github.com/user/repo --target https://example.com
strix --target ./my-project --target https://staging.example.com --target https://prod.example.com
# Custom instructions
# Custom instructions (inline)
strix --target example.com --instruction "Focus on authentication vulnerabilities"
# Custom instructions (from file)
strix --target example.com --instruction-file ./instructions.txt
strix --target https://app.com --instruction-file /path/to/detailed_instructions.md
""",
)
parser.add_argument(
"-v",
"--version",
action="version",
version=f"strix {get_version()}",
)
parser.add_argument(
"-t",
"--target",
type=str,
required=True,
action="append",
help="Target to test (URL, repository, local directory path, or domain name). "
help="Target to test (URL, repository, local directory path, domain name, or IP address). "
"Can be specified multiple times for multi-target scans.",
)
parser.add_argument(
@@ -283,13 +323,15 @@ Examples:
"testing approaches (e.g., 'Perform thorough authentication testing'), "
"test credentials (e.g., 'Use the following credentials to access the app: "
"admin:password123'), "
"or areas of interest (e.g., 'Check login API endpoint for security issues')",
"or areas of interest (e.g., 'Check login API endpoint for security issues').",
)
parser.add_argument(
"--run-name",
"--instruction-file",
type=str,
help="Custom name for this penetration test run",
help="Path to a file containing detailed custom instructions for the penetration test. "
"Use this option when you have lengthy or complex instructions saved in a file "
"(e.g., '--instruction-file ./detailed_instructions.txt').",
)
parser.add_argument(
@@ -302,8 +344,38 @@ Examples:
),
)
parser.add_argument(
"-m",
"--scan-mode",
type=str,
choices=["quick", "standard", "deep"],
default="deep",
help=(
"Scan mode: "
"'quick' for fast CI/CD checks, "
"'standard' for routine testing, "
"'deep' for thorough security reviews (default). "
"Default: deep."
),
)
args = parser.parse_args()
if args.instruction and args.instruction_file:
parser.error(
"Cannot specify both --instruction and --instruction-file. Use one or the other."
)
if args.instruction_file:
instruction_path = Path(args.instruction_file)
try:
with instruction_path.open(encoding="utf-8") as f:
args.instruction = f.read().strip()
if not args.instruction:
parser.error(f"Instruction file '{instruction_path}' is empty")
except Exception as e: # noqa: BLE001
parser.error(f"Failed to read instruction file '{instruction_path}': {e}")
args.targets_info = []
for target in args.target:
try:
@@ -321,6 +393,7 @@ Examples:
parser.error(f"Invalid target '{target}'")
assign_workspace_subdirs(args.targets_info)
rewrite_localhost_targets(args.targets_info, HOST_GATEWAY_HOSTNAME)
return args
@@ -347,8 +420,7 @@ def display_completion_message(args: argparse.Namespace, results_path: Path) ->
completion_text.append("", style="dim white")
completion_text.append("Penetration test interrupted by user", style="white")
stats_text = build_stats_text(tracer)
llm_stats_text = build_llm_stats_text(tracer)
stats_text = build_final_stats_text(tracer)
target_text = Text()
if len(args.targets_info) == 1:
@@ -368,9 +440,6 @@ def display_completion_message(args: argparse.Namespace, results_path: Path) ->
if stats_text.plain:
panel_parts.extend(["\n", stats_text])
if llm_stats_text.plain:
panel_parts.extend(["\n", llm_stats_text])
if scan_completed or has_vulnerabilities:
results_text = Text()
results_text.append("📊 Results Saved To: ", style="bold cyan")
@@ -392,17 +461,20 @@ def display_completion_message(args: argparse.Namespace, results_path: Path) ->
console.print("\n")
console.print(panel)
console.print()
console.print("[dim]🌐 Website:[/] [cyan]https://strix.ai[/]")
console.print("[dim]💬 Discord:[/] [cyan]https://discord.gg/YjKFvEZSdZ[/]")
console.print()
def pull_docker_image() -> None:
console = Console()
client = check_docker_connection()
if image_exists(client, STRIX_IMAGE):
if image_exists(client, Config.get("strix_image")): # type: ignore[arg-type]
return
console.print()
console.print(f"[bold cyan]🐳 Pulling Docker image:[/] {STRIX_IMAGE}")
console.print(f"[bold cyan]🐳 Pulling Docker image:[/] {Config.get('strix_image')}")
console.print("[dim yellow]This only happens on first run and may take a few minutes...[/]")
console.print()
@@ -411,7 +483,7 @@ def pull_docker_image() -> None:
layers_info: dict[str, str] = {}
last_update = ""
for line in client.api.pull(STRIX_IMAGE, stream=True, decode=True):
for line in client.api.pull(Config.get("strix_image"), stream=True, decode=True):
last_update = process_pull_line(line, layers_info, status, last_update)
except DockerException as e:
@@ -420,7 +492,7 @@ def pull_docker_image() -> None:
error_text.append("", style="bold red")
error_text.append("FAILED TO PULL IMAGE", style="bold red")
error_text.append("\n\n", style="white")
error_text.append(f"Could not download: {STRIX_IMAGE}\n", style="white")
error_text.append(f"Could not download: {Config.get('strix_image')}\n", style="white")
error_text.append(str(e), style="dim red")
panel = Panel(
@@ -452,8 +524,9 @@ def main() -> None:
validate_environment()
asyncio.run(warm_up_llm())
if not args.run_name:
args.run_name = generate_run_name()
save_current_config()
args.run_name = generate_run_name(args.targets_info)
for target_info in args.targets_info:
if target_info["type"] == "repository":
@@ -464,12 +537,34 @@ def main() -> None:
args.local_sources = collect_local_sources(args.targets_info)
if args.non_interactive:
asyncio.run(run_cli(args))
else:
asyncio.run(run_tui(args))
is_whitebox = bool(args.local_sources)
results_path = Path("agent_runs") / args.run_name
posthog.start(
model=Config.get("strix_llm"),
scan_mode=args.scan_mode,
is_whitebox=is_whitebox,
interactive=not args.non_interactive,
has_instructions=bool(args.instruction),
)
exit_reason = "user_exit"
try:
if args.non_interactive:
asyncio.run(run_cli(args))
else:
asyncio.run(run_tui(args))
except KeyboardInterrupt:
exit_reason = "interrupted"
except Exception as e:
exit_reason = "error"
posthog.error("unhandled_exception", str(e))
raise
finally:
tracer = get_global_tracer()
if tracer:
posthog.end(tracer, exit_reason=exit_reason)
results_path = Path("strix_runs") / args.run_name
display_completion_message(args, results_path)
if args.non_interactive:

View File

@@ -0,0 +1,119 @@
import html
import re
from dataclasses import dataclass
from typing import Literal
_FUNCTION_TAG_PREFIX = "<function="
def _get_safe_content(content: str) -> tuple[str, str]:
if not content:
return "", ""
last_lt = content.rfind("<")
if last_lt == -1:
return content, ""
suffix = content[last_lt:]
target = _FUNCTION_TAG_PREFIX # "<function="
if target.startswith(suffix):
return content[:last_lt], suffix
return content, ""
@dataclass
class StreamSegment:
type: Literal["text", "tool"]
content: str
tool_name: str | None = None
args: dict[str, str] | None = None
is_complete: bool = False
def parse_streaming_content(content: str) -> list[StreamSegment]:
if not content:
return []
segments: list[StreamSegment] = []
func_pattern = r"<function=([^>]+)>"
func_matches = list(re.finditer(func_pattern, content))
if not func_matches:
safe_content, _ = _get_safe_content(content)
text = safe_content.strip()
if text:
segments.append(StreamSegment(type="text", content=text))
return segments
first_func_start = func_matches[0].start()
if first_func_start > 0:
text_before = content[:first_func_start].strip()
if text_before:
segments.append(StreamSegment(type="text", content=text_before))
for i, match in enumerate(func_matches):
tool_name = match.group(1)
func_start = match.end()
func_end_match = re.search(r"</function>", content[func_start:])
if func_end_match:
func_body = content[func_start : func_start + func_end_match.start()]
is_complete = True
end_pos = func_start + func_end_match.end()
else:
if i + 1 < len(func_matches):
next_func_start = func_matches[i + 1].start()
func_body = content[func_start:next_func_start]
else:
func_body = content[func_start:]
is_complete = False
end_pos = len(content)
args = _parse_streaming_params(func_body)
segments.append(
StreamSegment(
type="tool",
content=func_body,
tool_name=tool_name,
args=args,
is_complete=is_complete,
)
)
if is_complete and i + 1 < len(func_matches):
next_start = func_matches[i + 1].start()
text_between = content[end_pos:next_start].strip()
if text_between:
segments.append(StreamSegment(type="text", content=text_between))
return segments
def _parse_streaming_params(func_body: str) -> dict[str, str]:
args: dict[str, str] = {}
complete_pattern = r"<parameter=([^>]+)>(.*?)</parameter>"
complete_matches = list(re.finditer(complete_pattern, func_body, re.DOTALL))
complete_end_pos = 0
for match in complete_matches:
param_name = match.group(1)
param_value = html.unescape(match.group(2).strip())
args[param_name] = param_value
complete_end_pos = max(complete_end_pos, match.end())
remaining = func_body[complete_end_pos:]
incomplete_pattern = r"<parameter=([^>]+)>(.*)$"
incomplete_match = re.search(incomplete_pattern, remaining, re.DOTALL)
if incomplete_match:
param_name = incomplete_match.group(1)
param_value = html.unescape(incomplete_match.group(2).strip())
args[param_name] = param_value
return args

View File

@@ -1,4 +1,5 @@
from . import (
agent_message_renderer,
agents_graph_renderer,
browser_renderer,
file_edit_renderer,
@@ -10,6 +11,7 @@ from . import (
scan_info_renderer,
terminal_renderer,
thinking_renderer,
todo_renderer,
user_message_renderer,
web_search_renderer,
)
@@ -20,6 +22,7 @@ from .registry import ToolTUIRegistry, get_tool_renderer, register_tool_renderer
__all__ = [
"BaseToolRenderer",
"ToolTUIRegistry",
"agent_message_renderer",
"agents_graph_renderer",
"browser_renderer",
"file_edit_renderer",
@@ -34,6 +37,7 @@ __all__ = [
"scan_info_renderer",
"terminal_renderer",
"thinking_renderer",
"todo_renderer",
"user_message_renderer",
"web_search_renderer",
]

View File

@@ -0,0 +1,190 @@
from functools import cache
from typing import Any, ClassVar
from pygments.lexers import get_lexer_by_name, guess_lexer
from pygments.styles import get_style_by_name
from pygments.util import ClassNotFound
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
_HEADER_STYLES = [
("###### ", 7, "bold #4ade80"),
("##### ", 6, "bold #22c55e"),
("#### ", 5, "bold #16a34a"),
("### ", 4, "bold #15803d"),
("## ", 3, "bold #22c55e"),
("# ", 2, "bold #4ade80"),
]
@cache
def _get_style_colors() -> dict[Any, str]:
style = get_style_by_name("native")
return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
def _get_token_color(token_type: Any) -> str | None:
colors = _get_style_colors()
while token_type:
if token_type in colors:
return colors[token_type]
token_type = token_type.parent
return None
def _highlight_code(code: str, language: str | None = None) -> Text:
text = Text()
try:
lexer = get_lexer_by_name(language) if language else guess_lexer(code)
except ClassNotFound:
text.append(code, style="#d4d4d4")
return text
for token_type, token_value in lexer.get_tokens(code):
if not token_value:
continue
color = _get_token_color(token_type)
text.append(token_value, style=color)
return text
def _try_parse_header(line: str) -> tuple[str, str] | None:
for prefix, strip_len, style in _HEADER_STYLES:
if line.startswith(prefix):
return (line[strip_len:], style)
return None
def _apply_markdown_styles(text: str) -> Text: # noqa: PLR0912
result = Text()
lines = text.split("\n")
in_code_block = False
code_block_lang: str | None = None
code_block_lines: list[str] = []
for i, line in enumerate(lines):
if i > 0 and not in_code_block:
result.append("\n")
if line.startswith("```"):
if not in_code_block:
in_code_block = True
code_block_lang = line[3:].strip() or None
code_block_lines = []
if i > 0:
result.append("\n")
else:
in_code_block = False
code_content = "\n".join(code_block_lines)
if code_content:
result.append_text(_highlight_code(code_content, code_block_lang))
code_block_lines = []
code_block_lang = None
continue
if in_code_block:
code_block_lines.append(line)
continue
header = _try_parse_header(line)
if header:
result.append(header[0], style=header[1])
elif line.startswith("> "):
result.append("", style="#22c55e")
result.append_text(_process_inline_formatting(line[2:]))
elif line.startswith(("- ", "* ")):
result.append("", style="#22c55e")
result.append_text(_process_inline_formatting(line[2:]))
elif len(line) > 2 and line[0].isdigit() and line[1:3] in (". ", ") "):
result.append(line[0] + ". ", style="#22c55e")
result.append_text(_process_inline_formatting(line[2:]))
elif line.strip() in ("---", "***", "___"):
result.append("" * 40, style="#22c55e")
else:
result.append_text(_process_inline_formatting(line))
if in_code_block and code_block_lines:
code_content = "\n".join(code_block_lines)
result.append_text(_highlight_code(code_content, code_block_lang))
return result
def _process_inline_formatting(line: str) -> Text:
result = Text()
i = 0
n = len(line)
while i < n:
if i + 1 < n and line[i : i + 2] in ("**", "__"):
marker = line[i : i + 2]
end = line.find(marker, i + 2)
if end != -1:
result.append(line[i + 2 : end], style="bold #4ade80")
i = end + 2
continue
if i + 1 < n and line[i : i + 2] == "~~":
end = line.find("~~", i + 2)
if end != -1:
result.append(line[i + 2 : end], style="strike #525252")
i = end + 2
continue
if line[i] == "`":
end = line.find("`", i + 1)
if end != -1:
result.append(line[i + 1 : end], style="bold #22c55e on #0a0a0a")
i = end + 1
continue
if line[i] in ("*", "_"):
marker = line[i]
if i + 1 < n and line[i + 1] != marker:
end = line.find(marker, i + 1)
if end != -1 and (end + 1 >= n or line[end + 1] != marker):
result.append(line[i + 1 : end], style="italic #86efac")
i = end + 1
continue
result.append(line[i])
i += 1
return result
@register_tool_renderer
class AgentMessageRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "agent_message"
css_classes: ClassVar[list[str]] = ["chat-message", "agent-message"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
content = tool_data.get("content", "")
if not content:
return Static(Text(), classes=" ".join(cls.css_classes))
styled_text = _apply_markdown_styles(content)
return Static(styled_text, classes=" ".join(cls.css_classes))
@classmethod
def render_simple(cls, content: str) -> Text:
if not content:
return Text()
from strix.llm.utils import clean_content
cleaned = clean_content(content)
if not cleaned:
return Text()
return _apply_markdown_styles(cleaned)

View File

@@ -1,5 +1,6 @@
from typing import Any, ClassVar
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
@@ -12,11 +13,15 @@ class ViewAgentGraphRenderer(BaseToolRenderer):
css_classes: ClassVar[list[str]] = ["tool-call", "agents-graph-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static: # noqa: ARG003
content_text = "🕸️ [bold #fbbf24]Viewing agents graph[/]"
def render(cls, tool_data: dict[str, Any]) -> Static:
status = tool_data.get("status", "unknown")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
text = Text()
text.append("", style="#a78bfa")
text.append("viewing agents graph", style="dim")
css_classes = cls.get_css_classes(status)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -27,20 +32,22 @@ class CreateAgentRenderer(BaseToolRenderer):
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
task = args.get("task", "")
name = args.get("name", "Agent")
header = f"🤖 [bold #fbbf24]Creating {cls.escape_markup(name)}[/]"
text = Text()
text.append("", style="#a78bfa")
text.append("spawning ", style="dim")
text.append(name, style="bold #a78bfa")
if task:
task_display = task[:400] + "..." if len(task) > 400 else task
content_text = f"{header}\n [dim]{cls.escape_markup(task_display)}[/]"
else:
content_text = f"{header}\n [dim]Spawning agent...[/]"
text.append("\n ")
text.append(task, style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
css_classes = cls.get_css_classes(status)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -51,19 +58,24 @@ class SendMessageToAgentRenderer(BaseToolRenderer):
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
message = args.get("message", "")
agent_id = args.get("agent_id", "")
header = "💬 [bold #fbbf24]Sending message[/]"
text = Text()
text.append("", style="#60a5fa")
if agent_id:
text.append(f"to {agent_id}", style="dim")
else:
text.append("sending message", style="dim")
if message:
message_display = message[:400] + "..." if len(message) > 400 else message
content_text = f"{header}\n [dim]{cls.escape_markup(message_display)}[/]"
else:
content_text = f"{header}\n [dim]Sending...[/]"
text.append("\n ")
text.append(message, style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
css_classes = cls.get_css_classes(status)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -79,25 +91,28 @@ class AgentFinishRenderer(BaseToolRenderer):
findings = args.get("findings", [])
success = args.get("success", True)
header = (
"🏁 [bold #fbbf24]Agent completed[/]" if success else "🏁 [bold #fbbf24]Agent failed[/]"
)
text = Text()
text.append("🏁 ")
if success:
text.append("Agent completed", style="bold #fbbf24")
else:
text.append("Agent failed", style="bold #fbbf24")
if result_summary:
content_parts = [f"{header}\n [bold]{cls.escape_markup(result_summary)}[/]"]
text.append("\n ")
text.append(result_summary, style="bold")
if findings and isinstance(findings, list):
finding_lines = [f"{finding}" for finding in findings]
content_parts.append(
f" [dim]{chr(10).join([cls.escape_markup(line) for line in finding_lines])}[/]"
)
content_text = "\n".join(content_parts)
for finding in findings:
text.append("\n")
text.append(str(finding), style="dim")
else:
content_text = f"{header}\n [dim]Completing task...[/]"
text.append("\n ")
text.append("Completing task...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -108,16 +123,17 @@ class WaitForMessageRenderer(BaseToolRenderer):
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
reason = args.get("reason", "Waiting for messages from other agents or user input")
reason = args.get("reason", "")
header = "⏸️ [bold #fbbf24]Waiting for messages[/]"
text = Text()
text.append("", style="#6b7280")
text.append("waiting", style="dim")
if reason:
reason_display = reason[:400] + "..." if len(reason) > 400 else reason
content_text = f"{header}\n [dim]{cls.escape_markup(reason_display)}[/]"
else:
content_text = f"{header}\n [dim]Agent paused until message received...[/]"
text.append("\n ")
text.append(reason, style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
css_classes = cls.get_css_classes(status)
return Static(text, classes=css_classes)

View File

@@ -1,13 +1,12 @@
from abc import ABC, abstractmethod
from typing import Any, ClassVar, cast
from typing import Any, ClassVar
from rich.markup import escape as rich_escape
from rich.text import Text
from textual.widgets import Static
class BaseToolRenderer(ABC):
tool_name: ClassVar[str] = ""
css_classes: ClassVar[list[str]] = ["tool-call"]
@classmethod
@@ -16,47 +15,80 @@ class BaseToolRenderer(ABC):
pass
@classmethod
def escape_markup(cls, text: str) -> str:
return cast("str", rich_escape(text))
def build_text(cls, tool_data: dict[str, Any]) -> Text: # noqa: ARG003
return Text()
@classmethod
def format_args(cls, args: dict[str, Any], max_length: int = 500) -> str:
if not args:
return ""
args_parts = []
for k, v in args.items():
str_v = str(v)
if len(str_v) > max_length:
str_v = str_v[: max_length - 3] + "..."
args_parts.append(f" [dim]{k}:[/] {cls.escape_markup(str_v)}")
return "\n".join(args_parts)
def create_static(cls, content: Text, status: str) -> Static:
css_classes = cls.get_css_classes(status)
return Static(content, classes=css_classes)
@classmethod
def format_result(cls, result: Any, max_length: int = 1000) -> str:
if result is None:
return ""
str_result = str(result).strip()
if not str_result:
return ""
if len(str_result) > max_length:
str_result = str_result[: max_length - 3] + "..."
return cls.escape_markup(str_result)
@classmethod
def get_status_icon(cls, status: str) -> str:
status_icons = {
"running": "[#f59e0b]●[/#f59e0b] In progress...",
"completed": "[#22c55e]✓[/#22c55e] Done",
"failed": "[#dc2626]✗[/#dc2626] Failed",
"error": "[#dc2626]✗[/#dc2626] Error",
def status_icon(cls, status: str) -> tuple[str, str]:
icons = {
"running": ("● In progress...", "#f59e0b"),
"completed": ("✓ Done", "#22c55e"),
"failed": ("✗ Failed", "#dc2626"),
"error": ("✗ Error", "#dc2626"),
}
return status_icons.get(status, "[dim]○[/dim] Unknown")
return icons.get(status, ("○ Unknown", "dim"))
@classmethod
def get_css_classes(cls, status: str) -> str:
base_classes = cls.css_classes.copy()
base_classes.append(f"status-{status}")
return " ".join(base_classes)
@classmethod
def text_with_style(cls, content: str, style: str | None = None) -> Text:
text = Text()
text.append(content, style=style)
return text
@classmethod
def text_icon_label(
cls,
icon: str,
label: str,
icon_style: str | None = None,
label_style: str | None = None,
) -> Text:
text = Text()
text.append(icon, style=icon_style)
text.append(" ")
text.append(label, style=label_style)
return text
@classmethod
def text_header(
cls,
icon: str,
title: str,
subtitle: str = "",
title_style: str = "bold",
subtitle_style: str = "dim",
) -> Text:
text = Text()
text.append(icon)
text.append(" ")
text.append(title, style=title_style)
if subtitle:
text.append(" ")
text.append(subtitle, style=subtitle_style)
return text
@classmethod
def text_key_value(
cls,
key: str,
value: str,
key_style: str = "dim",
value_style: str | None = None,
indent: int = 2,
) -> Text:
text = Text()
text.append(" " * indent)
text.append(key, style=key_style)
text.append(": ")
text.append(value, style=value_style)
return text

View File

@@ -1,120 +1,135 @@
from functools import cache
from typing import Any, ClassVar
from pygments.lexers import get_lexer_by_name
from pygments.styles import get_style_by_name
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@cache
def _get_style_colors() -> dict[Any, str]:
style = get_style_by_name("native")
return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
@register_tool_renderer
class BrowserRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "browser_action"
css_classes: ClassVar[list[str]] = ["tool-call", "browser-tool"]
SIMPLE_ACTIONS: ClassVar[dict[str, str]] = {
"back": "going back in browser history",
"forward": "going forward in browser history",
"scroll_down": "scrolling down",
"scroll_up": "scrolling up",
"refresh": "refreshing browser tab",
"close_tab": "closing browser tab",
"switch_tab": "switching browser tab",
"list_tabs": "listing browser tabs",
"view_source": "viewing page source",
"get_console_logs": "getting console logs",
"screenshot": "taking screenshot of browser tab",
"wait": "waiting...",
"close": "closing browser",
}
@classmethod
def _get_token_color(cls, token_type: Any) -> str | None:
colors = _get_style_colors()
while token_type:
if token_type in colors:
return colors[token_type]
token_type = token_type.parent
return None
@classmethod
def _highlight_js(cls, code: str) -> Text:
lexer = get_lexer_by_name("javascript")
text = Text()
for token_type, token_value in lexer.get_tokens(code):
if not token_value:
continue
color = cls._get_token_color(token_type)
text.append(token_value, style=color)
return text
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
action = args.get("action", "unknown")
content = cls._build_sleek_content(action, args)
content = cls._build_content(action, args)
css_classes = cls.get_css_classes(status)
return Static(content, classes=css_classes)
@classmethod
def _build_sleek_content(cls, action: str, args: dict[str, Any]) -> str:
browser_icon = "🌐"
def _build_url_action(cls, text: Text, label: str, url: str | None, suffix: str = "") -> None:
text.append(label, style="#06b6d4")
if url:
text.append(url, style="#06b6d4")
if suffix:
text.append(suffix, style="#06b6d4")
@classmethod
def _build_content(cls, action: str, args: dict[str, Any]) -> Text:
text = Text()
text.append("🌐 ")
if action in cls.SIMPLE_ACTIONS:
text.append(cls.SIMPLE_ACTIONS[action], style="#06b6d4")
return text
url = args.get("url")
text = args.get("text")
js_code = args.get("js_code")
key = args.get("key")
file_path = args.get("file_path")
if action in [
"launch",
"goto",
"new_tab",
"type",
"execute_js",
"click",
"double_click",
"hover",
"press_key",
"save_pdf",
]:
if action == "launch":
display_url = cls._format_url(url) if url else None
message = (
f"launching {display_url} on browser" if display_url else "launching browser"
)
elif action == "goto":
display_url = cls._format_url(url) if url else None
message = f"navigating to {display_url}" if display_url else "navigating"
elif action == "new_tab":
display_url = cls._format_url(url) if url else None
message = f"opening tab {display_url}" if display_url else "opening tab"
elif action == "type":
display_text = cls._format_text(text) if text else None
message = f"typing {display_text}" if display_text else "typing"
elif action == "execute_js":
display_js = cls._format_js(js_code) if js_code else None
message = (
f"executing javascript\n{display_js}" if display_js else "executing javascript"
)
elif action == "press_key":
display_key = cls.escape_markup(key) if key else None
message = f"pressing key {display_key}" if display_key else "pressing key"
elif action == "save_pdf":
display_path = cls.escape_markup(file_path) if file_path else None
message = f"saving PDF to {display_path}" if display_path else "saving PDF"
else:
action_words = {
"click": "clicking",
"double_click": "double clicking",
"hover": "hovering",
}
message = cls.escape_markup(action_words[action])
return f"{browser_icon} [#06b6d4]{message}[/]"
simple_actions = {
"back": "going back in browser history",
"forward": "going forward in browser history",
"scroll_down": "scrolling down",
"scroll_up": "scrolling up",
"refresh": "refreshing browser tab",
"close_tab": "closing browser tab",
"switch_tab": "switching browser tab",
"list_tabs": "listing browser tabs",
"view_source": "viewing page source",
"get_console_logs": "getting console logs",
"screenshot": "taking screenshot of browser tab",
"wait": "waiting...",
"close": "closing browser",
url_actions = {
"launch": ("launching ", " on browser" if url else "browser"),
"goto": ("navigating to ", ""),
"new_tab": ("opening tab ", ""),
}
if action in url_actions:
label, suffix = url_actions[action]
if action == "launch" and not url:
text.append("launching browser", style="#06b6d4")
else:
cls._build_url_action(text, label, url, suffix)
return text
if action in simple_actions:
return f"{browser_icon} [#06b6d4]{cls.escape_markup(simple_actions[action])}[/]"
click_actions = {
"click": "clicking",
"double_click": "double clicking",
"hover": "hovering",
}
if action in click_actions:
text.append(click_actions[action], style="#06b6d4")
return text
return f"{browser_icon} [#06b6d4]{cls.escape_markup(action)}[/]"
handlers: dict[str, tuple[str, str | None]] = {
"type": ("typing ", args.get("text")),
"press_key": ("pressing key ", args.get("key")),
"save_pdf": ("saving PDF to ", args.get("file_path")),
}
if action in handlers:
label, value = handlers[action]
text.append(label, style="#06b6d4")
if value:
text.append(str(value), style="#06b6d4")
return text
@classmethod
def _format_url(cls, url: str) -> str:
if len(url) > 300:
url = url[:297] + "..."
return cls.escape_markup(url)
if action == "execute_js":
text.append("executing javascript", style="#06b6d4")
js_code = args.get("js_code")
if js_code:
text.append("\n")
text.append_text(cls._highlight_js(js_code))
return text
@classmethod
def _format_text(cls, text: str) -> str:
if len(text) > 200:
text = text[:197] + "..."
return cls.escape_markup(text)
@classmethod
def _format_js(cls, js_code: str) -> str:
if len(js_code) > 200:
js_code = js_code[:197] + "..."
return f"[white]{cls.escape_markup(js_code)}[/white]"
text.append(action, style="#06b6d4")
return text

View File

@@ -1,16 +1,56 @@
from functools import cache
from typing import Any, ClassVar
from pygments.lexers import get_lexer_by_name, get_lexer_for_filename
from pygments.styles import get_style_by_name
from pygments.util import ClassNotFound
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@cache
def _get_style_colors() -> dict[Any, str]:
style = get_style_by_name("native")
return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
def _get_lexer_for_file(path: str) -> Any:
try:
return get_lexer_for_filename(path)
except ClassNotFound:
return get_lexer_by_name("text")
@register_tool_renderer
class StrReplaceEditorRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "str_replace_editor"
css_classes: ClassVar[list[str]] = ["tool-call", "file-edit-tool"]
@classmethod
def _get_token_color(cls, token_type: Any) -> str | None:
colors = _get_style_colors()
while token_type:
if token_type in colors:
return colors[token_type]
token_type = token_type.parent
return None
@classmethod
def _highlight_code(cls, code: str, path: str) -> Text:
lexer = _get_lexer_for_file(path)
text = Text()
for token_type, token_value in lexer.get_tokens(code):
if not token_value:
continue
color = cls._get_token_color(token_type)
text.append(token_value, style=color)
return text
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
@@ -18,28 +58,67 @@ class StrReplaceEditorRenderer(BaseToolRenderer):
command = args.get("command", "")
path = args.get("path", "")
old_str = args.get("old_str", "")
new_str = args.get("new_str", "")
file_text = args.get("file_text", "")
if command == "view":
header = "📖 [bold #10b981]Reading file[/]"
elif command == "str_replace":
header = "✏️ [bold #10b981]Editing file[/]"
elif command == "create":
header = "📝 [bold #10b981]Creating file[/]"
elif command == "insert":
header = "✏️ [bold #10b981]Inserting text[/]"
elif command == "undo_edit":
header = "↩️ [bold #10b981]Undoing edit[/]"
else:
header = "📄 [bold #10b981]File operation[/]"
text = Text()
if (result and isinstance(result, dict) and "content" in result) or path:
icons_and_labels = {
"view": ("📖 ", "Reading file", "#10b981"),
"str_replace": ("✏️ ", "Editing file", "#10b981"),
"create": ("📝 ", "Creating file", "#10b981"),
"insert": ("✏️ ", "Inserting text", "#10b981"),
"undo_edit": ("↩️ ", "Undoing edit", "#10b981"),
}
icon, label, color = icons_and_labels.get(command, ("📄 ", "File operation", "#10b981"))
text.append(icon)
text.append(label, style=f"bold {color}")
if path:
path_display = path[-60:] if len(path) > 60 else path
content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
else:
content_text = f"{header} [dim]Processing...[/]"
text.append(" ")
text.append(path_display, style="dim")
if command == "str_replace" and (old_str or new_str):
if old_str:
highlighted_old = cls._highlight_code(old_str, path)
for line in highlighted_old.plain.split("\n"):
text.append("\n")
text.append("-", style="#ef4444")
text.append(" ")
text.append(line)
if new_str:
highlighted_new = cls._highlight_code(new_str, path)
for line in highlighted_new.plain.split("\n"):
text.append("\n")
text.append("+", style="#22c55e")
text.append(" ")
text.append(line)
elif command == "create" and file_text:
text.append("\n")
text.append_text(cls._highlight_code(file_text, path))
elif command == "insert" and new_str:
highlighted_new = cls._highlight_code(new_str, path)
for line in highlighted_new.plain.split("\n"):
text.append("\n")
text.append("+", style="#22c55e")
text.append(" ")
text.append(line)
elif isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif not (result and isinstance(result, dict) and "content" in result) and not path:
text.append(" ")
text.append("Processing...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -50,19 +129,21 @@ class ListFilesRenderer(BaseToolRenderer):
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
path = args.get("path", "")
header = "📂 [bold #10b981]Listing files[/]"
text = Text()
text.append("📂 ")
text.append("Listing files", style="bold #10b981")
text.append(" ")
if path:
path_display = path[-60:] if len(path) > 60 else path
content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
text.append(path_display, style="dim")
else:
content_text = f"{header} [dim]Current directory[/]"
text.append("Current directory", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -73,27 +154,27 @@ class SearchFilesRenderer(BaseToolRenderer):
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
path = args.get("path", "")
regex = args.get("regex", "")
header = "🔍 [bold purple]Searching files[/]"
text = Text()
text.append("🔍 ")
text.append("Searching files", style="bold purple")
text.append(" ")
if path and regex:
path_display = path[-30:] if len(path) > 30 else path
regex_display = regex[:30] if len(regex) > 30 else regex
content_text = (
f"{header} [dim]{cls.escape_markup(path_display)} for "
f"'{cls.escape_markup(regex_display)}'[/]"
)
text.append(path, style="dim")
text.append(" for '", style="dim")
text.append(regex, style="dim")
text.append("'", style="dim")
elif path:
path_display = path[-60:] if len(path) > 60 else path
content_text = f"{header} [dim]{cls.escape_markup(path_display)}[/]"
text.append(path, style="dim")
elif regex:
regex_display = regex[:60] if len(regex) > 60 else regex
content_text = f"{header} [dim]'{cls.escape_markup(regex_display)}'[/]"
text.append("'", style="dim")
text.append(regex, style="dim")
text.append("'", style="dim")
else:
content_text = f"{header} [dim]Searching...[/]"
text.append("Searching...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)

View File

@@ -1,11 +1,17 @@
from typing import Any, ClassVar
from rich.padding import Padding
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
FIELD_STYLE = "bold #4ade80"
BG_COLOR = "#141414"
@register_tool_renderer
class FinishScanRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "finish_scan"
@@ -15,17 +21,44 @@ class FinishScanRenderer(BaseToolRenderer):
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
content = args.get("content", "")
success = args.get("success", True)
executive_summary = args.get("executive_summary", "")
methodology = args.get("methodology", "")
technical_analysis = args.get("technical_analysis", "")
recommendations = args.get("recommendations", "")
header = (
"🏁 [bold #dc2626]Finishing Scan[/]" if success else "🏁 [bold #dc2626]Scan Failed[/]"
)
text = Text()
text.append("🏁 ")
text.append("Finishing Scan", style="bold #dc2626")
if content:
content_text = f"{header}\n [bold]{cls.escape_markup(content)}[/]"
else:
content_text = f"{header}\n [dim]Generating final report...[/]"
if executive_summary:
text.append("\n\n")
text.append("Executive Summary", style=FIELD_STYLE)
text.append("\n")
text.append(executive_summary)
if methodology:
text.append("\n\n")
text.append("Methodology", style=FIELD_STYLE)
text.append("\n")
text.append(methodology)
if technical_analysis:
text.append("\n\n")
text.append("Technical Analysis", style=FIELD_STYLE)
text.append("\n")
text.append(technical_analysis)
if recommendations:
text.append("\n\n")
text.append("Recommendations", style=FIELD_STYLE)
text.append("\n")
text.append(recommendations)
if not (executive_summary or methodology or technical_analysis or recommendations):
text.append("\n ")
text.append("Generating final report...", style="dim")
padded = Padding(text, 2, style=f"on {BG_COLOR}")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(padded, classes=css_classes)

View File

@@ -1,5 +1,6 @@
from typing import Any, ClassVar
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
@@ -17,23 +18,28 @@ class CreateNoteRenderer(BaseToolRenderer):
title = args.get("title", "")
content = args.get("content", "")
category = args.get("category", "general")
header = "📝 [bold #fbbf24]Note[/]"
text = Text()
text.append("📝 ")
text.append("Note", style="bold #fbbf24")
text.append(" ")
text.append(f"({category})", style="dim")
if title:
title_display = title[:100] + "..." if len(title) > 100 else title
note_parts = [f"{header}\n [bold]{cls.escape_markup(title_display)}[/]"]
text.append("\n ")
text.append(title.strip())
if content:
content_display = content[:200] + "..." if len(content) > 200 else content
note_parts.append(f" [dim]{cls.escape_markup(content_display)}[/]")
if content:
text.append("\n ")
text.append(content.strip(), style="dim")
content_text = "\n".join(note_parts)
else:
content_text = f"{header}\n [dim]Creating note...[/]"
if not title and not content:
text.append("\n ")
text.append("Capturing...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -43,11 +49,12 @@ class DeleteNoteRenderer(BaseToolRenderer):
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static: # noqa: ARG003
header = "🗑️ [bold #fbbf24]Delete Note[/]"
content_text = f"{header}\n [dim]Deleting...[/]"
text = Text()
text.append("📝 ")
text.append("Note Removed", style="bold #94a3b8")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -59,28 +66,27 @@ class UpdateNoteRenderer(BaseToolRenderer):
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
title = args.get("title", "")
content = args.get("content", "")
title = args.get("title")
content = args.get("content")
header = "✏️ [bold #fbbf24]Update Note[/]"
text = Text()
text.append("📝 ")
text.append("Note Updated", style="bold #fbbf24")
if title or content:
note_parts = [header]
if title:
text.append("\n ")
text.append(title)
if title:
title_display = title[:100] + "..." if len(title) > 100 else title
note_parts.append(f" [bold]{cls.escape_markup(title_display)}[/]")
if content:
text.append("\n ")
text.append(content.strip(), style="dim")
if content:
content_display = content[:200] + "..." if len(content) > 200 else content
note_parts.append(f" [dim]{cls.escape_markup(content_display)}[/]")
content_text = "\n".join(note_parts)
else:
content_text = f"{header}\n [dim]Updating...[/]"
if not title and not content:
text.append("\n ")
text.append("Updating...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -92,17 +98,36 @@ class ListNotesRenderer(BaseToolRenderer):
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
header = "📋 [bold #fbbf24]Listing notes[/]"
text = Text()
text.append("📝 ")
text.append("Notes", style="bold #fbbf24")
if result and isinstance(result, dict) and "notes" in result:
notes = result["notes"]
if isinstance(notes, list):
count = len(notes)
content_text = f"{header}\n [dim]{count} notes found[/]"
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict) and result.get("success"):
count = result.get("total_count", 0)
notes = result.get("notes", []) or []
if count == 0:
text.append("\n ")
text.append("No notes", style="dim")
else:
content_text = f"{header}\n [dim]No notes found[/]"
for note in notes:
title = note.get("title", "").strip() or "(untitled)"
category = note.get("category", "general")
note_content = note.get("content", "").strip()
text.append("\n - ")
text.append(title)
text.append(f" ({category})", style="dim")
if note_content:
text.append("\n ")
text.append(note_content, style="dim")
else:
content_text = f"{header}\n [dim]Listing notes...[/]"
text.append("\n ")
text.append("Loading...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)

View File

@@ -1,5 +1,6 @@
from typing import Any, ClassVar
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
@@ -18,38 +19,42 @@ class ListRequestsRenderer(BaseToolRenderer):
httpql_filter = args.get("httpql_filter")
header = "📋 [bold #06b6d4]Listing requests[/]"
text = Text()
text.append("📋 ")
text.append("Listing requests", style="bold #06b6d4")
if result and isinstance(result, dict) and "requests" in result:
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict) and "requests" in result:
requests = result["requests"]
if isinstance(requests, list) and requests:
request_lines = []
for req in requests[:3]:
for req in requests[:25]:
if isinstance(req, dict):
method = req.get("method", "?")
path = req.get("path", "?")
response = req.get("response") or {}
status = response.get("statusCode", "?")
line = f"{method} {path} {status}"
request_lines.append(line)
if len(requests) > 3:
request_lines.append(f"... +{len(requests) - 3} more")
escaped_lines = [cls.escape_markup(line) for line in request_lines]
content_text = f"{header}\n [dim]{chr(10).join(escaped_lines)}[/]"
text.append("\n ")
text.append(f"{method} {path}{status}", style="dim")
if len(requests) > 25:
text.append("\n ")
text.append(f"... +{len(requests) - 25} more", style="dim")
else:
content_text = f"{header}\n [dim]No requests found[/]"
text.append("\n ")
text.append("No requests found", style="dim")
elif httpql_filter:
filter_display = (
httpql_filter[:300] + "..." if len(httpql_filter) > 300 else httpql_filter
httpql_filter[:500] + "..." if len(httpql_filter) > 500 else httpql_filter
)
content_text = f"{header}\n [dim]{cls.escape_markup(filter_display)}[/]"
text.append("\n ")
text.append(filter_display, style="dim")
else:
content_text = f"{header}\n [dim]All requests[/]"
text.append("\n ")
text.append("All requests", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -64,34 +69,41 @@ class ViewRequestRenderer(BaseToolRenderer):
part = args.get("part", "request")
header = f"👀 [bold #06b6d4]Viewing {cls.escape_markup(part)}[/]"
text = Text()
text.append("👀 ")
text.append(f"Viewing {part}", style="bold #06b6d4")
if result and isinstance(result, dict):
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict):
if "content" in result:
content = result["content"]
content_preview = content[:500] + "..." if len(content) > 500 else content
content_text = f"{header}\n [dim]{cls.escape_markup(content_preview)}[/]"
content_preview = content[:2000] + "..." if len(content) > 2000 else content
text.append("\n ")
text.append(content_preview, style="dim")
elif "matches" in result:
matches = result["matches"]
if isinstance(matches, list) and matches:
match_lines = [
match["match"]
for match in matches[:3]
if isinstance(match, dict) and "match" in match
]
if len(matches) > 3:
match_lines.append(f"... +{len(matches) - 3} more matches")
escaped_lines = [cls.escape_markup(line) for line in match_lines]
content_text = f"{header}\n [dim]{chr(10).join(escaped_lines)}[/]"
for match in matches[:25]:
if isinstance(match, dict) and "match" in match:
text.append("\n ")
text.append(match["match"], style="dim")
if len(matches) > 25:
text.append("\n ")
text.append(f"... +{len(matches) - 25} more matches", style="dim")
else:
content_text = f"{header}\n [dim]No matches found[/]"
text.append("\n ")
text.append("No matches found", style="dim")
else:
content_text = f"{header}\n [dim]Viewing content...[/]"
text.append("\n ")
text.append("Viewing content...", style="dim")
else:
content_text = f"{header}\n [dim]Loading...[/]"
text.append("\n ")
text.append("Loading...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -107,30 +119,39 @@ class SendRequestRenderer(BaseToolRenderer):
method = args.get("method", "GET")
url = args.get("url", "")
header = f"📤 [bold #06b6d4]Sending {cls.escape_markup(method)}[/]"
text = Text()
text.append("📤 ")
text.append(f"Sending {method}", style="bold #06b6d4")
if result and isinstance(result, dict):
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict):
status_code = result.get("status_code")
response_body = result.get("body", "")
if status_code:
response_preview = f"Status: {status_code}"
text.append("\n ")
text.append(f"Status: {status_code}", style="dim")
if response_body:
body_preview = (
response_body[:300] + "..." if len(response_body) > 300 else response_body
response_body[:2000] + "..." if len(response_body) > 2000 else response_body
)
response_preview += f"\n{body_preview}"
content_text = f"{header}\n [dim]{cls.escape_markup(response_preview)}[/]"
text.append("\n ")
text.append(body_preview, style="dim")
else:
content_text = f"{header}\n [dim]Response received[/]"
text.append("\n ")
text.append("Response received", style="dim")
elif url:
url_display = url[:400] + "..." if len(url) > 400 else url
content_text = f"{header}\n [dim]{cls.escape_markup(url_display)}[/]"
url_display = url[:500] + "..." if len(url) > 500 else url
text.append("\n ")
text.append(url_display, style="dim")
else:
content_text = f"{header}\n [dim]Sending...[/]"
text.append("\n ")
text.append("Sending...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -145,31 +166,40 @@ class RepeatRequestRenderer(BaseToolRenderer):
modifications = args.get("modifications", {})
header = "🔄 [bold #06b6d4]Repeating request[/]"
text = Text()
text.append("🔄 ")
text.append("Repeating request", style="bold #06b6d4")
if result and isinstance(result, dict):
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict):
status_code = result.get("status_code")
response_body = result.get("body", "")
if status_code:
response_preview = f"Status: {status_code}"
text.append("\n ")
text.append(f"Status: {status_code}", style="dim")
if response_body:
body_preview = (
response_body[:300] + "..." if len(response_body) > 300 else response_body
response_body[:2000] + "..." if len(response_body) > 2000 else response_body
)
response_preview += f"\n{body_preview}"
content_text = f"{header}\n [dim]{cls.escape_markup(response_preview)}[/]"
text.append("\n ")
text.append(body_preview, style="dim")
else:
content_text = f"{header}\n [dim]Response received[/]"
text.append("\n ")
text.append("Response received", style="dim")
elif modifications:
mod_text = str(modifications)
mod_display = mod_text[:400] + "..." if len(mod_text) > 400 else mod_text
content_text = f"{header}\n [dim]{cls.escape_markup(mod_display)}[/]"
mod_str = str(modifications)
mod_display = mod_str[:500] + "..." if len(mod_str) > 500 else mod_str
text.append("\n ")
text.append(mod_display, style="dim")
else:
content_text = f"{header}\n [dim]No modifications[/]"
text.append("\n ")
text.append("No modifications", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -179,11 +209,14 @@ class ScopeRulesRenderer(BaseToolRenderer):
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static: # noqa: ARG003
header = "⚙️ [bold #06b6d4]Updating proxy scope[/]"
content_text = f"{header}\n [dim]Configuring...[/]"
text = Text()
text.append("⚙️ ")
text.append("Updating proxy scope", style="bold #06b6d4")
text.append("\n ")
text.append("Configuring...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -195,31 +228,34 @@ class ListSitemapRenderer(BaseToolRenderer):
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
header = "🗺️ [bold #06b6d4]Listing sitemap[/]"
text = Text()
text.append("🗺️ ")
text.append("Listing sitemap", style="bold #06b6d4")
if result and isinstance(result, dict) and "entries" in result:
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict) and "entries" in result:
entries = result["entries"]
if isinstance(entries, list) and entries:
entry_lines = []
for entry in entries[:4]:
for entry in entries[:30]:
if isinstance(entry, dict):
label = entry.get("label", "?")
kind = entry.get("kind", "?")
line = f"{kind}: {label}"
entry_lines.append(line)
if len(entries) > 4:
entry_lines.append(f"... +{len(entries) - 4} more")
escaped_lines = [cls.escape_markup(line) for line in entry_lines]
content_text = f"{header}\n [dim]{chr(10).join(escaped_lines)}[/]"
text.append("\n ")
text.append(f"{kind}: {label}", style="dim")
if len(entries) > 30:
text.append("\n ")
text.append(f"... +{len(entries) - 30} more entries", style="dim")
else:
content_text = f"{header}\n [dim]No entries found[/]"
text.append("\n ")
text.append("No entries found", style="dim")
else:
content_text = f"{header}\n [dim]Loading...[/]"
text.append("\n ")
text.append("Loading...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)
@register_tool_renderer
@@ -231,25 +267,30 @@ class ViewSitemapEntryRenderer(BaseToolRenderer):
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
header = "📍 [bold #06b6d4]Viewing sitemap entry[/]"
text = Text()
text.append("📍 ")
text.append("Viewing sitemap entry", style="bold #06b6d4")
if result and isinstance(result, dict):
if "entry" in result:
entry = result["entry"]
if isinstance(entry, dict):
label = entry.get("label", "")
kind = entry.get("kind", "")
if label and kind:
entry_info = f"{kind}: {label}"
content_text = f"{header}\n [dim]{cls.escape_markup(entry_info)}[/]"
else:
content_text = f"{header}\n [dim]Entry details loaded[/]"
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict) and "entry" in result:
entry = result["entry"]
if isinstance(entry, dict):
label = entry.get("label", "")
kind = entry.get("kind", "")
if label and kind:
text.append("\n ")
text.append(f"{kind}: {label}", style="dim")
else:
content_text = f"{header}\n [dim]Entry details loaded[/]"
text.append("\n ")
text.append("Entry details loaded", style="dim")
else:
content_text = f"{header}\n [dim]Loading entry...[/]"
text.append("\n ")
text.append("Entry details loaded", style="dim")
else:
content_text = f"{header}\n [dim]Loading...[/]"
text.append("\n ")
text.append("Loading...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)

View File

@@ -1,34 +1,156 @@
import re
from functools import cache
from typing import Any, ClassVar
from pygments.lexers import PythonLexer
from pygments.styles import get_style_by_name
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
MAX_OUTPUT_LINES = 50
MAX_LINE_LENGTH = 200
STRIP_PATTERNS = [
r"\.\.\. \[(stdout|stderr|result|output|error) truncated at \d+k? chars\]",
]
@cache
def _get_style_colors() -> dict[Any, str]:
style = get_style_by_name("native")
return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
@register_tool_renderer
class PythonRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "python_action"
css_classes: ClassVar[list[str]] = ["tool-call", "python-tool"]
@classmethod
def _get_token_color(cls, token_type: Any) -> str | None:
colors = _get_style_colors()
while token_type:
if token_type in colors:
return colors[token_type]
token_type = token_type.parent
return None
@classmethod
def _highlight_python(cls, code: str) -> Text:
lexer = PythonLexer()
text = Text()
for token_type, token_value in lexer.get_tokens(code):
if not token_value:
continue
color = cls._get_token_color(token_type)
text.append(token_value, style=color)
return text
@classmethod
def _clean_output(cls, output: str) -> str:
cleaned = output
for pattern in STRIP_PATTERNS:
cleaned = re.sub(pattern, "", cleaned)
return cleaned.strip()
@classmethod
def _truncate_line(cls, line: str) -> str:
if len(line) > MAX_LINE_LENGTH:
return line[: MAX_LINE_LENGTH - 3] + "..."
return line
@classmethod
def _format_output(cls, output: str) -> Text:
text = Text()
lines = output.splitlines()
total_lines = len(lines)
head_count = MAX_OUTPUT_LINES // 2
tail_count = MAX_OUTPUT_LINES - head_count - 1
if total_lines <= MAX_OUTPUT_LINES:
display_lines = lines
truncated = False
hidden_count = 0
else:
display_lines = lines[:head_count]
truncated = True
hidden_count = total_lines - head_count - tail_count
for i, line in enumerate(display_lines):
truncated_line = cls._truncate_line(line)
text.append(" ")
text.append(truncated_line, style="dim")
if i < len(display_lines) - 1 or truncated:
text.append("\n")
if truncated:
text.append(f" ... {hidden_count} lines truncated ...", style="dim italic")
text.append("\n")
tail_lines = lines[-tail_count:]
for i, line in enumerate(tail_lines):
truncated_line = cls._truncate_line(line)
text.append(" ")
text.append(truncated_line, style="dim")
if i < len(tail_lines) - 1:
text.append("\n")
return text
@classmethod
def _append_output(cls, text: Text, result: dict[str, Any] | str) -> None:
if isinstance(result, str):
if result.strip():
text.append("\n")
text.append_text(cls._format_output(result))
return
stdout = result.get("stdout", "")
stderr = result.get("stderr", "")
stdout = cls._clean_output(stdout) if stdout else ""
stderr = cls._clean_output(stderr) if stderr else ""
if stdout:
text.append("\n")
formatted_output = cls._format_output(stdout)
text.append_text(formatted_output)
if stderr:
text.append("\n")
text.append(" stderr: ", style="bold #ef4444")
formatted_stderr = cls._format_output(stderr)
text.append_text(formatted_stderr)
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
result = tool_data.get("result")
action = args.get("action", "")
code = args.get("code", "")
header = "</> [bold #3b82f6]Python[/]"
text = Text()
text.append("</> ", style="dim")
if code and action in ["new_session", "execute"]:
code_display = code[:600] + "..." if len(code) > 600 else code
content_text = f"{header}\n [italic white]{cls.escape_markup(code_display)}[/]"
text.append_text(cls._highlight_python(code))
elif action == "close":
content_text = f"{header}\n [dim]Closing session...[/]"
text.append("Closing session...", style="dim")
elif action == "list_sessions":
content_text = f"{header}\n [dim]Listing sessions...[/]"
text.append("Listing sessions...", style="dim")
else:
content_text = f"{header}\n [dim]Running...[/]"
text.append("Running...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
if result and isinstance(result, dict | str):
cls._append_output(text, result)
css_classes = cls.get_css_classes(status)
return Static(text, classes=css_classes)

View File

@@ -1,5 +1,6 @@
from typing import Any, ClassVar
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
@@ -47,26 +48,32 @@ def render_tool_widget(tool_data: dict[str, Any]) -> Static:
def _render_default_tool_widget(tool_data: dict[str, Any]) -> Static:
tool_name = BaseToolRenderer.escape_markup(tool_data.get("tool_name", "Unknown Tool"))
tool_name = tool_data.get("tool_name", "Unknown Tool")
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
result = tool_data.get("result")
status_text = BaseToolRenderer.get_status_icon(status)
text = Text()
header = f"→ Using tool [bold blue]{BaseToolRenderer.escape_markup(tool_name)}[/]"
content_parts = [header]
text.append("→ Using tool ", style="dim")
text.append(tool_name, style="bold blue")
text.append("\n")
args_str = BaseToolRenderer.format_args(args)
if args_str:
content_parts.append(args_str)
for k, v in list(args.items()):
str_v = str(v)
text.append(" ")
text.append(k, style="dim")
text.append(": ")
text.append(str_v)
text.append("\n")
if status in ["completed", "failed", "error"] and result is not None:
result_str = BaseToolRenderer.format_result(result)
if result_str:
content_parts.append(f"[bold]Result:[/] {result_str}")
result_str = str(result)
text.append("Result: ", style="bold")
text.append(result_str)
else:
content_parts.append(status_text)
icon, color = BaseToolRenderer.status_icon(status)
text.append(icon, style=color)
css_classes = BaseToolRenderer.get_css_classes(status)
return Static("\n".join(content_parts), classes=css_classes)
return Static(text, classes=css_classes)

View File

@@ -1,53 +1,221 @@
from functools import cache
from typing import Any, ClassVar
from pygments.lexers import PythonLexer
from pygments.styles import get_style_by_name
from rich.padding import Padding
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
@cache
def _get_style_colors() -> dict[Any, str]:
style = get_style_by_name("native")
return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
FIELD_STYLE = "bold #4ade80"
BG_COLOR = "#141414"
@register_tool_renderer
class CreateVulnerabilityReportRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "create_vulnerability_report"
css_classes: ClassVar[list[str]] = ["tool-call", "reporting-tool"]
SEVERITY_COLORS: ClassVar[dict[str, str]] = {
"critical": "#dc2626",
"high": "#ea580c",
"medium": "#d97706",
"low": "#65a30d",
"info": "#0284c7",
}
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
def _get_token_color(cls, token_type: Any) -> str | None:
colors = _get_style_colors()
while token_type:
if token_type in colors:
return colors[token_type]
token_type = token_type.parent
return None
@classmethod
def _highlight_python(cls, code: str) -> Text:
lexer = PythonLexer()
text = Text()
for token_type, token_value in lexer.get_tokens(code):
if not token_value:
continue
color = cls._get_token_color(token_type)
text.append(token_value, style=color)
return text
@classmethod
def _get_cvss_color(cls, cvss_score: float) -> str:
if cvss_score >= 9.0:
return "#dc2626"
if cvss_score >= 7.0:
return "#ea580c"
if cvss_score >= 4.0:
return "#d97706"
if cvss_score >= 0.1:
return "#65a30d"
return "#6b7280"
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static: # noqa: PLR0912, PLR0915
args = tool_data.get("args", {})
result = tool_data.get("result", {})
title = args.get("title", "")
severity = args.get("severity", "")
content = args.get("content", "")
description = args.get("description", "")
impact = args.get("impact", "")
target = args.get("target", "")
technical_analysis = args.get("technical_analysis", "")
poc_description = args.get("poc_description", "")
poc_script_code = args.get("poc_script_code", "")
remediation_steps = args.get("remediation_steps", "")
header = "🐞 [bold #ea580c]Vulnerability Report[/]"
attack_vector = args.get("attack_vector", "")
attack_complexity = args.get("attack_complexity", "")
privileges_required = args.get("privileges_required", "")
user_interaction = args.get("user_interaction", "")
scope = args.get("scope", "")
confidentiality = args.get("confidentiality", "")
integrity = args.get("integrity", "")
availability = args.get("availability", "")
endpoint = args.get("endpoint", "")
method = args.get("method", "")
cve = args.get("cve", "")
severity = ""
cvss_score = None
if isinstance(result, dict):
severity = result.get("severity", "")
cvss_score = result.get("cvss_score")
text = Text()
text.append("🐞 ")
text.append("Vulnerability Report", style="bold #ea580c")
if title:
content_parts = [f"{header}\n [bold]{cls.escape_markup(title)}[/]"]
text.append("\n\n")
text.append("Title: ", style=FIELD_STYLE)
text.append(title)
if severity:
severity_color = cls._get_severity_color(severity.lower())
content_parts.append(
f" [dim]Severity: [{severity_color}]"
f"{cls.escape_markup(severity.upper())}[/{severity_color}][/]"
)
if severity:
text.append("\n\n")
text.append("Severity: ", style=FIELD_STYLE)
severity_color = cls.SEVERITY_COLORS.get(severity.lower(), "#6b7280")
text.append(severity.upper(), style=f"bold {severity_color}")
if content:
content_parts.append(f" [dim]{cls.escape_markup(content)}[/]")
if cvss_score is not None:
text.append("\n\n")
text.append("CVSS Score: ", style=FIELD_STYLE)
cvss_color = cls._get_cvss_color(cvss_score)
text.append(str(cvss_score), style=f"bold {cvss_color}")
content_text = "\n".join(content_parts)
else:
content_text = f"{header}\n [dim]Creating report...[/]"
if target:
text.append("\n\n")
text.append("Target: ", style=FIELD_STYLE)
text.append(target)
if endpoint:
text.append("\n\n")
text.append("Endpoint: ", style=FIELD_STYLE)
text.append(endpoint)
if method:
text.append("\n\n")
text.append("Method: ", style=FIELD_STYLE)
text.append(method)
if cve:
text.append("\n\n")
text.append("CVE: ", style=FIELD_STYLE)
text.append(cve)
if any(
[
attack_vector,
attack_complexity,
privileges_required,
user_interaction,
scope,
confidentiality,
integrity,
availability,
]
):
text.append("\n\n")
cvss_parts = []
if attack_vector:
cvss_parts.append(f"AV:{attack_vector}")
if attack_complexity:
cvss_parts.append(f"AC:{attack_complexity}")
if privileges_required:
cvss_parts.append(f"PR:{privileges_required}")
if user_interaction:
cvss_parts.append(f"UI:{user_interaction}")
if scope:
cvss_parts.append(f"S:{scope}")
if confidentiality:
cvss_parts.append(f"C:{confidentiality}")
if integrity:
cvss_parts.append(f"I:{integrity}")
if availability:
cvss_parts.append(f"A:{availability}")
text.append("CVSS Vector: ", style=FIELD_STYLE)
text.append("/".join(cvss_parts), style="dim")
if description:
text.append("\n\n")
text.append("Description", style=FIELD_STYLE)
text.append("\n")
text.append(description)
if impact:
text.append("\n\n")
text.append("Impact", style=FIELD_STYLE)
text.append("\n")
text.append(impact)
if technical_analysis:
text.append("\n\n")
text.append("Technical Analysis", style=FIELD_STYLE)
text.append("\n")
text.append(technical_analysis)
if poc_description:
text.append("\n\n")
text.append("PoC Description", style=FIELD_STYLE)
text.append("\n")
text.append(poc_description)
if poc_script_code:
text.append("\n\n")
text.append("PoC Code", style=FIELD_STYLE)
text.append("\n")
text.append_text(cls._highlight_python(poc_script_code))
if remediation_steps:
text.append("\n\n")
text.append("Remediation", style=FIELD_STYLE)
text.append("\n")
text.append(remediation_steps)
if not title:
text.append("\n ")
text.append("Creating report...", style="dim")
padded = Padding(text, 2, style=f"on {BG_COLOR}")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
@classmethod
def _get_severity_color(cls, severity: str) -> str:
severity_colors = {
"critical": "#dc2626",
"high": "#ea580c",
"medium": "#d97706",
"low": "#65a30d",
"info": "#0284c7",
}
return severity_colors.get(severity, "#6b7280")
return Static(padded, classes=css_classes)

View File

@@ -1,5 +1,6 @@
from typing import Any, ClassVar
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
@@ -15,29 +16,28 @@ class ScanStartInfoRenderer(BaseToolRenderer):
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
targets = args.get("targets", [])
text = Text()
text.append("🚀 Starting penetration test")
if len(targets) == 1:
target_display = cls._build_single_target_display(targets[0])
content = f"🚀 Starting penetration test on {target_display}"
text.append(" on ")
text.append(cls._get_target_display(targets[0]))
elif len(targets) > 1:
content = f"🚀 Starting penetration test on {len(targets)} targets"
text.append(f" on {len(targets)} targets")
for target_info in targets:
target_display = cls._build_single_target_display(target_info)
content += f"\n{target_display}"
else:
content = "🚀 Starting penetration test"
text.append("\n")
text.append(cls._get_target_display(target_info))
css_classes = cls.get_css_classes(status)
return Static(content, classes=css_classes)
return Static(text, classes=css_classes)
@classmethod
def _build_single_target_display(cls, target_info: dict[str, Any]) -> str:
def _get_target_display(cls, target_info: dict[str, Any]) -> str:
original = target_info.get("original")
if original:
return cls.escape_markup(str(original))
return str(original)
return "unknown target"
@@ -51,14 +51,17 @@ class SubagentStartInfoRenderer(BaseToolRenderer):
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
name = args.get("name", "Unknown Agent")
task = args.get("task", "")
name = str(args.get("name", "Unknown Agent"))
task = str(args.get("task", ""))
text = Text()
text.append("", style="#a78bfa")
text.append("subagent ", style="dim")
text.append(name, style="bold #a78bfa")
name = cls.escape_markup(str(name))
content = f"🤖 Spawned subagent {name}"
if task:
task = cls.escape_markup(str(task))
content += f"\n Task: {task}"
text.append("\n ")
text.append(task, style="dim")
css_classes = cls.get_css_classes(status)
return Static(content, classes=css_classes)
return Static(text, classes=css_classes)

View File

@@ -1,131 +1,311 @@
import re
from functools import cache
from typing import Any, ClassVar
from pygments.lexers import get_lexer_by_name
from pygments.styles import get_style_by_name
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
MAX_OUTPUT_LINES = 50
MAX_LINE_LENGTH = 200
STRIP_PATTERNS = [
(
r"\n?\[Command still running after [\d.]+s - showing output so far\.?"
r"\s*(?:Use C-c to interrupt if needed\.)?\]"
),
r"^\[Below is the output of the previous command\.\]\n?",
r"^No command is currently running\. Cannot send input\.$",
(
r"^A command is already running\. Use is_input=true to send input to it, "
r"or interrupt it first \(e\.g\., with C-c\)\.$"
),
]
@cache
def _get_style_colors() -> dict[Any, str]:
style = get_style_by_name("native")
return {token: f"#{style_def['color']}" for token, style_def in style if style_def["color"]}
@register_tool_renderer
class TerminalRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "terminal_execute"
css_classes: ClassVar[list[str]] = ["tool-call", "terminal-tool"]
CONTROL_SEQUENCES: ClassVar[set[str]] = {
"C-c",
"C-d",
"C-z",
"C-a",
"C-e",
"C-k",
"C-l",
"C-u",
"C-w",
"C-r",
"C-s",
"C-t",
"C-y",
"^c",
"^d",
"^z",
"^a",
"^e",
"^k",
"^l",
"^u",
"^w",
"^r",
"^s",
"^t",
"^y",
}
SPECIAL_KEYS: ClassVar[set[str]] = {
"Enter",
"Escape",
"Space",
"Tab",
"BTab",
"BSpace",
"DC",
"IC",
"Up",
"Down",
"Left",
"Right",
"Home",
"End",
"PageUp",
"PageDown",
"PgUp",
"PgDn",
"PPage",
"NPage",
"F1",
"F2",
"F3",
"F4",
"F5",
"F6",
"F7",
"F8",
"F9",
"F10",
"F11",
"F12",
}
@classmethod
def _get_token_color(cls, token_type: Any) -> str | None:
colors = _get_style_colors()
while token_type:
if token_type in colors:
return colors[token_type]
token_type = token_type.parent
return None
@classmethod
def _highlight_bash(cls, code: str) -> Text:
lexer = get_lexer_by_name("bash")
text = Text()
for token_type, token_value in lexer.get_tokens(code):
if not token_value:
continue
color = cls._get_token_color(token_type)
text.append(token_value, style=color)
return text
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
status = tool_data.get("status", "unknown")
result = tool_data.get("result", {})
result = tool_data.get("result")
command = args.get("command", "")
is_input = args.get("is_input", False)
terminal_id = args.get("terminal_id", "default")
timeout = args.get("timeout")
content = cls._build_sleek_content(command, is_input, terminal_id, timeout, result)
content = cls._build_content(command, is_input, status, result)
css_classes = cls.get_css_classes(status)
return Static(content, classes=css_classes)
@classmethod
def _build_sleek_content(
cls,
command: str,
is_input: bool,
terminal_id: str, # noqa: ARG003
timeout: float | None, # noqa: ARG003
result: dict[str, Any], # noqa: ARG003
) -> str:
def _build_content(
cls, command: str, is_input: bool, status: str, result: dict[str, Any] | str | None
) -> Text:
text = Text()
terminal_icon = ">_"
if not command.strip():
return f"{terminal_icon} [dim]getting logs...[/]"
control_sequences = {
"C-c",
"C-d",
"C-z",
"C-a",
"C-e",
"C-k",
"C-l",
"C-u",
"C-w",
"C-r",
"C-s",
"C-t",
"C-y",
"^c",
"^d",
"^z",
"^a",
"^e",
"^k",
"^l",
"^u",
"^w",
"^r",
"^s",
"^t",
"^y",
}
special_keys = {
"Enter",
"Escape",
"Space",
"Tab",
"BTab",
"BSpace",
"DC",
"IC",
"Up",
"Down",
"Left",
"Right",
"Home",
"End",
"PageUp",
"PageDown",
"PgUp",
"PgDn",
"PPage",
"NPage",
"F1",
"F2",
"F3",
"F4",
"F5",
"F6",
"F7",
"F8",
"F9",
"F10",
"F11",
"F12",
}
text.append(terminal_icon, style="dim")
text.append(" ")
text.append("getting logs...", style="dim")
if result:
cls._append_output(text, result, status, command)
return text
is_special = (
command in control_sequences
or command in special_keys
command in cls.CONTROL_SEQUENCES
or command in cls.SPECIAL_KEYS
or command.startswith(("M-", "S-", "C-S-", "C-M-", "S-M-"))
)
text.append(terminal_icon, style="dim")
text.append(" ")
if is_special:
return f"{terminal_icon} [#ef4444]{cls.escape_markup(command)}[/]"
text.append(command, style="#ef4444")
elif is_input:
text.append(">>>", style="#3b82f6")
text.append(" ")
text.append_text(cls._format_command(command))
else:
text.append("$", style="#22c55e")
text.append(" ")
text.append_text(cls._format_command(command))
if is_input:
formatted_command = cls._format_command_display(command)
return f"{terminal_icon} [#3b82f6]>>>[/] [#22c55e]{formatted_command}[/]"
if result:
cls._append_output(text, result, status, command)
formatted_command = cls._format_command_display(command)
return f"{terminal_icon} [#22c55e]$ {formatted_command}[/]"
return text
@classmethod
def _format_command_display(cls, command: str) -> str:
if not command:
return ""
def _clean_output(cls, output: str, command: str = "") -> str:
cleaned = output
if len(command) > 400:
command = command[:397] + "..."
for pattern in STRIP_PATTERNS:
cleaned = re.sub(pattern, "", cleaned, flags=re.MULTILINE)
return cls.escape_markup(command)
if cleaned.strip():
lines = cleaned.splitlines()
filtered_lines: list[str] = []
for line in lines:
if not filtered_lines and not line.strip():
continue
if re.match(r"^\[STRIX_\d+\]\$\s*", line):
continue
if command and line.strip() == command.strip():
continue
if command and re.match(r"^[\$#>]\s*" + re.escape(command.strip()) + r"\s*$", line):
continue
filtered_lines.append(line)
while filtered_lines and re.match(r"^\[STRIX_\d+\]\$\s*", filtered_lines[-1]):
filtered_lines.pop()
cleaned = "\n".join(filtered_lines)
return cleaned.strip()
@classmethod
def _append_output(
cls, text: Text, result: dict[str, Any] | str, tool_status: str, command: str = ""
) -> None:
if isinstance(result, str):
if result.strip():
text.append("\n")
text.append_text(cls._format_output(result))
return
raw_output = result.get("content", "")
output = cls._clean_output(raw_output, command)
error = result.get("error")
exit_code = result.get("exit_code")
result_status = result.get("status", "")
if error and not cls._is_status_message(error):
text.append("\n")
text.append(" error: ", style="bold #ef4444")
text.append(cls._truncate_line(error), style="#ef4444")
return
if result_status == "running" or tool_status == "running":
if output and output.strip():
text.append("\n")
formatted_output = cls._format_output(output)
text.append_text(formatted_output)
return
if not output or not output.strip():
if exit_code is not None and exit_code != 0:
text.append("\n")
text.append(f" exit {exit_code}", style="dim #ef4444")
return
text.append("\n")
formatted_output = cls._format_output(output)
text.append_text(formatted_output)
if exit_code is not None and exit_code != 0:
text.append("\n")
text.append(f" exit {exit_code}", style="dim #ef4444")
@classmethod
def _is_status_message(cls, message: str) -> bool:
status_patterns = [
r"No command is currently running",
r"A command is already running",
r"Cannot send input",
r"Use is_input=true",
r"Use C-c to interrupt",
r"showing output so far",
]
return any(re.search(pattern, message) for pattern in status_patterns)
@classmethod
def _format_output(cls, output: str) -> Text:
text = Text()
lines = output.splitlines()
total_lines = len(lines)
head_count = MAX_OUTPUT_LINES // 2
tail_count = MAX_OUTPUT_LINES - head_count - 1
if total_lines <= MAX_OUTPUT_LINES:
display_lines = lines
truncated = False
hidden_count = 0
else:
display_lines = lines[:head_count]
truncated = True
hidden_count = total_lines - head_count - tail_count
for i, line in enumerate(display_lines):
truncated_line = cls._truncate_line(line)
text.append(" ")
text.append(truncated_line, style="dim")
if i < len(display_lines) - 1 or truncated:
text.append("\n")
if truncated:
text.append(f" ... {hidden_count} lines truncated ...", style="dim italic")
text.append("\n")
tail_lines = lines[-tail_count:]
for i, line in enumerate(tail_lines):
truncated_line = cls._truncate_line(line)
text.append(" ")
text.append(truncated_line, style="dim")
if i < len(tail_lines) - 1:
text.append("\n")
return text
@classmethod
def _truncate_line(cls, line: str) -> str:
clean_line = re.sub(r"\x1b\[[0-9;]*m", "", line)
if len(clean_line) > MAX_LINE_LENGTH:
return line[: MAX_LINE_LENGTH - 3] + "..."
return line
@classmethod
def _format_command(cls, command: str) -> Text:
return cls._highlight_bash(command)

View File

@@ -1,5 +1,6 @@
from typing import Any, ClassVar
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
@@ -14,16 +15,17 @@ class ThinkRenderer(BaseToolRenderer):
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
args = tool_data.get("args", {})
thought = args.get("thought", "")
header = "🧠 [bold #a855f7]Thinking[/]"
text = Text()
text.append("🧠 ")
text.append("Thinking", style="bold #a855f7")
text.append("\n ")
if thought:
thought_display = thought[:600] + "..." if len(thought) > 600 else thought
content = f"{header}\n [italic dim]{cls.escape_markup(thought_display)}[/]"
text.append(thought, style="italic dim")
else:
content = f"{header}\n [italic dim]Thinking...[/]"
text.append("Thinking...", style="italic dim")
css_classes = cls.get_css_classes("completed")
return Static(content, classes=css_classes)
return Static(text, classes=css_classes)

View File

@@ -0,0 +1,225 @@
from typing import Any, ClassVar
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
from .registry import register_tool_renderer
STATUS_MARKERS: dict[str, str] = {
"pending": "[ ]",
"in_progress": "[~]",
"done": "[•]",
}
def _format_todo_lines(text: Text, result: dict[str, Any]) -> None:
todos = result.get("todos")
if not isinstance(todos, list) or not todos:
text.append("\n ")
text.append("No todos", style="dim")
return
for todo in todos:
status = todo.get("status", "pending")
marker = STATUS_MARKERS.get(status, STATUS_MARKERS["pending"])
title = todo.get("title", "").strip() or "(untitled)"
text.append("\n ")
text.append(marker)
text.append(" ")
if status == "done":
text.append(title, style="dim strike")
elif status == "in_progress":
text.append(title, style="italic")
else:
text.append(title)
@register_tool_renderer
class CreateTodoRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "create_todo"
css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
text = Text()
text.append("📋 ")
text.append("Todo", style="bold #a78bfa")
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict):
if result.get("success"):
_format_todo_lines(text, result)
else:
error = result.get("error", "Failed to create todo")
text.append("\n ")
text.append(error, style="#ef4444")
else:
text.append("\n ")
text.append("Creating...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(text, classes=css_classes)
@register_tool_renderer
class ListTodosRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "list_todos"
css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
text = Text()
text.append("📋 ")
text.append("Todos", style="bold #a78bfa")
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict):
if result.get("success"):
_format_todo_lines(text, result)
else:
error = result.get("error", "Unable to list todos")
text.append("\n ")
text.append(error, style="#ef4444")
else:
text.append("\n ")
text.append("Loading...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(text, classes=css_classes)
@register_tool_renderer
class UpdateTodoRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "update_todo"
css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
text = Text()
text.append("📋 ")
text.append("Todo Updated", style="bold #a78bfa")
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict):
if result.get("success"):
_format_todo_lines(text, result)
else:
error = result.get("error", "Failed to update todo")
text.append("\n ")
text.append(error, style="#ef4444")
else:
text.append("\n ")
text.append("Updating...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(text, classes=css_classes)
@register_tool_renderer
class MarkTodoDoneRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "mark_todo_done"
css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
text = Text()
text.append("📋 ")
text.append("Todo Completed", style="bold #a78bfa")
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict):
if result.get("success"):
_format_todo_lines(text, result)
else:
error = result.get("error", "Failed to mark todo done")
text.append("\n ")
text.append(error, style="#ef4444")
else:
text.append("\n ")
text.append("Marking done...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(text, classes=css_classes)
@register_tool_renderer
class MarkTodoPendingRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "mark_todo_pending"
css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
text = Text()
text.append("📋 ")
text.append("Todo Reopened", style="bold #f59e0b")
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict):
if result.get("success"):
_format_todo_lines(text, result)
else:
error = result.get("error", "Failed to reopen todo")
text.append("\n ")
text.append(error, style="#ef4444")
else:
text.append("\n ")
text.append("Reopening...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(text, classes=css_classes)
@register_tool_renderer
class DeleteTodoRenderer(BaseToolRenderer):
tool_name: ClassVar[str] = "delete_todo"
css_classes: ClassVar[list[str]] = ["tool-call", "todo-tool"]
@classmethod
def render(cls, tool_data: dict[str, Any]) -> Static:
result = tool_data.get("result")
text = Text()
text.append("📋 ")
text.append("Todo Removed", style="bold #94a3b8")
if isinstance(result, str) and result.strip():
text.append("\n ")
text.append(result.strip(), style="dim")
elif result and isinstance(result, dict):
if result.get("success"):
_format_todo_lines(text, result)
else:
error = result.get("error", "Failed to remove todo")
text.append("\n ")
text.append(error, style="#ef4444")
else:
text.append("\n ")
text.append("Removing...", style="dim")
css_classes = cls.get_css_classes("completed")
return Static(text, classes=css_classes)

View File

@@ -1,5 +1,6 @@
from typing import Any, ClassVar
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
@@ -12,32 +13,38 @@ class UserMessageRenderer(BaseToolRenderer):
css_classes: ClassVar[list[str]] = ["chat-message", "user-message"]
@classmethod
def render(cls, message_data: dict[str, Any]) -> Static:
content = message_data.get("content", "")
def render(cls, tool_data: dict[str, Any]) -> Static:
content = tool_data.get("content", "")
if not content:
return Static("", classes=cls.css_classes)
return Static(Text(), classes=" ".join(cls.css_classes))
if len(content) > 300:
content = content[:297] + "..."
styled_text = cls._format_user_message(content)
lines = content.split("\n")
bordered_lines = [f"[#3b82f6]▍[/#3b82f6] {line}" for line in lines]
bordered_content = "\n".join(bordered_lines)
formatted_content = f"[#3b82f6]▍[/#3b82f6] [bold]You:[/]\n{bordered_content}"
css_classes = " ".join(cls.css_classes)
return Static(formatted_content, classes=css_classes)
return Static(styled_text, classes=" ".join(cls.css_classes))
@classmethod
def render_simple(cls, content: str) -> str:
def render_simple(cls, content: str) -> Text:
if not content:
return ""
return Text()
if len(content) > 300:
content = content[:297] + "..."
return cls._format_user_message(content)
@classmethod
def _format_user_message(cls, content: str) -> Text:
text = Text()
text.append("", style="#3b82f6")
text.append(" ")
text.append("You:", style="bold")
text.append("\n")
lines = content.split("\n")
bordered_lines = [f"[#3b82f6]▍[/#3b82f6] {line}" for line in lines]
bordered_content = "\n".join(bordered_lines)
return f"[#3b82f6]▍[/#3b82f6] [bold]You:[/]\n{bordered_content}"
for i, line in enumerate(lines):
if i > 0:
text.append("\n")
text.append("", style="#3b82f6")
text.append(" ")
text.append(line)
return text

View File

@@ -1,5 +1,6 @@
from typing import Any, ClassVar
from rich.text import Text
from textual.widgets import Static
from .base_renderer import BaseToolRenderer
@@ -16,13 +17,13 @@ class WebSearchRenderer(BaseToolRenderer):
args = tool_data.get("args", {})
query = args.get("query", "")
header = "🌐 [bold #60a5fa]Searching the web...[/]"
text = Text()
text.append("🌐 ")
text.append("Searching the web...", style="bold #60a5fa")
if query:
query_display = query[:100] + "..." if len(query) > 100 else query
content_text = f"{header}\n [dim]{cls.escape_markup(query_display)}[/]"
else:
content_text = f"{header}"
text.append("\n ")
text.append(query, style="dim")
css_classes = cls.get_css_classes("completed")
return Static(content_text, classes=css_classes)
return Static(text, classes=css_classes)

File diff suppressed because it is too large Load Diff

View File

@@ -1,3 +1,4 @@
import ipaddress
import re
import secrets
import shutil
@@ -37,14 +38,168 @@ def get_severity_color(severity: str) -> str:
return severity_colors.get(severity, "#6b7280")
def build_stats_text(tracer: Any) -> Text:
stats_text = Text()
if not tracer:
return stats_text
def get_cvss_color(cvss_score: float) -> str:
if cvss_score >= 9.0:
return "#dc2626"
if cvss_score >= 7.0:
return "#ea580c"
if cvss_score >= 4.0:
return "#d97706"
if cvss_score >= 0.1:
return "#65a30d"
return "#6b7280"
def format_vulnerability_report(report: dict[str, Any]) -> Text: # noqa: PLR0912, PLR0915
"""Format a vulnerability report for CLI display with all rich fields."""
field_style = "bold #4ade80"
text = Text()
title = report.get("title", "")
if title:
text.append("Vulnerability Report", style="bold #ea580c")
text.append("\n\n")
text.append("Title: ", style=field_style)
text.append(title)
severity = report.get("severity", "")
if severity:
text.append("\n\n")
text.append("Severity: ", style=field_style)
severity_color = get_severity_color(severity.lower())
text.append(severity.upper(), style=f"bold {severity_color}")
cvss = report.get("cvss")
if cvss is not None:
text.append("\n\n")
text.append("CVSS Score: ", style=field_style)
cvss_color = get_cvss_color(cvss)
text.append(f"{cvss:.1f}", style=f"bold {cvss_color}")
target = report.get("target")
if target:
text.append("\n\n")
text.append("Target: ", style=field_style)
text.append(target)
endpoint = report.get("endpoint")
if endpoint:
text.append("\n\n")
text.append("Endpoint: ", style=field_style)
text.append(endpoint)
method = report.get("method")
if method:
text.append("\n\n")
text.append("Method: ", style=field_style)
text.append(method)
cve = report.get("cve")
if cve:
text.append("\n\n")
text.append("CVE: ", style=field_style)
text.append(cve)
cvss_breakdown = report.get("cvss_breakdown", {})
if cvss_breakdown:
text.append("\n\n")
cvss_parts = []
if cvss_breakdown.get("attack_vector"):
cvss_parts.append(f"AV:{cvss_breakdown['attack_vector']}")
if cvss_breakdown.get("attack_complexity"):
cvss_parts.append(f"AC:{cvss_breakdown['attack_complexity']}")
if cvss_breakdown.get("privileges_required"):
cvss_parts.append(f"PR:{cvss_breakdown['privileges_required']}")
if cvss_breakdown.get("user_interaction"):
cvss_parts.append(f"UI:{cvss_breakdown['user_interaction']}")
if cvss_breakdown.get("scope"):
cvss_parts.append(f"S:{cvss_breakdown['scope']}")
if cvss_breakdown.get("confidentiality"):
cvss_parts.append(f"C:{cvss_breakdown['confidentiality']}")
if cvss_breakdown.get("integrity"):
cvss_parts.append(f"I:{cvss_breakdown['integrity']}")
if cvss_breakdown.get("availability"):
cvss_parts.append(f"A:{cvss_breakdown['availability']}")
if cvss_parts:
text.append("CVSS Vector: ", style=field_style)
text.append("/".join(cvss_parts), style="dim")
description = report.get("description")
if description:
text.append("\n\n")
text.append("Description", style=field_style)
text.append("\n")
text.append(description)
impact = report.get("impact")
if impact:
text.append("\n\n")
text.append("Impact", style=field_style)
text.append("\n")
text.append(impact)
technical_analysis = report.get("technical_analysis")
if technical_analysis:
text.append("\n\n")
text.append("Technical Analysis", style=field_style)
text.append("\n")
text.append(technical_analysis)
poc_description = report.get("poc_description")
if poc_description:
text.append("\n\n")
text.append("PoC Description", style=field_style)
text.append("\n")
text.append(poc_description)
poc_script_code = report.get("poc_script_code")
if poc_script_code:
text.append("\n\n")
text.append("PoC Code", style=field_style)
text.append("\n")
text.append(poc_script_code, style="dim")
code_file = report.get("code_file")
if code_file:
text.append("\n\n")
text.append("Code File: ", style=field_style)
text.append(code_file)
code_before = report.get("code_before")
if code_before:
text.append("\n\n")
text.append("Code Before", style=field_style)
text.append("\n")
text.append(code_before, style="dim")
code_after = report.get("code_after")
if code_after:
text.append("\n\n")
text.append("Code After", style=field_style)
text.append("\n")
text.append(code_after, style="dim")
code_diff = report.get("code_diff")
if code_diff:
text.append("\n\n")
text.append("Code Diff", style=field_style)
text.append("\n")
text.append(code_diff, style="dim")
remediation_steps = report.get("remediation_steps")
if remediation_steps:
text.append("\n\n")
text.append("Remediation", style=field_style)
text.append("\n")
text.append(remediation_steps)
return text
def _build_vulnerability_stats(stats_text: Text, tracer: Any) -> None:
"""Build vulnerability section of stats text."""
vuln_count = len(tracer.vulnerability_reports)
tool_count = tracer.get_real_tool_count()
agent_count = len(tracer.agents)
if vuln_count > 0:
severity_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
@@ -80,68 +235,219 @@ def build_stats_text(tracer: Any) -> Text:
stats_text.append(" (No exploitable vulnerabilities detected)", style="dim green")
stats_text.append("\n")
def _build_llm_stats(stats_text: Text, total_stats: dict[str, Any]) -> None:
"""Build LLM usage section of stats text."""
if total_stats["requests"] > 0:
stats_text.append("\n")
stats_text.append("📥 Input Tokens: ", style="bold cyan")
stats_text.append(format_token_count(total_stats["input_tokens"]), style="bold white")
if total_stats["cached_tokens"] > 0:
stats_text.append("", style="dim white")
stats_text.append("⚡ Cached Tokens: ", style="bold green")
stats_text.append(format_token_count(total_stats["cached_tokens"]), style="bold white")
stats_text.append("", style="dim white")
stats_text.append("📤 Output Tokens: ", style="bold cyan")
stats_text.append(format_token_count(total_stats["output_tokens"]), style="bold white")
if total_stats["cost"] > 0:
stats_text.append("", style="dim white")
stats_text.append("💰 Total Cost: ", style="bold cyan")
stats_text.append(f"${total_stats['cost']:.4f}", style="bold yellow")
else:
stats_text.append("\n")
stats_text.append("💰 Total Cost: ", style="bold cyan")
stats_text.append("$0.0000 ", style="bold yellow")
stats_text.append("", style="bold white")
stats_text.append("📊 Tokens: ", style="bold cyan")
stats_text.append("0", style="bold white")
def build_final_stats_text(tracer: Any) -> Text:
"""Build stats text for final output with detailed messages and LLM usage."""
stats_text = Text()
if not tracer:
return stats_text
_build_vulnerability_stats(stats_text, tracer)
tool_count = tracer.get_real_tool_count()
agent_count = len(tracer.agents)
stats_text.append("🤖 Agents Used: ", style="bold cyan")
stats_text.append(str(agent_count), style="bold white")
stats_text.append("", style="dim white")
stats_text.append("🛠️ Tools Called: ", style="bold cyan")
stats_text.append(str(tool_count), style="bold white")
llm_stats = tracer.get_total_llm_stats()
_build_llm_stats(stats_text, llm_stats["total"])
return stats_text
def build_llm_stats_text(tracer: Any) -> Text:
llm_stats_text = Text()
def build_live_stats_text(tracer: Any, agent_config: dict[str, Any] | None = None) -> Text:
stats_text = Text()
if not tracer:
return llm_stats_text
return stats_text
if agent_config:
llm_config = agent_config["llm_config"]
model = getattr(llm_config, "model_name", "Unknown")
stats_text.append(f"🧠 Model: {model}")
stats_text.append("\n")
vuln_count = len(tracer.vulnerability_reports)
tool_count = tracer.get_real_tool_count()
agent_count = len(tracer.agents)
stats_text.append("🔍 Vulnerabilities: ", style="bold white")
stats_text.append(f"{vuln_count}", style="dim white")
stats_text.append("\n")
if vuln_count > 0:
severity_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
for report in tracer.vulnerability_reports:
severity = report.get("severity", "").lower()
if severity in severity_counts:
severity_counts[severity] += 1
severity_parts = []
for severity in ["critical", "high", "medium", "low", "info"]:
count = severity_counts[severity]
if count > 0:
severity_color = get_severity_color(severity)
severity_text = Text()
severity_text.append(f"{severity.upper()}: ", style=severity_color)
severity_text.append(str(count), style=f"bold {severity_color}")
severity_parts.append(severity_text)
for i, part in enumerate(severity_parts):
stats_text.append(part)
if i < len(severity_parts) - 1:
stats_text.append(" | ", style="dim white")
stats_text.append("\n")
stats_text.append("🤖 Agents: ", style="bold white")
stats_text.append(str(agent_count), style="dim white")
stats_text.append("", style="dim white")
stats_text.append("🛠️ Tools: ", style="bold white")
stats_text.append(str(tool_count), style="dim white")
llm_stats = tracer.get_total_llm_stats()
total_stats = llm_stats["total"]
if total_stats["requests"] > 0:
llm_stats_text.append("📥 Input Tokens: ", style="bold cyan")
llm_stats_text.append(format_token_count(total_stats["input_tokens"]), style="bold white")
stats_text.append("\n")
if total_stats["cached_tokens"] > 0:
llm_stats_text.append("", style="dim white")
llm_stats_text.append("⚡ Cached: ", style="bold green")
llm_stats_text.append(
format_token_count(total_stats["cached_tokens"]), style="bold green"
)
stats_text.append("📥 Input: ", style="bold white")
stats_text.append(format_token_count(total_stats["input_tokens"]), style="dim white")
llm_stats_text.append("", style="dim white")
llm_stats_text.append("📤 Output Tokens: ", style="bold cyan")
llm_stats_text.append(format_token_count(total_stats["output_tokens"]), style="bold white")
stats_text.append("", style="dim white")
stats_text.append(" ", style="bold white")
stats_text.append("Cached: ", style="bold white")
stats_text.append(format_token_count(total_stats["cached_tokens"]), style="dim white")
if total_stats["cost"] > 0:
llm_stats_text.append("", style="dim white")
llm_stats_text.append("💰 Total Cost: $", style="bold cyan")
llm_stats_text.append(f"{total_stats['cost']:.4f}", style="bold yellow")
stats_text.append("\n")
return llm_stats_text
stats_text.append("📤 Output: ", style="bold white")
stats_text.append(format_token_count(total_stats["output_tokens"]), style="dim white")
stats_text.append("", style="dim white")
stats_text.append("💰 Cost: ", style="bold white")
stats_text.append(f"${total_stats['cost']:.4f}", style="dim white")
return stats_text
def build_tui_stats_text(tracer: Any, agent_config: dict[str, Any] | None = None) -> Text:
stats_text = Text()
if not tracer:
return stats_text
if agent_config:
llm_config = agent_config["llm_config"]
model = getattr(llm_config, "model_name", "Unknown")
stats_text.append(model, style="dim")
llm_stats = tracer.get_total_llm_stats()
total_stats = llm_stats["total"]
total_tokens = total_stats["input_tokens"] + total_stats["output_tokens"]
if total_tokens > 0:
stats_text.append("\n")
stats_text.append(f"{format_token_count(total_tokens)} tokens", style="dim")
if total_stats["cost"] > 0:
stats_text.append("\n")
stats_text.append(f"${total_stats['cost']:.2f} spent", style="dim")
return stats_text
# Name generation utilities
def generate_run_name() -> str:
# fmt: off
adjectives = [
"stealthy", "sneaky", "crafty", "elite", "phantom", "shadow", "silent",
"rogue", "covert", "ninja", "ghost", "cyber", "digital", "binary",
"encrypted", "obfuscated", "masked", "cloaked", "invisible", "anonymous"
]
nouns = [
"exploit", "payload", "backdoor", "rootkit", "keylogger", "botnet", "trojan",
"worm", "virus", "packet", "buffer", "shell", "daemon", "spider", "crawler",
"scanner", "sniffer", "honeypot", "firewall", "breach"
]
# fmt: on
adj = secrets.choice(adjectives)
noun = secrets.choice(nouns)
number = secrets.randbelow(900) + 100
return f"{adj}-{noun}-{number}"
def _slugify_for_run_name(text: str, max_length: int = 32) -> str:
text = text.lower().strip()
text = re.sub(r"[^a-z0-9]+", "-", text)
text = text.strip("-")
if len(text) > max_length:
text = text[:max_length].rstrip("-")
return text or "pentest"
def _derive_target_label_for_run_name(targets_info: list[dict[str, Any]] | None) -> str: # noqa: PLR0911
if not targets_info:
return "pentest"
first = targets_info[0]
target_type = first.get("type")
details = first.get("details", {}) or {}
original = first.get("original", "") or ""
if target_type == "web_application":
url = details.get("target_url", original)
try:
parsed = urlparse(url)
return str(parsed.netloc or parsed.path or url)
except Exception: # noqa: BLE001
return str(url)
if target_type == "repository":
repo = details.get("target_repo", original)
parsed = urlparse(repo)
path = parsed.path or repo
name = path.rstrip("/").split("/")[-1] or path
if name.endswith(".git"):
name = name[:-4]
return str(name)
if target_type == "local_code":
path_str = details.get("target_path", original)
try:
return str(Path(path_str).name or path_str)
except Exception: # noqa: BLE001
return str(path_str)
if target_type == "ip_address":
return str(details.get("target_ip", original) or original)
return str(original or "pentest")
def generate_run_name(targets_info: list[dict[str, Any]] | None = None) -> str:
base_label = _derive_target_label_for_run_name(targets_info)
slug = _slugify_for_run_name(base_label)
random_suffix = secrets.token_hex(2)
return f"{slug}_{random_suffix}"
# Target processing utilities
def infer_target_type(target: str) -> tuple[str, dict[str, str]]:
def infer_target_type(target: str) -> tuple[str, dict[str, str]]: # noqa: PLR0911
if not target or not isinstance(target, str):
raise ValueError("Target must be a non-empty string")
@@ -167,6 +473,13 @@ def infer_target_type(target: str) -> tuple[str, dict[str, str]]:
return "repository", {"target_repo": target}
return "web_application", {"target_url": target}
try:
ip_obj = ipaddress.ip_address(target)
except ValueError:
pass
else:
return "ip_address", {"target_ip": str(ip_obj)}
path = Path(target).expanduser()
try:
if path.exists():
@@ -191,7 +504,8 @@ def infer_target_type(target: str) -> tuple[str, dict[str, str]]:
"- A valid URL (http:// or https://)\n"
"- A Git repository URL (https://github.com/... or git@github.com:...)\n"
"- A local directory path\n"
"- A domain name (e.g., example.com)"
"- A domain name (e.g., example.com)\n"
"- An IP address (e.g., 192.168.1.10)"
)
@@ -274,6 +588,47 @@ def collect_local_sources(targets_info: list[dict[str, Any]]) -> list[dict[str,
return local_sources
def _is_localhost_host(host: str) -> bool:
host_lower = host.lower().strip("[]")
if host_lower in ("localhost", "0.0.0.0", "::1"): # nosec B104
return True
try:
ip = ipaddress.ip_address(host_lower)
if isinstance(ip, ipaddress.IPv4Address):
return ip.is_loopback # 127.0.0.0/8
if isinstance(ip, ipaddress.IPv6Address):
return ip.is_loopback # ::1
except ValueError:
pass
return False
def rewrite_localhost_targets(targets_info: list[dict[str, Any]], host_gateway: str) -> None:
from yarl import URL # type: ignore[import-not-found]
for target_info in targets_info:
target_type = target_info.get("type")
details = target_info.get("details", {})
if target_type == "web_application":
target_url = details.get("target_url", "")
try:
url = URL(target_url)
except (ValueError, TypeError):
continue
if url.host and _is_localhost_host(url.host):
details["target_url"] = str(url.with_host(host_gateway))
elif target_type == "ip_address":
target_ip = details.get("target_ip", "")
if target_ip and _is_localhost_host(target_ip):
details["target_ip"] = host_gateway
# Repository utilities
def clone_repository(repo_url: str, run_name: str, dest_name: str | None = None) -> str:
console = Console()
@@ -364,9 +719,10 @@ def check_docker_connection() -> Any:
error_text.append("DOCKER NOT AVAILABLE", style="bold red")
error_text.append("\n\n", style="white")
error_text.append("Cannot connect to Docker daemon.\n", style="white")
error_text.append("Please ensure Docker is installed and running.\n\n", style="white")
error_text.append("Try running: ", style="dim white")
error_text.append("sudo systemctl start docker", style="dim cyan")
error_text.append(
"Please ensure Docker Desktop is installed and running, and try running strix again.\n",
style="white",
)
panel = Panel(
error_text,

View File

@@ -1,3 +1,6 @@
import logging
import warnings
import litellm
from .config import LLMConfig
@@ -11,5 +14,6 @@ __all__ = [
]
litellm._logging._disable_debugging()
litellm.drop_params = True
logging.getLogger("asyncio").setLevel(logging.CRITICAL)
logging.getLogger("asyncio").propagate = False
warnings.filterwarnings("ignore", category=RuntimeWarning, module="asyncio")

View File

@@ -1,19 +1,23 @@
import os
from strix.config import Config
class LLMConfig:
def __init__(
self,
model_name: str | None = None,
temperature: float = 0,
enable_prompt_caching: bool = True,
prompt_modules: list[str] | None = None,
skills: list[str] | None = None,
timeout: int | None = None,
scan_mode: str = "deep",
):
self.model_name = model_name or os.getenv("STRIX_LLM", "openai/gpt-5")
self.model_name = model_name or Config.get("strix_llm")
if not self.model_name:
raise ValueError("STRIX_LLM environment variable must be set and not empty")
self.temperature = max(0.0, min(1.0, temperature))
self.enable_prompt_caching = enable_prompt_caching
self.prompt_modules = prompt_modules or []
self.skills = skills or []
self.timeout = timeout or int(Config.get("llm_timeout") or "300")
self.scan_mode = scan_mode if scan_mode in ["quick", "standard", "deep"] else "deep"

218
strix/llm/dedupe.py Normal file
View File

@@ -0,0 +1,218 @@
import json
import logging
import re
from typing import Any
import litellm
from strix.config import Config
logger = logging.getLogger(__name__)
DEDUPE_SYSTEM_PROMPT = """You are an expert vulnerability report deduplication judge.
Your task is to determine if a candidate vulnerability report describes the SAME vulnerability
as any existing report.
CRITICAL DEDUPLICATION RULES:
1. SAME VULNERABILITY means:
- Same root cause (e.g., "missing input validation" not just "SQL injection")
- Same affected component/endpoint/file (exact match or clear overlap)
- Same exploitation method or attack vector
- Would be fixed by the same code change/patch
2. NOT DUPLICATES if:
- Different endpoints even with same vulnerability type (e.g., SQLi in /login vs /search)
- Different parameters in same endpoint (e.g., XSS in 'name' vs 'comment' field)
- Different root causes (e.g., stored XSS vs reflected XSS in same field)
- Different severity levels due to different impact
- One is authenticated, other is unauthenticated
3. ARE DUPLICATES even if:
- Titles are worded differently
- Descriptions have different level of detail
- PoC uses different payloads but exploits same issue
- One report is more thorough than another
- Minor variations in technical analysis
COMPARISON GUIDELINES:
- Focus on the technical root cause, not surface-level similarities
- Same vulnerability type (SQLi, XSS) doesn't mean duplicate - location matters
- Consider the fix: would fixing one also fix the other?
- When uncertain, lean towards NOT duplicate
FIELDS TO ANALYZE:
- title, description: General vulnerability info
- target, endpoint, method: Exact location of vulnerability
- technical_analysis: Root cause details
- poc_description: How it's exploited
- impact: What damage it can cause
YOU MUST RESPOND WITH EXACTLY THIS XML FORMAT AND NOTHING ELSE:
<dedupe_result>
<is_duplicate>true</is_duplicate>
<duplicate_id>vuln-0001</duplicate_id>
<confidence>0.95</confidence>
<reason>Both reports describe SQL injection in /api/login via the username parameter</reason>
</dedupe_result>
OR if not a duplicate:
<dedupe_result>
<is_duplicate>false</is_duplicate>
<duplicate_id></duplicate_id>
<confidence>0.90</confidence>
<reason>Different endpoints: candidate is /api/search, existing is /api/login</reason>
</dedupe_result>
RULES:
- is_duplicate MUST be exactly "true" or "false" (lowercase)
- duplicate_id MUST be the exact ID from existing reports or empty if not duplicate
- confidence MUST be a decimal (your confidence level in the decision)
- reason MUST be a specific explanation mentioning endpoint/parameter/root cause
- DO NOT include any text outside the <dedupe_result> tags"""
def _prepare_report_for_comparison(report: dict[str, Any]) -> dict[str, Any]:
relevant_fields = [
"id",
"title",
"description",
"impact",
"target",
"technical_analysis",
"poc_description",
"endpoint",
"method",
]
cleaned = {}
for field in relevant_fields:
if report.get(field):
value = report[field]
if isinstance(value, str) and len(value) > 8000:
value = value[:8000] + "...[truncated]"
cleaned[field] = value
return cleaned
def _extract_xml_field(content: str, field: str) -> str:
pattern = rf"<{field}>(.*?)</{field}>"
match = re.search(pattern, content, re.DOTALL | re.IGNORECASE)
if match:
return match.group(1).strip()
return ""
def _parse_dedupe_response(content: str) -> dict[str, Any]:
result_match = re.search(
r"<dedupe_result>(.*?)</dedupe_result>", content, re.DOTALL | re.IGNORECASE
)
if not result_match:
logger.warning(f"No <dedupe_result> block found in response: {content[:500]}")
raise ValueError("No <dedupe_result> block found in response")
result_content = result_match.group(1)
is_duplicate_str = _extract_xml_field(result_content, "is_duplicate")
duplicate_id = _extract_xml_field(result_content, "duplicate_id")
confidence_str = _extract_xml_field(result_content, "confidence")
reason = _extract_xml_field(result_content, "reason")
is_duplicate = is_duplicate_str.lower() == "true"
try:
confidence = float(confidence_str) if confidence_str else 0.0
except ValueError:
confidence = 0.0
return {
"is_duplicate": is_duplicate,
"duplicate_id": duplicate_id[:64] if duplicate_id else "",
"confidence": confidence,
"reason": reason[:500] if reason else "",
}
def check_duplicate(
candidate: dict[str, Any], existing_reports: list[dict[str, Any]]
) -> dict[str, Any]:
if not existing_reports:
return {
"is_duplicate": False,
"duplicate_id": "",
"confidence": 1.0,
"reason": "No existing reports to compare against",
}
try:
candidate_cleaned = _prepare_report_for_comparison(candidate)
existing_cleaned = [_prepare_report_for_comparison(r) for r in existing_reports]
comparison_data = {"candidate": candidate_cleaned, "existing_reports": existing_cleaned}
model_name = Config.get("strix_llm")
api_key = Config.get("llm_api_key")
api_base = (
Config.get("llm_api_base")
or Config.get("openai_api_base")
or Config.get("litellm_base_url")
or Config.get("ollama_api_base")
)
messages = [
{"role": "system", "content": DEDUPE_SYSTEM_PROMPT},
{
"role": "user",
"content": (
f"Compare this candidate vulnerability against existing reports:\n\n"
f"{json.dumps(comparison_data, indent=2)}\n\n"
f"Respond with ONLY the <dedupe_result> XML block."
),
},
]
completion_kwargs: dict[str, Any] = {
"model": model_name,
"messages": messages,
"timeout": 120,
"temperature": 0,
}
if api_key:
completion_kwargs["api_key"] = api_key
if api_base:
completion_kwargs["api_base"] = api_base
response = litellm.completion(**completion_kwargs)
content = response.choices[0].message.content
if not content:
return {
"is_duplicate": False,
"duplicate_id": "",
"confidence": 0.0,
"reason": "Empty response from LLM",
}
result = _parse_dedupe_response(content)
logger.info(
f"Deduplication check: is_duplicate={result['is_duplicate']}, "
f"confidence={result['confidence']}, reason={result['reason'][:100]}"
)
except Exception as e:
logger.exception("Error during vulnerability deduplication check")
return {
"is_duplicate": False,
"duplicate_id": "",
"confidence": 0.0,
"reason": f"Deduplication check failed: {e}",
"error": str(e),
}
else:
return result

View File

@@ -1,5 +1,6 @@
import asyncio
import logging
import os
from collections.abc import AsyncIterator
from dataclasses import dataclass
from enum import Enum
from pathlib import Path
@@ -11,31 +12,48 @@ from jinja2 import (
FileSystemLoader,
select_autoescape,
)
from litellm import ModelResponse, completion_cost
from litellm.utils import supports_prompt_caching
from litellm import completion_cost, stream_chunk_builder, supports_reasoning
from litellm.utils import supports_prompt_caching, supports_vision
from strix.config import Config
from strix.llm.config import LLMConfig
from strix.llm.memory_compressor import MemoryCompressor
from strix.llm.request_queue import get_global_queue
from strix.llm.utils import _truncate_to_first_function, parse_tool_invocations
from strix.prompts import load_prompt_modules
from strix.skills import load_skills
from strix.tools import get_tools_prompt
MAX_RETRIES = 5
RETRY_MULTIPLIER = 8
RETRY_MIN = 8
RETRY_MAX = 64
def _should_retry(exception: Exception) -> bool:
status_code = None
if hasattr(exception, "status_code"):
status_code = exception.status_code
elif hasattr(exception, "response") and hasattr(exception.response, "status_code"):
status_code = exception.response.status_code
if status_code is not None:
return bool(litellm._should_retry(status_code))
return True
logger = logging.getLogger(__name__)
api_key = os.getenv("LLM_API_KEY")
if api_key:
litellm.api_key = api_key
litellm.drop_params = True
litellm.modify_params = True
api_base = (
os.getenv("LLM_API_BASE")
or os.getenv("OPENAI_API_BASE")
or os.getenv("LITELLM_BASE_URL")
or os.getenv("OLLAMA_API_BASE")
_LLM_API_KEY = Config.get("llm_api_key")
_LLM_API_BASE = (
Config.get("llm_api_base")
or Config.get("openai_api_base")
or Config.get("litellm_base_url")
or Config.get("ollama_api_base")
)
if api_base:
litellm.api_base = api_base
_STRIX_REASONING_EFFORT = Config.get("strix_reasoning_effort")
class LLMRequestFailedError(Exception):
@@ -45,40 +63,6 @@ class LLMRequestFailedError(Exception):
self.details = details
MODELS_WITHOUT_STOP_WORDS = [
"gpt-5",
"gpt-5-mini",
"gpt-5-nano",
"o1-mini",
"o1-preview",
"o1",
"o1-2024-12-17",
"o3",
"o3-2025-04-16",
"o3-mini-2025-01-31",
"o3-mini",
"o4-mini",
"o4-mini-2025-04-16",
"grok-4-0709",
]
REASONING_EFFORT_SUPPORTED_MODELS = [
"gpt-5",
"gpt-5-mini",
"gpt-5-nano",
"o1-2024-12-17",
"o1",
"o3",
"o3-2025-04-16",
"o3-mini-2025-01-31",
"o3-mini",
"o4-mini",
"o4-mini-2025-04-16",
"gemini-2.5-flash",
"gemini-2.5-pro",
]
class StepRole(str, Enum):
AGENT = "agent"
USER = "user"
@@ -92,6 +76,7 @@ class LLMResponse:
scan_id: str | None = None
step_number: int = 1
role: StepRole = StepRole.AGENT
thinking_blocks: list[dict[str, Any]] | None = None # For reasoning models.
@dataclass
@@ -117,38 +102,52 @@ class RequestStats:
class LLM:
def __init__(self, config: LLMConfig, agent_name: str | None = None):
def __init__(
self, config: LLMConfig, agent_name: str | None = None, agent_id: str | None = None
):
self.config = config
self.agent_name = agent_name
self.agent_id = agent_id
self._total_stats = RequestStats()
self._last_request_stats = RequestStats()
self.memory_compressor = MemoryCompressor()
if _STRIX_REASONING_EFFORT:
self._reasoning_effort = _STRIX_REASONING_EFFORT
elif self.config.scan_mode == "quick":
self._reasoning_effort = "medium"
else:
self._reasoning_effort = "high"
self.memory_compressor = MemoryCompressor(
model_name=self.config.model_name,
timeout=self.config.timeout,
)
if agent_name:
prompt_dir = Path(__file__).parent.parent / "agents" / agent_name
prompts_dir = Path(__file__).parent.parent / "prompts"
skills_dir = Path(__file__).parent.parent / "skills"
loader = FileSystemLoader([prompt_dir, prompts_dir])
loader = FileSystemLoader([prompt_dir, skills_dir])
self.jinja_env = Environment(
loader=loader,
autoescape=select_autoescape(enabled_extensions=(), default_for_string=False),
)
try:
prompt_module_content = load_prompt_modules(
self.config.prompt_modules or [], self.jinja_env
)
skills_to_load = list(self.config.skills or [])
skills_to_load.append(f"scan_modes/{self.config.scan_mode}")
def get_module(name: str) -> str:
return prompt_module_content.get(name, "")
skill_content = load_skills(skills_to_load, self.jinja_env)
self.jinja_env.globals["get_module"] = get_module
def get_skill(name: str) -> str:
return skill_content.get(name, "")
self.jinja_env.globals["get_skill"] = get_skill
self.system_prompt = self.jinja_env.get_template("system_prompt.jinja").render(
get_tools_prompt=get_tools_prompt,
loaded_module_names=list(prompt_module_content.keys()),
**prompt_module_content,
loaded_skill_names=list(skill_content.keys()),
**skill_content,
)
except (FileNotFoundError, OSError, ValueError) as e:
logger.warning(f"Failed to load system prompt for {agent_name}: {e}")
@@ -156,6 +155,31 @@ class LLM:
else:
self.system_prompt = "You are a helpful AI assistant."
def set_agent_identity(self, agent_name: str | None, agent_id: str | None) -> None:
if agent_name:
self.agent_name = agent_name
if agent_id:
self.agent_id = agent_id
def _build_identity_message(self) -> dict[str, Any] | None:
if not (self.agent_name and str(self.agent_name).strip()):
return None
identity_name = self.agent_name
identity_id = self.agent_id
content = (
"\n\n"
"<agent_identity>\n"
"<meta>Internal metadata: do not echo or reference; "
"not part of history or tool calls.</meta>\n"
"<note>You are now assuming the role of this agent. "
"Act strictly as this agent and maintain self-identity for this step. "
"Now go answer the next needed step!</note>\n"
f"<agent_name>{identity_name}</agent_name>\n"
f"<agent_id>{identity_id}</agent_id>\n"
"</agent_identity>\n\n"
)
return {"role": "user", "content": content}
def _add_cache_control_to_content(
self, content: str | list[dict[str, Any]]
) -> str | list[dict[str, Any]]:
@@ -223,94 +247,143 @@ class LLM:
return cached_messages
async def generate( # noqa: PLR0912, PLR0915
self,
conversation_history: list[dict[str, Any]],
scan_id: str | None = None,
step_number: int = 1,
) -> LLMResponse:
def _prepare_messages(self, conversation_history: list[dict[str, Any]]) -> list[dict[str, Any]]:
messages = [{"role": "system", "content": self.system_prompt}]
identity_message = self._build_identity_message()
if identity_message:
messages.append(identity_message)
compressed_history = list(self.memory_compressor.compress_history(conversation_history))
conversation_history.clear()
conversation_history.extend(compressed_history)
messages.extend(compressed_history)
cached_messages = self._prepare_cached_messages(messages)
return self._prepare_cached_messages(messages)
try:
response = await self._make_request(cached_messages)
self._update_usage_stats(response)
async def _stream_and_accumulate(
self,
messages: list[dict[str, Any]],
scan_id: str | None,
step_number: int,
) -> AsyncIterator[LLMResponse]:
accumulated_content = ""
chunks: list[Any] = []
content = ""
async for chunk in self._stream_request(messages):
chunks.append(chunk)
delta = self._extract_chunk_delta(chunk)
if delta:
accumulated_content += delta
if "</function>" in accumulated_content:
function_end = accumulated_content.find("</function>") + len("</function>")
accumulated_content = accumulated_content[:function_end]
yield LLMResponse(
scan_id=scan_id,
step_number=step_number,
role=StepRole.AGENT,
content=accumulated_content,
tool_invocations=None,
)
if chunks:
complete_response = stream_chunk_builder(chunks)
self._update_usage_stats(complete_response)
accumulated_content = _truncate_to_first_function(accumulated_content)
if "</function>" in accumulated_content:
function_end = accumulated_content.find("</function>") + len("</function>")
accumulated_content = accumulated_content[:function_end]
tool_invocations = parse_tool_invocations(accumulated_content)
# Extract thinking blocks from the complete response if available
thinking_blocks = None
if chunks and self._should_include_reasoning_effort():
complete_response = stream_chunk_builder(chunks)
if (
response.choices
and hasattr(response.choices[0], "message")
and response.choices[0].message
hasattr(complete_response, "choices")
and complete_response.choices
and hasattr(complete_response.choices[0], "message")
):
content = getattr(response.choices[0].message, "content", "") or ""
message = complete_response.choices[0].message
if hasattr(message, "thinking_blocks") and message.thinking_blocks:
thinking_blocks = message.thinking_blocks
content = _truncate_to_first_function(content)
yield LLMResponse(
scan_id=scan_id,
step_number=step_number,
role=StepRole.AGENT,
content=accumulated_content,
tool_invocations=tool_invocations if tool_invocations else None,
thinking_blocks=thinking_blocks,
)
if "</function>" in content:
function_end_index = content.find("</function>") + len("</function>")
content = content[:function_end_index]
def _raise_llm_error(self, e: Exception) -> None:
error_map: list[tuple[type, str]] = [
(litellm.RateLimitError, "Rate limit exceeded"),
(litellm.AuthenticationError, "Invalid API key"),
(litellm.NotFoundError, "Model not found"),
(litellm.ContextWindowExceededError, "Context too long"),
(litellm.ContentPolicyViolationError, "Content policy violation"),
(litellm.ServiceUnavailableError, "Service unavailable"),
(litellm.Timeout, "Request timed out"),
(litellm.UnprocessableEntityError, "Unprocessable entity"),
(litellm.InternalServerError, "Internal server error"),
(litellm.APIConnectionError, "Connection error"),
(litellm.UnsupportedParamsError, "Unsupported parameters"),
(litellm.BudgetExceededError, "Budget exceeded"),
(litellm.APIResponseValidationError, "Response validation error"),
(litellm.JSONSchemaValidationError, "JSON schema validation error"),
(litellm.InvalidRequestError, "Invalid request"),
(litellm.BadRequestError, "Bad request"),
(litellm.APIError, "API error"),
(litellm.OpenAIError, "OpenAI error"),
]
tool_invocations = parse_tool_invocations(content)
from strix.telemetry import posthog
return LLMResponse(
scan_id=scan_id,
step_number=step_number,
role=StepRole.AGENT,
content=content,
tool_invocations=tool_invocations if tool_invocations else None,
)
for error_type, message in error_map:
if isinstance(e, error_type):
posthog.error(f"llm_{error_type.__name__}", message)
raise LLMRequestFailedError(f"LLM request failed: {message}", str(e)) from e
except litellm.RateLimitError as e:
raise LLMRequestFailedError("LLM request failed: Rate limit exceeded", str(e)) from e
except litellm.AuthenticationError as e:
raise LLMRequestFailedError("LLM request failed: Invalid API key", str(e)) from e
except litellm.NotFoundError as e:
raise LLMRequestFailedError("LLM request failed: Model not found", str(e)) from e
except litellm.ContextWindowExceededError as e:
raise LLMRequestFailedError("LLM request failed: Context too long", str(e)) from e
except litellm.ContentPolicyViolationError as e:
raise LLMRequestFailedError(
"LLM request failed: Content policy violation", str(e)
) from e
except litellm.ServiceUnavailableError as e:
raise LLMRequestFailedError("LLM request failed: Service unavailable", str(e)) from e
except litellm.Timeout as e:
raise LLMRequestFailedError("LLM request failed: Request timed out", str(e)) from e
except litellm.UnprocessableEntityError as e:
raise LLMRequestFailedError("LLM request failed: Unprocessable entity", str(e)) from e
except litellm.InternalServerError as e:
raise LLMRequestFailedError("LLM request failed: Internal server error", str(e)) from e
except litellm.APIConnectionError as e:
raise LLMRequestFailedError("LLM request failed: Connection error", str(e)) from e
except litellm.UnsupportedParamsError as e:
raise LLMRequestFailedError("LLM request failed: Unsupported parameters", str(e)) from e
except litellm.BudgetExceededError as e:
raise LLMRequestFailedError("LLM request failed: Budget exceeded", str(e)) from e
except litellm.APIResponseValidationError as e:
raise LLMRequestFailedError(
"LLM request failed: Response validation error", str(e)
) from e
except litellm.JSONSchemaValidationError as e:
raise LLMRequestFailedError(
"LLM request failed: JSON schema validation error", str(e)
) from e
except litellm.InvalidRequestError as e:
raise LLMRequestFailedError("LLM request failed: Invalid request", str(e)) from e
except litellm.BadRequestError as e:
raise LLMRequestFailedError("LLM request failed: Bad request", str(e)) from e
except litellm.APIError as e:
raise LLMRequestFailedError("LLM request failed: API error", str(e)) from e
except litellm.OpenAIError as e:
raise LLMRequestFailedError("LLM request failed: OpenAI error", str(e)) from e
except Exception as e:
raise LLMRequestFailedError(f"LLM request failed: {type(e).__name__}", str(e)) from e
posthog.error("llm_unknown_error", type(e).__name__)
raise LLMRequestFailedError(f"LLM request failed: {type(e).__name__}", str(e)) from e
async def generate(
self,
conversation_history: list[dict[str, Any]],
scan_id: str | None = None,
step_number: int = 1,
) -> AsyncIterator[LLMResponse]:
messages = self._prepare_messages(conversation_history)
last_error: Exception | None = None
for attempt in range(MAX_RETRIES):
try:
async for response in self._stream_and_accumulate(messages, scan_id, step_number):
yield response
return # noqa: TRY300
except Exception as e: # noqa: BLE001
last_error = e
if not _should_retry(e) or attempt == MAX_RETRIES - 1:
break
wait_time = min(RETRY_MAX, RETRY_MULTIPLIER * (2**attempt))
wait_time = max(RETRY_MIN, wait_time)
await asyncio.sleep(wait_time)
if last_error:
self._raise_llm_error(last_error)
def _extract_chunk_delta(self, chunk: Any) -> str:
if chunk.choices and hasattr(chunk.choices[0], "delta"):
delta = chunk.choices[0].delta
return getattr(delta, "content", "") or ""
return ""
@property
def usage_stats(self) -> dict[str, dict[str, int | float]]:
@@ -325,58 +398,93 @@ class LLM:
"supported": supports_prompt_caching(self.config.model_name),
}
def _should_include_stop_param(self) -> bool:
if not self.config.model_name:
return True
actual_model_name = self.config.model_name.split("/")[-1].lower()
model_name_lower = self.config.model_name.lower()
return not any(
actual_model_name == unsupported_model.lower()
or model_name_lower == unsupported_model.lower()
for unsupported_model in MODELS_WITHOUT_STOP_WORDS
)
def _should_include_reasoning_effort(self) -> bool:
if not self.config.model_name:
return False
try:
return bool(supports_reasoning(model=self.config.model_name))
except Exception: # noqa: BLE001
return False
actual_model_name = self.config.model_name.split("/")[-1].lower()
model_name_lower = self.config.model_name.lower()
def _model_supports_vision(self) -> bool:
if not self.config.model_name:
return False
try:
return bool(supports_vision(model=self.config.model_name))
except Exception: # noqa: BLE001
return False
return any(
actual_model_name == supported_model.lower()
or model_name_lower == supported_model.lower()
for supported_model in REASONING_EFFORT_SUPPORTED_MODELS
)
def _filter_images_from_messages(self, messages: list[dict[str, Any]]) -> list[dict[str, Any]]:
filtered_messages = []
for msg in messages:
content = msg.get("content")
updated_msg = msg
if isinstance(content, list):
filtered_content = []
for item in content:
if isinstance(item, dict):
if item.get("type") == "image_url":
filtered_content.append(
{
"type": "text",
"text": "[Screenshot removed - model does not support "
"vision. Use view_source or execute_js instead.]",
}
)
else:
filtered_content.append(item)
else:
filtered_content.append(item)
if filtered_content:
text_parts = [
item.get("text", "") if isinstance(item, dict) else str(item)
for item in filtered_content
]
all_text = all(
isinstance(item, dict) and item.get("type") == "text"
for item in filtered_content
)
if all_text:
updated_msg = {**msg, "content": "\n".join(text_parts)}
else:
updated_msg = {**msg, "content": filtered_content}
else:
updated_msg = {**msg, "content": ""}
filtered_messages.append(updated_msg)
return filtered_messages
async def _make_request(
async def _stream_request(
self,
messages: list[dict[str, Any]],
) -> ModelResponse:
) -> AsyncIterator[Any]:
if not self._model_supports_vision():
messages = self._filter_images_from_messages(messages)
completion_args: dict[str, Any] = {
"model": self.config.model_name,
"messages": messages,
"temperature": self.config.temperature,
"timeout": 180,
"timeout": self.config.timeout,
"stream_options": {"include_usage": True},
}
if self._should_include_stop_param():
completion_args["stop"] = ["</function>"]
if _LLM_API_KEY:
completion_args["api_key"] = _LLM_API_KEY
if _LLM_API_BASE:
completion_args["api_base"] = _LLM_API_BASE
completion_args["stop"] = ["</function>"]
if self._should_include_reasoning_effort():
completion_args["reasoning_effort"] = "high"
completion_args["reasoning_effort"] = self._reasoning_effort
queue = get_global_queue()
response = await queue.make_request(completion_args)
self._total_stats.requests += 1
self._last_request_stats = RequestStats(requests=1)
return response
async for chunk in queue.stream_request(completion_args):
yield chunk
def _update_usage_stats(self, response: ModelResponse) -> None:
def _update_usage_stats(self, response: Any) -> None:
try:
if hasattr(response, "usage") and response.usage:
input_tokens = getattr(response.usage, "prompt_tokens", 0)

View File

@@ -1,9 +1,10 @@
import logging
import os
from typing import Any
import litellm
from strix.config import Config
logger = logging.getLogger(__name__)
@@ -85,6 +86,7 @@ def _extract_message_text(msg: dict[str, Any]) -> str:
def _summarize_messages(
messages: list[dict[str, Any]],
model: str,
timeout: int = 600,
) -> dict[str, Any]:
if not messages:
empty_summary = "<context_summary message_count='0'>{text}</context_summary>"
@@ -106,7 +108,7 @@ def _summarize_messages(
completion_args = {
"model": model,
"messages": [{"role": "user", "content": prompt}],
"timeout": 180,
"timeout": timeout,
}
response = litellm.completion(**completion_args)
@@ -146,9 +148,11 @@ class MemoryCompressor:
self,
max_images: int = 3,
model_name: str | None = None,
timeout: int = 600,
):
self.max_images = max_images
self.model_name = model_name or os.getenv("STRIX_LLM", "openai/gpt-5")
self.model_name = model_name or Config.get("strix_llm")
self.timeout = timeout
if not self.model_name:
raise ValueError("STRIX_LLM environment variable must be set and not empty")
@@ -202,7 +206,7 @@ class MemoryCompressor:
chunk_size = 10
for i in range(0, len(old_msgs), chunk_size):
chunk = old_msgs[i : i + chunk_size]
summary = _summarize_messages(chunk, model_name)
summary = _summarize_messages(chunk, model_name, self.timeout)
if summary:
compressed.append(summary)

View File

@@ -1,39 +1,26 @@
import asyncio
import logging
import threading
import time
from collections.abc import AsyncIterator
from typing import Any
import litellm
from litellm import ModelResponse, completion
from tenacity import retry, retry_if_exception, stop_after_attempt, wait_exponential
from litellm import acompletion
from litellm.types.utils import ModelResponseStream
logger = logging.getLogger(__name__)
def should_retry_exception(exception: Exception) -> bool:
status_code = None
if hasattr(exception, "status_code"):
status_code = exception.status_code
elif hasattr(exception, "response") and hasattr(exception.response, "status_code"):
status_code = exception.response.status_code
if status_code is not None:
return bool(litellm._should_retry(status_code))
return True
from strix.config import Config
class LLMRequestQueue:
def __init__(self, max_concurrent: int = 6, delay_between_requests: float = 1.0):
self.max_concurrent = max_concurrent
self.delay_between_requests = delay_between_requests
self._semaphore = threading.BoundedSemaphore(max_concurrent)
def __init__(self) -> None:
self.delay_between_requests = float(Config.get("llm_rate_limit_delay") or "4.0")
self.max_concurrent = int(Config.get("llm_rate_limit_concurrent") or "1")
self._semaphore = threading.BoundedSemaphore(self.max_concurrent)
self._last_request_time = 0.0
self._lock = threading.Lock()
async def make_request(self, completion_args: dict[str, Any]) -> ModelResponse:
async def stream_request(
self, completion_args: dict[str, Any]
) -> AsyncIterator[ModelResponseStream]:
try:
while not self._semaphore.acquire(timeout=0.2):
await asyncio.sleep(0.1)
@@ -47,25 +34,18 @@ class LLMRequestQueue:
if sleep_needed > 0:
await asyncio.sleep(sleep_needed)
return await self._reliable_request(completion_args)
async for chunk in self._stream_request(completion_args):
yield chunk
finally:
self._semaphore.release()
@retry( # type: ignore[misc]
stop=stop_after_attempt(5),
wait=wait_exponential(multiplier=2, min=1, max=30),
retry=retry_if_exception(should_retry_exception),
reraise=True,
)
async def _reliable_request(self, completion_args: dict[str, Any]) -> ModelResponse:
response = completion(**completion_args, stream=False)
if isinstance(response, ModelResponse):
return response
self._raise_unexpected_response()
raise RuntimeError("Unreachable code")
async def _stream_request(
self, completion_args: dict[str, Any]
) -> AsyncIterator[ModelResponseStream]:
response = await acompletion(**completion_args, stream=True)
def _raise_unexpected_response(self) -> None:
raise RuntimeError("Unexpected response type")
async for chunk in response:
yield chunk
_global_queue: LLMRequestQueue | None = None

View File

@@ -47,10 +47,14 @@ def parse_tool_invocations(content: str) -> list[dict[str, Any]] | None:
def _fix_stopword(content: str) -> str:
if "<function=" in content and content.count("<function=") == 1:
if (
"<function=" in content
and content.count("<function=") == 1
and "</function>" not in content
):
if content.endswith("</"):
content = content.rstrip() + "function>"
elif not content.rstrip().endswith("</function>"):
else:
content = content + "\n</function>"
return content
@@ -75,6 +79,12 @@ def clean_content(content: str) -> str:
tool_pattern = r"<function=[^>]+>.*?</function>"
cleaned = re.sub(tool_pattern, "", content, flags=re.DOTALL)
incomplete_tool_pattern = r"<function=[^>]+>.*$"
cleaned = re.sub(incomplete_tool_pattern, "", cleaned, flags=re.DOTALL)
partial_tag_pattern = r"<f(?:u(?:n(?:c(?:t(?:i(?:o(?:n(?:=(?:[^>]*)?)?)?)?)?)?)?)?)?$"
cleaned = re.sub(partial_tag_pattern, "", cleaned)
hidden_xml_patterns = [
r"<inter_agent_message>.*?</inter_agent_message>",
r"<agent_completion_report>.*?</agent_completion_report>",

View File

@@ -1,64 +0,0 @@
# 📚 Strix Prompt Modules
## 🎯 Overview
Prompt modules are specialized knowledge packages that enhance Strix agents with deep expertise in specific vulnerability types, technologies, and testing methodologies. Each module provides advanced techniques, practical examples, and validation methods that go beyond baseline security knowledge.
---
## 🏗️ Architecture
### How Prompts Work
When an agent is created, it can load up to 5 specialized prompt modules relevant to the specific subtask and context at hand:
```python
# Agent creation with specialized modules
create_agent(
task="Test authentication mechanisms in API",
name="Auth Specialist",
prompt_modules="authentication_jwt,business_logic"
)
```
The modules are dynamically injected into the agent's system prompt, allowing it to operate with deep expertise tailored to the specific vulnerability types or technologies required for the task at hand.
---
## 📁 Module Categories
| Category | Purpose |
|----------|---------|
| **`/vulnerabilities`** | Advanced testing techniques for core vulnerability classes like authentication bypasses, business logic flaws, and race conditions |
| **`/frameworks`** | Specific testing methods for popular frameworks e.g. Django, Express, FastAPI, and Next.js |
| **`/technologies`** | Specialized techniques for third-party services such as Supabase, Firebase, Auth0, and payment gateways |
| **`/protocols`** | Protocol-specific testing patterns for GraphQL, WebSocket, OAuth, and other communication standards |
| **`/cloud`** | Cloud provider security testing for AWS, Azure, GCP, and Kubernetes environments |
| **`/reconnaissance`** | Advanced information gathering and enumeration techniques for comprehensive attack surface mapping |
| **`/custom`** | Community-contributed modules for specialized or industry-specific testing scenarios |
---
## 🎨 Creating New Modules
### What Should a Module Contain?
A good prompt module is a structured knowledge package that typically includes:
- **Advanced techniques** - Non-obvious methods specific to the task and domain
- **Practical examples** - Working payloads, commands, or test cases with variations
- **Validation methods** - How to confirm findings and avoid false positives
- **Context-specific insights** - Environment and version nuances, configuration-dependent behavior, and edge cases
Modules use XML-style tags for structure and focus on deep, specialized knowledge that significantly enhances agent capabilities for that specific context.
---
## 🤝 Contributing
Community contributions are more than welcome — contribute new modules via [pull requests](https://github.com/usestrix/strix/pulls) or [GitHub issues](https://github.com/usestrix/strix/issues) to help expand the collection and improve extensibility for Strix agents.
---
> [!NOTE]
> **Work in Progress** - We're actively expanding the prompt module collection with specialized techniques and new categories.

View File

@@ -1,109 +0,0 @@
from pathlib import Path
from jinja2 import Environment
def get_available_prompt_modules() -> dict[str, list[str]]:
modules_dir = Path(__file__).parent
available_modules = {}
for category_dir in modules_dir.iterdir():
if category_dir.is_dir() and not category_dir.name.startswith("__"):
category_name = category_dir.name
modules = []
for file_path in category_dir.glob("*.jinja"):
module_name = file_path.stem
modules.append(module_name)
if modules:
available_modules[category_name] = sorted(modules)
return available_modules
def get_all_module_names() -> set[str]:
all_modules = set()
for category_modules in get_available_prompt_modules().values():
all_modules.update(category_modules)
return all_modules
def validate_module_names(module_names: list[str]) -> dict[str, list[str]]:
available_modules = get_all_module_names()
valid_modules = []
invalid_modules = []
for module_name in module_names:
if module_name in available_modules:
valid_modules.append(module_name)
else:
invalid_modules.append(module_name)
return {"valid": valid_modules, "invalid": invalid_modules}
def generate_modules_description() -> str:
available_modules = get_available_prompt_modules()
if not available_modules:
return "No prompt modules available"
all_module_names = get_all_module_names()
if not all_module_names:
return "No prompt modules available"
sorted_modules = sorted(all_module_names)
modules_str = ", ".join(sorted_modules)
description = (
f"List of prompt modules to load for this agent (max 5). Available modules: {modules_str}. "
)
example_modules = sorted_modules[:2]
if example_modules:
example = f"Example: {', '.join(example_modules)} for specialized agent"
description += example
return description
def load_prompt_modules(module_names: list[str], jinja_env: Environment) -> dict[str, str]:
import logging
logger = logging.getLogger(__name__)
module_content = {}
prompts_dir = Path(__file__).parent
available_modules = get_available_prompt_modules()
for module_name in module_names:
try:
module_path = None
if "/" in module_name:
module_path = f"{module_name}.jinja"
else:
for category, modules in available_modules.items():
if module_name in modules:
module_path = f"{category}/{module_name}.jinja"
break
if not module_path:
root_candidate = f"{module_name}.jinja"
if (prompts_dir / root_candidate).exists():
module_path = root_candidate
if module_path and (prompts_dir / module_path).exists():
template = jinja_env.get_template(module_path)
var_name = module_name.split("/")[-1]
module_content[var_name] = template.render()
logger.info(f"Loaded prompt module: {module_name} -> {var_name}")
else:
logger.warning(f"Prompt module not found: {module_name}")
except (FileNotFoundError, OSError, ValueError) as e:
logger.warning(f"Failed to load prompt module {module_name}: {e}")
return module_content

View File

@@ -1,10 +1,19 @@
import os
from strix.config import Config
from .runtime import AbstractRuntime
class SandboxInitializationError(Exception):
"""Raised when sandbox initialization fails (e.g., Docker issues)."""
def __init__(self, message: str, details: str | None = None):
super().__init__(message)
self.message = message
self.details = details
def get_runtime() -> AbstractRuntime:
runtime_backend = os.getenv("STRIX_RUNTIME_BACKEND", "docker")
runtime_backend = Config.get("strix_runtime_backend")
if runtime_backend == "docker":
from .docker_runtime import DockerRuntime
@@ -16,4 +25,4 @@ def get_runtime() -> AbstractRuntime:
)
__all__ = ["AbstractRuntime", "get_runtime"]
__all__ = ["AbstractRuntime", "SandboxInitializationError", "get_runtime"]

View File

@@ -4,27 +4,49 @@ import os
import secrets
import socket
import time
from concurrent.futures import ThreadPoolExecutor
from concurrent.futures import TimeoutError as FuturesTimeoutError
from pathlib import Path
from typing import cast
from typing import Any, cast
import docker
from docker.errors import DockerException, ImageNotFound, NotFound
from docker.models.containers import Container
from requests.exceptions import ConnectionError as RequestsConnectionError
from requests.exceptions import Timeout as RequestsTimeout
from strix.config import Config
from . import SandboxInitializationError
from .runtime import AbstractRuntime, SandboxInfo
STRIX_IMAGE = os.getenv("STRIX_IMAGE", "ghcr.io/usestrix/strix-sandbox:0.1.10")
HOST_GATEWAY_HOSTNAME = "host.docker.internal"
DOCKER_TIMEOUT = 60 # seconds
TOOL_SERVER_HEALTH_REQUEST_TIMEOUT = 5 # seconds per health check request
TOOL_SERVER_HEALTH_RETRIES = 10 # number of retries for health check
logger = logging.getLogger(__name__)
class DockerRuntime(AbstractRuntime):
def __init__(self) -> None:
try:
self.client = docker.from_env()
except DockerException as e:
self.client = docker.from_env(timeout=DOCKER_TIMEOUT)
except (DockerException, RequestsConnectionError, RequestsTimeout) as e:
logger.exception("Failed to connect to Docker daemon")
raise RuntimeError("Docker is not available or not configured correctly.") from e
if isinstance(e, RequestsConnectionError | RequestsTimeout):
raise SandboxInitializationError(
"Docker daemon unresponsive",
f"Connection timed out after {DOCKER_TIMEOUT} seconds. "
"Please ensure Docker Desktop is installed and running, "
"and try running strix again.",
) from e
raise SandboxInitializationError(
"Docker is not available",
"Docker is not available or not configured correctly. "
"Please ensure Docker Desktop is installed and running, "
"and try running strix again.",
) from e
self._scan_container: Container | None = None
self._tool_server_port: int | None = None
@@ -38,6 +60,23 @@ class DockerRuntime(AbstractRuntime):
s.bind(("", 0))
return cast("int", s.getsockname()[1])
def _exec_run_with_timeout(
self, container: Container, cmd: str, timeout: int = DOCKER_TIMEOUT, **kwargs: Any
) -> Any:
with ThreadPoolExecutor(max_workers=1) as executor:
future = executor.submit(container.exec_run, cmd, **kwargs)
try:
return future.result(timeout=timeout)
except FuturesTimeoutError:
logger.exception(f"exec_run timed out after {timeout}s: {cmd[:100]}...")
raise SandboxInitializationError(
"Container command timed out",
f"Command timed out after {timeout} seconds. "
"Docker may be overloaded or unresponsive. "
"Please ensure Docker Desktop is installed and running, "
"and try running strix again.",
) from None
def _get_scan_id(self, agent_id: str) -> str:
try:
from strix.telemetry.tracer import get_global_tracer
@@ -80,10 +119,13 @@ class DockerRuntime(AbstractRuntime):
def _create_container_with_retry(self, scan_id: str, max_retries: int = 3) -> Container:
last_exception = None
container_name = f"strix-scan-{scan_id}"
image_name = Config.get("strix_image")
if not image_name:
raise ValueError("STRIX_IMAGE must be configured")
for attempt in range(max_retries):
try:
self._verify_image_available(STRIX_IMAGE)
self._verify_image_available(image_name)
try:
existing_container = self.client.containers.get(container_name)
@@ -105,7 +147,7 @@ class DockerRuntime(AbstractRuntime):
self._tool_server_token = tool_server_token
container = self.client.containers.run(
STRIX_IMAGE,
image_name,
command="sleep infinity",
detach=True,
name=container_name,
@@ -121,7 +163,9 @@ class DockerRuntime(AbstractRuntime):
"CAIDO_PORT": str(caido_port),
"TOOL_SERVER_PORT": str(tool_server_port),
"TOOL_SERVER_TOKEN": tool_server_token,
"HOST_GATEWAY": HOST_GATEWAY_HOSTNAME,
},
extra_hosts=self._get_extra_hosts(),
tty=True,
)
@@ -131,7 +175,7 @@ class DockerRuntime(AbstractRuntime):
self._initialize_container(
container, caido_port, tool_server_port, tool_server_token
)
except DockerException as e:
except (DockerException, RequestsConnectionError, RequestsTimeout) as e:
last_exception = e
if attempt == max_retries - 1:
logger.exception(f"Failed to create container after {max_retries} attempts")
@@ -147,8 +191,19 @@ class DockerRuntime(AbstractRuntime):
else:
return container
raise RuntimeError(
f"Failed to create Docker container after {max_retries} attempts: {last_exception}"
if isinstance(last_exception, RequestsConnectionError | RequestsTimeout):
raise SandboxInitializationError(
"Failed to create sandbox container",
f"Docker daemon unresponsive after {max_retries} attempts "
f"(timed out after {DOCKER_TIMEOUT}s). "
"Please ensure Docker Desktop is installed and running, "
"and try running strix again.",
) from last_exception
raise SandboxInitializationError(
"Failed to create sandbox container",
f"Container creation failed after {max_retries} attempts: {last_exception}. "
"Please ensure Docker Desktop is installed and running, "
"and try running strix again.",
) from last_exception
def _get_or_create_scan_container(self, scan_id: str) -> Container: # noqa: PLR0912
@@ -193,7 +248,7 @@ class DockerRuntime(AbstractRuntime):
except NotFound:
pass
except DockerException as e:
except (DockerException, RequestsConnectionError, RequestsTimeout) as e:
logger.warning(f"Failed to get container by name {container_name}: {e}")
else:
return container
@@ -203,7 +258,7 @@ class DockerRuntime(AbstractRuntime):
all=True, filters={"label": f"strix-scan-id={scan_id}"}
)
if containers:
container = cast("Container", containers[0])
container = containers[0]
if container.status != "running":
container.start()
time.sleep(2)
@@ -217,7 +272,7 @@ class DockerRuntime(AbstractRuntime):
logger.info(f"Found existing container by label for scan {scan_id}")
return container
except DockerException as e:
except (DockerException, RequestsConnectionError, RequestsTimeout) as e:
logger.warning("Failed to find existing container by label for scan %s: %s", scan_id, e)
logger.info("Creating new Docker container for scan %s", scan_id)
@@ -227,15 +282,18 @@ class DockerRuntime(AbstractRuntime):
self, container: Container, caido_port: int, tool_server_port: int, tool_server_token: str
) -> None:
logger.info("Initializing Caido proxy on port %s", caido_port)
result = container.exec_run(
self._exec_run_with_timeout(
container,
f"bash -c 'export CAIDO_PORT={caido_port} && /usr/local/bin/docker-entrypoint.sh true'",
detach=False,
)
time.sleep(5)
result = container.exec_run(
"bash -c 'source /etc/profile.d/proxy.sh && echo $CAIDO_API_TOKEN'", user="pentester"
result = self._exec_run_with_timeout(
container,
"bash -c 'source /etc/profile.d/proxy.sh && echo $CAIDO_API_TOKEN'",
user="pentester",
)
caido_token = result.output.decode().strip() if result.exit_code == 0 else ""
@@ -248,7 +306,57 @@ class DockerRuntime(AbstractRuntime):
user="pentester",
)
time.sleep(5)
time.sleep(2)
host = self._resolve_docker_host()
health_url = f"http://{host}:{tool_server_port}/health"
self._wait_for_tool_server_health(health_url)
def _wait_for_tool_server_health(
self,
health_url: str,
max_retries: int = TOOL_SERVER_HEALTH_RETRIES,
request_timeout: int = TOOL_SERVER_HEALTH_REQUEST_TIMEOUT,
) -> None:
import httpx
logger.info(f"Waiting for tool server health at {health_url}")
for attempt in range(max_retries):
try:
with httpx.Client(trust_env=False, timeout=request_timeout) as client:
response = client.get(health_url)
response.raise_for_status()
health_data = response.json()
if health_data.get("status") == "healthy":
logger.info(
f"Tool server is healthy after {attempt + 1} attempt(s): {health_data}"
)
return
logger.warning(f"Tool server returned unexpected status: {health_data}")
except httpx.ConnectError:
logger.debug(
f"Tool server not ready (attempt {attempt + 1}/{max_retries}): "
f"Connection refused"
)
except httpx.TimeoutException:
logger.debug(
f"Tool server not ready (attempt {attempt + 1}/{max_retries}): "
f"Request timed out"
)
except (httpx.RequestError, httpx.HTTPStatusError) as e:
logger.debug(f"Tool server not ready (attempt {attempt + 1}/{max_retries}): {e}")
sleep_time = min(2**attempt * 0.5, 5)
time.sleep(sleep_time)
raise SandboxInitializationError(
"Tool server failed to start",
"Please ensure Docker Desktop is installed and running, and try running strix again.",
)
def _copy_local_directory_to_container(
self, container: Container, local_path: str, target_name: str | None = None
@@ -358,11 +466,7 @@ class DockerRuntime(AbstractRuntime):
container = self.client.containers.get(container_id)
container.reload()
host = "127.0.0.1"
if "DOCKER_HOST" in os.environ:
docker_host = os.environ["DOCKER_HOST"]
if "://" in docker_host:
host = docker_host.split("://")[1].split(":")[0]
host = self._resolve_docker_host()
except NotFound:
raise ValueError(f"Container {container_id} not found.") from None
@@ -371,6 +475,23 @@ class DockerRuntime(AbstractRuntime):
else:
return f"http://{host}:{port}"
def _resolve_docker_host(self) -> str:
docker_host = os.getenv("DOCKER_HOST", "")
if not docker_host:
return "127.0.0.1"
from urllib.parse import urlparse
parsed = urlparse(docker_host)
if parsed.scheme in ("tcp", "http", "https") and parsed.hostname:
return parsed.hostname
return "127.0.0.1"
def _get_extra_hosts(self) -> dict[str, str]:
return {HOST_GATEWAY_HOSTNAME: "host-gateway"}
async def destroy_sandbox(self, container_id: str) -> None:
logger.info("Destroying scan container %s", container_id)
try:

64
strix/skills/README.md Normal file
View File

@@ -0,0 +1,64 @@
# 📚 Strix Skills
## 🎯 Overview
Skills are specialized knowledge packages that enhance Strix agents with deep expertise in specific vulnerability types, technologies, and testing methodologies. Each skill provides advanced techniques, practical examples, and validation methods that go beyond baseline security knowledge.
---
## 🏗️ Architecture
### How Skills Work
When an agent is created, it can load up to 5 specialized skills relevant to the specific subtask and context at hand:
```python
# Agent creation with specialized skills
create_agent(
task="Test authentication mechanisms in API",
name="Auth Specialist",
skills="authentication_jwt,business_logic"
)
```
The skills are dynamically injected into the agent's system prompt, allowing it to operate with deep expertise tailored to the specific vulnerability types or technologies required for the task at hand.
---
## 📁 Skill Categories
| Category | Purpose |
|----------|---------|
| **`/vulnerabilities`** | Advanced testing techniques for core vulnerability classes like authentication bypasses, business logic flaws, and race conditions |
| **`/frameworks`** | Specific testing methods for popular frameworks e.g. Django, Express, FastAPI, and Next.js |
| **`/technologies`** | Specialized techniques for third-party services such as Supabase, Firebase, Auth0, and payment gateways |
| **`/protocols`** | Protocol-specific testing patterns for GraphQL, WebSocket, OAuth, and other communication standards |
| **`/cloud`** | Cloud provider security testing for AWS, Azure, GCP, and Kubernetes environments |
| **`/reconnaissance`** | Advanced information gathering and enumeration techniques for comprehensive attack surface mapping |
| **`/custom`** | Community-contributed skills for specialized or industry-specific testing scenarios |
---
## 🎨 Creating New Skills
### What Should a Skill Contain?
A good skill is a structured knowledge package that typically includes:
- **Advanced techniques** - Non-obvious methods specific to the task and domain
- **Practical examples** - Working payloads, commands, or test cases with variations
- **Validation methods** - How to confirm findings and avoid false positives
- **Context-specific insights** - Environment and version nuances, configuration-dependent behavior, and edge cases
Skills use XML-style tags for structure and focus on deep, specialized knowledge that significantly enhances agent capabilities for that specific context.
---
## 🤝 Contributing
Community contributions are more than welcome — contribute new skills via [pull requests](https://github.com/usestrix/strix/pulls) or [GitHub issues](https://github.com/usestrix/strix/issues) to help expand the collection and improve extensibility for Strix agents.
---
> [!NOTE]
> **Work in Progress** - We're actively expanding the skills collection with specialized techniques and new categories.

107
strix/skills/__init__.py Normal file
View File

@@ -0,0 +1,107 @@
from pathlib import Path
from jinja2 import Environment
def get_available_skills() -> dict[str, list[str]]:
skills_dir = Path(__file__).parent
available_skills = {}
for category_dir in skills_dir.iterdir():
if category_dir.is_dir() and not category_dir.name.startswith("__"):
category_name = category_dir.name
skills = []
for file_path in category_dir.glob("*.jinja"):
skill_name = file_path.stem
skills.append(skill_name)
if skills:
available_skills[category_name] = sorted(skills)
return available_skills
def get_all_skill_names() -> set[str]:
all_skills = set()
for category_skills in get_available_skills().values():
all_skills.update(category_skills)
return all_skills
def validate_skill_names(skill_names: list[str]) -> dict[str, list[str]]:
available_skills = get_all_skill_names()
valid_skills = []
invalid_skills = []
for skill_name in skill_names:
if skill_name in available_skills:
valid_skills.append(skill_name)
else:
invalid_skills.append(skill_name)
return {"valid": valid_skills, "invalid": invalid_skills}
def generate_skills_description() -> str:
available_skills = get_available_skills()
if not available_skills:
return "No skills available"
all_skill_names = get_all_skill_names()
if not all_skill_names:
return "No skills available"
sorted_skills = sorted(all_skill_names)
skills_str = ", ".join(sorted_skills)
description = f"List of skills to load for this agent (max 5). Available skills: {skills_str}. "
example_skills = sorted_skills[:2]
if example_skills:
example = f"Example: {', '.join(example_skills)} for specialized agent"
description += example
return description
def load_skills(skill_names: list[str], jinja_env: Environment) -> dict[str, str]:
import logging
logger = logging.getLogger(__name__)
skill_content = {}
skills_dir = Path(__file__).parent
available_skills = get_available_skills()
for skill_name in skill_names:
try:
skill_path = None
if "/" in skill_name:
skill_path = f"{skill_name}.jinja"
else:
for category, skills in available_skills.items():
if skill_name in skills:
skill_path = f"{category}/{skill_name}.jinja"
break
if not skill_path:
root_candidate = f"{skill_name}.jinja"
if (skills_dir / root_candidate).exists():
skill_path = root_candidate
if skill_path and (skills_dir / skill_path).exists():
template = jinja_env.get_template(skill_path)
var_name = skill_name.split("/")[-1]
skill_content[var_name] = template.render()
logger.info(f"Loaded skill: {skill_name} -> {var_name}")
else:
logger.warning(f"Skill not found: {skill_name}")
except (FileNotFoundError, OSError, ValueError) as e:
logger.warning(f"Failed to load skill {skill_name}: {e}")
return skill_content

View File

@@ -28,7 +28,7 @@ AGENT TYPES YOU CAN CREATE:
COORDINATION GUIDELINES:
- Ensure clear task boundaries and success criteria
- Terminate redundant agents when objectives overlap
- Use message passing for agent communication
- Use message passing only when essential (requests/answers or critical handoffs); avoid routine status messages and prefer batched updates
</agent_management>
<final_responsibilities>

View File

@@ -31,6 +31,18 @@
</high_value_targets>
<advanced_techniques>
<route_enumeration>
- __BUILD_MANIFEST.sortedPages: Execute `console.log(__BUILD_MANIFEST.sortedPages.join('\n'))` in browser console to instantly reveal all registered routes (Pages Router and static App Router paths compiled at build time)
- __NEXT_DATA__: Inspect `<script id="__NEXT_DATA__">` for serverside props, pageProps, buildId, and dynamic route params on current page; reveals data flow and prop structure
- Source maps exposure: Check `/_next/static/` for exposed .map files revealing full route structure, server action IDs, API endpoints, and internal function names
- Client bundle mining: Search main-*.js and page chunks for route definitions; grep for 'pathname:', 'href:', '__next_route__', 'serverActions', and API endpoint strings
- Static chunk enumeration: Probe `/_next/static/chunks/pages/` and `/_next/static/chunks/app/` for build artifacts; filenames map directly to routes (e.g., `admin.js` → `/admin`)
- Build manifest fetch: GET `/_next/static/<buildId>/_buildManifest.js` and `/_next/static/<buildId>/_ssgManifest.js` for complete route and static generation metadata
- Sitemap/robots leakage: Check `/sitemap.xml`, `/robots.txt`, and `/sitemap-*.xml` for unintended exposure of admin/internal/preview paths
- Server action discovery: Inspect Network tab for POST requests with `Next-Action` header; extract action IDs from response streams and client hydration data
- Environment variable leakage: Execute `Object.keys(process.env).filter(k => k.startsWith('NEXT_PUBLIC_'))` in console to list public env vars; grep bundles for 'API_KEY', 'SECRET', 'TOKEN', 'PASSWORD' to find accidentally leaked credentials
</route_enumeration>
<middleware_bypass>
- Test for CVE-class middleware bypass via `x-middleware-subrequest` crafting and `x-nextjs-data` probing. Look for 307 + `x-middleware-rewrite`/`x-nextjs-redirect` headers and attempt bypass on protected routes.
- Attempt direct route access on Node vs Edge runtimes; confirm protection parity.
@@ -80,6 +92,14 @@
- Identify `dangerouslySetInnerHTML`, Markdown renderers, and user-controlled href/src attributes. Validate CSP/Trusted Types coverage for SSR/CSR/hydration.
- Attack hydration boundaries: server vs client render mismatches can enable gadget-based XSS.
</client_and_dom>
<data_fetching_over_exposure>
- getServerSideProps/getStaticProps leakage: Execute `JSON.parse(document.getElementById('__NEXT_DATA__').textContent).props.pageProps` in console to inspect all server-fetched data; look for sensitive fields (emails, tokens, internal IDs, full user objects) passed to client but not rendered in UI
- Over-fetched database queries: Check if pageProps include entire user records, relations, or admin-only fields when only username is displayed; common when using ORM select-all patterns
- API response pass-through: Verify if API responses are sanitized before passing to props; developers often forward entire responses including metadata, cursors, or debug info
- Environment-dependent data: Test if staging/dev accidentally exposes more fields in props than production due to inconsistent serialization logic
- Nested object inspection: Drill into nested props objects; look for `_metadata`, `_internal`, `__typename` (GraphQL), or framework-added fields containing sensitive context
</data_fetching_over_exposure>
</advanced_techniques>
<bypass_techniques>
@@ -87,6 +107,8 @@
- Method override/tunneling: `_method`, `X-HTTP-Method-Override`, GET on endpoints unexpectedly accepting writes.
- Case/param aliasing and query duplication affecting middleware vs handler parsing.
- Cache key confusion at CDN/proxy (lack of Vary on auth cookies/headers) to leak personalized SSR/ISR content.
- API route path normalization: Test `/api/users` vs `/api/users/` vs `/api//users` vs `/api/./users`; middleware may normalize differently than route handlers, allowing protection bypass. Try double slashes, trailing slashes, and dot segments.
- Parameter pollution: Send duplicate query params (`?id=1&id=2`) or array notation (`?filter[]=a&filter[]=b`) to exploit parsing differences between middleware (which may check first value) and handler (which may use last or array).
</bypass_techniques>
<special_contexts>
@@ -107,6 +129,10 @@
3. Demonstrate server action invocation outside UI with insufficient authorization checks.
4. Show middleware bypass (where applicable) with explicit headers and resulting protected content.
5. Include runtime parity checks (Edge vs Node) proving inconsistent enforcement.
6. For route enumeration: verify discovered routes return 200/403 (deployed) not 404 (build artifacts); test with authenticated vs unauthenticated requests.
7. For leaked credentials: test API keys with minimal read-only calls; filter placeholders (YOUR_API_KEY, demo-token); confirm keys match provider patterns (sk_live_*, pk_prod_*).
8. For __NEXT_DATA__ over-exposure: test cross-user (User A's props should not contain User B's PII); verify exposed fields are not in DOM; validate token validity with API calls.
9. For path normalization bypasses: show differential responses (403 vs 200 for path variants); redirects (307/308) don't count—only direct access bypasses matter.
</validation>
<pro_tips>

View File

@@ -0,0 +1,145 @@
<scan_mode>
DEEP SCAN MODE - Exhaustive Security Assessment
This mode is for thorough security reviews where finding vulnerabilities is critical.
PHASE 1: EXHAUSTIVE RECONNAISSANCE AND MAPPING
Spend significant effort understanding the target before exploitation.
For whitebox (source code available):
- Map EVERY file, module, and code path in the repository
- Trace all entry points from HTTP handlers to database queries
- Identify all authentication mechanisms and their implementations
- Map all authorization checks and understand the access control model
- Identify all external service integrations and API calls
- Analyze all configuration files for secrets and misconfigurations
- Review all database schemas and understand data relationships
- Map all background jobs, cron tasks, and async processing
- Identify all serialization/deserialization points
- Review all file handling operations (upload, download, processing)
- Understand the deployment model and infrastructure assumptions
- Check all dependency versions against known CVE databases
For blackbox (no source code):
- Exhaustive subdomain enumeration using multiple sources and tools
- Full port scanning to identify all services
- Complete content discovery with multiple wordlists
- Technology fingerprinting on all discovered assets
- API endpoint discovery through documentation, JavaScript analysis, and fuzzing
- Identify all parameters including hidden and rarely-used ones
- Map all user roles by testing with different account types
- Understand rate limiting, WAF rules, and security controls in place
- Document the complete application architecture as understood from outside
EXECUTION STRATEGY - HIERARCHICAL AGENT SWARM:
After Phase 1 (Recon & Mapping) is complete:
1. Divide the application into major components/parts (e.g., Auth System, Payment Gateway, User Profile, Admin Panel)
2. Spawn a specialized subagent for EACH major component
3. Each component agent must then:
- Further subdivide its scope into subparts (e.g., Login Form, Registration API, Password Reset)
- Spawn sub-subagents for each distinct subpart
4. At the lowest level (specific functionality), spawn specialized agents for EACH potential vulnerability type:
- "Auth System" → "Login Form" → "SQLi Agent", "XSS Agent", "Auth Bypass Agent"
- This creates a massive parallel swarm covering every angle
- Do NOT overload a single agent with multiple vulnerability types
- Scale horizontally to maximum capacity
PHASE 2: DEEP BUSINESS LOGIC ANALYSIS
Understand the application deeply enough to find logic flaws:
- CREATE A FULL STORYBOARD of all user flows and state transitions
- Document every step of the business logic in a structured flow diagram
- Use the application extensively as every type of user to map the full lifecycle of data
- Document all state machines and workflows (e.g. Order Created -> Paid -> Shipped)
- Identify trust boundaries between components
- Map all integrations with third-party services
- Understand what invariants the application tries to maintain
- Identify all points where roles, privileges, or sensitive data changes hands
- Look for implicit assumptions in the business logic
- Consider multi-step attacks that abuse normal functionality
PHASE 3: COMPREHENSIVE ATTACK SURFACE TESTING
Test EVERY input vector with EVERY applicable technique.
Input Handling - Test all parameters, headers, cookies with:
- Multiple injection payloads (SQL, NoSQL, LDAP, XPath, Command, Template)
- Various encodings and bypass techniques (double encoding, unicode, null bytes)
- Boundary conditions and type confusion
- Large payloads and buffer-related issues
Authentication and Session:
- Exhaustive brute force protection testing
- Session fixation, hijacking, and prediction attacks
- JWT/token manipulation if applicable
- OAuth flow abuse scenarios
- Password reset flow vulnerabilities (token leakage, reuse, timing)
- Multi-factor authentication bypass techniques
- Account enumeration through all possible channels
Access Control:
- Test EVERY endpoint for horizontal and vertical access control
- Parameter tampering on all object references
- Forced browsing to all discovered resources
- HTTP method tampering
- Test access control after session changes (logout, role change)
File Operations:
- Exhaustive file upload bypass testing (extension, content-type, magic bytes)
- Path traversal on all file parameters
- Server-side request forgery through file inclusion
- XXE through all XML parsing points
Business Logic:
- Race conditions on all state-changing operations
- Workflow bypass attempts on every multi-step process
- Price/quantity manipulation in all transactions
- Parallel execution attacks
- Time-of-check to time-of-use vulnerabilities
Advanced Attacks:
- HTTP request smuggling if multiple proxies/servers
- Cache poisoning and cache deception
- Subdomain takeover on all subdomains
- Prototype pollution in JavaScript applications
- CORS misconfiguration exploitation
- WebSocket security testing
- GraphQL specific attacks if applicable
PHASE 4: VULNERABILITY CHAINING
Don't just find individual bugs - chain them:
- Combine information disclosure with access control bypass
- Chain SSRF to access internal services
- Use low-severity findings to enable high-impact attacks
- Look for multi-step attack paths that automated tools miss
- Consider attacks that span multiple application components
CHAINING PRINCIPLES (MAX IMPACT):
- Treat every finding as a pivot: ask "What does this unlock next?" until you reach maximum privilege / maximum data exposure / maximum control
- Prefer end-to-end exploit paths over isolated bugs: initial foothold → pivot → privilege gain → sensitive action/data
- Cross boundaries deliberately: user → admin, external → internal, unauthenticated → authenticated, read → write, single-tenant → cross-tenant
- Validate chains by executing the full sequence using the available tools (proxy + browser for workflows, python for automation, terminal for supporting commands)
- When a component agent finds a potential pivot, it must message/spawn the next focused agent to continue the chain in the next component/subpart
PHASE 5: PERSISTENT TESTING
If initial attempts fail, don't give up:
- Research specific technologies for known bypasses
- Try alternative exploitation techniques
- Look for edge cases and unusual functionality
- Test with different client contexts
- Revisit previously tested areas with new information
- Consider timing-based and blind exploitation techniques
PHASE 6: THOROUGH REPORTING
- Document EVERY confirmed vulnerability with full details
- Include all severity levels - even low findings may enable chains
- Provide complete reproduction steps and PoC
- Document remediation recommendations
- Note areas requiring additional review beyond current scope
MINDSET:
- Relentless - this is about finding what others miss
- Creative - think of unconventional attack vectors
- Patient - real vulnerabilities often require deep investigation
- Thorough - test every parameter, every endpoint, every edge case
- Persistent - if one approach fails, try ten more
- Holistic - understand how components interact to find systemic issues
</scan_mode>

View File

@@ -0,0 +1,63 @@
<scan_mode>
QUICK SCAN MODE - Rapid Security Assessment
This mode is optimized for fast feedback. Focus on HIGH-IMPACT vulnerabilities with minimal overhead.
PHASE 1: RAPID ORIENTATION
- If source code is available: Focus primarily on RECENT CHANGES (git diff, new commits, modified files)
- Identify the most critical entry points: authentication endpoints, payment flows, admin interfaces, API endpoints handling sensitive data
- Quickly understand the tech stack and frameworks in use
- Skip exhaustive reconnaissance - use what's immediately visible
PHASE 2: TARGETED ATTACK SURFACE
For whitebox (source code available):
- Prioritize files changed in recent commits/PRs - these are most likely to contain fresh bugs
- Look for security-sensitive patterns in diffs: auth checks, input handling, database queries, file operations
- Trace user-controllable input in changed code paths
- Check if security controls were modified or bypassed
For blackbox (no source code):
- Focus on authentication and session management
- Test the most critical user flows only
- Check for obvious misconfigurations and exposed endpoints
- Skip deep content discovery - test what's immediately accessible
PHASE 3: HIGH-IMPACT VULNERABILITY FOCUS
Prioritize in this order:
1. Authentication bypass and broken access control
2. Remote code execution vectors
3. SQL injection in critical endpoints
4. Insecure direct object references (IDOR) in sensitive resources
5. Server-side request forgery (SSRF)
6. Hardcoded credentials or secrets in code
Skip lower-priority items:
- Extensive subdomain enumeration
- Full directory bruteforcing
- Information disclosure that doesn't lead to exploitation
- Theoretical vulnerabilities without PoC
PHASE 4: VALIDATION AND REPORTING
- Validate only critical/high severity findings with minimal PoC
- Report findings as you discover them - don't wait for completion
- Focus on exploitability and business impact
QUICK CHAINING RULE:
- If you find ANY strong primitive (auth weakness, access control gap, injection point, internal reachability), immediately attempt a single high-impact pivot to demonstrate real impact
- Do not stop at a low-context “maybe”; turn it into a concrete exploit sequence (even if short) that reaches privileged action or sensitive data
OPERATIONAL GUIDELINES:
- Use the browser tool for quick manual testing of critical flows
- Use terminal for targeted scans with fast presets (e.g., nuclei with critical/high templates only)
- Use proxy to inspect traffic on key endpoints
- Skip extensive fuzzing - use targeted payloads only
- Create subagents only for parallel high-priority tasks
- If whitebox: file_edit tool to review specific suspicious code sections
- Use notes tool to track critical findings only
MINDSET:
- Think like a time-boxed bug bounty hunter going for quick wins
- Prioritize breadth over depth on critical areas
- If something looks exploitable, validate quickly and move on
- Don't get stuck - if an attack vector isn't yielding results quickly, pivot
</scan_mode>

View File

@@ -0,0 +1,91 @@
<scan_mode>
STANDARD SCAN MODE - Balanced Security Assessment
This mode provides thorough coverage with a structured methodology. Balance depth with efficiency.
PHASE 1: RECONNAISSANCE AND MAPPING
Understanding the target is critical before exploitation. Never skip this phase.
For whitebox (source code available):
- Map the entire codebase structure: directories, modules, entry points
- Identify the application architecture (MVC, microservices, monolith)
- Understand the routing: how URLs map to handlers/controllers
- Identify all user input vectors: forms, APIs, file uploads, headers, cookies
- Map authentication and authorization flows
- Identify database interactions and ORM usage
- Review dependency manifests for known vulnerable packages
- Understand the data model and sensitive data locations
For blackbox (no source code):
- Crawl the application thoroughly using browser tool - interact with every feature
- Enumerate all endpoints, parameters, and functionality
- Identify the technology stack through fingerprinting
- Map user roles and access levels
- Understand the business logic by using the application as intended
- Document all forms, APIs, and data entry points
- Use proxy tool to capture and analyze all traffic during exploration
PHASE 2: BUSINESS LOGIC UNDERSTANDING
Before testing for vulnerabilities, understand what the application DOES:
- What are the critical business flows? (payments, user registration, data access)
- What actions should be restricted to specific roles?
- What data should users NOT be able to access?
- What state transitions exist? (order pending → paid → shipped)
- Where does money, sensitive data, or privilege flow?
PHASE 3: SYSTEMATIC VULNERABILITY ASSESSMENT
Test each attack surface methodically. Create focused subagents for different areas.
Entry Point Analysis:
- Test all input fields for injection vulnerabilities
- Check all API endpoints for authentication and authorization
- Verify all file upload functionality for bypass
- Test all search and filter functionality
- Check redirect parameters and URL handling
Authentication and Session:
- Test login for brute force protection
- Check session token entropy and handling
- Test password reset flows for weaknesses
- Verify logout invalidates sessions
- Test for authentication bypass techniques
Access Control:
- For every privileged action, test as unprivileged user
- Test horizontal access control (user A accessing user B's data)
- Test vertical access control (user escalating to admin)
- Check API endpoints mirror UI access controls
- Test direct object references with different user contexts
Business Logic:
- Attempt to skip steps in multi-step processes
- Test for race conditions in critical operations
- Try negative values, zero values, boundary conditions
- Attempt to replay transactions
- Test for price manipulation in e-commerce flows
PHASE 4: EXPLOITATION AND VALIDATION
- Every finding must have a working proof-of-concept
- Demonstrate actual impact, not theoretical risk
- Chain vulnerabilities when possible to show maximum impact
- Document the full attack path from initial access to impact
- Use python tool for complex exploit development
CHAINING & MAX IMPACT MINDSET:
- Always ask: "If I can do X, what does that enable me to do next?" Keep pivoting until you reach maximum privilege or maximum sensitive data access
- Prefer complete end-to-end paths (entry point → pivot → privileged action/data) over isolated bug reports
- Use the application as a real user would: exploit must survive the actual workflow and state transitions
- When you discover a useful pivot (info leak, weak boundary, partial access), immediately pursue the next step rather than stopping at the first win
PHASE 5: COMPREHENSIVE REPORTING
- Report all confirmed vulnerabilities with clear reproduction steps
- Include severity based on actual exploitability and business impact
- Provide remediation recommendations
- Document any areas that need further investigation
MINDSET:
- Methodical and systematic - cover the full attack surface
- Document as you go - findings and areas tested
- Validate everything - no assumptions about exploitability
- Think about business impact, not just technical severity
</scan_mode>

View File

@@ -0,0 +1,222 @@
<information_disclosure_vulnerability_guide>
<title>INFORMATION DISCLOSURE</title>
<critical>Information leaks accelerate exploitation by revealing code, configuration, identifiers, and trust boundaries. Treat every response byte, artifact, and header as potential intelligence. Minimize, normalize, and scope disclosure across all channels.</critical>
<scope>
- Errors and exception pages: stack traces, file paths, SQL, framework versions
- Debug/dev tooling reachable in prod: debuggers, profilers, feature flags
- DVCS/build artifacts and temp/backup files: .git, .svn, .hg, .bak, .swp, archives
- Configuration and secrets: .env, phpinfo, appsettings.json, Docker/K8s manifests
- API schemas and introspection: OpenAPI/Swagger, GraphQL introspection, gRPC reflection
- Client bundles and source maps: webpack/Vite maps, embedded env, __NEXT_DATA__, static JSON
- Headers and response metadata: Server/X-Powered-By, tracing, ETag, Accept-Ranges, Server-Timing
- Storage/export surfaces: public buckets, signed URLs, export/download endpoints
- Observability/admin: /metrics, /actuator, /health, tracing UIs (Jaeger, Zipkin), Kibana, Admin UIs
- Directory listings and indexing: autoindex, sitemap/robots revealing hidden routes
- Cross-origin signals: CORS misconfig, Referrer-Policy leakage, Expose-Headers
- File/document metadata: EXIF, PDF/Office properties
</scope>
<methodology>
1. Build a channel map: Web, API, GraphQL, WebSocket, gRPC, mobile, background jobs, exports, CDN.
2. Establish a diff harness: compare owner vs non-owner vs anonymous across transports; normalize on status/body length/ETag/headers.
3. Trigger controlled failures: send malformed types, boundary values, missing params, and alternate content-types to elicit error detail and stack traces.
4. Enumerate artifacts: DVCS folders, backups, config endpoints, source maps, client bundles, API docs, observability routes.
5. Correlate disclosures to impact: versions→CVE, paths→LFI/RCE, keys→cloud access, schemas→auth bypass, IDs→IDOR.
</methodology>
<surfaces>
<errors_and_exceptions>
- SQL/ORM errors: reveal table/column names, DBMS, query fragments
- Stack traces: absolute paths, class/method names, framework versions, developer emails
- Template engine probes: {% raw %}{{7*7}}, ${7*7}{% endraw %} identify templating stack and code paths
- JSON/XML parsers: type mismatches and coercion logs leak internal model names
</errors_and_exceptions>
<debug_and_env_modes>
- Debug pages and flags: Django DEBUG, Laravel Telescope, Rails error pages, Flask/Werkzeug debugger, ASP.NET customErrors Off
- Profiler endpoints: /debug/pprof, /actuator, /_profiler, custom /debug APIs
- Feature/config toggles exposed in JS or headers; admin/staff banners in HTML
</debug_and_env_modes>
<dvcs_and_backups>
- DVCS: /.git/ (HEAD, config, index, objects), .svn/entries, .hg/store → reconstruct source and secrets
- Backups/temp: .bak/.old/~/.swp/.swo/.tmp/.orig, db dumps, zipped deployments under /backup/, /old/, /archive/
- Build artifacts: dist artifacts containing .map, env prints, internal URLs
</dvcs_and_backups>
<configs_and_secrets>
- Classic: web.config, appsettings.json, settings.py, config.php, phpinfo.php
- Containers/cloud: Dockerfile, docker-compose.yml, Kubernetes manifests, service account tokens, cloud credentials files
- Credentials and connection strings; internal hosts and ports; JWT secrets
</configs_and_secrets>
<api_schemas_and_introspection>
- OpenAPI/Swagger: /swagger, /api-docs, /openapi.json — enumerate hidden/privileged operations
- GraphQL: introspection enabled; field suggestions; error disclosure via invalid fields; persisted queries catalogs
- gRPC: server reflection exposing services/messages; proto download via reflection
</api_schemas_and_introspection>
<client_bundles_and_maps>
- Source maps (.map) reveal original sources, comments, and internal logic
- Client env leakage: NEXT_PUBLIC_/VITE_/REACT_APP_ variables; runtime config; embedded secrets accidentally shipped
- Next.js data: __NEXT_DATA__ and pre-fetched JSON under /_next/data can include internal IDs, flags, or PII
- Static JSON/CSV feeds used by the UI that bypass server-side auth filtering
</client_bundles_and_maps>
<headers_and_response_metadata>
- Fingerprinting: Server, X-Powered-By, X-AspNet-Version
- Tracing: X-Request-Id, traceparent, Server-Timing, debug headers
- Caching oracles: ETag/If-None-Match, Last-Modified/If-Modified-Since, Accept-Ranges/Range (partial content reveals)
- Content sniffing and MIME metadata that implies backend components
</headers_and_response_metadata>
<storage_and_exports>
- Public object storage: S3/GCS/Azure blobs with world-readable ACLs or guessable keys
- Signed URLs: long-lived, weakly scoped, re-usable across tenants; metadata leaks in headers
- Export/report endpoints returning foreign data sets or unfiltered fields
</storage_and_exports>
<observability_and_admin>
- Metrics: Prometheus /metrics exposing internal hostnames, process args, SQL, credentials by mistake
- Health/config: /actuator/health, /actuator/env, Spring Boot info endpoints
- Tracing UIs and dashboards: Jaeger/Zipkin/Kibana/Grafana exposed without auth
</observability_and_admin>
<directory_and_indexing>
- Autoindex on /uploads/, /files/, /logs/, /tmp/, /assets/
- Robots/sitemap reveal hidden paths, admin panels, export feeds
</directory_and_indexing>
<cross_origin_signals>
- Referrer leakage: missing/referrer policy leading to path/query/token leaks to third parties
- CORS: overly permissive Access-Control-Allow-Origin/Expose-Headers revealing data cross-origin; preflight error shapes
</cross_origin_signals>
<file_metadata>
- EXIF, PDF/Office properties: authors, paths, software versions, timestamps, embedded objects
</file_metadata>
</surfaces>
<advanced_techniques>
<differential_oracles>
- Compare owner vs non-owner vs anonymous for the same resource and track: status, length, ETag, Last-Modified, Cache-Control
- HEAD vs GET: header-only differences can confirm existence or type without content
- Conditional requests: 304 vs 200 behaviors leak existence/state; binary search content size via Range requests
</differential_oracles>
<cdn_and_cache_keys>
- Identity-agnostic caches: CDN/proxy keys missing Authorization/tenant headers → cross-user cached responses
- Vary misconfiguration: user-agent/language vary without auth vary leaks alternate content
- 206 partial content + stale caches leak object fragments
</cdn_and_cache_keys>
<cross_channel_mirroring>
- Inconsistent hardening between REST, GraphQL, WebSocket, and gRPC; one channel leaks schema or fields hidden in others
- SSR vs CSR: server-rendered pages omit fields while JSON API includes them; compare responses
</cross_channel_mirroring>
<introspection_and_reflection>
- GraphQL: disabled introspection still leaks via errors, fragment suggestions, and client bundles containing schema
- gRPC reflection: list services/messages and infer internal resource names and flows
</introspection_and_reflection>
<cloud_specific>
- S3/GCS/Azure: anonymous listing disabled but object reads allowed; metadata headers leak owner/project identifiers
- Pre-signed URLs: audience not bound; observe key scope and lifetime in URL params
</cloud_specific>
</advanced_techniques>
<usefulness_assessment>
- Actionable signals:
- Secrets/keys/tokens that grant new access (DB creds, cloud keys, JWT signing/refresh, signed URL secrets)
- Versions with a reachable, unpatched CVE on an exposed path
- Cross-tenant identifiers/data or per-user fields that differ by principal
- File paths, service hosts, or internal URLs that enable LFI/SSRF/RCE pivots
- Cache/CDN differentials (Vary/ETag/Range) that expose other users' content
- Schema/introspection revealing hidden operations or fields that return sensitive data
- Likely benign or intended:
- Public docs or non-sensitive metadata explicitly documented as public
- Generic server names without precise versions or exploit path
- Redacted/sanitized fields with stable length/ETag across principals
- Per-user data visible only to the owner and consistent with privacy policy
</usefulness_assessment>
<triage_rubric>
- Critical: Credentials/keys; signed URL secrets; config dumps; unrestricted admin/observability panels
- High: Versions with reachable CVEs; cross-tenant data; caches serving cross-user content; schema enabling auth bypass
- Medium: Internal paths/hosts enabling LFI/SSRF pivots; source maps revealing hidden endpoints/IDs
- Low: Generic headers, marketing versions, intended documentation without exploit path
- Guidance: Always attempt a minimal, reversible proof for Critical/High; if no safe chain exists, document precise blocker and downgrade
</triage_rubric>
<escalation_playbook>
- If DVCS/backups/configs → extract secrets; test least-privileged read; rotate after coordinated disclosure
- If versions → map to CVE; verify exposure; execute minimal PoC under strict scope
- If schema/introspection → call hidden/privileged fields with non-owner tokens; confirm auth gaps
- If source maps/client JSON → mine endpoints/IDs/flags; pivot to IDOR/listing; validate filtering
- If cache/CDN keys → demonstrate cross-user cache leak via Vary/ETag/Range; escalate to broken access control
- If paths/hosts → target LFI/SSRF with harmless reads (e.g., /etc/hostname, metadata headers); avoid destructive actions
- If observability/admin → enumerate read-only info first; prove data scope breach; avoid write/exec operations
</escalation_playbook>
<exploitation_chains>
<credential_extraction>
- DVCS/config dumps exposing secrets (DB, SMTP, JWT, cloud)
- Keys → cloud control plane access; rotate and verify scope
</credential_extraction>
<version_to_cve>
1. Derive precise component versions from headers/errors/bundles.
2. Map to known CVEs and confirm reachability.
3. Execute minimal proof targeting disclosed component.
</version_to_cve>
<path_disclosure_to_lfi>
1. Paths from stack traces/templates reveal filesystem layout.
2. Use LFI/traversal to fetch config/keys.
3. Prove controlled access without altering state.
</path_disclosure_to_lfi>
<schema_to_auth_bypass>
1. Schema reveals hidden fields/endpoints.
2. Attempt requests with those fields; confirm missing authorization or field filtering.
</schema_to_auth_bypass>
</exploitation_chains>
<validation>
1. Provide raw evidence (headers/body/artifact) and explain exact data revealed.
2. Determine intent: cross-check docs/UX; classify per triage rubric (Critical/High/Medium/Low).
3. Attempt minimal, reversible exploitation or present a concrete step-by-step chain (what to try next and why).
4. Show reproducibility and minimal request set; include cross-channel confirmation where applicable.
5. Bound scope (user, tenant, environment) and data sensitivity classification.
</validation>
<false_positives>
- Intentional public docs or non-sensitive metadata with no exploit path
- Generic errors with no actionable details
- Redacted fields that do not change differential oracles (length/ETag stable)
- Version banners with no exposed vulnerable surface and no chain
- Owner-visible-only details that do not cross identity/tenant boundaries
</false_positives>
<impact>
- Accelerated exploitation of RCE/LFI/SSRF via precise versions and paths
- Credential/secret exposure leading to persistent external compromise
- Cross-tenant data disclosure through exports, caches, or mis-scoped signed URLs
- Privacy/regulatory violations and business intelligence leakage
</impact>
<pro_tips>
1. Start with artifacts (DVCS, backups, maps) before payloads; artifacts yield the fastest wins.
2. Normalize responses and diff by digest to reduce noise when comparing roles.
3. Hunt source maps and client data JSON; they often carry internal IDs and flags.
4. Probe caches/CDNs for identity-unaware keys; verify Vary includes Authorization/tenant.
5. Treat introspection and reflection as configuration findings across GraphQL/gRPC; validate per environment.
6. Mine observability endpoints last; they are noisy but high-yield in misconfigured setups.
7. Chain quickly to a concrete risk and stop—proof should be minimal and reversible.
</pro_tips>
<remember>Information disclosure is an amplifier. Convert leaks into precise, minimal exploits or clear architectural risks.</remember>
</information_disclosure_vulnerability_guide>

View File

@@ -0,0 +1,177 @@
<open_redirect_vulnerability_guide>
<title>OPEN REDIRECT</title>
<critical>Open redirects enable phishing, OAuth/OIDC code and token theft, and allowlist bypass in server-side fetchers that follow redirects. Treat every redirect target as untrusted: canonicalize and enforce exact allowlists per scheme, host, and path.</critical>
<scope>
- Server-driven redirects (HTTP 3xx Location) and client-driven redirects (window.location, meta refresh, SPA routers)
- OAuth/OIDC/SAML flows using redirect_uri, post_logout_redirect_uri, RelayState, returnTo/continue/next
- Multi-hop chains where only the first hop is validated
- Allowlist/canonicalization bypasses across URL parsers and reverse proxies
</scope>
<methodology>
1. Inventory all redirect surfaces: login/logout, password reset, SSO/OAuth flows, payment gateways, email links, invite/verification, unsubscribe, language/locale switches, /out or /r redirectors.
2. Build a test matrix of scheme×host×path variants and encoding/unicode forms. Compare server-side validation vs browser navigation results.
3. Exercise multi-hop: trusted-domain → redirector → external. Verify if validation applies pre- or post-redirect.
4. Prove impact: credential phishing, OAuth code interception, internal egress (if a server fetcher follows redirects).
</methodology>
<discovery_techniques>
<injection_points>
- Params: redirect, url, next, return_to, returnUrl, continue, goto, target, callback, out, dest, back, to, r, u
- OAuth/OIDC/SAML: redirect_uri, post_logout_redirect_uri, RelayState, state (if used to compute final destination)
- SPA: router.push/replace, location.assign/href, meta refresh, window.open
- Headers influencing construction: Host, X-Forwarded-Host/Proto, Referer; and server-side Location echo
</injection_points>
<parser_differentials>
<userinfo>
https://trusted.com@evil.com → many validators parse host as trusted.com, browser navigates to evil.com
Variants: trusted.com%40evil.com, a%40evil.com%40trusted.com
</userinfo>
<backslash_and_slashes>
https://trusted.com\\evil.com, https://trusted.com\\@evil.com, ///evil.com, /\\evil.com
Windows/backends may normalize \\ to /; browsers differ on interpretation of extra leading slashes
</backslash_and_slashes>
<whitespace_and_ctrl>
http%09://evil.com, http%0A://evil.com, trusted.com%09evil.com
Control/whitespace around the scheme/host can split parsers
</whitespace_and_ctrl>
<fragment_and_query>
trusted.com#@evil.com, trusted.com?//@evil.com, ?next=//evil.com#@trusted.com
Validators often stop at # while the browser parses after it
</fragment_and_query>
<unicode_and_idna>
Punycode/IDN: truѕted.com (Cyrillic), trusted.com。evil.com (full-width dot), trailing dot trusted.com.
Test with mixed Unicode normalization and IDNA conversion
</unicode_and_idna>
</parser_differentials>
<encoding_bypasses>
- Double encoding: %2f%2fevil.com, %252f%252fevil.com
- Mixed case and scheme smuggling: hTtPs://evil.com, http:evil.com
- IP variants: decimal 2130706433, octal 0177.0.0.1, hex 0x7f.1, IPv6 [::ffff:127.0.0.1]
- User-controlled path bases: /out?url=/\\evil.com
</encoding_bypasses>
</discovery_techniques>
<allowlist_evasion>
<common_mistakes>
- Substring/regex contains checks: allows trusted.com.evil.com, or path matches leaking external
- Wildcards: *.trusted.com also matches attacker.trusted.com.evil.net
- Missing scheme pinning: data:, javascript:, file:, gopher: accepted
- Case/IDN drift between validator and browser
</common_mistakes>
<robust_validation>
- Canonicalize with a single modern URL parser (WHATWG URL) and compare exact scheme, hostname (post-IDNA), and an explicit allowlist with optional exact path prefixes
- Require absolute HTTPS; reject protocol-relative // and unknown schemes
- Normalize and compare after following zero redirects only; if following, re-validate the final destination per hop server-side
</robust_validation>
</allowlist_evasion>
<oauth_oidc_saml>
<redirect_uri_abuse>
- Using an open redirect on a trusted domain for redirect_uri enables code interception
- Weak prefix/suffix checks: https://trusted.com → https://trusted.com.evil.com; /callback → /callback@evil.com
- Path traversal/canonicalization: /oauth/../../@evil.com
- post_logout_redirect_uri often less strictly validated; test both
- state must be unguessable and bound to client/session; do not recompute final destination from state without validation
</redirect_uri_abuse>
<defense_notes>
- Pre-register exact redirect_uri values per client (no wildcards). Enforce exact scheme/host/port/path match
- For public native apps, follow RFC guidance (loopback 127.0.0.1 with exact port handling); disallow open web redirectors
- SAML RelayState should be validated against an allowlist or ignored for absolute URLs
</defense_notes>
</oauth_oidc_saml>
<client_side_vectors>
<javascript_redirects>
- location.href/assign/replace using user input; ensure targets are normalized and restricted to same-origin or allowlist
- meta refresh content=0;url=USER_INPUT; browsers treat javascript:/data: differently; still dangerous in client-controlled redirects
- SPA routers: router.push(searchParams.get('next')); enforce same-origin and strip schemes
</javascript_redirects>
</client_side_vectors>
<reverse_proxies_and_gateways>
- Host/X-Forwarded-* may change absolute URL construction; validate against server-derived canonical origin, not client headers
- CDNs that follow redirects for link checking or prefetching can leak tokens when chained with open redirects
</reverse_proxies_and_gateways>
<ssrf_chaining>
- Some server-side fetchers (web previewers, link unfurlers, validators) follow 3xx; combine with an open redirect on an allowlisted domain to pivot to internal targets (169.254.169.254, localhost, cluster addresses)
- Confirm by observing distinct error/timing for internal vs external, or OAST callbacks when reachable
</ssrf_chaining>
<framework_notes>
<server_side>
- Rails: redirect_to params[:url] without URI parsing; test array params and protocol-relative
- Django: HttpResponseRedirect(request.GET['next']) without is_safe_url; relies on ALLOWED_HOSTS + scheme checks
- Spring: return "redirect:" + param; ensure UriComponentsBuilder normalization and allowlist
- Express: res.redirect(req.query.url); use a safe redirect helper enforcing relative paths or a vetted allowlist
</server_side>
<client_side>
- React/Next.js/Vue/Angular routing based on URLSearchParams; ensure same-origin policy and disallow external schemes in client code
</client_side>
</framework_notes>
<exploitation_scenarios>
<oauth_code_interception>
1. Set redirect_uri to https://trusted.example/out?url=https://attacker.tld/cb
2. IdP sends code to trusted.example which redirects to attacker.tld
3. Exchange code for tokens; demonstrate account access
</oauth_code_interception>
<phishing_flow>
1. Send link on trusted domain: /login?next=https://attacker.tld/fake
2. Victim authenticates; browser navigates to attacker page
3. Capture credentials/tokens via cloned UI or injected JS
</phishing_flow>
<internal_evasion>
1. Server-side link unfurler fetches https://trusted.example/out?u=http://169.254.169.254/latest/meta-data
2. Redirect follows to metadata; confirm via timing/headers or controlled endpoints
</internal_evasion>
</exploitation_scenarios>
<validation>
1. Produce a minimal URL that navigates to an external domain via the vulnerable surface; include the full address bar capture.
2. Show bypass of the stated validation (regex/allowlist) using canonicalization variants.
3. Test multi-hop: prove only first hop is validated and second hop escapes constraints.
4. For OAuth/SAML, demonstrate code/RelayState delivery to an attacker-controlled endpoint with role-separated evidence.
</validation>
<false_positives>
- Redirects constrained to relative same-origin paths with robust normalization
- Exact pre-registered OAuth redirect_uri with strict verifier
- Validators using a single canonical parser and comparing post-IDNA host and scheme
- User prompts that show the exact final destination before navigating and refuse unknown schemes
</false_positives>
<impact>
- Credential and token theft via phishing and OAuth/OIDC interception
- Internal data exposure when server fetchers follow redirects (previewers/unfurlers)
- Policy bypass where allowlists are enforced only on the first hop
- Cross-application trust erosion and brand abuse
</impact>
<pro_tips>
1. Always compare server-side canonicalization to real browser navigation; differences reveal bypasses.
2. Try userinfo, protocol-relative, Unicode/IDN, and IP numeric variants early; they catch many weak validators.
3. In OAuth, prioritize post_logout_redirect_uri and less-discussed flows; theyre often looser.
4. Exercise multi-hop across distinct subdomains and paths; validators commonly check only hop 1.
5. For SSRF chaining, target services known to follow redirects and log their outbound requests.
6. Favor allowlists of exact origins plus optional path prefixes; never substring/regex contains checks.
7. Keep a curated suite of redirect payloads per runtime (Java, Node, Python, Go) reflecting each parsers quirks.
</pro_tips>
<remember>Redirection is safe only when the final destination is constrained after canonicalization. Enforce exact origins, verify per hop, and treat client-provided destinations as untrusted across every stack.</remember>
</open_redirect_vulnerability_guide>

View File

@@ -0,0 +1,155 @@
<subdomain_takeover_guide>
<title>SUBDOMAIN TAKEOVER</title>
<critical>Subdomain takeover lets an attacker serve content from a trusted subdomain by claiming resources referenced by dangling DNS (CNAME/A/ALIAS/NS) or mis-bound provider configurations. Consequences include phishing on a trusted origin, cookie and CORS pivot, OAuth redirect abuse, and CDN cache poisoning.</critical>
<scope>
- Dangling CNAME/A/ALIAS to third-party services (hosting, storage, serverless, CDN)
- Orphaned NS delegations (child zones with abandoned/expired nameservers)
- Decommissioned SaaS integrations (support, docs, marketing, forms) referenced via CNAME
- CDN “alternate domain” mappings (CloudFront/Fastly/Azure CDN) lacking ownership verification
- Storage and static hosting endpoints (S3/Blob/GCS buckets, GitHub/GitLab Pages)
</scope>
<methodology>
1. Enumerate subdomains comprehensively (web, API, mobile, legacy): aggregate CT logs, passive DNS, and org inventory. De-duplicate and normalize.
2. Resolve DNS for all RR types: A/AAAA, CNAME, NS, MX, TXT. Keep CNAME chains; record terminal CNAME targets and provider hints.
3. HTTP/TLS probe: capture status, body, length, canonical error text, Server/alt-svc headers, certificate SANs, and CDN headers (Via, X-Served-By).
4. Fingerprint providers: map known “unclaimed/missing resource” signatures to candidate services. Maintain a living dictionary.
5. Attempt claim (only with authorization): create the missing resource on the provider with the exact required name; bind the custom domain if the provider allows.
6. Validate control: serve a minimal unique payload; confirm over HTTPS; optionally obtain a DV certificate (CT log evidence) within legal scope.
</methodology>
<discovery_techniques>
<enumeration_pipeline>
- Subdomain inventory: combine CT (crt.sh APIs), passive DNS sources, in-house asset lists, IaC/terraform outputs, mobile app assets, and historical DNS
- Resolver sweep: use IPv4/IPv6-aware resolvers; track NXDOMAIN vs SERVFAIL vs provider-branded 4xx/5xx responses
- Record graph: build a CNAME graph and collapse chains to identify external endpoints (e.g., myapp.example.com → foo.azurewebsites.net)
</enumeration_pipeline>
<dns_indicators>
- CNAME targets ending in provider domains: github.io, amazonaws.com, cloudfront.net, azurewebsites.net, blob.core.windows.net, fastly.net, vercel.app, netlify.app, herokudns.com, trafficmanager.net, azureedge.net, akamaized.net
- Orphaned NS: subzone delegated to nameservers on a domain that has expired or no longer hosts authoritative servers; or to inexistent NS hosts
- MX to third-party mail providers with decommissioned domains (risk: mail subdomain control or delivery manipulation)
- TXT/verification artifacts (asuid, _dnsauth, _github-pages-challenge) suggesting previous external bindings
</dns_indicators>
<http_fingerprints>
- Service-specific unclaimed messages (examples, not exhaustive):
- GitHub Pages: “There isnt a GitHub Pages site here.”
- Fastly: “Fastly error: unknown domain”
- Heroku: “No such app” or “Theres nothing here, yet.”
- S3 static site: “NoSuchBucket” / “The specified bucket does not exist”
- CloudFront (alt domain not configured): 403/400 with “The request could not be satisfied” and no matching distribution
- Azure App Service: default 404 for azurewebsites.net unless custom-domain verified (look for asuid TXT requirement)
- Shopify: “Sorry, this shop is currently unavailable”
- TLS clues: certificate CN/SAN referencing provider default host instead of the custom subdomain indicates potential mis-binding
</http_fingerprints>
</discovery_techniques>
<exploitation_techniques>
<claim_third_party_resource>
- Create the resource with the exact required name:
- Storage/hosting: S3 bucket “sub.example.com” (website endpoint) or bucket named after the CNAME target if provider dictates
- Pages hosting: create repo/site and add the custom domain (when provider does not enforce prior domain verification)
- Serverless/app hosting: create app/site matching the target hostname, then add custom domain mapping
- Bind the custom domain: some providers require TXT verification (modern hardened path), others historically allowed binding without proof
</claim_third_party_resource>
<cdn_alternate_domains>
- Add the victim subdomain as an alternate domain on your CDN distribution if the provider does not enforce domain ownership checks
- Upload a TLS cert via provider or use managed cert issuance if allowed; confirm 200 on the subdomain with your content
</cdn_alternate_domains>
<ns_delegation_takeover>
- If a child zone (e.g., zone.example.com) is delegated to nameservers under an expired domain (ns1.abandoned.tld), register abandoned.tld and host authoritative NS; publish records to control all hosts under the delegated subzone
- Validate with SOA/NS queries and serve a verification token; then add A/CNAME/MX/TXT as needed
</ns_delegation_takeover>
<mail_surface>
- If MX points to a decommissioned provider that allowed inbox creation without domain re-verification (historically), a takeover could enable email receipt for that subdomain; modern providers generally require explicit TXT ownership
</mail_surface>
</exploitation_techniques>
<advanced_techniques>
<blind_and_cache_channels>
- CDN edge behavior: 404/421 vs 403 differentials reveal whether an alt name is partially configured; probe with Host header manipulation
- Cache poisoning: once taken over, exploit cache keys and Vary headers to persist malicious responses at the edge
</blind_and_cache_channels>
<ct_and_tls>
- Use CT logs to detect unexpected certificate issuance for your subdomain; for PoC, issue a DV cert post-takeover (within scope) to produce verifiable evidence
</ct_and_tls>
<oauth_and_trust_chains>
- If the subdomain is whitelisted as an OAuth redirect/callback or in CSP/script-src, a takeover elevates impact to account takeover or script injection on trusted origins
</oauth_and_trust_chains>
<provider_edges>
- Many providers hardened domain binding (TXT verification) but legacy projects or specific products remain weak; verify per-product behavior (CDN vs app hosting vs storage)
- Multi-tenant providers sometimes accept custom domains at the edge even when backend resource is missing; leverage timing and registration windows
</provider_edges>
</advanced_techniques>
<bypass_techniques>
<verification_gaps>
- Look for providers that accept domain binding prior to TXT verification, or where verification is optional for trial/legacy tiers
- Race windows: re-claim resource names immediately after victim deletion while DNS still points to provider
</verification_gaps>
<wildcards_and_fallbacks>
- Wildcard CNAMEs to providers may expose unbounded subdomains; test random hosts to identify service-wide unclaimed behavior
- Fallback origins: CDNs configured with multiple origins may expose unknown-domain responses from a default origin that is claimable
</wildcards_and_fallbacks>
</bypass_techniques>
<special_contexts>
<storage_and_static>
- S3/GCS/Azure Blob static sites: bucket naming constraints dictate whether a bucket can match hostname; website vs API endpoints differ in claimability and fingerprints
</storage_and_static>
<serverless_and_hosting>
- GitHub/GitLab Pages, Netlify, Vercel, Azure Static Web Apps: domain binding flows vary; most require TXT now, but historical projects or specific paths may not
</serverless_and_hosting>
<cdn_and_edge>
- CloudFront/Fastly/Azure CDN/Akamai: alternate domain verification differs; some products historically allowed alt-domain claims without proof
</cdn_and_edge>
<dns_delegations>
- Child-zone NS delegations outrank parent records; control of delegated NS yields full control of all hosts below that label
</dns_delegations>
</special_contexts>
<validation>
1. Before: record DNS chain, HTTP response (status/body length/fingerprint), and TLS details.
2. After claim: serve unique content and verify over HTTPS at the target subdomain.
3. Optional: issue a DV certificate (legal scope) and reference CT entry as durable evidence.
4. Demonstrate impact chains (CSP/script-src trust, OAuth redirect acceptance, cookie Domain scoping) with minimal PoCs.
</validation>
<false_positives>
- “Unknown domain” pages that are not claimable due to enforced TXT/ownership checks.
- Provider-branded default pages for valid, owned resources (not a takeover) versus “unclaimed resource” states
- Soft 404s from your own infrastructure or catch-all vhosts
</false_positives>
<impact>
- Content injection under trusted subdomain: phishing, malware delivery, brand damage
- Cookie and CORS pivot: if parent site sets Domain-scoped cookies or allows subdomain origins in CORS/Trusted Types/CSP
- OAuth/SSO abuse via whitelisted redirect URIs
- Email delivery manipulation for subdomain (MX/DMARC/SPF interactions in edge cases)
</impact>
<pro_tips>
1. Build a pipeline: enumerate (subfinder/amass) → resolve (dnsx) → probe (httpx) → fingerprint (nuclei/custom) → verify claims.
2. Maintain a current fingerprint corpus; provider messages change frequently—prefer regex families over exact strings.
3. Prefer minimal PoCs: static “ownership proof” page and, where allowed, DV cert issuance for auditability.
4. Monitor CT for unexpected certs on your subdomains; alert and investigate.
5. Eliminate dangling DNS in decommission workflows first; deletion of the app/service must remove or block the DNS target.
6. For NS delegations, treat any expired nameserver domain as critical; reassign or remove delegation immediately.
7. Use CAA to limit certificate issuance while you triage; it reduces the blast radius for taken-over hosts.
</pro_tips>
<remember>Subdomain safety is lifecycle safety: if DNS points at anything, you must own and verify the thing on every provider and product path. Remove or verify—there is no safe middle.</remember>
</subdomain_takeover_guide>

38
strix/telemetry/README.md Normal file
View File

@@ -0,0 +1,38 @@
### Overview
To help make Strix better for everyone, we collect anonymized data that helps us understand how to better improve our AI security agent for our users, guide the addition of new features, and fix common errors and bugs. This feedback loop is crucial for improving Strix's capabilities and user experience.
We use [PostHog](https://posthog.com), an open-source analytics platform, for data collection and analysis. Our telemetry implementation is fully transparent - you can review the [source code](https://github.com/usestrix/strix/blob/main/strix/telemetry/posthog.py) to see exactly what we track.
### Telemetry Policy
Privacy is our priority. All collected data is anonymized by default. Each session gets a random UUID that is not persisted or tied to you. Your code, scan targets, vulnerability details, and findings always remain private and are never collected.
### What We Track
We collect only very **basic** usage data including:
**Session Errors:** Duration and error types (not messages or stack traces)\
**System Context:** OS type, architecture, Strix version\
**Scan Context:** Scan mode (quick/standard/deep), scan type (whitebox/blackbox)\
**Model Usage:** Which LLM model is being used (not prompts or responses)\
**Aggregate Metrics:** Vulnerability counts by severity, agent/tool counts, token usage and cost estimates
For complete transparency, you can inspect our [telemetry implementation](https://github.com/usestrix/strix/blob/main/strix/telemetry/posthog.py) to see the exact events we track.
### What We **Never** Collect
- IP addresses, usernames, or any identifying information
- Scan targets, file paths, target URLs, or domains
- Vulnerability details, descriptions, or code
- LLM requests and responses
### How to Opt Out
Telemetry in Strix is entirely **optional**:
```bash
export STRIX_TELEMETRY=0
```
You can set this environment variable before running Strix to disable **all** telemetry.

View File

@@ -1,4 +1,10 @@
from . import posthog
from .tracer import Tracer, get_global_tracer, set_global_tracer
__all__ = ["Tracer", "get_global_tracer", "set_global_tracer"]
__all__ = [
"Tracer",
"get_global_tracer",
"posthog",
"set_global_tracer",
]

137
strix/telemetry/posthog.py Normal file
View File

@@ -0,0 +1,137 @@
import json
import platform
import sys
import urllib.request
from pathlib import Path
from typing import TYPE_CHECKING, Any
from uuid import uuid4
from strix.config import Config
if TYPE_CHECKING:
from strix.telemetry.tracer import Tracer
_POSTHOG_PUBLIC_API_KEY = "phc_7rO3XRuNT5sgSKAl6HDIrWdSGh1COzxw0vxVIAR6vVZ"
_POSTHOG_HOST = "https://us.i.posthog.com"
_SESSION_ID = uuid4().hex[:16]
def _is_enabled() -> bool:
return (Config.get("strix_telemetry") or "1").lower() not in ("0", "false", "no", "off")
def _is_first_run() -> bool:
marker = Path.home() / ".strix" / ".seen"
if marker.exists():
return False
try:
marker.parent.mkdir(parents=True, exist_ok=True)
marker.touch()
except Exception: # noqa: BLE001, S110
pass # nosec B110
return True
def _get_version() -> str:
try:
from importlib.metadata import version
return version("strix-agent")
except Exception: # noqa: BLE001
return "unknown"
def _send(event: str, properties: dict[str, Any]) -> None:
if not _is_enabled():
return
try:
payload = {
"api_key": _POSTHOG_PUBLIC_API_KEY,
"event": event,
"distinct_id": _SESSION_ID,
"properties": properties,
}
req = urllib.request.Request( # noqa: S310
f"{_POSTHOG_HOST}/capture/",
data=json.dumps(payload).encode(),
headers={"Content-Type": "application/json"},
)
with urllib.request.urlopen(req, timeout=10): # noqa: S310 # nosec B310
pass
except Exception: # noqa: BLE001, S110
pass # nosec B110
def _base_props() -> dict[str, Any]:
return {
"os": platform.system().lower(),
"arch": platform.machine(),
"python": f"{sys.version_info.major}.{sys.version_info.minor}",
"strix_version": _get_version(),
}
def start(
model: str | None,
scan_mode: str | None,
is_whitebox: bool,
interactive: bool,
has_instructions: bool,
) -> None:
_send(
"scan_started",
{
**_base_props(),
"model": model or "unknown",
"scan_mode": scan_mode or "unknown",
"scan_type": "whitebox" if is_whitebox else "blackbox",
"interactive": interactive,
"has_instructions": has_instructions,
"first_run": _is_first_run(),
},
)
def finding(severity: str) -> None:
_send(
"finding_reported",
{
**_base_props(),
"severity": severity.lower(),
},
)
def end(tracer: "Tracer", exit_reason: str = "completed") -> None:
vulnerabilities_counts = {"critical": 0, "high": 0, "medium": 0, "low": 0, "info": 0}
for v in tracer.vulnerability_reports:
sev = v.get("severity", "info").lower()
if sev in vulnerabilities_counts:
vulnerabilities_counts[sev] += 1
llm = tracer.get_total_llm_stats()
total = llm.get("total", {})
_send(
"scan_ended",
{
**_base_props(),
"exit_reason": exit_reason,
"duration_seconds": round(tracer._calculate_duration()),
"vulnerabilities_total": len(tracer.vulnerability_reports),
**{f"vulnerabilities_{k}": v for k, v in vulnerabilities_counts.items()},
"agent_count": len(tracer.agents),
"tool_count": tracer.get_real_tool_count(),
"llm_tokens": llm.get("total_tokens", 0),
"llm_cost": total.get("cost", 0.0),
},
)
def error(error_type: str, error_msg: str | None = None) -> None:
props = {**_base_props(), "error_type": error_type}
if error_msg:
props["error_msg"] = error_msg
_send("error", props)

View File

@@ -4,6 +4,8 @@ from pathlib import Path
from typing import TYPE_CHECKING, Any, Optional
from uuid import uuid4
from strix.telemetry import posthog
if TYPE_CHECKING:
from collections.abc import Callable
@@ -33,6 +35,8 @@ class Tracer:
self.agents: dict[str, dict[str, Any]] = {}
self.tool_executions: dict[int, dict[str, Any]] = {}
self.chat_messages: list[dict[str, Any]] = []
self.streaming_content: dict[str, str] = {}
self.interrupted_content: dict[str, str] = {}
self.vulnerability_reports: list[dict[str, Any]] = []
self.final_scan_result: str | None = None
@@ -50,8 +54,9 @@ class Tracer:
self._run_dir: Path | None = None
self._next_execution_id = 1
self._next_message_id = 1
self._saved_vuln_ids: set[str] = set()
self.vulnerability_found_callback: Callable[[str, str, str, str], None] | None = None
self.vulnerability_found_callback: Callable[[dict[str, Any]], None] | None = None
def set_run_name(self, run_name: str) -> None:
self.run_name = run_name
@@ -59,7 +64,7 @@ class Tracer:
def get_run_dir(self) -> Path:
if self._run_dir is None:
runs_dir = Path.cwd() / "agent_runs"
runs_dir = Path.cwd() / "strix_runs"
runs_dir.mkdir(exist_ok=True)
run_dir_name = self.run_name if self.run_name else self.run_id
@@ -68,46 +73,118 @@ class Tracer:
return self._run_dir
def add_vulnerability_report(
def add_vulnerability_report( # noqa: PLR0912
self,
title: str,
content: str,
severity: str,
description: str | None = None,
impact: str | None = None,
target: str | None = None,
technical_analysis: str | None = None,
poc_description: str | None = None,
poc_script_code: str | None = None,
remediation_steps: str | None = None,
cvss: float | None = None,
cvss_breakdown: dict[str, str] | None = None,
endpoint: str | None = None,
method: str | None = None,
cve: str | None = None,
code_file: str | None = None,
code_before: str | None = None,
code_after: str | None = None,
code_diff: str | None = None,
) -> str:
report_id = f"vuln-{len(self.vulnerability_reports) + 1:04d}"
report = {
report: dict[str, Any] = {
"id": report_id,
"title": title.strip(),
"content": content.strip(),
"severity": severity.lower().strip(),
"timestamp": datetime.now(UTC).strftime("%Y-%m-%d %H:%M:%S UTC"),
}
if description:
report["description"] = description.strip()
if impact:
report["impact"] = impact.strip()
if target:
report["target"] = target.strip()
if technical_analysis:
report["technical_analysis"] = technical_analysis.strip()
if poc_description:
report["poc_description"] = poc_description.strip()
if poc_script_code:
report["poc_script_code"] = poc_script_code.strip()
if remediation_steps:
report["remediation_steps"] = remediation_steps.strip()
if cvss is not None:
report["cvss"] = cvss
if cvss_breakdown:
report["cvss_breakdown"] = cvss_breakdown
if endpoint:
report["endpoint"] = endpoint.strip()
if method:
report["method"] = method.strip()
if cve:
report["cve"] = cve.strip()
if code_file:
report["code_file"] = code_file.strip()
if code_before:
report["code_before"] = code_before.strip()
if code_after:
report["code_after"] = code_after.strip()
if code_diff:
report["code_diff"] = code_diff.strip()
self.vulnerability_reports.append(report)
logger.info(f"Added vulnerability report: {report_id} - {title}")
posthog.finding(severity)
if self.vulnerability_found_callback:
self.vulnerability_found_callback(
report_id, title.strip(), content.strip(), severity.lower().strip()
)
self.vulnerability_found_callback(report)
self.save_run_data()
return report_id
def set_final_scan_result(
self,
content: str,
success: bool = True,
) -> None:
self.final_scan_result = content.strip()
def get_existing_vulnerabilities(self) -> list[dict[str, Any]]:
return list(self.vulnerability_reports)
def update_scan_final_fields(
self,
executive_summary: str,
methodology: str,
technical_analysis: str,
recommendations: str,
) -> None:
self.scan_results = {
"scan_completed": True,
"content": content,
"success": success,
"executive_summary": executive_summary.strip(),
"methodology": methodology.strip(),
"technical_analysis": technical_analysis.strip(),
"recommendations": recommendations.strip(),
"success": True,
}
logger.info(f"Set final scan result: success={success}")
self.final_scan_result = f"""# Executive Summary
{executive_summary.strip()}
# Methodology
{methodology.strip()}
# Technical Analysis
{technical_analysis.strip()}
# Recommendations
{recommendations.strip()}
"""
logger.info("Updated scan final fields")
self.save_run_data(mark_complete=True)
posthog.end(self, exit_reason="finished_by_tool")
def log_agent_creation(
self, agent_id: str, name: str, task: str, parent_id: str | None = None
@@ -197,11 +274,13 @@ class Tracer:
"max_iterations": config.get("max_iterations", 200),
}
)
self.get_run_dir()
def save_run_data(self) -> None:
def save_run_data(self, mark_complete: bool = False) -> None: # noqa: PLR0912, PLR0915
try:
run_dir = self.get_run_dir()
self.end_time = datetime.now(UTC).isoformat()
if mark_complete:
self.end_time = datetime.now(UTC).isoformat()
if self.final_scan_result:
penetration_test_report_file = run_dir / "penetration_test_report.md"
@@ -219,21 +298,76 @@ class Tracer:
vuln_dir = run_dir / "vulnerabilities"
vuln_dir.mkdir(exist_ok=True)
new_reports = [
report
for report in self.vulnerability_reports
if report["id"] not in self._saved_vuln_ids
]
severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3, "info": 4}
sorted_reports = sorted(
self.vulnerability_reports,
key=lambda x: (severity_order.get(x["severity"], 5), x["timestamp"]),
)
for report in sorted_reports:
for report in new_reports:
vuln_file = vuln_dir / f"{report['id']}.md"
with vuln_file.open("w", encoding="utf-8") as f:
f.write(f"# {report['title']}\n\n")
f.write(f"**ID:** {report['id']}\n")
f.write(f"**Severity:** {report['severity'].upper()}\n")
f.write(f"**Found:** {report['timestamp']}\n\n")
f.write("## Description\n\n")
f.write(f"{report['content']}\n")
f.write(f"# {report.get('title', 'Untitled Vulnerability')}\n\n")
f.write(f"**ID:** {report.get('id', 'unknown')}\n")
f.write(f"**Severity:** {report.get('severity', 'unknown').upper()}\n")
f.write(f"**Found:** {report.get('timestamp', 'unknown')}\n")
metadata_fields: list[tuple[str, Any]] = [
("Target", report.get("target")),
("Endpoint", report.get("endpoint")),
("Method", report.get("method")),
("CVE", report.get("cve")),
]
cvss_score = report.get("cvss")
if cvss_score is not None:
metadata_fields.append(("CVSS", cvss_score))
for label, value in metadata_fields:
if value:
f.write(f"**{label}:** {value}\n")
f.write("\n## Description\n\n")
desc = report.get("description") or "No description provided."
f.write(f"{desc}\n\n")
if report.get("impact"):
f.write("## Impact\n\n")
f.write(f"{report['impact']}\n\n")
if report.get("technical_analysis"):
f.write("## Technical Analysis\n\n")
f.write(f"{report['technical_analysis']}\n\n")
if report.get("poc_description") or report.get("poc_script_code"):
f.write("## Proof of Concept\n\n")
if report.get("poc_description"):
f.write(f"{report['poc_description']}\n\n")
if report.get("poc_script_code"):
f.write("```\n")
f.write(f"{report['poc_script_code']}\n")
f.write("```\n\n")
if report.get("code_file") or report.get("code_diff"):
f.write("## Code Analysis\n\n")
if report.get("code_file"):
f.write(f"**File:** {report['code_file']}\n\n")
if report.get("code_diff"):
f.write("**Changes:**\n")
f.write("```diff\n")
f.write(f"{report['code_diff']}\n")
f.write("```\n\n")
if report.get("remediation_steps"):
f.write("## Remediation\n\n")
f.write(f"{report['remediation_steps']}\n\n")
self._saved_vuln_ids.add(report["id"])
vuln_csv_file = run_dir / "vulnerabilities.csv"
with vuln_csv_file.open("w", encoding="utf-8", newline="") as f:
@@ -254,10 +388,11 @@ class Tracer:
}
)
logger.info(
f"Saved {len(self.vulnerability_reports)} vulnerability reports to: {vuln_dir}"
)
logger.info(f"Saved vulnerability index to: {vuln_csv_file}")
if new_reports:
logger.info(
f"Saved {len(new_reports)} new vulnerability report(s) to: {vuln_dir}"
)
logger.info(f"Updated vulnerability index: {vuln_csv_file}")
logger.info(f"📊 Essential scan data saved to: {run_dir}")
@@ -277,14 +412,14 @@ class Tracer:
def get_agent_tools(self, agent_id: str) -> list[dict[str, Any]]:
return [
exec_data
for exec_data in self.tool_executions.values()
for exec_data in list(self.tool_executions.values())
if exec_data.get("agent_id") == agent_id
]
def get_real_tool_count(self) -> int:
return sum(
1
for exec_data in self.tool_executions.values()
for exec_data in list(self.tool_executions.values())
if exec_data.get("tool_name") not in ["scan_start_info", "subagent_start_info"]
)
@@ -319,5 +454,28 @@ class Tracer:
"total_tokens": total_stats["input_tokens"] + total_stats["output_tokens"],
}
def update_streaming_content(self, agent_id: str, content: str) -> None:
self.streaming_content[agent_id] = content
def clear_streaming_content(self, agent_id: str) -> None:
self.streaming_content.pop(agent_id, None)
def get_streaming_content(self, agent_id: str) -> str | None:
return self.streaming_content.get(agent_id)
def finalize_streaming_as_interrupted(self, agent_id: str) -> str | None:
content = self.streaming_content.pop(agent_id, None)
if content and content.strip():
self.interrupted_content[agent_id] = content
self.log_chat_message(
content=content,
role="assistant",
agent_id=agent_id,
metadata={"interrupted": True},
)
return content
return self.interrupted_content.pop(agent_id, None)
def cleanup(self) -> None:
self.save_run_data()
self.save_run_data(mark_complete=True)

View File

@@ -1,5 +1,7 @@
import os
from strix.config import Config
from .executor import (
execute_tool,
execute_tool_invocation,
@@ -22,11 +24,15 @@ from .registry import (
SANDBOX_MODE = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
HAS_PERPLEXITY_API = bool(os.getenv("PERPLEXITY_API_KEY"))
HAS_PERPLEXITY_API = bool(Config.get("perplexity_api_key"))
DISABLE_BROWSER = (Config.get("strix_disable_browser") or "false").lower() == "true"
if not SANDBOX_MODE:
from .agents_graph import * # noqa: F403
from .browser import * # noqa: F403
if not DISABLE_BROWSER:
from .browser import * # noqa: F403
from .file_edit import * # noqa: F403
from .finish import * # noqa: F403
from .notes import * # noqa: F403
@@ -35,13 +41,14 @@ if not SANDBOX_MODE:
from .reporting import * # noqa: F403
from .terminal import * # noqa: F403
from .thinking import * # noqa: F403
from .todo import * # noqa: F403
if HAS_PERPLEXITY_API:
from .web_search import * # noqa: F403
else:
from .browser import * # noqa: F403
if not DISABLE_BROWSER:
from .browser import * # noqa: F403
from .file_edit import * # noqa: F403
from .notes import * # noqa: F403
from .proxy import * # noqa: F403
from .python import * # noqa: F403
from .terminal import * # noqa: F403

View File

@@ -190,36 +190,35 @@ def create_agent(
task: str,
name: str,
inherit_context: bool = True,
prompt_modules: str | None = None,
skills: str | None = None,
) -> dict[str, Any]:
try:
parent_id = agent_state.agent_id
module_list = []
if prompt_modules:
module_list = [m.strip() for m in prompt_modules.split(",") if m.strip()]
skill_list = []
if skills:
skill_list = [s.strip() for s in skills.split(",") if s.strip()]
if len(module_list) > 5:
if len(skill_list) > 5:
return {
"success": False,
"error": (
"Cannot specify more than 5 prompt modules for an agent "
"(use comma-separated format)"
"Cannot specify more than 5 skills for an agent (use comma-separated format)"
),
"agent_id": None,
}
if module_list:
from strix.prompts import get_all_module_names, validate_module_names
if skill_list:
from strix.skills import get_all_skill_names, validate_skill_names
validation = validate_module_names(module_list)
validation = validate_skill_names(skill_list)
if validation["invalid"]:
available_modules = list(get_all_module_names())
available_skills = list(get_all_skill_names())
return {
"success": False,
"error": (
f"Invalid prompt modules: {validation['invalid']}. "
f"Available modules: {', '.join(available_modules)}"
f"Invalid skills: {validation['invalid']}. "
f"Available skills: {', '.join(available_skills)}"
),
"agent_id": None,
}
@@ -230,9 +229,18 @@ def create_agent(
state = AgentState(task=task, agent_name=name, parent_id=parent_id, max_iterations=300)
llm_config = LLMConfig(prompt_modules=module_list)
parent_agent = _agent_instances.get(parent_id)
timeout = None
scan_mode = "deep"
if parent_agent and hasattr(parent_agent, "llm_config"):
if hasattr(parent_agent.llm_config, "timeout"):
timeout = parent_agent.llm_config.timeout
if hasattr(parent_agent.llm_config, "scan_mode"):
scan_mode = parent_agent.llm_config.scan_mode
llm_config = LLMConfig(skills=skill_list, timeout=timeout, scan_mode=scan_mode)
agent_config = {
"llm_config": llm_config,
"state": state,

View File

@@ -79,57 +79,58 @@ Only create a new agent if no existing agent is handling the specific task.</des
<parameter name="inherit_context" type="boolean" required="false">
<description>Whether the new agent should inherit parent's conversation history and context</description>
</parameter>
<parameter name="prompt_modules" type="string" required="false">
<description>Comma-separated list of prompt modules to use for the agent (MAXIMUM 5 modules allowed). Most agents should have at least one module in order to be useful. Agents should be highly specialized - use 1-3 related modules; up to 5 for complex contexts. {{DYNAMIC_MODULES_DESCRIPTION}}</description>
<parameter name="skills" type="string" required="false">
<description>Comma-separated list of skills to use for the agent (MAXIMUM 5 skills allowed). Most agents should have at least one skill in order to be useful. Agents should be highly specialized - use 1-3 related skills; up to 5 for complex contexts. {{DYNAMIC_SKILLS_DESCRIPTION}}</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - agent_id: Unique identifier for the created agent - success: Whether the agent was created successfully - message: Status message - agent_info: Details about the created agent</description>
</returns>
<examples>
# REQUIRED: Check agent graph again before creating another agent
<function=view_agent_graph>
</function>
# After confirming no SQL testing agent exists, create agent for vulnerability validation
<function=create_agent>
<parameter=task>Validate and exploit the suspected SQL injection vulnerability found in
the login form. Confirm exploitability and document proof of concept.</parameter>
<parameter=name>SQLi Validator</parameter>
<parameter=prompt_modules>sql_injection</parameter>
<parameter=skills>sql_injection</parameter>
</function>
<function=create_agent>
<parameter=task>Test authentication mechanisms, JWT implementation, and session management
for security vulnerabilities and bypass techniques.</parameter>
<parameter=name>Auth Specialist</parameter>
<parameter=prompt_modules>authentication_jwt, business_logic</parameter>
<parameter=skills>authentication_jwt, business_logic</parameter>
</function>
# Example of single-module specialization (most focused)
# Example of single-skill specialization (most focused)
<function=create_agent>
<parameter=task>Perform comprehensive XSS testing including reflected, stored, and DOM-based
variants across all identified input points.</parameter>
<parameter=name>XSS Specialist</parameter>
<parameter=prompt_modules>xss</parameter>
<parameter=skills>xss</parameter>
</function>
# Example of up to 5 related modules (borderline acceptable)
# Example of up to 5 related skills (borderline acceptable)
<function=create_agent>
<parameter=task>Test for server-side vulnerabilities including SSRF, XXE, and potential
RCE vectors in file upload and XML processing endpoints.</parameter>
<parameter=name>Server-Side Attack Specialist</parameter>
<parameter=prompt_modules>ssrf, xxe, rce</parameter>
<parameter=skills>ssrf, xxe, rce</parameter>
</function>
</examples>
</tool>
<tool name="send_message_to_agent">
<description>Send a message to another agent in the graph for coordination and communication.</description>
<details>This enables agents to communicate with each other during execution for:
<details>This enables agents to communicate with each other during execution, but should be used only when essential:
- Sharing discovered information or findings
- Asking questions or requesting assistance
- Providing instructions or coordination
- Reporting status or results</details>
- Reporting status or results
Best practices:
- Avoid routine status updates; batch non-urgent information
- Prefer parent/child completion flows (agent_finish)
- Do not message when the context is already known</details>
<parameters>
<parameter name="target_agent_id" type="string" required="true">
<description>ID of the agent to send the message to</description>

View File

@@ -1,8 +1,10 @@
from typing import Any, Literal, NoReturn
from typing import TYPE_CHECKING, Any, Literal, NoReturn
from strix.tools.registry import register_tool
from .tab_manager import BrowserTabManager, get_browser_tab_manager
if TYPE_CHECKING:
from .tab_manager import BrowserTabManager
BrowserAction = Literal[
@@ -71,7 +73,7 @@ def _validate_file_path(action_name: str, file_path: str | None) -> None:
def _handle_navigation_actions(
manager: BrowserTabManager,
manager: "BrowserTabManager",
action: str,
url: str | None = None,
tab_id: str | None = None,
@@ -90,7 +92,7 @@ def _handle_navigation_actions(
def _handle_interaction_actions(
manager: BrowserTabManager,
manager: "BrowserTabManager",
action: str,
coordinate: str | None = None,
text: str | None = None,
@@ -128,7 +130,7 @@ def _raise_unknown_action(action: str) -> NoReturn:
def _handle_tab_actions(
manager: BrowserTabManager,
manager: "BrowserTabManager",
action: str,
url: str | None = None,
tab_id: str | None = None,
@@ -149,7 +151,7 @@ def _handle_tab_actions(
def _handle_utility_actions(
manager: BrowserTabManager,
manager: "BrowserTabManager",
action: str,
duration: float | None = None,
js_code: str | None = None,
@@ -191,6 +193,8 @@ def browser_action(
file_path: str | None = None,
clear: bool = False,
) -> dict[str, Any]:
from .tab_manager import get_browser_tab_manager
manager = get_browser_tab_manager()
try:

View File

@@ -4,6 +4,8 @@ from typing import Any
import httpx
from strix.config import Config
if os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "false":
from strix.runtime import get_runtime
@@ -17,6 +19,10 @@ from .registry import (
)
SANDBOX_EXECUTION_TIMEOUT = float(Config.get("strix_sandbox_execution_timeout") or "500")
SANDBOX_CONNECT_TIMEOUT = float(Config.get("strix_sandbox_connect_timeout") or "10")
async def execute_tool(tool_name: str, agent_state: Any | None = None, **kwargs: Any) -> Any:
execute_in_sandbox = should_execute_in_sandbox(tool_name)
sandbox_mode = os.getenv("STRIX_SANDBOX_MODE", "false").lower() == "true"
@@ -62,10 +68,15 @@ async def _execute_tool_in_sandbox(tool_name: str, agent_state: Any, **kwargs: A
"Content-Type": "application/json",
}
timeout = httpx.Timeout(
timeout=SANDBOX_EXECUTION_TIMEOUT,
connect=SANDBOX_CONNECT_TIMEOUT,
)
async with httpx.AsyncClient(trust_env=False) as client:
try:
response = await client.post(
request_url, json=request_data, headers=headers, timeout=None
request_url, json=request_data, headers=headers, timeout=timeout
)
response.raise_for_status()
response_data = response.json()

View File

@@ -3,9 +3,6 @@ import re
from pathlib import Path
from typing import Any, cast
from openhands_aci import file_editor
from openhands_aci.utils.shell import run_shell_cmd
from strix.tools.registry import register_tool
@@ -33,6 +30,8 @@ def str_replace_editor(
new_str: str | None = None,
insert_line: int | None = None,
) -> dict[str, Any]:
from openhands_aci import file_editor
try:
path_obj = Path(path)
if not path_obj.is_absolute():
@@ -64,6 +63,8 @@ def list_files(
path: str,
recursive: bool = False,
) -> dict[str, Any]:
from openhands_aci.utils.shell import run_shell_cmd
try:
path_obj = Path(path)
if not path_obj.is_absolute():
@@ -116,6 +117,8 @@ def search_files(
regex: str,
file_pattern: str = "*",
) -> dict[str, Any]:
from openhands_aci.utils.shell import run_shell_cmd
try:
path_obj = Path(path)
if not path_obj.is_absolute():

View File

@@ -4,49 +4,40 @@ from strix.tools.registry import register_tool
def _validate_root_agent(agent_state: Any) -> dict[str, Any] | None:
if (
agent_state is not None
and hasattr(agent_state, "parent_id")
and agent_state.parent_id is not None
):
if agent_state and hasattr(agent_state, "parent_id") and agent_state.parent_id is not None:
return {
"success": False,
"message": (
"This tool can only be used by the root/main agent. "
"Subagents must use agent_finish instead."
),
"error": "finish_scan_wrong_agent",
"message": "This tool can only be used by the root/main agent",
"suggestion": "If you are a subagent, use agent_finish from agents_graph tool instead",
}
return None
def _validate_content(content: str) -> dict[str, Any] | None:
if not content or not content.strip():
return {"success": False, "message": "Content cannot be empty"}
return None
def _check_active_agents(agent_state: Any = None) -> dict[str, Any] | None:
try:
from strix.tools.agents_graph.agents_graph_actions import _agent_graph
current_agent_id = None
if agent_state and hasattr(agent_state, "agent_id"):
if agent_state and agent_state.agent_id:
current_agent_id = agent_state.agent_id
else:
return None
running_agents = []
active_agents = []
stopping_agents = []
for agent_id, node in _agent_graph.get("nodes", {}).items():
for agent_id, node in _agent_graph["nodes"].items():
if agent_id == current_agent_id:
continue
status = node.get("status", "")
status = node.get("status", "unknown")
if status == "running":
running_agents.append(
active_agents.append(
{
"id": agent_id,
"name": node.get("name", "Unknown"),
"task": node.get("task", "No task description"),
"task": node.get("task", "Unknown task")[:300],
"status": status,
}
)
elif status == "stopping":
@@ -54,121 +45,105 @@ def _check_active_agents(agent_state: Any = None) -> dict[str, Any] | None:
{
"id": agent_id,
"name": node.get("name", "Unknown"),
"task": node.get("task", "Unknown task")[:300],
"status": status,
}
)
if running_agents or stopping_agents:
message_parts = ["Cannot finish scan while other agents are still active:"]
if running_agents:
message_parts.append("\n\nRunning agents:")
message_parts.extend(
[
f" - {agent['name']} ({agent['id']}): {agent['task']}"
for agent in running_agents
]
)
if stopping_agents:
message_parts.append("\n\nStopping agents:")
message_parts.extend(
[f" - {agent['name']} ({agent['id']})" for agent in stopping_agents]
)
message_parts.extend(
[
"\n\nSuggested actions:",
"1. Use wait_for_message to wait for all agents to complete",
"2. Send messages to agents asking them to finish if urgent",
"3. Use view_agent_graph to monitor agent status",
]
)
return {
if active_agents or stopping_agents:
response: dict[str, Any] = {
"success": False,
"message": "\n".join(message_parts),
"active_agents": {
"running": len(running_agents),
"stopping": len(stopping_agents),
"details": {
"running": running_agents,
"stopping": stopping_agents,
},
},
"error": "agents_still_active",
"message": "Cannot finish scan: agents are still active",
}
if active_agents:
response["active_agents"] = active_agents
if stopping_agents:
response["stopping_agents"] = stopping_agents
response["suggestions"] = [
"Use wait_for_message to wait for all agents to complete",
"Use send_message_to_agent if you need agents to complete immediately",
"Check agent_status to see current agent states",
]
response["total_active"] = len(active_agents) + len(stopping_agents)
return response
except ImportError:
pass
except Exception:
import logging
logging.warning("Could not check agent graph status - agents_graph module unavailable")
logging.exception("Error checking active agents")
return None
def _finalize_with_tracer(content: str, success: bool) -> dict[str, Any]:
@register_tool(sandbox_execution=False)
def finish_scan(
executive_summary: str,
methodology: str,
technical_analysis: str,
recommendations: str,
agent_state: Any = None,
) -> dict[str, Any]:
validation_error = _validate_root_agent(agent_state)
if validation_error:
return validation_error
active_agents_error = _check_active_agents(agent_state)
if active_agents_error:
return active_agents_error
validation_errors = []
if not executive_summary or not executive_summary.strip():
validation_errors.append("Executive summary cannot be empty")
if not methodology or not methodology.strip():
validation_errors.append("Methodology cannot be empty")
if not technical_analysis or not technical_analysis.strip():
validation_errors.append("Technical analysis cannot be empty")
if not recommendations or not recommendations.strip():
validation_errors.append("Recommendations cannot be empty")
if validation_errors:
return {"success": False, "message": "Validation failed", "errors": validation_errors}
try:
from strix.telemetry.tracer import get_global_tracer
tracer = get_global_tracer()
if tracer:
tracer.set_final_scan_result(
content=content.strip(),
success=success,
tracer.update_scan_final_fields(
executive_summary=executive_summary.strip(),
methodology=methodology.strip(),
technical_analysis=technical_analysis.strip(),
recommendations=recommendations.strip(),
)
vulnerability_count = len(tracer.vulnerability_reports)
return {
"success": True,
"scan_completed": True,
"message": "Scan completed successfully"
if success
else "Scan completed with errors",
"vulnerabilities_found": len(tracer.vulnerability_reports),
"message": "Scan completed successfully",
"vulnerabilities_found": vulnerability_count,
}
import logging
logging.warning("Global tracer not available - final scan result not stored")
logging.warning("Current tracer not available - scan results not stored")
return { # noqa: TRY300
"success": True,
"scan_completed": True,
"message": "Scan completed successfully (not persisted)"
if success
else "Scan completed with errors (not persisted)",
"warning": "Final result could not be persisted - tracer unavailable",
}
except ImportError:
except (ImportError, AttributeError) as e:
return {"success": False, "message": f"Failed to complete scan: {e!s}"}
else:
return {
"success": True,
"scan_completed": True,
"message": "Scan completed successfully (not persisted)"
if success
else "Scan completed with errors (not persisted)",
"warning": "Final result could not be persisted - tracer module unavailable",
"message": "Scan completed (not persisted)",
"warning": "Results could not be persisted - tracer unavailable",
}
@register_tool(sandbox_execution=False)
def finish_scan(
content: str,
success: bool = True,
agent_state: Any = None,
) -> dict[str, Any]:
try:
validation_error = _validate_root_agent(agent_state)
if validation_error:
return validation_error
validation_error = _validate_content(content)
if validation_error:
return validation_error
active_agents_error = _check_active_agents(agent_state)
if active_agents_error:
return active_agents_error
return _finalize_with_tracer(content, success)
except (ValueError, TypeError, KeyError) as e:
return {"success": False, "message": f"Failed to complete scan: {e!s}"}

View File

@@ -1,6 +1,6 @@
<tools>
<tool name="finish_scan">
<description>Complete the main security scan and generate final report.
<description>Complete the security scan by providing the final assessment fields as full penetration test report.
IMPORTANT: This tool can ONLY be used by the root/main agent.
Subagents must use agent_finish from agents_graph tool instead.
@@ -8,11 +8,20 @@ Subagents must use agent_finish from agents_graph tool instead.
IMPORTANT: This tool will NOT allow finishing if any agents are still running or stopping.
You must wait for all agents to complete before using this tool.
This tool MUST be called at the very end of the security assessment to:
- Verify all agents have completed their tasks
- Generate the final comprehensive scan report
- Mark the entire scan as completed
- Stop the agent from running
This tool directly updates the scan report data:
- executive_summary
- methodology
- technical_analysis
- recommendations
All fields are REQUIRED and map directly to the final report.
This must be the last tool called in the scan. It will:
1. Verify you are the root agent
2. Check all subagents have completed
3. Update the scan with your provided fields
4. Mark the scan as completed
5. Stop agent execution
Use this tool when:
- You are the main/root agent conducting the security assessment
@@ -23,23 +32,39 @@ Use this tool when:
IMPORTANT: Calling this tool multiple times will OVERWRITE any previous scan report.
Make sure you include ALL findings and details in a single comprehensive report.
If agents are still running, this tool will:
If agents are still running, the tool will:
- Show you which agents are still active
- Suggest using wait_for_message to wait for completion
- Suggest messaging agents if immediate completion is needed
Put ALL details in the content - methodology, tools used, vulnerability counts, key findings, recommendations,
compliance notes, risk assessments, next steps, etc. Be comprehensive and include everything relevant.</description>
NOTE: Make sure the vulnerabilities found were reported with create_vulnerability_report tool, otherwise they will not be tracked and you will not be rewarded.
But make sure to not report the same vulnerability multiple times.
Professional, customer-facing penetration test report rules (PDF-ready):
- Do NOT include internal or system details: never mention local/absolute paths (e.g., "/workspace"), internal tools, agents, orchestrators, sandboxes, models, system prompts/instructions, connection/tooling issues, or tester environment details.
- Tone and style: formal, objective, third-person, concise. No internal checklists or engineering runbooks. Content must read as a polished client deliverable.
- Structure across fields should align to standard pentest reports:
- Executive summary: business impact, risk posture, notable criticals, remediation theme.
- Methodology: industry-standard methods (e.g., OWASP, OSSTMM, NIST), scope, constraints—no internal execution notes.
- Technical analysis: consolidated findings overview referencing created vulnerability reports; avoid raw logs.
- Recommendations: prioritized, actionable, aligned to risk and best practices.
</description>
<parameters>
<parameter name="content" type="string" required="true">
<description>Complete scan report including executive summary, methodology, findings, vulnerability details, recommendations, compliance notes, risk assessment, and conclusions. Include everything relevant to the assessment.</description>
<parameter name="executive_summary" type="string" required="true">
<description>High-level summary for executives: key findings, overall security posture, critical risks, business impact</description>
</parameter>
<parameter name="success" type="boolean" required="false">
<description>Whether the scan completed successfully without critical errors</description>
<parameter name="methodology" type="string" required="true">
<description>Testing methodology: approach, tools used, scope, techniques employed</description>
</parameter>
<parameter name="technical_analysis" type="string" required="true">
<description>Detailed technical findings and security assessment results over the scan</description>
</parameter>
<parameter name="recommendations" type="string" required="true">
<description>Actionable security recommendations and remediation priorities</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing success status and completion message. If agents are still running, returns details about active agents and suggested actions.</description>
<description>Response containing success status, vulnerability count, and completion message. If agents are still running, returns details about active agents and suggested actions.</description>
</returns>
</tool>
</tools>

View File

@@ -11,7 +11,6 @@ _notes_storage: dict[str, dict[str, Any]] = {}
def _filter_notes(
category: str | None = None,
tags: list[str] | None = None,
priority: str | None = None,
search_query: str | None = None,
) -> list[dict[str, Any]]:
filtered_notes = []
@@ -20,9 +19,6 @@ def _filter_notes(
if category and note.get("category") != category:
continue
if priority and note.get("priority") != priority:
continue
if tags:
note_tags = note.get("tags", [])
if not any(tag in note_tags for tag in tags):
@@ -43,13 +39,12 @@ def _filter_notes(
return filtered_notes
@register_tool
@register_tool(sandbox_execution=False)
def create_note(
title: str,
content: str,
category: str = "general",
tags: list[str] | None = None,
priority: str = "normal",
) -> dict[str, Any]:
try:
if not title or not title.strip():
@@ -58,7 +53,7 @@ def create_note(
if not content or not content.strip():
return {"success": False, "error": "Content cannot be empty", "note_id": None}
valid_categories = ["general", "findings", "methodology", "todo", "questions", "plan"]
valid_categories = ["general", "findings", "methodology", "questions", "plan"]
if category not in valid_categories:
return {
"success": False,
@@ -66,14 +61,6 @@ def create_note(
"note_id": None,
}
valid_priorities = ["low", "normal", "high", "urgent"]
if priority not in valid_priorities:
return {
"success": False,
"error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
"note_id": None,
}
note_id = str(uuid.uuid4())[:5]
timestamp = datetime.now(UTC).isoformat()
@@ -82,7 +69,6 @@ def create_note(
"content": content.strip(),
"category": category,
"tags": tags or [],
"priority": priority,
"created_at": timestamp,
"updated_at": timestamp,
}
@@ -99,17 +85,14 @@ def create_note(
}
@register_tool
@register_tool(sandbox_execution=False)
def list_notes(
category: str | None = None,
tags: list[str] | None = None,
priority: str | None = None,
search: str | None = None,
) -> dict[str, Any]:
try:
filtered_notes = _filter_notes(
category=category, tags=tags, priority=priority, search_query=search
)
filtered_notes = _filter_notes(category=category, tags=tags, search_query=search)
return {
"success": True,
@@ -126,13 +109,12 @@ def list_notes(
}
@register_tool
@register_tool(sandbox_execution=False)
def update_note(
note_id: str,
title: str | None = None,
content: str | None = None,
tags: list[str] | None = None,
priority: str | None = None,
) -> dict[str, Any]:
try:
if note_id not in _notes_storage:
@@ -153,15 +135,6 @@ def update_note(
if tags is not None:
note["tags"] = tags
if priority is not None:
valid_priorities = ["low", "normal", "high", "urgent"]
if priority not in valid_priorities:
return {
"success": False,
"error": f"Invalid priority. Must be one of: {', '.join(valid_priorities)}",
}
note["priority"] = priority
note["updated_at"] = datetime.now(UTC).isoformat()
return {
@@ -173,7 +146,7 @@ def update_note(
return {"success": False, "error": f"Failed to update note: {e}"}
@register_tool
@register_tool(sandbox_execution=False)
def delete_note(note_id: str) -> dict[str, Any]:
try:
if note_id not in _notes_storage:

View File

@@ -1,10 +1,9 @@
<tools>
<tool name="create_note">
<description>Create a personal note for TODOs, side notes, plans, and organizational purposes during
the scan.</description>
<details>Use this tool for quick reminders, action items, planning thoughts, and organizational notes
rather than formal vulnerability reports or detailed findings. This is your personal notepad
for keeping track of tasks, ideas, and things to remember or follow up on.</details>
<description>Create a personal note for observations, findings, and research during the scan.</description>
<details>Use this tool for documenting discoveries, observations, methodology notes, and questions.
This is your personal notepad for recording information you want to remember or reference later.
For tracking actionable tasks, use the todo tool instead.</details>
<parameters>
<parameter name="title" type="string" required="true">
<description>Title of the note</description>
@@ -13,49 +12,41 @@
<description>Content of the note</description>
</parameter>
<parameter name="category" type="string" required="false">
<description>Category to organize the note (default: "general", "findings", "methodology", "todo", "questions", "plan")</description>
<description>Category to organize the note (default: "general", "findings", "methodology", "questions", "plan")</description>
</parameter>
<parameter name="tags" type="string" required="false">
<description>Tags for categorization</description>
</parameter>
<parameter name="priority" type="string" required="false">
<description>Priority level of the note ("low", "normal", "high", "urgent")</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - note_id: ID of the created note - success: Whether the note was created successfully</description>
</returns>
<examples>
# Create a TODO reminder
<function=create_note>
<parameter=title>TODO: Check SSL Certificate Details</parameter>
<parameter=content>Remember to verify SSL certificate validity and check for weak ciphers
on the HTTPS service discovered on port 443. Also check for certificate
transparency logs.</parameter>
<parameter=category>todo</parameter>
<parameter=tags>["ssl", "certificate", "followup"]</parameter>
<parameter=priority>normal</parameter>
</function>
# Planning note
<function=create_note>
<parameter=title>Scan Strategy Planning</parameter>
<parameter=content>Plan for next phase: 1) Complete subdomain enumeration 2) Test discovered
web apps for OWASP Top 10 3) Check database services for default creds
4) Review any custom applications for business logic flaws</parameter>
<parameter=category>plan</parameter>
<parameter=tags>["planning", "strategy", "next_steps"]</parameter>
</function>
# Side note for later investigation
# Document an interesting finding
<function=create_note>
<parameter=title>Interesting Directory Found</parameter>
<parameter=content>Found /backup/ directory that might contain sensitive files. Low priority
for now but worth checking if time permits. Directory listing seems
disabled.</parameter>
<parameter=content>Found /backup/ directory that might contain sensitive files. Directory listing
seems disabled but worth investigating further.</parameter>
<parameter=category>findings</parameter>
<parameter=tags>["directory", "backup", "low_priority"]</parameter>
<parameter=priority>low</parameter>
<parameter=tags>["directory", "backup"]</parameter>
</function>
# Methodology note
<function=create_note>
<parameter=title>Authentication Flow Analysis</parameter>
<parameter=content>The application uses JWT tokens stored in localStorage. Token expiration is
set to 24 hours. Observed that refresh token rotation is not implemented.</parameter>
<parameter=category>methodology</parameter>
<parameter=tags>["auth", "jwt", "session"]</parameter>
</function>
# Research question
<function=create_note>
<parameter=title>Custom Header Investigation</parameter>
<parameter=content>The API returns a custom X-Request-ID header. Need to research if this
could be used for user tracking or has any security implications.</parameter>
<parameter=category>questions</parameter>
<parameter=tags>["headers", "research"]</parameter>
</function>
</examples>
</tool>
@@ -84,9 +75,6 @@
<parameter name="tags" type="string" required="false">
<description>Filter by tags (returns notes with any of these tags)</description>
</parameter>
<parameter name="priority" type="string" required="false">
<description>Filter by priority level</description>
</parameter>
<parameter name="search" type="string" required="false">
<description>Search query to find in note titles and content</description>
</parameter>
@@ -100,11 +88,6 @@
<parameter=category>findings</parameter>
</function>
# List high priority items
<function=list_notes>
<parameter=priority>high</parameter>
</function>
# Search for SQL injection related notes
<function=list_notes>
<parameter=search>SQL injection</parameter>
@@ -132,9 +115,6 @@
<parameter name="tags" type="string" required="false">
<description>New tags for the note</description>
</parameter>
<parameter name="priority" type="string" required="false">
<description>New priority level</description>
</parameter>
</parameters>
<returns type="Dict[str, Any]">
<description>Response containing: - success: Whether the note was updated successfully</description>
@@ -143,7 +123,6 @@
<function=update_note>
<parameter=note_id>note_123</parameter>
<parameter=content>Updated content with new findings...</parameter>
<parameter=priority>urgent</parameter>
</function>
</examples>
</tool>

View File

@@ -2,8 +2,6 @@ from typing import Any, Literal
from strix.tools.registry import register_tool
from .proxy_manager import get_proxy_manager
RequestPart = Literal["request", "response"]
@@ -27,6 +25,8 @@ def list_requests(
sort_order: Literal["asc", "desc"] = "desc",
scope_id: str | None = None,
) -> dict[str, Any]:
from .proxy_manager import get_proxy_manager
manager = get_proxy_manager()
return manager.list_requests(
httpql_filter, start_page, end_page, page_size, sort_by, sort_order, scope_id
@@ -41,6 +41,8 @@ def view_request(
page: int = 1,
page_size: int = 50,
) -> dict[str, Any]:
from .proxy_manager import get_proxy_manager
manager = get_proxy_manager()
return manager.view_request(request_id, part, search_pattern, page, page_size)
@@ -53,6 +55,8 @@ def send_request(
body: str = "",
timeout: int = 30,
) -> dict[str, Any]:
from .proxy_manager import get_proxy_manager
if headers is None:
headers = {}
manager = get_proxy_manager()
@@ -64,6 +68,8 @@ def repeat_request(
request_id: str,
modifications: dict[str, Any] | None = None,
) -> dict[str, Any]:
from .proxy_manager import get_proxy_manager
if modifications is None:
modifications = {}
manager = get_proxy_manager()
@@ -78,6 +84,8 @@ def scope_rules(
scope_id: str | None = None,
scope_name: str | None = None,
) -> dict[str, Any]:
from .proxy_manager import get_proxy_manager
manager = get_proxy_manager()
return manager.scope_rules(action, allowlist, denylist, scope_id, scope_name)
@@ -89,6 +97,8 @@ def list_sitemap(
depth: Literal["DIRECT", "ALL"] = "DIRECT",
page: int = 1,
) -> dict[str, Any]:
from .proxy_manager import get_proxy_manager
manager = get_proxy_manager()
return manager.list_sitemap(scope_id, parent_id, depth, page)
@@ -97,5 +107,7 @@ def list_sitemap(
def view_sitemap_entry(
entry_id: str,
) -> dict[str, Any]:
from .proxy_manager import get_proxy_manager
manager = get_proxy_manager()
return manager.view_sitemap_entry(entry_id)

View File

@@ -2,8 +2,6 @@ from typing import Any, Literal
from strix.tools.registry import register_tool
from .python_manager import get_python_session_manager
PythonAction = Literal["new_session", "execute", "close", "list_sessions"]
@@ -15,6 +13,8 @@ def python_action(
timeout: int = 30,
session_id: str | None = None,
) -> dict[str, Any]:
from .python_manager import get_python_session_manager
def _validate_code(action_name: str, code: str | None) -> None:
if not code:
raise ValueError(f"code parameter is required for {action_name} action")

View File

@@ -23,17 +23,17 @@ class ImplementedInClientSideOnlyError(Exception):
def _process_dynamic_content(content: str) -> str:
if "{{DYNAMIC_MODULES_DESCRIPTION}}" in content:
if "{{DYNAMIC_SKILLS_DESCRIPTION}}" in content:
try:
from strix.prompts import generate_modules_description
from strix.skills import generate_skills_description
modules_description = generate_modules_description()
content = content.replace("{{DYNAMIC_MODULES_DESCRIPTION}}", modules_description)
skills_description = generate_skills_description()
content = content.replace("{{DYNAMIC_SKILLS_DESCRIPTION}}", skills_description)
except ImportError:
logger.warning("Could not import prompts utilities for dynamic schema generation")
logger.warning("Could not import skills utilities for dynamic schema generation")
content = content.replace(
"{{DYNAMIC_MODULES_DESCRIPTION}}",
"List of prompt modules to load for this agent (max 5). Module discovery failed.",
"{{DYNAMIC_SKILLS_DESCRIPTION}}",
"List of skills to load for this agent (max 5). Skill discovery failed.",
)
return content

Some files were not shown because too many files have changed in this diff Show More