- Use singleton browser with isolated BrowserContext per agent instead of
separate Chromium processes per agent
- Add cleanup logic for stale browser/playwright on reconnect
- Add resource management instructions to browser schema (close tabs/browser when done)
- Suppress Kali login message in Dockerfile
- Replace multiprocessing/threading with single asyncio task per agent
- Add task cancellation: new request cancels previous for same agent
- Add per-agent state isolation via ContextVar for Terminal, Browser, Python managers
- Add posthog telemetry for tool execution errors (timeout, http, sandbox)
- Fix proxy manager singleton pattern
- Increase client timeout buffer over server timeout
- Add context.py to Dockerfile
- Move tool server startup from Python to entrypoint script
- Hardcode Caido port (48080) in entrypoint, remove from Python
- Use /app/venv/bin/python directly instead of poetry run
- Fix env var passing through sudo with sudo -E and explicit vars
- Add Caido process monitoring and logging during startup
- Add retry logic with exponential backoff for token fetch
- Add tool server process validation before declaring ready
- Simplify docker_runtime.py (489 -> 310 lines)
- DRY up container state recovery into _recover_container_state()
- Add container creation retry logic (3 attempts)
- Fix GraphQL health check URL (/graphql/ with trailing slash)
Add cancelled flag to prevent timed-out thread's finally block from
overwriting stdout/stderr when a subsequent execution has already
started capturing output.
- Add ThreadPoolExecutor in agent_worker for parallel request execution
- Add request_id correlation to prevent response mismatch between concurrent requests
- Add background listener thread per agent to dispatch responses to correct futures
- Add --timeout argument for hard request timeout (default: 120s from config)
- Remove signal handlers from terminal_manager, python_manager, tab_manager (use atexit only)
- Replace SIGALRM timeout in python_instance with threading-based timeout
This fixes requests getting queued behind slow operations and timeouts.
The str() of httpx.RequestError was often empty, making error messages
unhelpful. Now includes the exception type (e.g., ConnectError) for
better debugging.
- Add professional, realistic multiline examples to all tool schemas
- finish_scan: Complete pentest report with SSRF/access control findings
- create_vulnerability_report: Full SSRF writeup with cloud metadata PoC
- file_edit, notes, thinking: Realistic security testing examples
- Remove XML terminology from system prompt and tool descriptions
- All examples use real newlines (not literal \n) to demonstrate correct usage
- Add Config class with all env var defaults in one place
- Auto-load saved config on startup (env vars take precedence)
- Auto-save config after successful LLM warm-up
- Replace scattered os.getenv() calls with Config.get()
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add dedupe.py with XML-based LLM deduplication using direct litellm calls
- Integrate deduplication check in create_vulnerability_report tool
- Add get_existing_vulnerabilities() method to tracer for fetching reports
- Update schema and system prompt with deduplication guidelines
- Add PyInstaller spec file and build script for creating standalone executables
- Add install.sh for curl | sh installation from GitHub releases
- Add GitHub Actions workflow for multi-platform builds (macOS, Linux, Windows)
- Move sandbox-only deps (playwright, ipython, libtmux, etc.) to optional extras
- Make google-cloud-aiplatform optional ([vertex] extra) to reduce binary size
- Use lazy imports in tool actions to avoid loading sandbox deps at startup
- Add -v/--version flag to CLI
- Add website and Discord links to completion message
- Binary size: ~97MB (down from ~120MB with all deps)
- update_todo: add `updates` param for bulk updates in one call
- mark_todo_done: add `todo_ids` param to mark multiple todos done
- mark_todo_pending: add `todo_ids` param to mark multiple pending
- delete_todo: add `todo_ids` param to delete multiple todos
- Increase todo renderer display limit from 10 to 25
- Maintains backward compatibility with single-ID usage
- Update prompts to keep todos short-horizon and dynamic
Introduces scan mode selection to control testing depth and methodology:
- quick: optimized for CI/CD, focuses on recent changes and high-impact vulns
- standard: balanced coverage with systematic methodology
- deep: exhaustive testing with hierarchical agent swarm (now default)
Each mode has dedicated prompt modules with detailed pentesting guidelines
covering reconnaissance, mapping, business logic analysis, exploitation,
and vulnerability chaining strategies.
Closes#152
- Add new todo tool with create, list, update, mark_done, mark_pending, delete actions
- Each subagent has isolated todo storage keyed by agent_id
- Support bulk todo creation via JSON array or bullet list
- Add TUI renderers for all todo actions with status markers
- Update notes tool to remove priority and todo-related functionality
- Add task tracking guidance to StrixAgent system prompt
- Fix instruction file error handling in CLI
Replace timeout=None with configurable timeouts (120s execution, 10s connect)
to prevent hung sandbox connections from blocking indefinitely.
Configurable via STRIX_SANDBOX_EXECUTION_TIMEOUT and STRIX_SANDBOX_CONNECT_TIMEOUT
environment variables.