STR-39: expand source-aware whitebox workflows and wiki memory

This commit is contained in:
bearsyankees
2026-03-19 19:33:16 -06:00
parent afb85c21b1
commit f65a97f6b2
24 changed files with 768 additions and 104 deletions

View File

@@ -156,6 +156,9 @@ strix --target https://your-app.com --instruction "Perform authenticated testing
# Multi-target testing (source code + deployed app) # Multi-target testing (source code + deployed app)
strix -t https://github.com/org/app -t https://your-app.com strix -t https://github.com/org/app -t https://your-app.com
# White-box source-aware scan (local repository)
strix --target ./app-directory --scan-mode standard
# Focused testing with custom instructions # Focused testing with custom instructions
strix --target api.your-app.com --instruction "Focus on business logic flaws and IDOR vulnerabilities" strix --target api.your-app.com --instruction "Focus on business logic flaws and IDOR vulnerabilities"

View File

@@ -97,7 +97,9 @@ RUN mkdir -p /home/pentester/.npm-global
RUN npm install -g retire@latest && \ RUN npm install -g retire@latest && \
npm install -g eslint@latest && \ npm install -g eslint@latest && \
npm install -g js-beautify@latest npm install -g js-beautify@latest && \
npm install -g @ast-grep/cli@latest && \
npm install -g tree-sitter-cli@latest
WORKDIR /home/pentester/tools WORKDIR /home/pentester/tools
RUN git clone https://github.com/aravind0x7/JS-Snooper.git && \ RUN git clone https://github.com/aravind0x7/JS-Snooper.git && \
@@ -110,6 +112,7 @@ RUN git clone https://github.com/aravind0x7/JS-Snooper.git && \
USER root USER root
RUN curl -sSfL https://raw.githubusercontent.com/trufflesecurity/trufflehog/main/scripts/install.sh | sh -s -- -b /usr/local/bin RUN curl -sSfL https://raw.githubusercontent.com/trufflesecurity/trufflehog/main/scripts/install.sh | sh -s -- -b /usr/local/bin
RUN curl -sSfL https://raw.githubusercontent.com/gitleaks/gitleaks/master/install.sh | sh -s -- -b /usr/local/bin
RUN apt-get update && apt-get install -y zaproxy RUN apt-get update && apt-get install -y zaproxy

View File

@@ -51,7 +51,7 @@ Configure Strix using environment variables or a config file.
## Docker Configuration ## Docker Configuration
<ParamField path="STRIX_IMAGE" default="ghcr.io/usestrix/strix-sandbox:0.1.11" type="string"> <ParamField path="STRIX_IMAGE" default="ghcr.io/usestrix/strix-sandbox:0.1.12" type="string">
Docker image to use for the sandbox container. Docker image to use for the sandbox container.
</ParamField> </ParamField>

View File

@@ -45,13 +45,21 @@ Strix runs inside a Kali Linux-based Docker container with a comprehensive set o
| [js-beautify](https://github.com/beautifier/js-beautify) | JavaScript deobfuscation | | [js-beautify](https://github.com/beautifier/js-beautify) | JavaScript deobfuscation |
| [JSHint](https://jshint.com) | JavaScript code quality tool | | [JSHint](https://jshint.com) | JavaScript code quality tool |
## Source-Aware Analysis
| Tool | Description |
| ------------------------------------------------------- | --------------------------------------------- |
| [Semgrep](https://github.com/semgrep/semgrep) | Fast SAST and custom rule matching |
| [ast-grep](https://ast-grep.github.io) | Structural AST/CST-aware code search (`sg`) |
| [Tree-sitter](https://tree-sitter.github.io/tree-sitter/) | Syntax tree parsing and symbol extraction |
| [Bandit](https://bandit.readthedocs.io) | Python security linter |
## Secret Detection ## Secret Detection
| Tool | Description | | Tool | Description |
| ----------------------------------------------------------- | ------------------------------------- | | ----------------------------------------------------------- | ------------------------------------- |
| [TruffleHog](https://github.com/trufflesecurity/trufflehog) | Find secrets in code and history | | [TruffleHog](https://github.com/trufflesecurity/trufflehog) | Find secrets in code and history |
| [Semgrep](https://github.com/semgrep/semgrep) | Static analysis for security patterns | | [Gitleaks](https://github.com/gitleaks/gitleaks) | Detect hardcoded secrets in repositories |
| [Bandit](https://bandit.readthedocs.io) | Python security linter |
## Authentication Testing ## Authentication Testing
@@ -64,7 +72,7 @@ Strix runs inside a Kali Linux-based Docker container with a comprehensive set o
| Tool | Description | | Tool | Description |
| -------------------------- | ---------------------------------------------- | | -------------------------- | ---------------------------------------------- |
| [Trivy](https://trivy.dev) | Container and dependency vulnerability scanner | | [Trivy](https://trivy.dev) | Filesystem/container scanning for vulns, misconfigurations, secrets, and licenses |
## HTTP Proxy ## HTTP Proxy

View File

@@ -32,14 +32,18 @@ sqlmap -u "https://example.com/page?id=1"
### Code Analysis ### Code Analysis
```bash ```bash
# Search for secrets # Fast SAST triage
trufflehog filesystem ./
# Static analysis
semgrep --config auto ./src semgrep --config auto ./src
# Grep for patterns # Structural AST search
grep -r "password" ./ sg scan ./src
# Secret detection
gitleaks detect --source ./
trufflehog filesystem ./
# Supply-chain and misconfiguration checks
trivy fs ./
``` ```
### Custom Scripts ### Custom Scripts

View File

@@ -31,6 +31,8 @@ Balanced testing for routine security reviews. Best for:
**Duration**: 30 minutes to 1 hour **Duration**: 30 minutes to 1 hour
**White-box behavior**: Uses source-aware mapping and static triage to prioritize dynamic exploit validation paths.
## Deep ## Deep
```bash ```bash
@@ -44,6 +46,8 @@ Thorough penetration testing. Best for:
**Duration**: 1-4 hours depending on target complexity **Duration**: 1-4 hours depending on target complexity
**White-box behavior**: Runs broad source-aware triage (`semgrep`, AST structural search, secrets, supply-chain checks) and then systematically validates top candidates dynamically.
<Note> <Note>
Deep mode is the default. It explores edge cases, chained vulnerabilities, and complex attack paths. Deep mode is the default. It explores edge cases, chained vulnerabilities, and complex attack paths.
</Note> </Note>

View File

@@ -77,11 +77,12 @@ BLACK-BOX TESTING (domain/subdomain only):
WHITE-BOX TESTING (code provided): WHITE-BOX TESTING (code provided):
- MUST perform BOTH static AND dynamic analysis - MUST perform BOTH static AND dynamic analysis
- Static: Review code for vulnerabilities - Static: Use source-aware triage first to map risk quickly (`semgrep`, `ast-grep`, Tree-sitter tooling, `gitleaks`, `trufflehog`, `trivy fs`). Then review code for vulnerabilities
- Dynamic: Run the application and test live - Shared memory: Use notes as shared working memory; check existing `wiki` notes first (`list_notes`), then update one repo wiki note instead of creating duplicates
- NEVER rely solely on static code analysis - always test dynamically - Dynamic: Run the application and test live to validate exploitability
- You MUST begin at the very first step by running the code and testing live. - NEVER rely solely on static code analysis when dynamic validation is possible
- If dynamically running the code proves impossible after exhaustive attempts, pivot to just comprehensive static analysis. - Begin with fast source triage and dynamic run preparation in parallel; use static findings to prioritize live testing.
- If dynamically running the code proves impossible after exhaustive attempts, pivot to comprehensive static analysis.
- Try to infer how to run the code based on its structure and content. - Try to infer how to run the code based on its structure and content.
- FIX discovered vulnerabilities in code in same file. - FIX discovered vulnerabilities in code in same file.
- Test patches to confirm vulnerability removal. - Test patches to confirm vulnerability removal.
@@ -369,8 +370,12 @@ JAVASCRIPT ANALYSIS:
CODE ANALYSIS: CODE ANALYSIS:
- semgrep - Static analysis/SAST - semgrep - Static analysis/SAST
- ast-grep (sg) - Structural AST/CST-aware code search
- tree-sitter - Syntax-aware parsing and symbol extraction support
- bandit - Python security linter - bandit - Python security linter
- trufflehog - Secret detection in code - trufflehog - Secret detection in code
- gitleaks - Secret detection in repository content/history
- trivy fs - Filesystem vulnerability/misconfiguration/license/secret scanning
SPECIALIZED TOOLS: SPECIALIZED TOOLS:
- jwt_tool - JWT token manipulation - jwt_tool - JWT token manipulation

View File

@@ -37,7 +37,7 @@ class Config:
strix_disable_browser = "false" strix_disable_browser = "false"
# Runtime Configuration # Runtime Configuration
strix_image = "ghcr.io/usestrix/strix-sandbox:0.1.11" strix_image = "ghcr.io/usestrix/strix-sandbox:0.1.12"
strix_runtime_backend = "docker" strix_runtime_backend = "docker"
strix_sandbox_execution_timeout = "120" strix_sandbox_execution_timeout = "120"
strix_sandbox_connect_timeout = "10" strix_sandbox_connect_timeout = "10"

View File

@@ -75,7 +75,10 @@ async def run_cli(args: Any) -> None: # noqa: PLR0915
"diff_scope": getattr(args, "diff_scope", {"active": False}), "diff_scope": getattr(args, "diff_scope", {"active": False}),
} }
llm_config = LLMConfig(scan_mode=scan_mode) llm_config = LLMConfig(
scan_mode=scan_mode,
is_whitebox=bool(getattr(args, "local_sources", [])),
)
agent_config = { agent_config = {
"llm_config": llm_config, "llm_config": llm_config,
"max_iterations": 300, "max_iterations": 300,

View File

@@ -748,7 +748,9 @@ class StrixTUIApp(App): # type: ignore[misc]
def _build_agent_config(self, args: argparse.Namespace) -> dict[str, Any]: def _build_agent_config(self, args: argparse.Namespace) -> dict[str, Any]:
scan_mode = getattr(args, "scan_mode", "deep") scan_mode = getattr(args, "scan_mode", "deep")
llm_config = LLMConfig(scan_mode=scan_mode) llm_config = LLMConfig(
scan_mode=scan_mode, is_whitebox=bool(getattr(args, "local_sources", []))
)
config = { config = {
"llm_config": llm_config, "llm_config": llm_config,

View File

@@ -9,6 +9,7 @@ class LLMConfig:
skills: list[str] | None = None, skills: list[str] | None = None,
timeout: int | None = None, timeout: int | None = None,
scan_mode: str = "deep", scan_mode: str = "deep",
is_whitebox: bool = False,
): ):
self.model_name = model_name or Config.get("strix_llm") self.model_name = model_name or Config.get("strix_llm")
@@ -21,3 +22,4 @@ class LLMConfig:
self.timeout = timeout or int(Config.get("llm_timeout") or "300") self.timeout = timeout or int(Config.get("llm_timeout") or "300")
self.scan_mode = scan_mode if scan_mode in ["quick", "standard", "deep"] else "deep" self.scan_mode = scan_mode if scan_mode in ["quick", "standard", "deep"] else "deep"
self.is_whitebox = is_whitebox

View File

@@ -90,6 +90,8 @@ class LLM:
*list(self.config.skills or []), *list(self.config.skills or []),
f"scan_modes/{self.config.scan_mode}", f"scan_modes/{self.config.scan_mode}",
] ]
if self.config.is_whitebox:
skills_to_load.append("coordination/source_aware_whitebox")
skill_content = load_skills(skills_to_load) skill_content = load_skills(skills_to_load)
env.globals["get_skill"] = lambda name: skill_content.get(name, "") env.globals["get_skill"] = lambda name: skill_content.get(name, "")

View File

@@ -37,6 +37,10 @@ The skills are dynamically injected into the agent's system prompt, allowing it
| **`/reconnaissance`** | Advanced information gathering and enumeration techniques for comprehensive attack surface mapping | | **`/reconnaissance`** | Advanced information gathering and enumeration techniques for comprehensive attack surface mapping |
| **`/custom`** | Community-contributed skills for specialized or industry-specific testing scenarios | | **`/custom`** | Community-contributed skills for specialized or industry-specific testing scenarios |
Notable source-aware skills:
- `source_aware_whitebox` (coordination): white-box orchestration playbook
- `source_aware_sast` (custom): semgrep/AST/secrets/supply-chain static triage workflow
--- ---
## 🎨 Creating New Skills ## 🎨 Creating New Skills

View File

@@ -0,0 +1,58 @@
---
name: source-aware-whitebox
description: Coordination playbook for source-aware white-box testing with static triage and dynamic validation
---
# Source-Aware White-Box Coordination
Use this coordination playbook when repository source code is available.
## Objective
Increase white-box coverage by combining source-aware triage with dynamic validation. Source-aware tooling is recommended by default, but not mandatory when context suggests a better path.
## Recommended Workflow
1. Build a quick source map before deep exploitation.
2. Run first-pass static triage to rank high-risk paths.
3. Use triage outputs to prioritize dynamic PoC validation.
4. Keep findings evidence-driven: no report without validation.
5. Keep shared wiki memory current so all agents can reuse context.
## Source-Aware Triage Stack
- `semgrep`: fast security-first triage and custom pattern scans
- `ast-grep` (`sg`): structural pattern hunting and targeted repo mapping
- `tree-sitter`: syntax-aware parsing support for symbol and route extraction
- `gitleaks` + `trufflehog`: complementary secret detection (working tree and history coverage)
- `trivy fs`: dependency, misconfiguration, license, and secret checks
## Agent Delegation Guidance
- Keep child agents specialized by vulnerability/component as usual.
- For source-heavy subtasks, prefer creating child agents with `source_aware_sast` skill.
- Use source findings to shape payloads and endpoint selection for dynamic testing.
## Wiki Note Requirement (Source Map)
When source is present, maintain one wiki note per repository and keep it current.
Operational rules:
- At task start, call `list_notes` with `category=wiki` and reuse existing repo wiki if present.
- If no repo wiki exists, create one with `create_note` and `category=wiki`.
- Update the same wiki via `update_note`; avoid creating duplicate wiki notes for the same repo.
- Child agents should read wiki notes first, then extend with new evidence from their scope.
Recommended sections:
- Architecture overview
- Entrypoints and routing
- AuthN/AuthZ model
- High-risk sinks and trust boundaries
- Static scanner summary
- Dynamic validation follow-ups
## Validation Guardrails
- Static findings are hypotheses until validated.
- Dynamic exploitation evidence is still required before vulnerability reporting.
- Keep scanner output concise, deduplicated, and mapped to concrete code locations.

View File

@@ -0,0 +1,100 @@
---
name: source-aware-sast
description: Practical source-aware SAST and AST playbook for semgrep, ast-grep, gitleaks, and trivy fs
---
# Source-Aware SAST Playbook
Use this skill for source-heavy analysis where static and structural signals should guide dynamic testing.
## Fast Start
Run tools from repo root and store outputs in a dedicated artifact directory:
```bash
mkdir -p /workspace/.strix-source-aware
```
Before scanning, check shared wiki memory:
```text
1) list_notes(category="wiki")
2) Reuse matching repo wiki note if present
3) create_note(category="wiki") only if missing
```
## Semgrep First Pass
Use Semgrep as the default static triage pass:
```bash
semgrep --config auto --json --output /workspace/.strix-source-aware/semgrep.json .
```
If diff scope is active, restrict to changed files first, then expand only when needed.
## AST-Grep Structural Mapping
Use `sg` for structure-aware code hunting:
```bash
sg scan --json . > /workspace/.strix-source-aware/ast-grep.json
```
Target high-value patterns such as:
- missing auth checks near route handlers
- dynamic command/query construction
- unsafe deserialization or template execution paths
- file and path operations influenced by user input
## Tree-Sitter Assisted Repo Mapping
Use tree-sitter CLI for syntax-aware parsing when grep-level mapping is noisy:
```bash
tree-sitter parse -q <file>
```
Use outputs to improve route/symbol/sink maps for subsequent targeted scans.
## Secret and Supply Chain Coverage
Detect hardcoded credentials:
```bash
gitleaks detect --source . --report-format json --report-path /workspace/.strix-source-aware/gitleaks.json
trufflehog filesystem --json . > /workspace/.strix-source-aware/trufflehog.json
```
Run repository-wide dependency and config checks:
```bash
trivy fs --format json --output /workspace/.strix-source-aware/trivy-fs.json .
```
## Converting Static Signals Into Exploits
1. Rank candidates by impact and exploitability.
2. Trace source-to-sink flow for top candidates.
3. Build dynamic PoCs that reproduce the suspected issue.
4. Report only after dynamic validation succeeds.
## Wiki Update Template
Keep one wiki note per repository and update these sections:
```text
## Architecture
## Entrypoints
## AuthN/AuthZ
## High-Risk Sinks
## Static Findings Summary
## Dynamic Validation Follow-Ups
```
## Anti-Patterns
- Do not treat scanner output as final truth.
- Do not spend full cycles on low-signal pattern matches.
- Do not report source-only findings without validation evidence.
- Do not create multiple wiki notes for the same repository when one already exists.

View File

@@ -15,6 +15,9 @@ Thorough understanding before exploitation. Test every parameter, every endpoint
**Whitebox (source available)** **Whitebox (source available)**
- Map every file, module, and code path in the repository - Map every file, module, and code path in the repository
- Load and maintain shared `wiki` notes from the start (`list_notes(category="wiki")`), then continuously update one repo note
- Start with broad source-aware triage (`semgrep`, `ast-grep`, `gitleaks`, `trufflehog`, `trivy fs`) and use outputs to drive deep review
- Use syntax-aware parsing (Tree-sitter tooling) to improve symbol, route, and sink extraction quality
- Trace all entry points from HTTP handlers to database queries - Trace all entry points from HTTP handlers to database queries
- Document all authentication mechanisms and implementations - Document all authentication mechanisms and implementations
- Map authorization checks and access control model - Map authorization checks and access control model
@@ -25,7 +28,7 @@ Thorough understanding before exploitation. Test every parameter, every endpoint
- Identify all serialization/deserialization points - Identify all serialization/deserialization points
- Review file handling: upload, download, processing - Review file handling: upload, download, processing
- Understand the deployment model and infrastructure assumptions - Understand the deployment model and infrastructure assumptions
- Check all dependency versions against CVE databases - Check all dependency versions and repository risks against CVE/misconfiguration data
**Blackbox (no source)** **Blackbox (no source)**
- Exhaustive subdomain enumeration with multiple sources and tools - Exhaustive subdomain enumeration with multiple sources and tools

View File

@@ -15,6 +15,9 @@ Optimize for fast feedback on critical security issues. Skip exhaustive enumerat
**Whitebox (source available)** **Whitebox (source available)**
- Focus on recent changes: git diffs, new commits, modified files—these are most likely to contain fresh bugs - Focus on recent changes: git diffs, new commits, modified files—these are most likely to contain fresh bugs
- Read existing `wiki` notes first (`list_notes(category="wiki")`) to avoid remapping from scratch
- Run a fast static triage on changed files first (`semgrep`, then targeted `sg` queries)
- Run quick secret and dependency checks (`gitleaks`, `trufflehog`, `trivy fs`) scoped to changed areas when possible
- Identify security-sensitive patterns in changed code: auth checks, input handling, database queries, file operations - Identify security-sensitive patterns in changed code: auth checks, input handling, database queries, file operations
- Trace user input through modified code paths - Trace user input through modified code paths
- Check if security controls were modified or bypassed - Check if security controls were modified or bypassed

View File

@@ -15,11 +15,14 @@ Systematic testing across the full attack surface. Understand the application be
**Whitebox (source available)** **Whitebox (source available)**
- Map codebase structure: modules, entry points, routing - Map codebase structure: modules, entry points, routing
- Start by loading existing `wiki` notes (`list_notes(category="wiki")`) and update one shared repo note as mapping evolves
- Run `semgrep` first-pass triage to prioritize risky flows before deep manual review
- Use `ast-grep`/Tree-sitter-assisted structural search for route, sink, and trust-boundary mapping
- Identify architecture pattern (MVC, microservices, monolith) - Identify architecture pattern (MVC, microservices, monolith)
- Trace input vectors: forms, APIs, file uploads, headers, cookies - Trace input vectors: forms, APIs, file uploads, headers, cookies
- Review authentication and authorization flows - Review authentication and authorization flows
- Analyze database interactions and ORM usage - Analyze database interactions and ORM usage
- Check dependencies for known CVEs - Check dependencies and repo risks with `trivy fs`, `gitleaks`, and `trufflehog`
- Understand the data model and sensitive data locations - Understand the data model and sensitive data locations
**Blackbox (no source)** **Blackbox (no source)**

View File

@@ -39,6 +39,13 @@ def _run_agent_in_thread(
if inherited_messages if inherited_messages
else "started with a fresh context" else "started with a fresh context"
) )
wiki_memory_instruction = ""
if getattr(getattr(agent, "llm_config", None), "is_whitebox", False):
wiki_memory_instruction = (
'\n - White-box memory: call list_notes(category="wiki") early, '
"reuse existing repo wiki notes, and update the same note instead of "
"creating duplicates"
)
task_xml = f"""<agent_delegation> task_xml = f"""<agent_delegation>
<identity> <identity>
@@ -64,6 +71,7 @@ def _run_agent_in_thread(
- All agents share /workspace directory and proxy history for better collaboration - All agents share /workspace directory and proxy history for better collaboration
- You can see files created by other agents and proxy traffic from previous work - You can see files created by other agents and proxy traffic from previous work
- Build upon previous work but focus on your specific delegated task - Build upon previous work but focus on your specific delegated task
{wiki_memory_instruction}
</instructions> </instructions>
</agent_delegation>""" </agent_delegation>"""
@@ -233,13 +241,21 @@ def create_agent(
timeout = None timeout = None
scan_mode = "deep" scan_mode = "deep"
is_whitebox = False
if parent_agent and hasattr(parent_agent, "llm_config"): if parent_agent and hasattr(parent_agent, "llm_config"):
if hasattr(parent_agent.llm_config, "timeout"): if hasattr(parent_agent.llm_config, "timeout"):
timeout = parent_agent.llm_config.timeout timeout = parent_agent.llm_config.timeout
if hasattr(parent_agent.llm_config, "scan_mode"): if hasattr(parent_agent.llm_config, "scan_mode"):
scan_mode = parent_agent.llm_config.scan_mode scan_mode = parent_agent.llm_config.scan_mode
if hasattr(parent_agent.llm_config, "is_whitebox"):
is_whitebox = parent_agent.llm_config.is_whitebox
llm_config = LLMConfig(skills=skill_list, timeout=timeout, scan_mode=scan_mode) llm_config = LLMConfig(
skills=skill_list,
timeout=timeout,
scan_mode=scan_mode,
is_whitebox=is_whitebox,
)
agent_config = { agent_config = {
"llm_config": llm_config, "llm_config": llm_config,

View File

@@ -1,11 +1,175 @@
import json
import threading
import uuid import uuid
from datetime import UTC, datetime from datetime import UTC, datetime
from pathlib import Path
from typing import Any from typing import Any
from strix.tools.registry import register_tool from strix.tools.registry import register_tool
_notes_storage: dict[str, dict[str, Any]] = {} _notes_storage: dict[str, dict[str, Any]] = {}
_VALID_NOTE_CATEGORIES = ["general", "findings", "methodology", "questions", "plan", "wiki"]
_notes_lock = threading.RLock()
_loaded_notes_run_dir: str | None = None
def _get_run_dir() -> Path | None:
try:
from strix.telemetry.tracer import get_global_tracer
tracer = get_global_tracer()
if not tracer:
return None
return tracer.get_run_dir()
except (ImportError, OSError, RuntimeError):
return None
def _get_notes_jsonl_path() -> Path | None:
run_dir = _get_run_dir()
if not run_dir:
return None
notes_dir = run_dir / "notes"
notes_dir.mkdir(parents=True, exist_ok=True)
return notes_dir / "notes.jsonl"
def _append_note_event(op: str, note_id: str, note: dict[str, Any] | None = None) -> None:
notes_path = _get_notes_jsonl_path()
if not notes_path:
return
event: dict[str, Any] = {
"timestamp": datetime.now(UTC).isoformat(),
"op": op,
"note_id": note_id,
}
if note is not None:
event["note"] = note
with notes_path.open("a", encoding="utf-8") as f:
f.write(f"{json.dumps(event, ensure_ascii=True)}\n")
def _load_notes_from_jsonl(notes_path: Path) -> dict[str, dict[str, Any]]:
hydrated: dict[str, dict[str, Any]] = {}
if not notes_path.exists():
return hydrated
with notes_path.open(encoding="utf-8") as f:
for raw_line in f:
line = raw_line.strip()
if not line:
continue
try:
event = json.loads(line)
except json.JSONDecodeError:
continue
op = str(event.get("op", "")).strip().lower()
note_id = str(event.get("note_id", "")).strip()
if not note_id or op not in {"create", "update", "delete"}:
continue
if op == "delete":
hydrated.pop(note_id, None)
continue
note = event.get("note")
if not isinstance(note, dict):
continue
existing = hydrated.get(note_id, {})
existing.update(note)
hydrated[note_id] = existing
return hydrated
def _ensure_notes_loaded() -> None:
global _loaded_notes_run_dir # noqa: PLW0603
run_dir = _get_run_dir()
run_dir_key = str(run_dir.resolve()) if run_dir else "__no_run_dir__"
if _loaded_notes_run_dir == run_dir_key:
return
_notes_storage.clear()
notes_path = _get_notes_jsonl_path()
if notes_path:
_notes_storage.update(_load_notes_from_jsonl(notes_path))
for note_id, note in _notes_storage.items():
if note.get("category") == "wiki":
_persist_wiki_note(note_id, note)
_loaded_notes_run_dir = run_dir_key
def _sanitize_wiki_title(title: str) -> str:
cleaned = "".join(ch.lower() if ch.isalnum() else "-" for ch in title.strip())
slug = "-".join(part for part in cleaned.split("-") if part)
return slug or "wiki-note"
def _get_wiki_directory() -> Path | None:
try:
run_dir = _get_run_dir()
if not run_dir:
return None
wiki_dir = run_dir / "wiki"
wiki_dir.mkdir(parents=True, exist_ok=True)
except OSError:
return None
else:
return wiki_dir
def _get_wiki_note_path(note_id: str, note: dict[str, Any]) -> Path | None:
wiki_dir = _get_wiki_directory()
if not wiki_dir:
return None
wiki_filename = note.get("wiki_filename")
if not isinstance(wiki_filename, str) or not wiki_filename.strip():
title = note.get("title", "wiki-note")
wiki_filename = f"{note_id}-{_sanitize_wiki_title(str(title))}.md"
note["wiki_filename"] = wiki_filename
return wiki_dir / wiki_filename
def _persist_wiki_note(note_id: str, note: dict[str, Any]) -> None:
wiki_path = _get_wiki_note_path(note_id, note)
if not wiki_path:
return
tags = note.get("tags", [])
tags_line = ", ".join(str(tag) for tag in tags) if isinstance(tags, list) and tags else "none"
content = (
f"# {note.get('title', 'Wiki Note')}\n\n"
f"**Note ID:** {note_id}\n"
f"**Created:** {note.get('created_at', '')}\n"
f"**Updated:** {note.get('updated_at', '')}\n"
f"**Tags:** {tags_line}\n\n"
"## Content\n\n"
f"{note.get('content', '')}\n"
)
wiki_path.write_text(content, encoding="utf-8")
def _remove_wiki_note(note_id: str, note: dict[str, Any]) -> None:
wiki_path = _get_wiki_note_path(note_id, note)
if not wiki_path:
return
if wiki_path.exists():
wiki_path.unlink()
def _filter_notes( def _filter_notes(
@@ -13,6 +177,7 @@ def _filter_notes(
tags: list[str] | None = None, tags: list[str] | None = None,
search_query: str | None = None, search_query: str | None = None,
) -> list[dict[str, Any]]: ) -> list[dict[str, Any]]:
_ensure_notes_loaded()
filtered_notes = [] filtered_notes = []
for note_id, note in _notes_storage.items(): for note_id, note in _notes_storage.items():
@@ -40,49 +205,66 @@ def _filter_notes(
@register_tool(sandbox_execution=False) @register_tool(sandbox_execution=False)
def create_note( def create_note( # noqa: PLR0911
title: str, title: str,
content: str, content: str,
category: str = "general", category: str = "general",
tags: list[str] | None = None, tags: list[str] | None = None,
) -> dict[str, Any]: ) -> dict[str, Any]:
try: with _notes_lock:
if not title or not title.strip(): try:
return {"success": False, "error": "Title cannot be empty", "note_id": None} _ensure_notes_loaded()
if not content or not content.strip(): if not title or not title.strip():
return {"success": False, "error": "Content cannot be empty", "note_id": None} return {"success": False, "error": "Title cannot be empty", "note_id": None}
valid_categories = ["general", "findings", "methodology", "questions", "plan"] if not content or not content.strip():
if category not in valid_categories: return {"success": False, "error": "Content cannot be empty", "note_id": None}
return {
"success": False, if category not in _VALID_NOTE_CATEGORIES:
"error": f"Invalid category. Must be one of: {', '.join(valid_categories)}", return {
"note_id": None, "success": False,
"error": (
f"Invalid category. Must be one of: {', '.join(_VALID_NOTE_CATEGORIES)}"
),
"note_id": None,
}
note_id = ""
for _ in range(20):
candidate = str(uuid.uuid4())[:5]
if candidate not in _notes_storage:
note_id = candidate
break
if not note_id:
return {"success": False, "error": "Failed to allocate note ID", "note_id": None}
timestamp = datetime.now(UTC).isoformat()
note = {
"title": title.strip(),
"content": content.strip(),
"category": category,
"tags": tags or [],
"created_at": timestamp,
"updated_at": timestamp,
} }
note_id = str(uuid.uuid4())[:5] _notes_storage[note_id] = note
timestamp = datetime.now(UTC).isoformat() _append_note_event("create", note_id, note)
if category == "wiki":
_persist_wiki_note(note_id, note)
note = { except (ValueError, TypeError) as e:
"title": title.strip(), return {"success": False, "error": f"Failed to create note: {e}", "note_id": None}
"content": content.strip(), except OSError as e:
"category": category, return {"success": False, "error": f"Failed to persist wiki note: {e}", "note_id": None}
"tags": tags or [], else:
"created_at": timestamp, return {
"updated_at": timestamp, "success": True,
} "note_id": note_id,
"message": f"Note '{title}' created successfully",
_notes_storage[note_id] = note }
except (ValueError, TypeError) as e:
return {"success": False, "error": f"Failed to create note: {e}", "note_id": None}
else:
return {
"success": True,
"note_id": note_id,
"message": f"Note '{title}' created successfully",
}
@register_tool(sandbox_execution=False) @register_tool(sandbox_execution=False)
@@ -91,22 +273,23 @@ def list_notes(
tags: list[str] | None = None, tags: list[str] | None = None,
search: str | None = None, search: str | None = None,
) -> dict[str, Any]: ) -> dict[str, Any]:
try: with _notes_lock:
filtered_notes = _filter_notes(category=category, tags=tags, search_query=search) try:
filtered_notes = _filter_notes(category=category, tags=tags, search_query=search)
return { return {
"success": True, "success": True,
"notes": filtered_notes, "notes": filtered_notes,
"total_count": len(filtered_notes), "total_count": len(filtered_notes),
} }
except (ValueError, TypeError) as e: except (ValueError, TypeError) as e:
return { return {
"success": False, "success": False,
"error": f"Failed to list notes: {e}", "error": f"Failed to list notes: {e}",
"notes": [], "notes": [],
"total_count": 0, "total_count": 0,
} }
@register_tool(sandbox_execution=False) @register_tool(sandbox_execution=False)
@@ -116,49 +299,66 @@ def update_note(
content: str | None = None, content: str | None = None,
tags: list[str] | None = None, tags: list[str] | None = None,
) -> dict[str, Any]: ) -> dict[str, Any]:
try: with _notes_lock:
if note_id not in _notes_storage: try:
return {"success": False, "error": f"Note with ID '{note_id}' not found"} _ensure_notes_loaded()
note = _notes_storage[note_id] if note_id not in _notes_storage:
return {"success": False, "error": f"Note with ID '{note_id}' not found"}
if title is not None: note = _notes_storage[note_id]
if not title.strip():
return {"success": False, "error": "Title cannot be empty"}
note["title"] = title.strip()
if content is not None: if title is not None:
if not content.strip(): if not title.strip():
return {"success": False, "error": "Content cannot be empty"} return {"success": False, "error": "Title cannot be empty"}
note["content"] = content.strip() note["title"] = title.strip()
if tags is not None: if content is not None:
note["tags"] = tags if not content.strip():
return {"success": False, "error": "Content cannot be empty"}
note["content"] = content.strip()
note["updated_at"] = datetime.now(UTC).isoformat() if tags is not None:
note["tags"] = tags
return { note["updated_at"] = datetime.now(UTC).isoformat()
"success": True, _append_note_event("update", note_id, note)
"message": f"Note '{note['title']}' updated successfully", if note.get("category") == "wiki":
} _persist_wiki_note(note_id, note)
except (ValueError, TypeError) as e: return {
return {"success": False, "error": f"Failed to update note: {e}"} "success": True,
"message": f"Note '{note['title']}' updated successfully",
}
except (ValueError, TypeError) as e:
return {"success": False, "error": f"Failed to update note: {e}"}
except OSError as e:
return {"success": False, "error": f"Failed to persist wiki note: {e}"}
@register_tool(sandbox_execution=False) @register_tool(sandbox_execution=False)
def delete_note(note_id: str) -> dict[str, Any]: def delete_note(note_id: str) -> dict[str, Any]:
try: with _notes_lock:
if note_id not in _notes_storage: try:
return {"success": False, "error": f"Note with ID '{note_id}' not found"} _ensure_notes_loaded()
note_title = _notes_storage[note_id]["title"] if note_id not in _notes_storage:
del _notes_storage[note_id] return {"success": False, "error": f"Note with ID '{note_id}' not found"}
except (ValueError, TypeError) as e: note = _notes_storage[note_id]
return {"success": False, "error": f"Failed to delete note: {e}"} note_title = note["title"]
else: if note.get("category") == "wiki":
return { _remove_wiki_note(note_id, note)
"success": True, del _notes_storage[note_id]
"message": f"Note '{note_title}' deleted successfully", _append_note_event("delete", note_id)
}
except (ValueError, TypeError) as e:
return {"success": False, "error": f"Failed to delete note: {e}"}
except OSError as e:
return {"success": False, "error": f"Failed to delete wiki note: {e}"}
else:
return {
"success": True,
"message": f"Note '{note_title}' deleted successfully",
}

View File

@@ -2,7 +2,8 @@
<tool name="create_note"> <tool name="create_note">
<description>Create a personal note for observations, findings, and research during the scan.</description> <description>Create a personal note for observations, findings, and research during the scan.</description>
<details>Use this tool for documenting discoveries, observations, methodology notes, and questions. <details>Use this tool for documenting discoveries, observations, methodology notes, and questions.
This is your personal notepad for recording information you want to remember or reference later. This is your personal and shared run memory for recording information you want to remember or reference later.
Use category "wiki" for repository source maps shared across agents in the same run.
For tracking actionable tasks, use the todo tool instead.</details> For tracking actionable tasks, use the todo tool instead.</details>
<parameters> <parameters>
<parameter name="title" type="string" required="true"> <parameter name="title" type="string" required="true">
@@ -12,7 +13,7 @@
<description>Content of the note</description> <description>Content of the note</description>
</parameter> </parameter>
<parameter name="category" type="string" required="false"> <parameter name="category" type="string" required="false">
<description>Category to organize the note (default: "general", "findings", "methodology", "questions", "plan")</description> <description>Category to organize the note (default: "general", "findings", "methodology", "questions", "plan", "wiki")</description>
</parameter> </parameter>
<parameter name="tags" type="string" required="false"> <parameter name="tags" type="string" required="false">
<description>Tags for categorization</description> <description>Tags for categorization</description>
@@ -122,6 +123,11 @@ The /api/internal/* endpoints are high priority as they appear to lack authentic
<function=list_notes> <function=list_notes>
<parameter=search>admin</parameter> <parameter=search>admin</parameter>
<parameter=category>findings</parameter> <parameter=category>findings</parameter>
</function>
# Load shared repository wiki notes
<function=list_notes>
<parameter=category>wiki</parameter>
</function> </function>
</examples> </examples>
</tool> </tool>

View File

@@ -0,0 +1,28 @@
from strix.llm.config import LLMConfig
from strix.llm.llm import LLM
def test_llm_config_whitebox_defaults_to_false(monkeypatch) -> None:
monkeypatch.setenv("STRIX_LLM", "openai/gpt-5")
config = LLMConfig()
assert config.is_whitebox is False
def test_llm_config_whitebox_can_be_enabled(monkeypatch) -> None:
monkeypatch.setenv("STRIX_LLM", "openai/gpt-5")
config = LLMConfig(is_whitebox=True)
assert config.is_whitebox is True
def test_whitebox_prompt_loads_source_aware_coordination_skill(monkeypatch) -> None:
monkeypatch.setenv("STRIX_LLM", "openai/gpt-5")
whitebox_llm = LLM(LLMConfig(scan_mode="quick", is_whitebox=True), agent_name="StrixAgent")
assert "<source_aware_whitebox>" in whitebox_llm.system_prompt
assert "Begin with fast source triage" in whitebox_llm.system_prompt
assert "You MUST begin at the very first step by running the code and testing live." not in (
whitebox_llm.system_prompt
)
non_whitebox_llm = LLM(LLMConfig(scan_mode="quick", is_whitebox=False), agent_name="StrixAgent")
assert "<source_aware_whitebox>" not in non_whitebox_llm.system_prompt

View File

@@ -0,0 +1,107 @@
from types import SimpleNamespace
import strix.agents as agents_module
from strix.llm.config import LLMConfig
from strix.tools.agents_graph import agents_graph_actions
def test_create_agent_inherits_parent_whitebox_flag(monkeypatch) -> None:
monkeypatch.setenv("STRIX_LLM", "openai/gpt-5")
agents_graph_actions._agent_graph["nodes"].clear()
agents_graph_actions._agent_graph["edges"].clear()
agents_graph_actions._agent_messages.clear()
agents_graph_actions._running_agents.clear()
agents_graph_actions._agent_instances.clear()
agents_graph_actions._agent_states.clear()
parent_id = "parent-agent"
parent_llm = LLMConfig(timeout=123, scan_mode="standard", is_whitebox=True)
agents_graph_actions._agent_instances[parent_id] = SimpleNamespace(
llm_config=parent_llm,
non_interactive=True,
)
captured_config: dict[str, object] = {}
class FakeStrixAgent:
def __init__(self, config: dict[str, object]):
captured_config["agent_config"] = config
class FakeThread:
def __init__(self, target, args, daemon, name):
self.target = target
self.args = args
self.daemon = daemon
self.name = name
def start(self) -> None:
return None
monkeypatch.setattr(agents_module, "StrixAgent", FakeStrixAgent)
monkeypatch.setattr(agents_graph_actions.threading, "Thread", FakeThread)
agent_state = SimpleNamespace(
agent_id=parent_id,
get_conversation_history=list,
)
result = agents_graph_actions.create_agent(
agent_state=agent_state,
task="source-aware child task",
name="SourceAwareChild",
inherit_context=False,
)
assert result["success"] is True
llm_config = captured_config["agent_config"]["llm_config"]
assert isinstance(llm_config, LLMConfig)
assert llm_config.timeout == 123
assert llm_config.scan_mode == "standard"
assert llm_config.is_whitebox is True
def test_delegation_prompt_includes_wiki_memory_instruction_in_whitebox(monkeypatch) -> None:
monkeypatch.setenv("STRIX_LLM", "openai/gpt-5")
agents_graph_actions._agent_graph["nodes"].clear()
agents_graph_actions._agent_graph["edges"].clear()
agents_graph_actions._agent_messages.clear()
agents_graph_actions._running_agents.clear()
agents_graph_actions._agent_instances.clear()
agents_graph_actions._agent_states.clear()
parent_id = "parent-1"
child_id = "child-1"
agents_graph_actions._agent_graph["nodes"][parent_id] = {"name": "Parent", "status": "running"}
agents_graph_actions._agent_graph["nodes"][child_id] = {"name": "Child", "status": "running"}
class FakeState:
def __init__(self) -> None:
self.agent_id = child_id
self.agent_name = "Child"
self.parent_id = parent_id
self.task = "analyze source risks"
self.stop_requested = False
self.messages: list[tuple[str, str]] = []
def add_message(self, role: str, content: str) -> None:
self.messages.append((role, content))
def model_dump(self) -> dict[str, str]:
return {"agent_id": self.agent_id}
class FakeAgent:
def __init__(self) -> None:
self.llm_config = LLMConfig(is_whitebox=True)
async def agent_loop(self, _task: str) -> dict[str, bool]:
return {"ok": True}
state = FakeState()
agent = FakeAgent()
result = agents_graph_actions._run_agent_in_thread(agent, state, inherited_messages=[])
assert result["result"] == {"ok": True}
task_messages = [msg for role, msg in state.messages if role == "user"]
assert task_messages
assert 'list_notes(category="wiki")' in task_messages[-1]

View File

@@ -0,0 +1,100 @@
from pathlib import Path
from strix.telemetry.tracer import Tracer, get_global_tracer, set_global_tracer
from strix.tools.notes import notes_actions
def _reset_notes_state() -> None:
notes_actions._notes_storage.clear()
notes_actions._loaded_notes_run_dir = None
def test_wiki_notes_are_persisted_and_removed(tmp_path: Path, monkeypatch) -> None:
monkeypatch.chdir(tmp_path)
_reset_notes_state()
previous_tracer = get_global_tracer()
tracer = Tracer("wiki-test-run")
set_global_tracer(tracer)
try:
created = notes_actions.create_note(
title="Repo Map",
content="## Architecture\n- monolith",
category="wiki",
tags=["source-map"],
)
assert created["success"] is True
note_id = created["note_id"]
assert isinstance(note_id, str)
note = notes_actions._notes_storage[note_id]
wiki_filename = note.get("wiki_filename")
assert isinstance(wiki_filename, str)
wiki_path = tmp_path / "strix_runs" / "wiki-test-run" / "wiki" / wiki_filename
assert wiki_path.exists()
assert "## Architecture" in wiki_path.read_text(encoding="utf-8")
updated = notes_actions.update_note(
note_id=note_id,
content="## Architecture\n- service-oriented",
)
assert updated["success"] is True
assert "service-oriented" in wiki_path.read_text(encoding="utf-8")
deleted = notes_actions.delete_note(note_id=note_id)
assert deleted["success"] is True
assert wiki_path.exists() is False
finally:
_reset_notes_state()
set_global_tracer(previous_tracer) # type: ignore[arg-type]
def test_notes_jsonl_replay_survives_memory_reset(tmp_path: Path, monkeypatch) -> None:
monkeypatch.chdir(tmp_path)
_reset_notes_state()
previous_tracer = get_global_tracer()
tracer = Tracer("notes-replay-run")
set_global_tracer(tracer)
try:
created = notes_actions.create_note(
title="Auth findings",
content="initial finding",
category="findings",
tags=["auth"],
)
assert created["success"] is True
note_id = created["note_id"]
assert isinstance(note_id, str)
notes_path = tmp_path / "strix_runs" / "notes-replay-run" / "notes" / "notes.jsonl"
assert notes_path.exists() is True
_reset_notes_state()
listed = notes_actions.list_notes(category="findings")
assert listed["success"] is True
assert listed["total_count"] == 1
assert listed["notes"][0]["note_id"] == note_id
updated = notes_actions.update_note(note_id=note_id, content="updated finding")
assert updated["success"] is True
_reset_notes_state()
listed_after_update = notes_actions.list_notes(search="updated finding")
assert listed_after_update["success"] is True
assert listed_after_update["total_count"] == 1
assert listed_after_update["notes"][0]["note_id"] == note_id
deleted = notes_actions.delete_note(note_id=note_id)
assert deleted["success"] is True
_reset_notes_state()
listed_after_delete = notes_actions.list_notes(category="findings")
assert listed_after_delete["success"] is True
assert listed_after_delete["total_count"] == 0
finally:
_reset_notes_state()
set_global_tracer(previous_tracer) # type: ignore[arg-type]