Overhaul Feynman harness: streamline agents, prompts, and extensions

Remove legacy chains, skills, and config modules. Add citation agent, SYSTEM.md, modular research-tools extension, and web-access layer. Add ralph-wiggum to Pi package stack for long-running loops. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 14:59:30 -07:00
parent d23e679331
commit 406d50b3ff
60 changed files with 2994 additions and 3191 deletions
--- a/prompts/autoresearch.md
+++ b/prompts/autoresearch.md
@@ -1,19 +1,32 @@
 ---
-description: Turn a research idea into a paper-oriented end-to-end run with literature, hypotheses, experiments when possible, and a draft artifact.
+description: Autonomous experiment loop — try ideas, measure results, keep what works, discard what doesn't, repeat.
 ---
-Run an autoresearch workflow for: $@
+Start an autoresearch optimization loop for: $@

-Requirements:
- Prefer the project `auto` chain or the `planner` + `researcher` + `verifier` + `writer` subagents when the task is broad enough to benefit from decomposition.
- If the run is likely to take a while, or the user wants it detached, launch the subagent workflow in background with `clarify: false, async: true` and report how to inspect status.
- Start by clarifying the research objective, scope, and target contribution.
- Search for the strongest relevant primary sources first.
- If the topic is current, product-oriented, market-facing, or asks about latest developments, start with `web_search` and `fetch_content`.
- Use `alpha_search` for academic background or paper-centric parts of the topic, but do not rely on it alone for current topics.
- Build a compact evidence table before committing to a paper narrative.
- If experiments are feasible in the current environment, design and run the smallest experiment that materially reduces uncertainty.
- If experiments are not feasible, produce a paper-style draft that is explicit about missing validation and limitations.
- Produce one final durable markdown artifact for the user-facing result.
- If the result is a paper-style draft, save it to `papers/`; otherwise save it to `outputs/`.
- Do not create extra user-facing intermediate markdown files unless the user explicitly asks for them.
- End with a `Sources` section containing direct URLs for every source used.
+This command uses pi-autoresearch. Enter autoresearch mode and begin the autonomous experiment loop.
+
+## Behavior
+
+- If `autoresearch.md` and `autoresearch.jsonl` already exist in the project, resume the existing session with the user's input as additional context.
+- Otherwise, gather the optimization target from the user:
+  - What to optimize (test speed, bundle size, training loss, build time, etc.)
+  - The benchmark command to run
+  - The metric name, unit, and direction (lower/higher is better)
+  - Files in scope for changes
+- Then initialize the session: create `autoresearch.md`, `autoresearch.sh`, run the baseline, and start looping.
+
+## Loop
+
+Each iteration: edit → commit → `run_experiment` → `log_experiment` → keep or revert → repeat. Do not stop unless interrupted or `maxIterations` is reached.
+
+## Key tools
+
+- `init_experiment` — one-time session config (name, metric, unit, direction)
+- `run_experiment` — run the benchmark command, capture output and wall-clock time
+- `log_experiment` — record result, auto-commit, update dashboard
+
+## Subcommands
+
+- `/autoresearch <text>` — start or resume the loop
+- `/autoresearch off` — stop the loop, keep data
+- `/autoresearch clear` — delete all state and start fresh