feynman/autoresearch.md at 82790506d3b9dfb0607403fb4de812fc67674bbb

Files

Advait Paliwal 9b1e04f128 Add system resource detection, Docker execution skill, and environment-aware recommendations

- TUI header now shows CPU cores, RAM, GPU, and Docker availability
- System prompt uses resource info to recommend execution environments
- Docker skill for running experiment code in isolated containers
- Renamed docker-sandbox skill to docker (Feynman stays on host, code runs in containers)
- Updated README and website to cite Docker alongside Agent Computer

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-23 18:31:37 -07:00

2.1 KiB

Raw Blame History

description, args, section, topLevelCli

description	args	section	topLevelCli
Autonomous experiment loop — try ideas, measure results, keep what works, discard what doesn't, repeat.	<idea>	Research Workflows	true

Start an autoresearch optimization loop for: $@

This command uses pi-autoresearch.

Step 1: Gather

If autoresearch.md and autoresearch.jsonl already exist, ask the user if they want to resume or start fresh.

Otherwise, collect the following from the user before doing anything else:

What to optimize (test speed, bundle size, training loss, build time, etc.)
The benchmark command to run
The metric name, unit, and direction (lower/higher is better)
Files in scope for changes
Maximum number of iterations (default: 20)

Step 2: Environment

Ask the user where to run:

Local — run in the current working directory
New git branch — create a branch so main stays clean
Virtual environment — create an isolated venv/conda env first
Docker — run experiment code inside an isolated Docker container
Cloud — delegate to a remote Agent Computer machine via /delegate

Do not proceed without a clear answer.

Step 3: Confirm

Present the full plan to the user before starting:

Optimization target: [metric] ([direction])
Benchmark command:   [command]
Files in scope:      [files]
Environment:         [chosen environment]
Max iterations:      [N]

Ask the user to confirm. Do not start the loop without explicit approval.

Step 4: Run

Initialize the session: create autoresearch.md, autoresearch.sh, run the baseline, and start looping.

Each iteration: edit → commit → run_experiment → log_experiment → keep or revert → repeat. Do not stop unless interrupted or maxIterations is reached.

Key tools

init_experiment — one-time session config (name, metric, unit, direction)
run_experiment — run the benchmark command, capture output and wall-clock time
log_experiment — record result, auto-commit, update dashboard

Subcommands

/autoresearch <text> — start or resume the loop
/autoresearch off — stop the loop, keep data
/autoresearch clear — delete all state and start fresh

2.1 KiB Raw Blame History