Files
feynman/prompts/replicate.md
Advait Paliwal 7024a86024 Replace Pi tool registrations with skills and CLI integration
- Remove all manually registered Pi tools (alpha_search, alpha_get_paper,
  alpha_ask_paper, alpha_annotate_paper, alpha_list_annotations,
  alpha_read_code, session_search, preview_file) and their wrappers
  (alpha.ts, preview.ts, session-search.ts, alpha-tools.test.ts)
- Add Pi skill files for alpha-research, session-search, preview,
  modal-compute, and runpod-compute in skills/
- Sync skills to ~/.feynman/agent/skills/ on startup via syncBundledAssets
- Add node_modules/.bin to Pi subprocess PATH so alpha CLI is accessible
- Add /outputs extension command to browse research artifacts via dialog
- Add Modal and RunPod as execution environments in /replicate and
  /autoresearch prompts
- Remove redundant /alpha-login /alpha-logout /alpha-status REPL commands
  (feynman alpha CLI still works)
- Update README, researcher agent, metadata, and website docs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 00:38:45 -07:00

25 lines
2.3 KiB
Markdown

---
description: Plan or execute a replication workflow for a paper, claim, or benchmark.
args: <paper>
section: Research Workflows
topLevelCli: true
---
Design a replication plan for: $@
## Workflow
1. **Extract** — Use the `researcher` subagent to pull implementation details from the target paper and any linked code. If `CHANGELOG.md` exists, read the most recent relevant entries before planning or resuming.
2. **Plan** — Determine what code, datasets, metrics, and environment are needed. Be explicit about what is verified, what is inferred, what is still missing, and which checks or test oracles will be used to decide whether the replication succeeded.
3. **Environment** — Before running anything, ask the user where to execute:
- **Local** — run in the current working directory
- **Virtual environment** — create an isolated venv/conda env first
- **Docker** — run experiment code inside an isolated Docker container
- **Modal** — run on Modal's serverless GPU infrastructure. Write a Modal-decorated Python script and execute with `modal run <script.py>`. Best for burst GPU jobs that don't need persistent state. Requires `modal` CLI (`pip install modal && modal setup`).
- **RunPod** — provision a GPU pod on RunPod and SSH in for execution. Use `runpodctl` to create pods, transfer files, and manage lifecycle. Best for long-running experiments or when you need SSH access and persistent storage. Requires `runpodctl` CLI and `RUNPOD_API_KEY`.
- **Plan only** — produce the replication plan without executing
4. **Execute** — If the user chose an execution environment, implement and run the replication steps there. Save notes, scripts, raw outputs, and results to disk in a reproducible layout. Do not call the outcome replicated unless the planned checks actually passed.
5. **Log** — For multi-step or resumable replication work, append concise entries to `CHANGELOG.md` after meaningful progress, failed attempts, major verification outcomes, and before stopping. Record the active objective, what changed, what was checked, and the next step.
6. **Report** — End with a `Sources` section containing paper and repository URLs.
Do not install packages, run training, or execute experiments without confirming the execution environment first.