Rename .pi to .feynman, rename citation agent to verifier, add website, skills, and docs

- Rename project config dir from .pi/ to .feynman/ (Pi supports this via piConfig.configDir) - Rename citation agent to verifier across all prompts, agents, skills, and docs - Add website with homepage and 24 doc pages (Astro + Tailwind) - Add skills for all workflows (deep-research, lit, review, audit, replicate, compare, draft, autoresearch, watch, jobs, session-log, agentcomputer) - Add Pi-native prompt frontmatter (args, section, topLevelCli) and read at runtime - Remove sync-docs generation layer — docs are standalone - Remove metadata/prompts.mjs and metadata/packages.mjs — not needed at runtime - Rewrite README and homepage copy - Add environment selection to /replicate before executing - Add prompts/delegate.md and AGENTS.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 17:35:35 -07:00
parent 406d50b3ff
commit f5570b4e5a
98 changed files with 9886 additions and 298 deletions
--- a/prompts/audit.md
+++ b/prompts/audit.md
@@ -1,10 +1,13 @@
 ---
 description: Compare a paper's claims against its public codebase and identify mismatches, omissions, and reproducibility risks.
+args: <item>
+section: Research Workflows
+topLevelCli: true
 ---
 Audit the paper and codebase for: $@

 Requirements:
- Use the `researcher` subagent for evidence gathering and the `citation` subagent to verify sources and add inline citations when the audit is non-trivial.
+- Use the `researcher` subagent for evidence gathering and the `verifier` subagent to verify sources and add inline citations when the audit is non-trivial.
 - Compare claimed methods, defaults, metrics, and data handling against the actual code.
 - Call out missing code, mismatches, ambiguous defaults, and reproduction risks.
 - Save exactly one audit artifact to `outputs/` as markdown.
--- a/prompts/autoresearch.md
+++ b/prompts/autoresearch.md
@@ -1,5 +1,8 @@
 ---
 description: Autonomous experiment loop — try ideas, measure results, keep what works, discard what doesn't, repeat.
+args: <idea>
+section: Research Workflows
+topLevelCli: true
 ---
 Start an autoresearch optimization loop for: $@

--- a/prompts/compare.md
+++ b/prompts/compare.md
@@ -1,10 +1,13 @@
 ---
 description: Compare multiple sources on a topic and produce a source-grounded matrix of agreements, disagreements, and confidence.
+args: <topic>
+section: Research Workflows
+topLevelCli: true
 ---
 Compare sources for: $@

 Requirements:
- Use the `researcher` subagent to gather source material when the comparison set is broad, and the `citation` subagent to verify sources and add inline citations to the final matrix.
+- Use the `researcher` subagent to gather source material when the comparison set is broad, and the `verifier` subagent to verify sources and add inline citations to the final matrix.
 - Build a comparison matrix covering: source, key claim, evidence type, caveats, confidence.
 - Distinguish agreement, disagreement, and uncertainty clearly.
 - Save exactly one comparison to `outputs/` as markdown.
--- a/prompts/deepresearch.md
+++ b/prompts/deepresearch.md
@@ -1,9 +1,12 @@
 ---
 description: Run a thorough, source-heavy investigation on a topic and produce a durable research brief with inline citations.
+args: <topic>
+section: Research Workflows
+topLevelCli: true
 ---
 Run a deep research workflow for: $@

-You are the Lead Researcher. You plan, delegate, evaluate, loop, write, and cite. Internal orchestration is invisible to the user unless they ask.
+You are the Lead Researcher. You plan, delegate, evaluate, verify, write, and cite. Internal orchestration is invisible to the user unless they ask.

 ## 1. Plan

@@ -12,8 +15,30 @@ Analyze the research question using extended thinking. Develop a research strate
 - Evidence types needed (papers, web, code, data, docs)
 - Sub-questions disjoint enough to parallelize
 - Source types and time periods that matter
+- Acceptance criteria: what evidence would make the answer "sufficient"

-Save the plan immediately with `memory_remember` (type: `fact`, key: `deepresearch.plan`). Context windows get truncated on long runs — the plan must survive.
+Write the plan to `outputs/.plans/deepresearch-plan.md` as a self-contained artifact:
+
+```markdown
+# Research Plan: [topic]
+
+## Questions
+1. ...
+
+## Strategy
+- Researcher allocations and dimensions
+- Expected rounds
+
+## Acceptance Criteria
+- [ ] All key questions answered with ≥2 independent sources
+- [ ] Contradictions identified and addressed
+- [ ] No single-source claims on critical findings
+
+## Decision Log
+(Updated as the workflow progresses)
+```
+
+Also save the plan with `memory_remember` (type: `fact`, key: `deepresearch.plan`) so it survives context truncation.

 ## 2. Scale decision

@@ -57,7 +82,9 @@ After researchers return, read their output files and critically assess:
 - Are there contradictions needing resolution?
 - Is any key angle missing entirely?

-If gaps are significant, spawn another targeted batch of researchers. No fixed cap on rounds — iterate until evidence is sufficient or sources are exhausted. Update the stored plan with `memory_remember` as it evolves.
+If gaps are significant, spawn another targeted batch of researchers. No fixed cap on rounds — iterate until evidence is sufficient or sources are exhausted.
+
+Update the plan artifact (`outputs/.plans/deepresearch-plan.md`) decision log after each round.

 Most topics need 1-2 rounds. Stop when additional rounds would not materially change conclusions.

@@ -84,22 +111,51 @@ Save this draft to a temp file (e.g., `draft.md` in the chain artifacts dir or a

 ## 6. Cite

-Spawn the `citation` agent to post-process YOUR draft. The citation agent adds inline citations, verifies every source URL, and produces the final output:
+Spawn the `verifier` agent to post-process YOUR draft. The verifier agent adds inline citations, verifies every source URL, and produces the final output:

 ```
-{ agent: "citation", task: "Add inline citations to draft.md using the research files as source material. Verify every URL.", output: "brief.md" }
+{ agent: "verifier", task: "Add inline citations to draft.md using the research files as source material. Verify every URL.", output: "brief.md" }
 ```

-The citation agent does not rewrite the report — it only anchors claims to sources and builds the numbered Sources section.
+The verifier agent does not rewrite the report — it only anchors claims to sources and builds the numbered Sources section.

-## 7. Deliver
+## 7. Verify

-Copy the final cited output to the appropriate folder:
+Spawn the `reviewer` agent against the cited draft. The reviewer checks for:
+- Unsupported claims that slipped past citation
+- Logical gaps or contradictions between sections
+- Single-source claims on critical findings
+- Overstated confidence relative to evidence quality
+
+```
+{ agent: "reviewer", task: "Verify brief.md — flag any claims that lack sufficient source backing, identify logical gaps, and check that confidence levels match evidence strength. This is a verification pass, not a peer review.", output: "verification.md" }
+```
+
+If the reviewer flags FATAL issues, fix them in the brief before delivering. MAJOR issues get noted in the Open Questions section. MINOR issues are accepted.
+
+## 8. Deliver
+
+Copy the final cited and verified output to the appropriate folder:
 - Paper-style drafts → `papers/`
 - Everything else → `outputs/`

 Use a descriptive filename based on the topic.

+Write a provenance record alongside the main artifact as `<filename>.provenance.md`:
+
+```markdown
+# Provenance: [topic]
+
+- **Date:** [date]
+- **Rounds:** [number of researcher rounds]
+- **Sources consulted:** [total unique sources across all research files]
+- **Sources accepted:** [sources that survived citation verification]
+- **Sources rejected:** [dead links, unverifiable, or removed]
+- **Verification:** [PASS / PASS WITH NOTES — summary of reviewer findings]
+- **Plan:** outputs/.plans/deepresearch-plan.md
+- **Research files:** [list of intermediate research-*.md files]
+```
+
 ## Background execution

 If the user wants unattended execution or the sweep will clearly take a while:
--- a/prompts/delegate.md
+++ b/prompts/delegate.md
@@ -0,0 +1,21 @@
+---
+description: Delegate a research task to a remote Agent Computer machine for cloud execution.
+args: <task>
+section: Internal
+---
+Delegate the following task to a remote Agent Computer machine: $@
+
+## Workflow
+
+1. **Check CLI** — Verify `computer` or `aicomputer` is installed and authenticated. If not, install with `npm install -g aicomputer` and run `computer login`.
+2. **Pick a machine** — Run `computer ls --json` and choose an appropriate machine. If none are running, tell the user to create one with `computer create`.
+3. **Pick an agent** — Run `computer agent agents <machine> --json` and choose an installed agent with credentials (prefer Claude).
+4. **Create a session** — Use `computer agent sessions new <machine> --agent claude --name research --json`.
+5. **Send the task** — Translate the user's research task into a self-contained prompt and send it via `computer agent prompt`. The prompt must include:
+   - The full research objective
+   - Where to write outputs (default: `/workspace/outputs/`)
+   - What artifact to produce when done (summary file)
+   - Any tools or data sources to use
+6. **Monitor** — Use `computer agent watch <machine> --session <session_id>` to stream progress. Report status to the user at meaningful milestones.
+7. **Retrieve results** — When the remote agent finishes, pull the summary back with `computer agent prompt <machine> "cat /workspace/outputs/summary.md" --session <session_id>`. Present results to the user.
+8. **Clean up** — Close the session with `computer agent close <machine> --session <session_id>` unless the user wants to continue.
--- a/prompts/draft.md
+++ b/prompts/draft.md
@@ -1,10 +1,13 @@
 ---
 description: Turn research findings into a polished paper-style draft with equations, sections, and explicit claims.
+args: <topic>
+section: Research Workflows
+topLevelCli: true
 ---
 Write a paper-style draft for: $@

 Requirements:
- Use the `writer` subagent when the draft should be produced from already-collected notes, then use the `citation` subagent to add inline citations and verify sources.
+- Use the `writer` subagent when the draft should be produced from already-collected notes, then use the `verifier` subagent to add inline citations and verify sources.
 - Include at minimum: title, abstract, problem statement, related work, method or synthesis, evidence or experiments, limitations, conclusion.
 - Use clean Markdown with LaTeX where equations materially help.
 - Save exactly one draft to `papers/` as markdown.
--- a/prompts/jobs.md
+++ b/prompts/jobs.md
@@ -1,5 +1,7 @@
 ---
 description: Inspect active background research work, including running processes and scheduled follow-ups.
+section: Project & Session
+topLevelCli: true
 ---
 Inspect active background work for this project.

--- a/prompts/lit.md
+++ b/prompts/lit.md
@@ -1,11 +1,15 @@
 ---
 description: Run a literature review on a topic using paper search and primary-source synthesis.
+args: <topic>
+section: Research Workflows
+topLevelCli: true
 ---
 Investigate the following topic as a literature review: $@

-Requirements:
- Use the `researcher` subagent when the sweep is wide enough to benefit from delegated paper triage before synthesis.
- Separate consensus, disagreements, and open questions.
- When useful, propose concrete next experiments or follow-up reading.
- Save exactly one literature review to `outputs/` as markdown.
- End with a `Sources` section containing direct URLs for every source used.
+## Workflow
+
+1. **Gather** — Use the `researcher` subagent when the sweep is wide enough to benefit from delegated paper triage before synthesis. For narrow topics, search directly.
+2. **Synthesize** — Separate consensus, disagreements, and open questions. When useful, propose concrete next experiments or follow-up reading.
+3. **Cite** — Spawn the `verifier` agent to add inline citations and verify every source URL in the draft.
+4. **Verify** — Spawn the `reviewer` agent to check the cited draft for unsupported claims, logical gaps, and single-source critical findings. Fix FATAL issues before delivering. Note MAJOR issues in Open Questions.
+5. **Deliver** — Save exactly one literature review to `outputs/` as markdown. Write a provenance record alongside it as `<filename>.provenance.md` listing: date, sources consulted vs. accepted vs. rejected, verification status, and intermediate research files used.
--- a/prompts/log.md
+++ b/prompts/log.md
@@ -1,5 +1,7 @@
 ---
 description: Write a durable session log with completed work, findings, open questions, and next steps.
+section: Project & Session
+topLevelCli: true
 ---
 Write a session log for the current research work.

--- a/prompts/replicate.md
+++ b/prompts/replicate.md
@@ -1,12 +1,21 @@
 ---
 description: Plan or execute a replication workflow for a paper, claim, or benchmark.
+args: <paper>
+section: Research Workflows
+topLevelCli: true
 ---
 Design a replication plan for: $@

-Requirements:
- Use the `researcher` subagent to extract implementation details from the target paper and any linked code.
- Determine what code, datasets, metrics, and environment are needed.
- If enough information is available locally, implement and run the replication steps.
- Save notes, scripts, and results to disk in a reproducible layout.
- Be explicit about what is verified, what is inferred, and what is still missing.
- End with a `Sources` section containing paper and repository URLs.
+## Workflow
+
+1. **Extract** — Use the `researcher` subagent to pull implementation details from the target paper and any linked code.
+2. **Plan** — Determine what code, datasets, metrics, and environment are needed. Be explicit about what is verified, what is inferred, and what is still missing.
+3. **Environment** — Before running anything, ask the user where to execute:
+   - **Local** — run in the current working directory
+   - **Virtual environment** — create an isolated venv/conda env first
+   - **Cloud** — delegate to a remote Agent Computer machine via `/delegate`
+   - **Plan only** — produce the replication plan without executing
+4. **Execute** — If the user chose an execution environment, implement and run the replication steps there. Save notes, scripts, and results to disk in a reproducible layout.
+5. **Report** — End with a `Sources` section containing paper and repository URLs.
+
+Do not install packages, run training, or execute experiments without confirming the execution environment first.
--- a/prompts/review.md
+++ b/prompts/review.md
@@ -1,5 +1,8 @@
 ---
 description: Simulate an AI research peer review with likely objections, severity, and a concrete revision plan.
+args: <artifact>
+section: Research Workflows
+topLevelCli: true
 ---
 Review this AI research artifact: $@

--- a/prompts/watch.md
+++ b/prompts/watch.md
@@ -1,5 +1,8 @@
 ---
 description: Set up a recurring or deferred research watch on a topic, company, paper area, or product surface.
+args: <topic>
+section: Research Workflows
+topLevelCli: true
 ---
 Create a research watch for: $@