Rename .pi to .feynman, rename citation agent to verifier, add website, skills, and docs
- Rename project config dir from .pi/ to .feynman/ (Pi supports this via piConfig.configDir) - Rename citation agent to verifier across all prompts, agents, skills, and docs - Add website with homepage and 24 doc pages (Astro + Tailwind) - Add skills for all workflows (deep-research, lit, review, audit, replicate, compare, draft, autoresearch, watch, jobs, session-log, agentcomputer) - Add Pi-native prompt frontmatter (args, section, topLevelCli) and read at runtime - Remove sync-docs generation layer — docs are standalone - Remove metadata/prompts.mjs and metadata/packages.mjs — not needed at runtime - Rewrite README and homepage copy - Add environment selection to /replicate before executing - Add prompts/delegate.md and AGENTS.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
39
website/src/content/docs/workflows/audit.md
Normal file
39
website/src/content/docs/workflows/audit.md
Normal file
@@ -0,0 +1,39 @@
|
||||
---
|
||||
title: Code Audit
|
||||
description: Compare paper claims against public codebases
|
||||
section: Workflows
|
||||
order: 4
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/audit <item>
|
||||
```
|
||||
|
||||
## What it does
|
||||
|
||||
Compares claims made in a paper against its public codebase. Surfaces mismatches, missing experiments, and reproducibility risks.
|
||||
|
||||
## What it checks
|
||||
|
||||
- Do the reported hyperparameters match the code?
|
||||
- Are all claimed experiments present in the repository?
|
||||
- Does the training loop match the described methodology?
|
||||
- Are there undocumented preprocessing steps?
|
||||
- Do evaluation metrics match the paper's claims?
|
||||
|
||||
## Example
|
||||
|
||||
```
|
||||
/audit 2401.12345
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
An audit report with:
|
||||
|
||||
- Claim-by-claim verification
|
||||
- Identified mismatches
|
||||
- Missing components
|
||||
- Reproducibility risk assessment
|
||||
44
website/src/content/docs/workflows/autoresearch.md
Normal file
44
website/src/content/docs/workflows/autoresearch.md
Normal file
@@ -0,0 +1,44 @@
|
||||
---
|
||||
title: Autoresearch
|
||||
description: Autonomous experiment optimization loop
|
||||
section: Workflows
|
||||
order: 8
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/autoresearch <idea>
|
||||
```
|
||||
|
||||
## What it does
|
||||
|
||||
Runs an autonomous experiment loop:
|
||||
|
||||
1. **Edit** — Modify code or configuration
|
||||
2. **Commit** — Save the change
|
||||
3. **Benchmark** — Run evaluation
|
||||
4. **Evaluate** — Compare against baseline
|
||||
5. **Keep or revert** — Persist improvements, roll back regressions
|
||||
6. **Repeat** — Continue until the target is hit
|
||||
|
||||
## Tracking
|
||||
|
||||
Metrics are tracked in:
|
||||
|
||||
- `autoresearch.md` — Human-readable progress log
|
||||
- `autoresearch.jsonl` — Machine-readable metrics over time
|
||||
|
||||
## Controls
|
||||
|
||||
```
|
||||
/autoresearch <idea> # start or resume
|
||||
/autoresearch off # stop, keep data
|
||||
/autoresearch clear # delete all state, start fresh
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
```
|
||||
/autoresearch optimize the learning rate schedule for better convergence
|
||||
```
|
||||
29
website/src/content/docs/workflows/compare.md
Normal file
29
website/src/content/docs/workflows/compare.md
Normal file
@@ -0,0 +1,29 @@
|
||||
---
|
||||
title: Source Comparison
|
||||
description: Compare multiple sources with agreement/disagreement matrix
|
||||
section: Workflows
|
||||
order: 6
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/compare <topic>
|
||||
```
|
||||
|
||||
## What it does
|
||||
|
||||
Compares multiple sources on a topic. Builds an agreement/disagreement matrix showing where sources align and where they conflict.
|
||||
|
||||
## Example
|
||||
|
||||
```
|
||||
/compare approaches to constitutional AI training
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
- Source-by-source breakdown
|
||||
- Agreement/disagreement matrix
|
||||
- Synthesis of key differences
|
||||
- Assessment of which positions have stronger evidence
|
||||
40
website/src/content/docs/workflows/deep-research.md
Normal file
40
website/src/content/docs/workflows/deep-research.md
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
title: Deep Research
|
||||
description: Thorough source-heavy investigation with parallel agents
|
||||
section: Workflows
|
||||
order: 1
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/deepresearch <topic>
|
||||
```
|
||||
|
||||
## What it does
|
||||
|
||||
Deep research runs a thorough, source-heavy investigation. It plans the research scope, delegates to parallel researcher agents, synthesizes findings, and adds inline citations.
|
||||
|
||||
The workflow follows these steps:
|
||||
|
||||
1. **Plan** — Clarify the research question and identify search strategy
|
||||
2. **Delegate** — Spawn parallel researcher agents to gather evidence from different source types (papers, web, repos)
|
||||
3. **Synthesize** — Merge findings, resolve contradictions, identify gaps
|
||||
4. **Cite** — Add inline citations and verify all source URLs
|
||||
5. **Deliver** — Write a durable research brief to `outputs/`
|
||||
|
||||
## Example
|
||||
|
||||
```
|
||||
/deepresearch transformer scaling laws and their implications for compute-optimal training
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
Produces a structured research brief with:
|
||||
|
||||
- Executive summary
|
||||
- Key findings organized by theme
|
||||
- Evidence tables with source links
|
||||
- Open questions and suggested next steps
|
||||
- Numbered sources section with direct URLs
|
||||
37
website/src/content/docs/workflows/draft.md
Normal file
37
website/src/content/docs/workflows/draft.md
Normal file
@@ -0,0 +1,37 @@
|
||||
---
|
||||
title: Draft Writing
|
||||
description: Paper-style draft generation from research findings
|
||||
section: Workflows
|
||||
order: 7
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/draft <topic>
|
||||
```
|
||||
|
||||
## What it does
|
||||
|
||||
Produces a paper-style draft with structured sections. Writes to `papers/`.
|
||||
|
||||
## Structure
|
||||
|
||||
The generated draft includes:
|
||||
|
||||
- Title
|
||||
- Abstract
|
||||
- Introduction / Background
|
||||
- Method or Approach
|
||||
- Evidence and Analysis
|
||||
- Limitations
|
||||
- Conclusion
|
||||
- Sources
|
||||
|
||||
## Example
|
||||
|
||||
```
|
||||
/draft survey of differentiable physics simulators
|
||||
```
|
||||
|
||||
The writer agent works only from supplied evidence — it never fabricates content. If evidence is insufficient, it explicitly notes the gaps.
|
||||
31
website/src/content/docs/workflows/literature-review.md
Normal file
31
website/src/content/docs/workflows/literature-review.md
Normal file
@@ -0,0 +1,31 @@
|
||||
---
|
||||
title: Literature Review
|
||||
description: Map consensus, disagreements, and open questions
|
||||
section: Workflows
|
||||
order: 2
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/lit <topic>
|
||||
```
|
||||
|
||||
## What it does
|
||||
|
||||
Runs a structured literature review that searches across academic papers and web sources. Explicitly separates consensus findings from disagreements and open questions.
|
||||
|
||||
## Example
|
||||
|
||||
```
|
||||
/lit multimodal reasoning benchmarks for large language models
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
A structured review covering:
|
||||
|
||||
- **Consensus** — What the field agrees on
|
||||
- **Disagreements** — Where sources conflict
|
||||
- **Open questions** — What remains unresolved
|
||||
- **Sources** — Direct links to all referenced papers and articles
|
||||
42
website/src/content/docs/workflows/replication.md
Normal file
42
website/src/content/docs/workflows/replication.md
Normal file
@@ -0,0 +1,42 @@
|
||||
---
|
||||
title: Replication
|
||||
description: Plan replications of papers and claims
|
||||
section: Workflows
|
||||
order: 5
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/replicate <paper or claim>
|
||||
```
|
||||
|
||||
## What it does
|
||||
|
||||
Extracts key implementation details from a paper, identifies what's needed to replicate the results, and asks where to run before executing anything.
|
||||
|
||||
Before running code, Feynman asks you to choose an execution environment:
|
||||
|
||||
- **Local** — run in the current working directory
|
||||
- **Virtual environment** — create an isolated venv/conda env first
|
||||
- **Cloud** — delegate to a remote Agent Computer machine
|
||||
- **Plan only** — produce the replication plan without executing
|
||||
|
||||
## Example
|
||||
|
||||
```
|
||||
/replicate "chain-of-thought prompting improves math reasoning"
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
A replication plan covering:
|
||||
|
||||
- Key claims to verify
|
||||
- Required resources (compute, data, models)
|
||||
- Implementation details extracted from the paper
|
||||
- Potential pitfalls and underspecified details
|
||||
- Step-by-step replication procedure
|
||||
- Success criteria
|
||||
|
||||
If an execution environment is selected, also produces runnable scripts and captured results.
|
||||
49
website/src/content/docs/workflows/review.md
Normal file
49
website/src/content/docs/workflows/review.md
Normal file
@@ -0,0 +1,49 @@
|
||||
---
|
||||
title: Peer Review
|
||||
description: Simulated peer review with severity-graded feedback
|
||||
section: Workflows
|
||||
order: 3
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/review <artifact>
|
||||
```
|
||||
|
||||
## What it does
|
||||
|
||||
Simulates a tough-but-fair peer review for AI research artifacts. Evaluates novelty, empirical rigor, baselines, ablations, and reproducibility.
|
||||
|
||||
The reviewer agent identifies:
|
||||
|
||||
- Weak baselines
|
||||
- Missing ablations
|
||||
- Evaluation mismatches
|
||||
- Benchmark leakage
|
||||
- Under-specified implementation details
|
||||
|
||||
## Severity levels
|
||||
|
||||
Feedback is graded by severity:
|
||||
|
||||
- **FATAL** — Fundamental issues that invalidate the claims
|
||||
- **MAJOR** — Significant problems that need addressing
|
||||
- **MINOR** — Small improvements or clarifications
|
||||
|
||||
## Example
|
||||
|
||||
```
|
||||
/review outputs/scaling-laws-brief.md
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
Structured review with:
|
||||
|
||||
- Summary of the work
|
||||
- Strengths
|
||||
- Weaknesses (severity-graded)
|
||||
- Questions for the authors
|
||||
- Verdict (accept / revise / reject)
|
||||
- Revision plan
|
||||
29
website/src/content/docs/workflows/watch.md
Normal file
29
website/src/content/docs/workflows/watch.md
Normal file
@@ -0,0 +1,29 @@
|
||||
---
|
||||
title: Watch
|
||||
description: Recurring research monitoring
|
||||
section: Workflows
|
||||
order: 9
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```
|
||||
/watch <topic>
|
||||
```
|
||||
|
||||
## What it does
|
||||
|
||||
Schedules a recurring research watch. Sets a baseline of current knowledge and defines what constitutes a meaningful change worth reporting.
|
||||
|
||||
## Example
|
||||
|
||||
```
|
||||
/watch new papers on test-time compute scaling
|
||||
```
|
||||
|
||||
## How it works
|
||||
|
||||
1. Feynman establishes a baseline by surveying current sources
|
||||
2. Defines change signals (new papers, updated results, new repos)
|
||||
3. Schedules periodic checks via `pi-schedule-prompt`
|
||||
4. Reports only when meaningful changes are detected
|
||||
Reference in New Issue
Block a user