Initial Feynman research agent scaffold

This commit is contained in:
Advait Paliwal
2026-03-20 11:05:58 -07:00
commit 1fe1ce04a5
25 changed files with 5079 additions and 0 deletions

View File

@@ -0,0 +1,48 @@
---
name: experiment-design
description: Use this when the task is to turn a vague research idea into a testable experiment, define metrics, choose baselines, or plan ablations.
---
# Experiment Design
## When To Use
Use this skill when the user has:
- a hypothesis to test
- a method to evaluate
- an unclear benchmark plan
- a need for baselines, ablations, or metrics
## Procedure
1. Restate the research question as a falsifiable claim.
2. Define:
- independent variables
- dependent variables
- success metrics
- baselines
- constraints
3. Search for prior work first with `alpha_search` so you do not reinvent an obviously flawed setup.
4. Use `alpha_get_paper` and `alpha_ask_paper` on the strongest references.
5. Prefer the smallest experiment that can meaningfully reduce uncertainty.
6. List confounders and failure modes up front.
7. If implementation is requested, create the scripts, configs, and logging plan.
8. Write the plan to disk before running expensive work.
## Pitfalls
- Avoid experiments with no baseline.
- Avoid metrics that do not connect to the claim.
- Avoid ablations that change multiple variables at once.
- Avoid broad plans that cannot be executed with the current environment.
## Deliverable
Produce:
- hypothesis
- setup
- baselines
- metrics
- ablations
- risks
- next action

View File

@@ -0,0 +1,52 @@
---
name: literature-review
description: Use this when the task is to survey prior work, compare papers, synthesize a field, or build a reading list grounded in primary sources.
---
# Literature Review
## When To Use
Use this skill when the user wants:
- a research overview
- a paper shortlist
- a comparison of methods
- a synthesis of consensus and disagreement
- a source-backed brief on a topic
## Procedure
1. Search broadly first with `alpha_search`.
2. Pick the strongest candidates by direct relevance, recency, citations, and venue quality.
3. Inspect the top papers with `alpha_get_paper` before making concrete claims.
4. Use `alpha_ask_paper` for missing methodological or experimental details.
5. Build a compact evidence table:
- title
- year
- authors
- venue
- claim or contribution
- important caveats
6. Distinguish:
- what multiple sources agree on
- where methods or findings differ
- what remains unresolved
7. If the user wants a durable artifact, write a markdown brief to disk.
8. If you discover an important gotcha about a paper, save it with `alpha_annotate_paper`.
## Pitfalls
- Do not summarize a field from titles alone.
- Do not flatten disagreements into fake consensus.
- Do not treat recent preprints as established facts without saying so.
- Do not cite secondary commentary when a primary source is available.
## Output Shape
Prefer this structure:
- question
- strongest papers
- major findings
- disagreements or caveats
- open questions
- recommended next reading or experiments

View File

@@ -0,0 +1,50 @@
---
name: paper-code-audit
description: Use this when the task is to compare a paper against its repository, verify whether claims are implemented, or assess reproducibility risk.
---
# Paper Code Audit
## When To Use
Use this skill for:
- paper-versus-code verification
- implementation gap analysis
- reproducibility audits
- checking whether public code matches reported results
## Procedure
1. Locate the paper with `alpha_search`.
2. Load the paper with `alpha_get_paper`.
3. Extract implementation-relevant details using `alpha_ask_paper`:
- datasets
- preprocessing
- model architecture
- hyperparameters
- evaluation protocol
4. If the paper links a repository, inspect it using `alpha_read_code`.
5. Compare paper claims against code realities:
- are all components present
- do defaults match the paper
- are metrics/eval scripts exposed
- are hidden assumptions required
6. Record concrete mismatches, not vibes.
7. Save the audit in `outputs/`.
8. If you find a durable gotcha, save it with `alpha_annotate_paper`.
## Pitfalls
- Do not infer repository behavior without opening the relevant files.
- Do not assume README claims reflect the actual defaults.
- Do not mark something as missing if it exists under another name without checking.
## Deliverable
Include:
- paper summary
- repository coverage
- confirmed matches
- mismatches or omissions
- reproducibility risks
- recommended next actions

View File

@@ -0,0 +1,45 @@
---
name: paper-writing
description: Use this when the task is to turn research notes, experiments, or a literature review into a polished paper-style writeup with Markdown and LaTeX.
---
# Paper Writing
## When To Use
Use this skill for:
- research reports that should read like a paper
- internal memos with equations or formal structure
- polished writeups of experiments or literature reviews
- converting rough notes into a coherent draft
## Procedure
1. Make sure the underlying claims are already grounded in sources, experiments, or explicit caveats.
2. Build the draft around a proper research structure:
- title
- abstract
- introduction or problem statement
- related work
- approach, synthesis, or methodology
- evidence, experiments, or case studies
- limitations
- conclusion
3. Use Markdown by default.
4. Use LaTeX only where equations or notation genuinely improve clarity.
5. Keep claims falsifiable and scoped.
6. Save polished drafts to `papers/`.
## Pitfalls
- Do not use LaTeX for decoration.
- Do not make a draft look more certain than the evidence supports.
- Do not hide missing citations or weak evidence; flag them.
## Deliverable
A readable paper-style draft with:
- explicit structure
- traceable claims
- equations only where useful
- limitations stated plainly

View File

@@ -0,0 +1,49 @@
---
name: reading-list
description: Use this when the user wants a curated reading sequence, paper shortlist, or tiered set of papers for learning or project onboarding.
---
# Reading List
## When To Use
Use this skill for:
- getting up to speed on a topic
- onboarding into a research area
- choosing which papers to read first
- constructing a project-specific reading order
## Procedure
1. Start with `alpha_search` in `all` mode.
2. Inspect the strongest candidates with `alpha_get_paper`.
3. Use `alpha_ask_paper` for fit questions like:
- what problem does this really solve
- what assumptions does it rely on
- what prior work does it build on
4. Classify papers into roles:
- foundational
- key recent advances
- evaluation or benchmark references
- critiques or limitations
- likely replication targets
5. Order the list intentionally:
- start with orientation
- move to strongest methods
- finish with edges, critiques, or adjacent work
6. Write the final list as a durable markdown artifact in `outputs/`.
## Pitfalls
- Do not sort purely by citations.
- Do not over-index on recency when fundamentals matter.
- Do not include papers you have not inspected at all.
## Deliverable
For each paper include:
- title
- year
- why it matters
- when to read it in the sequence
- one caveat or limitation

View File

@@ -0,0 +1,52 @@
---
name: replication
description: Use this when the task is to reproduce a paper result, benchmark a claim, rebuild an experiment, or evaluate whether a published result holds in practice.
---
# Replication
## When To Use
Use this skill for:
- paper reproduction
- benchmark recreation
- ablation reruns
- claim verification through code and experiments
## Procedure
1. Identify the canonical source paper and inspect it with `alpha_get_paper`.
2. Extract the exact target:
- task
- dataset
- model or method
- metrics
- hardware or runtime assumptions
3. Use `alpha_ask_paper` to pull out the exact details missing from the report.
4. If the paper has a public repository, inspect it with `alpha_read_code`.
5. Search the local workspace for existing code, notebooks, configs, and datasets.
6. Write down the missing pieces explicitly before running anything.
7. If the environment is sufficient, implement the minimal runnable reproduction path.
8. Run the experiment with built-in file and shell tools.
9. Save:
- commands used
- configs
- raw outputs
- summarized results
10. Compare observed results with the paper and explain gaps.
11. If the paper had a practical gotcha, attach it with `alpha_annotate_paper`.
## Pitfalls
- Do not claim replication succeeded if key conditions were missing.
- Do not compare different metrics as if they were equivalent.
- Do not ignore dataset or preprocessing mismatch.
- Do not hide failed runs; record them and explain them.
## Verification
A good replication outcome includes:
- the exact command path
- the data or config used
- the observed metrics
- a clear statement of match, partial match, or mismatch