Initial Feynman research agent scaffold
This commit is contained in:
48
skills/research/experiment-design/SKILL.md
Normal file
48
skills/research/experiment-design/SKILL.md
Normal file
@@ -0,0 +1,48 @@
|
||||
---
|
||||
name: experiment-design
|
||||
description: Use this when the task is to turn a vague research idea into a testable experiment, define metrics, choose baselines, or plan ablations.
|
||||
---
|
||||
|
||||
# Experiment Design
|
||||
|
||||
## When To Use
|
||||
|
||||
Use this skill when the user has:
|
||||
- a hypothesis to test
|
||||
- a method to evaluate
|
||||
- an unclear benchmark plan
|
||||
- a need for baselines, ablations, or metrics
|
||||
|
||||
## Procedure
|
||||
|
||||
1. Restate the research question as a falsifiable claim.
|
||||
2. Define:
|
||||
- independent variables
|
||||
- dependent variables
|
||||
- success metrics
|
||||
- baselines
|
||||
- constraints
|
||||
3. Search for prior work first with `alpha_search` so you do not reinvent an obviously flawed setup.
|
||||
4. Use `alpha_get_paper` and `alpha_ask_paper` on the strongest references.
|
||||
5. Prefer the smallest experiment that can meaningfully reduce uncertainty.
|
||||
6. List confounders and failure modes up front.
|
||||
7. If implementation is requested, create the scripts, configs, and logging plan.
|
||||
8. Write the plan to disk before running expensive work.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Avoid experiments with no baseline.
|
||||
- Avoid metrics that do not connect to the claim.
|
||||
- Avoid ablations that change multiple variables at once.
|
||||
- Avoid broad plans that cannot be executed with the current environment.
|
||||
|
||||
## Deliverable
|
||||
|
||||
Produce:
|
||||
- hypothesis
|
||||
- setup
|
||||
- baselines
|
||||
- metrics
|
||||
- ablations
|
||||
- risks
|
||||
- next action
|
||||
52
skills/research/literature-review/SKILL.md
Normal file
52
skills/research/literature-review/SKILL.md
Normal file
@@ -0,0 +1,52 @@
|
||||
---
|
||||
name: literature-review
|
||||
description: Use this when the task is to survey prior work, compare papers, synthesize a field, or build a reading list grounded in primary sources.
|
||||
---
|
||||
|
||||
# Literature Review
|
||||
|
||||
## When To Use
|
||||
|
||||
Use this skill when the user wants:
|
||||
- a research overview
|
||||
- a paper shortlist
|
||||
- a comparison of methods
|
||||
- a synthesis of consensus and disagreement
|
||||
- a source-backed brief on a topic
|
||||
|
||||
## Procedure
|
||||
|
||||
1. Search broadly first with `alpha_search`.
|
||||
2. Pick the strongest candidates by direct relevance, recency, citations, and venue quality.
|
||||
3. Inspect the top papers with `alpha_get_paper` before making concrete claims.
|
||||
4. Use `alpha_ask_paper` for missing methodological or experimental details.
|
||||
5. Build a compact evidence table:
|
||||
- title
|
||||
- year
|
||||
- authors
|
||||
- venue
|
||||
- claim or contribution
|
||||
- important caveats
|
||||
6. Distinguish:
|
||||
- what multiple sources agree on
|
||||
- where methods or findings differ
|
||||
- what remains unresolved
|
||||
7. If the user wants a durable artifact, write a markdown brief to disk.
|
||||
8. If you discover an important gotcha about a paper, save it with `alpha_annotate_paper`.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Do not summarize a field from titles alone.
|
||||
- Do not flatten disagreements into fake consensus.
|
||||
- Do not treat recent preprints as established facts without saying so.
|
||||
- Do not cite secondary commentary when a primary source is available.
|
||||
|
||||
## Output Shape
|
||||
|
||||
Prefer this structure:
|
||||
- question
|
||||
- strongest papers
|
||||
- major findings
|
||||
- disagreements or caveats
|
||||
- open questions
|
||||
- recommended next reading or experiments
|
||||
50
skills/research/paper-code-audit/SKILL.md
Normal file
50
skills/research/paper-code-audit/SKILL.md
Normal file
@@ -0,0 +1,50 @@
|
||||
---
|
||||
name: paper-code-audit
|
||||
description: Use this when the task is to compare a paper against its repository, verify whether claims are implemented, or assess reproducibility risk.
|
||||
---
|
||||
|
||||
# Paper Code Audit
|
||||
|
||||
## When To Use
|
||||
|
||||
Use this skill for:
|
||||
- paper-versus-code verification
|
||||
- implementation gap analysis
|
||||
- reproducibility audits
|
||||
- checking whether public code matches reported results
|
||||
|
||||
## Procedure
|
||||
|
||||
1. Locate the paper with `alpha_search`.
|
||||
2. Load the paper with `alpha_get_paper`.
|
||||
3. Extract implementation-relevant details using `alpha_ask_paper`:
|
||||
- datasets
|
||||
- preprocessing
|
||||
- model architecture
|
||||
- hyperparameters
|
||||
- evaluation protocol
|
||||
4. If the paper links a repository, inspect it using `alpha_read_code`.
|
||||
5. Compare paper claims against code realities:
|
||||
- are all components present
|
||||
- do defaults match the paper
|
||||
- are metrics/eval scripts exposed
|
||||
- are hidden assumptions required
|
||||
6. Record concrete mismatches, not vibes.
|
||||
7. Save the audit in `outputs/`.
|
||||
8. If you find a durable gotcha, save it with `alpha_annotate_paper`.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Do not infer repository behavior without opening the relevant files.
|
||||
- Do not assume README claims reflect the actual defaults.
|
||||
- Do not mark something as missing if it exists under another name without checking.
|
||||
|
||||
## Deliverable
|
||||
|
||||
Include:
|
||||
- paper summary
|
||||
- repository coverage
|
||||
- confirmed matches
|
||||
- mismatches or omissions
|
||||
- reproducibility risks
|
||||
- recommended next actions
|
||||
45
skills/research/paper-writing/SKILL.md
Normal file
45
skills/research/paper-writing/SKILL.md
Normal file
@@ -0,0 +1,45 @@
|
||||
---
|
||||
name: paper-writing
|
||||
description: Use this when the task is to turn research notes, experiments, or a literature review into a polished paper-style writeup with Markdown and LaTeX.
|
||||
---
|
||||
|
||||
# Paper Writing
|
||||
|
||||
## When To Use
|
||||
|
||||
Use this skill for:
|
||||
- research reports that should read like a paper
|
||||
- internal memos with equations or formal structure
|
||||
- polished writeups of experiments or literature reviews
|
||||
- converting rough notes into a coherent draft
|
||||
|
||||
## Procedure
|
||||
|
||||
1. Make sure the underlying claims are already grounded in sources, experiments, or explicit caveats.
|
||||
2. Build the draft around a proper research structure:
|
||||
- title
|
||||
- abstract
|
||||
- introduction or problem statement
|
||||
- related work
|
||||
- approach, synthesis, or methodology
|
||||
- evidence, experiments, or case studies
|
||||
- limitations
|
||||
- conclusion
|
||||
3. Use Markdown by default.
|
||||
4. Use LaTeX only where equations or notation genuinely improve clarity.
|
||||
5. Keep claims falsifiable and scoped.
|
||||
6. Save polished drafts to `papers/`.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Do not use LaTeX for decoration.
|
||||
- Do not make a draft look more certain than the evidence supports.
|
||||
- Do not hide missing citations or weak evidence; flag them.
|
||||
|
||||
## Deliverable
|
||||
|
||||
A readable paper-style draft with:
|
||||
- explicit structure
|
||||
- traceable claims
|
||||
- equations only where useful
|
||||
- limitations stated plainly
|
||||
49
skills/research/reading-list/SKILL.md
Normal file
49
skills/research/reading-list/SKILL.md
Normal file
@@ -0,0 +1,49 @@
|
||||
---
|
||||
name: reading-list
|
||||
description: Use this when the user wants a curated reading sequence, paper shortlist, or tiered set of papers for learning or project onboarding.
|
||||
---
|
||||
|
||||
# Reading List
|
||||
|
||||
## When To Use
|
||||
|
||||
Use this skill for:
|
||||
- getting up to speed on a topic
|
||||
- onboarding into a research area
|
||||
- choosing which papers to read first
|
||||
- constructing a project-specific reading order
|
||||
|
||||
## Procedure
|
||||
|
||||
1. Start with `alpha_search` in `all` mode.
|
||||
2. Inspect the strongest candidates with `alpha_get_paper`.
|
||||
3. Use `alpha_ask_paper` for fit questions like:
|
||||
- what problem does this really solve
|
||||
- what assumptions does it rely on
|
||||
- what prior work does it build on
|
||||
4. Classify papers into roles:
|
||||
- foundational
|
||||
- key recent advances
|
||||
- evaluation or benchmark references
|
||||
- critiques or limitations
|
||||
- likely replication targets
|
||||
5. Order the list intentionally:
|
||||
- start with orientation
|
||||
- move to strongest methods
|
||||
- finish with edges, critiques, or adjacent work
|
||||
6. Write the final list as a durable markdown artifact in `outputs/`.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Do not sort purely by citations.
|
||||
- Do not over-index on recency when fundamentals matter.
|
||||
- Do not include papers you have not inspected at all.
|
||||
|
||||
## Deliverable
|
||||
|
||||
For each paper include:
|
||||
- title
|
||||
- year
|
||||
- why it matters
|
||||
- when to read it in the sequence
|
||||
- one caveat or limitation
|
||||
52
skills/research/replication/SKILL.md
Normal file
52
skills/research/replication/SKILL.md
Normal file
@@ -0,0 +1,52 @@
|
||||
---
|
||||
name: replication
|
||||
description: Use this when the task is to reproduce a paper result, benchmark a claim, rebuild an experiment, or evaluate whether a published result holds in practice.
|
||||
---
|
||||
|
||||
# Replication
|
||||
|
||||
## When To Use
|
||||
|
||||
Use this skill for:
|
||||
- paper reproduction
|
||||
- benchmark recreation
|
||||
- ablation reruns
|
||||
- claim verification through code and experiments
|
||||
|
||||
## Procedure
|
||||
|
||||
1. Identify the canonical source paper and inspect it with `alpha_get_paper`.
|
||||
2. Extract the exact target:
|
||||
- task
|
||||
- dataset
|
||||
- model or method
|
||||
- metrics
|
||||
- hardware or runtime assumptions
|
||||
3. Use `alpha_ask_paper` to pull out the exact details missing from the report.
|
||||
4. If the paper has a public repository, inspect it with `alpha_read_code`.
|
||||
5. Search the local workspace for existing code, notebooks, configs, and datasets.
|
||||
6. Write down the missing pieces explicitly before running anything.
|
||||
7. If the environment is sufficient, implement the minimal runnable reproduction path.
|
||||
8. Run the experiment with built-in file and shell tools.
|
||||
9. Save:
|
||||
- commands used
|
||||
- configs
|
||||
- raw outputs
|
||||
- summarized results
|
||||
10. Compare observed results with the paper and explain gaps.
|
||||
11. If the paper had a practical gotcha, attach it with `alpha_annotate_paper`.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Do not claim replication succeeded if key conditions were missing.
|
||||
- Do not compare different metrics as if they were equivalent.
|
||||
- Do not ignore dataset or preprocessing mismatch.
|
||||
- Do not hide failed runs; record them and explain them.
|
||||
|
||||
## Verification
|
||||
|
||||
A good replication outcome includes:
|
||||
- the exact command path
|
||||
- the data or config used
|
||||
- the observed metrics
|
||||
- a clear statement of match, partial match, or mismatch
|
||||
Reference in New Issue
Block a user