Initial Feynman research agent scaffold
This commit is contained in:
52
skills/research/replication/SKILL.md
Normal file
52
skills/research/replication/SKILL.md
Normal file
@@ -0,0 +1,52 @@
|
||||
---
|
||||
name: replication
|
||||
description: Use this when the task is to reproduce a paper result, benchmark a claim, rebuild an experiment, or evaluate whether a published result holds in practice.
|
||||
---
|
||||
|
||||
# Replication
|
||||
|
||||
## When To Use
|
||||
|
||||
Use this skill for:
|
||||
- paper reproduction
|
||||
- benchmark recreation
|
||||
- ablation reruns
|
||||
- claim verification through code and experiments
|
||||
|
||||
## Procedure
|
||||
|
||||
1. Identify the canonical source paper and inspect it with `alpha_get_paper`.
|
||||
2. Extract the exact target:
|
||||
- task
|
||||
- dataset
|
||||
- model or method
|
||||
- metrics
|
||||
- hardware or runtime assumptions
|
||||
3. Use `alpha_ask_paper` to pull out the exact details missing from the report.
|
||||
4. If the paper has a public repository, inspect it with `alpha_read_code`.
|
||||
5. Search the local workspace for existing code, notebooks, configs, and datasets.
|
||||
6. Write down the missing pieces explicitly before running anything.
|
||||
7. If the environment is sufficient, implement the minimal runnable reproduction path.
|
||||
8. Run the experiment with built-in file and shell tools.
|
||||
9. Save:
|
||||
- commands used
|
||||
- configs
|
||||
- raw outputs
|
||||
- summarized results
|
||||
10. Compare observed results with the paper and explain gaps.
|
||||
11. If the paper had a practical gotcha, attach it with `alpha_annotate_paper`.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Do not claim replication succeeded if key conditions were missing.
|
||||
- Do not compare different metrics as if they were equivalent.
|
||||
- Do not ignore dataset or preprocessing mismatch.
|
||||
- Do not hide failed runs; record them and explain them.
|
||||
|
||||
## Verification
|
||||
|
||||
A good replication outcome includes:
|
||||
- the exact command path
|
||||
- the data or config used
|
||||
- the observed metrics
|
||||
- a clear statement of match, partial match, or mismatch
|
||||
Reference in New Issue
Block a user