Files
feynman/skills/research/experiment-design/SKILL.md
2026-03-20 23:37:38 -07:00

1.4 KiB

name, description
name description
experiment-design Use this when the task is to turn a vague research idea into a testable experiment, define metrics, choose baselines, or plan ablations.

Experiment Design

When To Use

Use this skill when the user has:

  • a hypothesis to test
  • a method to evaluate
  • an unclear benchmark plan
  • a need for baselines, ablations, or metrics

Procedure

  1. Restate the research question as a falsifiable claim.
  2. Define:
    • independent variables
    • dependent variables
    • success metrics
    • baselines
    • constraints
  3. Search for prior work first.
  4. If the setup is tied to current products, APIs, model offerings, pricing, or market behavior, use web_search and fetch_content first.
  5. Use alpha_search, alpha_get_paper, and alpha_ask_paper for academic baselines and prior experiments.
  6. Prefer the smallest experiment that can meaningfully reduce uncertainty.
  7. List confounders and failure modes up front.
  8. If implementation is requested, create the scripts, configs, and logging plan.
  9. Write the plan to disk before running expensive work.

Pitfalls

  • Avoid experiments with no baseline.
  • Avoid metrics that do not connect to the claim.
  • Avoid ablations that change multiple variables at once.
  • Avoid broad plans that cannot be executed with the current environment.

Deliverable

Produce:

  • hypothesis
  • setup
  • baselines
  • metrics
  • ablations
  • risks
  • next action