Add AI research review workflows
This commit is contained in:
22
.pi/agents/review.chain.md
Normal file
22
.pi/agents/review.chain.md
Normal file
@@ -0,0 +1,22 @@
|
||||
---
|
||||
name: review
|
||||
description: Gather evidence, verify claims, and simulate a peer review for an AI research artifact.
|
||||
---
|
||||
|
||||
## researcher
|
||||
output: research.md
|
||||
|
||||
Inspect the target paper, draft, code, cited work, and any linked experimental artifacts for {task}. Gather the strongest primary evidence that matters for a review.
|
||||
|
||||
## verifier
|
||||
reads: research.md
|
||||
output: verification.md
|
||||
|
||||
Audit research.md for unsupported claims, reproducibility gaps, stale or weak evidence, and paper-code mismatches relevant to {task}.
|
||||
|
||||
## reviewer
|
||||
reads: research.md+verification.md
|
||||
output: review.md
|
||||
progress: true
|
||||
|
||||
Write the final simulated peer review for {task} using research.md and verification.md. Include likely reviewer objections, severity, and a concrete revision plan.
|
||||
33
.pi/agents/reviewer.md
Normal file
33
.pi/agents/reviewer.md
Normal file
@@ -0,0 +1,33 @@
|
||||
---
|
||||
name: reviewer
|
||||
description: Simulate a tough but constructive AI research peer reviewer.
|
||||
thinking: high
|
||||
output: review.md
|
||||
defaultProgress: true
|
||||
---
|
||||
|
||||
You are Feynman's AI research reviewer.
|
||||
|
||||
Your job is to act like a skeptical but fair peer reviewer for AI/ML systems work.
|
||||
|
||||
Operating rules:
|
||||
- Evaluate novelty, clarity, empirical rigor, reproducibility, and likely reviewer pushback.
|
||||
- Do not praise vaguely. Every positive claim should be tied to specific evidence.
|
||||
- Look for:
|
||||
- missing or weak baselines
|
||||
- missing ablations
|
||||
- evaluation mismatches
|
||||
- unclear claims of novelty
|
||||
- weak related-work positioning
|
||||
- insufficient statistical evidence
|
||||
- benchmark leakage or contamination risks
|
||||
- under-specified implementation details
|
||||
- claims that outrun the experiments
|
||||
- Produce reviewer-style output with severity and concrete fixes.
|
||||
- Distinguish between fatal issues, strong concerns, and polish issues.
|
||||
- Preserve uncertainty. If the draft might pass depending on venue norms, say so explicitly.
|
||||
- End with a `Sources` section containing direct URLs for anything additionally inspected during review.
|
||||
|
||||
Default output expectations:
|
||||
- Save the main artifact to `review.md`.
|
||||
- Optimize for reviewer realism and actionable criticism.
|
||||
@@ -63,6 +63,10 @@ Inside the REPL:
|
||||
- `/new` starts a new persisted session
|
||||
- `/exit` quits
|
||||
- `/lit <topic>` expands the literature-review prompt template
|
||||
- `/related <topic>` builds the related-work and justification view
|
||||
- `/review <artifact>` simulates a peer review for an AI research artifact
|
||||
- `/ablate <artifact>` designs the minimum convincing ablation set
|
||||
- `/rebuttal <artifact>` drafts a rebuttal and revision matrix
|
||||
- `/replicate <paper or claim>` expands the replication prompt template
|
||||
- `/reading <topic>` expands the reading-list prompt template
|
||||
- `/memo <topic>` expands the general research memo prompt template
|
||||
@@ -109,8 +113,10 @@ Feynman also ships bundled research subagents in `.pi/agents/`:
|
||||
|
||||
- `researcher` for evidence gathering
|
||||
- `verifier` for claim and source checking
|
||||
- `reviewer` for peer-review style criticism
|
||||
- `writer` for polished memo and draft writing
|
||||
- `deep` chain for gather → verify → synthesize
|
||||
- `review` chain for gather → verify → peer review
|
||||
- `auto` chain for plan → gather → verify → draft
|
||||
|
||||
Feynman uses `@companion-ai/alpha-hub` directly in-process rather than shelling out to the CLI.
|
||||
|
||||
@@ -562,7 +562,16 @@ function buildProjectAgentsTemplate(): string {
|
||||
This file is read automatically at startup. It is the durable project memory for Feynman.
|
||||
|
||||
## Project Overview
|
||||
- State the research question, target artifact, and key datasets here.
|
||||
- State the research question, target artifact, target venue, and key datasets or benchmarks here.
|
||||
|
||||
## AI Research Context
|
||||
- Problem statement:
|
||||
- Core hypothesis:
|
||||
- Closest prior work:
|
||||
- Required baselines:
|
||||
- Required ablations:
|
||||
- Primary metrics:
|
||||
- Datasets / benchmarks:
|
||||
|
||||
## Ground Rules
|
||||
- Do not modify raw data in \`Data/Raw/\` or equivalent raw-data folders.
|
||||
@@ -575,6 +584,11 @@ This file is read automatically at startup. It is the durable project memory for
|
||||
|
||||
## Session Logging
|
||||
- Use \`/log\` at the end of meaningful sessions to write a durable session note into \`notes/session-logs/\`.
|
||||
|
||||
## Review Readiness
|
||||
- Known reviewer concerns:
|
||||
- Missing experiments:
|
||||
- Missing writing or framing work:
|
||||
`;
|
||||
}
|
||||
|
||||
@@ -613,9 +627,9 @@ export default function researchTools(pi: ExtensionAPI): void {
|
||||
const recentActivity = getRecentActivitySummary(ctx);
|
||||
const shortcuts = [
|
||||
["/lit", "survey papers on a topic"],
|
||||
["/deepresearch", "run a source-heavy research pass"],
|
||||
["/review", "simulate a peer review"],
|
||||
["/draft", "draft a paper-style writeup"],
|
||||
["/jobs", "inspect active background work"],
|
||||
["/deepresearch", "run a source-heavy research pass"],
|
||||
];
|
||||
const lines: string[] = [];
|
||||
|
||||
|
||||
17
prompts/ablate.md
Normal file
17
prompts/ablate.md
Normal file
@@ -0,0 +1,17 @@
|
||||
---
|
||||
description: Design the smallest convincing ablation set for an AI research project.
|
||||
---
|
||||
Design an ablation plan for: $@
|
||||
|
||||
Requirements:
|
||||
- Identify the exact claims the paper is making.
|
||||
- For each claim, determine what ablation or control is necessary to support it.
|
||||
- Prefer the `verifier` subagent when the claim structure is complicated.
|
||||
- Distinguish:
|
||||
- must-have ablations
|
||||
- nice-to-have ablations
|
||||
- unnecessary experiments
|
||||
- Call out where benchmark norms imply mandatory controls.
|
||||
- Optimize for the minimum convincing set, not experiment sprawl.
|
||||
- Save the plan to `outputs/` as markdown if the user wants a durable artifact.
|
||||
- End with a `Sources` section containing direct URLs for any external sources used.
|
||||
18
prompts/rebuttal.md
Normal file
18
prompts/rebuttal.md
Normal file
@@ -0,0 +1,18 @@
|
||||
---
|
||||
description: Turn reviewer comments into a structured rebuttal and revision plan for an AI research paper.
|
||||
---
|
||||
Prepare a rebuttal workflow for: $@
|
||||
|
||||
Requirements:
|
||||
- If reviewer comments are provided, organize them into a response matrix.
|
||||
- If reviewer comments are not yet provided, infer the likely strongest objections from the current draft and review them before drafting responses.
|
||||
- Prefer the `reviewer` subagent or the project `review` chain when fresh critical review is still needed.
|
||||
- For each issue, produce:
|
||||
- reviewer concern
|
||||
- whether it is valid
|
||||
- evidence available now
|
||||
- paper changes needed
|
||||
- rebuttal language
|
||||
- Do not overclaim fixes that have not been implemented.
|
||||
- Save the rebuttal matrix to `outputs/` as markdown.
|
||||
- End with a `Sources` section containing direct URLs for all inspected external sources.
|
||||
19
prompts/related.md
Normal file
19
prompts/related.md
Normal file
@@ -0,0 +1,19 @@
|
||||
---
|
||||
description: Build a related-work map and justify why an AI research project needs to exist.
|
||||
---
|
||||
Build the related-work and justification view for: $@
|
||||
|
||||
Requirements:
|
||||
- Search for the closest and strongest relevant papers first.
|
||||
- Prefer the `researcher` subagent when the space is broad or moving quickly.
|
||||
- Identify:
|
||||
- foundational papers
|
||||
- closest prior work
|
||||
- strongest recent competing approaches
|
||||
- benchmarks and evaluation norms
|
||||
- critiques or known weaknesses in the area
|
||||
- For each important paper, explain why it matters to this project.
|
||||
- Be explicit about what real gap remains after considering the strongest prior work.
|
||||
- If the project is not differentiated enough, say so clearly.
|
||||
- Save the artifact to `outputs/` as markdown if the user wants a durable result.
|
||||
- End with a `Sources` section containing direct URLs.
|
||||
24
prompts/review.md
Normal file
24
prompts/review.md
Normal file
@@ -0,0 +1,24 @@
|
||||
---
|
||||
description: Simulate an AI research peer review with likely objections, severity, and a concrete revision plan.
|
||||
---
|
||||
Review this AI research artifact: $@
|
||||
|
||||
Requirements:
|
||||
- Prefer the project `review` chain or the `researcher` + `verifier` + `reviewer` subagents when the artifact is large or the review needs to inspect paper, code, and experiments together.
|
||||
- Inspect the strongest relevant sources directly before making strong review claims.
|
||||
- If the artifact is a paper or draft, evaluate:
|
||||
- novelty and related-work positioning
|
||||
- clarity of claims
|
||||
- baseline fairness
|
||||
- evaluation design
|
||||
- missing ablations
|
||||
- reproducibility details
|
||||
- whether conclusions outrun the evidence
|
||||
- If code or experiment artifacts exist, compare them against the claimed method and evaluation.
|
||||
- Produce:
|
||||
- short verdict
|
||||
- likely reviewer objections
|
||||
- severity for each issue
|
||||
- revision plan in priority order
|
||||
- Save the review to `outputs/` as markdown.
|
||||
- End with a `Sources` section containing direct URLs for every inspected external source.
|
||||
@@ -16,8 +16,9 @@ Operating rules:
|
||||
- Never answer a latest/current question from arXiv or alpha-backed paper search alone.
|
||||
- For AI model or product claims, prefer official docs/vendor pages plus recent web sources over old papers.
|
||||
- Use the installed Pi research packages for broader web/PDF access, document parsing, citation workflows, background processes, memory, session recall, and delegated subtasks when they reduce friction.
|
||||
- Feynman ships project subagents for research work. Prefer the \`researcher\`, \`verifier\`, and \`writer\` subagents for larger research tasks, and use the project \`deep\` or \`auto\` chains when a multi-step delegated workflow clearly fits.
|
||||
- Feynman ships project subagents for research work. Prefer the \`researcher\`, \`verifier\`, \`reviewer\`, and \`writer\` subagents for larger research tasks, and use the project \`deep\`, \`review\`, or \`auto\` chains when a multi-step delegated workflow clearly fits.
|
||||
- Use subagents when decomposition meaningfully reduces context pressure or lets you parallelize evidence gathering. For detached long-running work, prefer background subagent execution with \`clarify: false, async: true\`.
|
||||
- For AI research artifacts, default to pressure-testing the work before polishing it. Use review-style workflows to check novelty positioning, evaluation design, baseline fairness, ablations, reproducibility, and likely reviewer objections.
|
||||
- Use the visualization packages when a chart, diagram, or interactive widget would materially improve understanding. Prefer charts for quantitative comparisons, Mermaid for simple process/architecture diagrams, and interactive HTML widgets for exploratory visual explanations.
|
||||
- Persistent memory is package-backed. Use \`memory_search\` to recall prior preferences and lessons, \`memory_remember\` to store explicit durable facts, and \`memory_lessons\` when prior corrections matter.
|
||||
- If the user says "remember", states a stable preference, or asks for something to be the default in future sessions, call \`memory_remember\`. Do not just say you will remember it.
|
||||
@@ -33,6 +34,7 @@ Operating rules:
|
||||
- When citing papers from alpha-backed tools, prefer direct arXiv or alphaXiv links and include the arXiv ID.
|
||||
- After writing a polished artifact, use \`preview_file\` when the user wants to review it in a browser or PDF viewer.
|
||||
- Default toward delivering a concrete artifact when the task naturally calls for one: reading list, memo, audit, experiment log, or draft.
|
||||
- Strong default AI-research artifacts include: related-work map, peer-review simulation, ablation plan, reproducibility audit, and rebuttal matrix.
|
||||
- Default artifact locations:
|
||||
- outputs/ for reviews, reading lists, and summaries
|
||||
- experiments/ for runnable experiment code and result logs
|
||||
|
||||
@@ -212,6 +212,10 @@ function printHelp(): void {
|
||||
/new Start a fresh persisted session
|
||||
/exit Quit the REPL
|
||||
/lit <topic> Expand the literature review prompt template
|
||||
/related <topic> Map related work and justify the research gap
|
||||
/review <artifact> Simulate a peer review for an AI research artifact
|
||||
/ablate <artifact> Design the minimum convincing ablation set
|
||||
/rebuttal <artifact> Draft a rebuttal and revision matrix
|
||||
/replicate <paper> Expand the replication prompt template
|
||||
/reading <topic> Expand the reading list prompt template
|
||||
/memo <topic> Expand the general research memo prompt template
|
||||
|
||||
Reference in New Issue
Block a user