fix: tighten workflow prompts and search defaults

This commit is contained in:
Advait Paliwal
2026-04-14 09:30:15 -07:00
parent af6486312d
commit 0995f5cc22
11 changed files with 26 additions and 17 deletions

View File

@@ -31,7 +31,7 @@ The installer downloads a standalone native bundle with its own Node.js runtime.
To upgrade the standalone app later, rerun the installer. `feynman update` only refreshes installed Pi packages inside Feynman's environment; it does not replace the standalone runtime bundle itself.
To uninstall the standalone app, remove the launcher and runtime bundle, then optionally remove `~/.feynman` if you also want to delete settings, auth, and sessions. See the installation guide for platform-specific paths.
To uninstall the standalone app, remove the launcher and runtime bundle, then optionally remove `~/.feynman` if you also want to delete settings, sessions, and installed package state. If you also want to delete alphaXiv login state, remove `~/.ahub`. See the installation guide for platform-specific paths.
Local models are supported through the custom-provider flow. For Ollama, run `feynman setup`, choose `Custom provider (baseUrl + API key)`, use `openai-completions`, and point it at `http://localhost:11434/v1`.

View File

@@ -9,7 +9,7 @@ Audit the paper and codebase for: $@
Derive a short slug from the audit target (lowercase, hyphens, no filler words, ≤5 words). Use this slug for all files in this run.
Requirements:
- Before starting, outline the audit plan: which paper, which repo, which claims to check. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user, then continue automatically. Do not block the workflow waiting for confirmation.
- Before starting, outline the audit plan: which paper, which repo, which claims to check. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
- Use the `researcher` subagent for evidence gathering and the `verifier` subagent to verify sources and add inline citations when the audit is non-trivial.
- Compare claimed methods, defaults, metrics, and data handling against the actual code.
- Call out missing code, mismatches, ambiguous defaults, and reproduction risks.

View File

@@ -9,7 +9,7 @@ Compare sources for: $@
Derive a short slug from the comparison topic (lowercase, hyphens, no filler words, ≤5 words). Use this slug for all files in this run.
Requirements:
- Before starting, outline the comparison plan: which sources to compare, which dimensions to evaluate, expected output structure. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user, then continue automatically. Do not block the workflow waiting for confirmation.
- Before starting, outline the comparison plan: which sources to compare, which dimensions to evaluate, expected output structure. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
- Use the `researcher` subagent to gather source material when the comparison set is broad, and the `verifier` subagent to verify sources and add inline citations to the final matrix.
- Build a comparison matrix covering: source, key claim, evidence type, caveats, confidence.
- Generate charts with `pi-charts` when the comparison involves quantitative metrics. Use Mermaid for method or architecture comparisons.

View File

@@ -51,7 +51,7 @@ If `CHANGELOG.md` exists, read the most recent relevant entries before finalizin
Also save the plan with `memory_remember` (type: `fact`, key: `deepresearch.<slug>.plan`) so it survives context truncation.
Present the plan to the user, then continue automatically. Do not block the workflow waiting for approval. If the user actively asks for changes, revise the plan first before proceeding.
Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting in the terminal, give them a brief chance to request plan changes before proceeding.
## 2. Scale decision

View File

@@ -9,7 +9,7 @@ Write a paper-style draft for: $@
Derive a short slug from the topic (lowercase, hyphens, no filler words, ≤5 words). Use this slug for all files in this run.
Requirements:
- Before writing, outline the draft structure: proposed title, sections, key claims to make, source material to draw from, and a verification log for the critical claims, figures, and calculations. Write the outline to `outputs/.plans/<slug>.md`. Present the outline to the user, then continue automatically. Do not block the workflow waiting for confirmation.
- Before writing, outline the draft structure: proposed title, sections, key claims to make, source material to draw from, and a verification log for the critical claims, figures, and calculations. Write the outline to `outputs/.plans/<slug>.md`. Present the outline to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
- Use the `writer` subagent when the draft should be produced from already-collected notes, then use the `verifier` subagent to add inline citations and verify sources.
- Include at minimum: title, abstract, problem statement, related work, method or synthesis, evidence or experiments, limitations, conclusion.
- Use clean Markdown with LaTeX where equations materially help.

View File

@@ -10,7 +10,7 @@ Derive a short slug from the topic (lowercase, hyphens, no filler words, ≤5 wo
## Workflow
1. **Plan** — Outline the scope: key questions, source types to search (papers, web, repos), time period, expected sections, and a small task ledger plus verification log. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user, then continue automatically. Do not block the workflow waiting for confirmation.
1. **Plan** — Outline the scope: key questions, source types to search (papers, web, repos), time period, expected sections, and a small task ledger plus verification log. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
2. **Gather** — Use the `researcher` subagent when the sweep is wide enough to benefit from delegated paper triage before synthesis. For narrow topics, search directly. Researcher outputs go to `<slug>-research-*.md`. Do not silently skip assigned questions; mark them `done`, `blocked`, or `superseded`.
3. **Synthesize** — Separate consensus, disagreements, and open questions. When useful, propose concrete next experiments or follow-up reading. Generate charts with `pi-charts` for quantitative comparisons across papers and Mermaid diagrams for taxonomies or method pipelines. Before finishing the draft, sweep every strong claim against the verification log and downgrade anything that is inferred or single-source critical.
4. **Cite** — Spawn the `verifier` agent to add inline citations and verify every source URL in the draft.

View File

@@ -9,7 +9,7 @@ Review this AI research artifact: $@
Derive a short slug from the artifact name (lowercase, hyphens, no filler words, ≤5 words). Use this slug for all files in this run.
Requirements:
- Before starting, outline what will be reviewed, the review criteria (novelty, empirical rigor, baselines, reproducibility, etc.), and any verification-specific checks needed for claims, figures, and reported metrics. Present the plan to the user, then continue automatically. Do not block the workflow waiting for confirmation.
- Before starting, outline what will be reviewed, the review criteria (novelty, empirical rigor, baselines, reproducibility, etc.), and any verification-specific checks needed for claims, figures, and reported metrics. Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
- Spawn a `researcher` subagent to gather evidence on the artifact — inspect the paper, code, cited work, and any linked experimental artifacts. Save to `<slug>-research.md`.
- Spawn a `reviewer` subagent with `<slug>-research.md` to produce the final peer review with inline annotations.
- For small or simple artifacts where evidence gathering is overkill, run the `reviewer` subagent directly instead.

View File

@@ -9,7 +9,7 @@ Create a research watch for: $@
Derive a short slug from the watch topic (lowercase, hyphens, no filler words, ≤5 words). Use this slug for all files in this run.
Requirements:
- Before starting, outline the watch plan: what to monitor, what signals matter, what counts as a meaningful change, and the check frequency. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user, then continue automatically. Do not block the workflow waiting for confirmation.
- Before starting, outline the watch plan: what to monitor, what signals matter, what counts as a meaningful change, and the check frequency. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
- Start with a baseline sweep of the topic.
- Use `schedule_prompt` to create the recurring or delayed follow-up instead of merely promising to check later.
- Save exactly one baseline artifact to `outputs/<slug>-baseline.md`.

View File

@@ -24,14 +24,16 @@ export function patchPiWebAccessSource(relativePath, source) {
}
if (relativePath === "index.ts") {
if (patched.includes('return "summary-review";')) {
patched = patched.replace('return "summary-review";', 'return "none";');
const workflowDefaultOriginal = 'const workflow = resolveWorkflow(params.workflow ?? configWorkflow, ctx?.hasUI !== false);';
const workflowDefaultPatched = 'const workflow = resolveWorkflow(params.workflow ?? configWorkflow ?? "none", ctx?.hasUI !== false);';
if (patched.includes(workflowDefaultOriginal)) {
patched = patched.replace(workflowDefaultOriginal, workflowDefaultPatched);
changed = true;
}
if (patched.includes('summary-review = open curator with auto summary draft (default)')) {
patched = patched.replace(
'summary-review = open curator with auto summary draft (default)',
'summary-review = open curator with auto summary draft',
'summary-review = open curator with auto summary draft (opt-in)',
);
changed = true;
}

View File

@@ -33,13 +33,15 @@ test("patchPiWebAccessSource updates index.ts directory handling", () => {
assert.match(patched, /const dir = dirname\(WEB_SEARCH_CONFIG_PATH\);/);
});
test("patchPiWebAccessSource defaults workflow to none for index.ts", () => {
test("patchPiWebAccessSource defaults workflow to none for index.ts without disabling explicit summary-review", () => {
const input = [
'function resolveWorkflow(input: unknown, hasUI: boolean): WebSearchWorkflow {',
'\tif (!hasUI) return "none";',
'\tif (typeof input === "string" && input.trim().toLowerCase() === "none") return "none";',
'\treturn "summary-review";',
'}',
'const configWorkflow = loadConfigForExtensionInit().workflow;',
'const workflow = resolveWorkflow(params.workflow ?? configWorkflow, ctx?.hasUI !== false);',
'workflow: Type.Optional(',
'\tStringEnum(["none", "summary-review"], {',
'\t\tdescription: "Search workflow mode: none = no curator, summary-review = open curator with auto summary draft (default)",',
@@ -50,8 +52,9 @@ test("patchPiWebAccessSource defaults workflow to none for index.ts", () => {
const patched = patchPiWebAccessSource("index.ts", input);
assert.match(patched, /return "none";/);
assert.doesNotMatch(patched, /summary-review = open curator with auto summary draft \(default\)/);
assert.match(patched, /params\.workflow \?\? configWorkflow \?\? "none"/);
assert.match(patched, /return "summary-review";/);
assert.match(patched, /summary-review = open curator with auto summary draft \(opt-in\)/);
});
test("patchPiWebAccessSource is idempotent", () => {

View File

@@ -35,23 +35,27 @@ To update the standalone Feynman app on macOS, Linux, or Windows, rerun the inst
## Uninstalling
Feynman does not currently ship a dedicated `uninstall` command. Remove the standalone launcher and runtime bundle directly, then optionally remove the Feynman home directory if you also want to delete settings, auth, and sessions.
Feynman does not currently ship a dedicated `uninstall` command. Remove the standalone launcher and runtime bundle directly, then optionally remove the Feynman home directory if you also want to delete settings, sessions, and installed package state. If you also want to clear alphaXiv login state, remove `~/.ahub`.
On macOS or Linux:
```bash
rm -f ~/.local/bin/feynman
rm -rf ~/.local/share/feynman
# optional: remove settings, auth, sessions, and installed package state
# optional: remove settings, sessions, and installed package state
rm -rf ~/.feynman
# optional: remove alphaXiv auth state
rm -rf ~/.ahub
```
On Windows PowerShell:
```powershell
Remove-Item "$env:LOCALAPPDATA\\Programs\\feynman" -Recurse -Force
# optional: remove settings, auth, sessions, and installed package state
# optional: remove settings, sessions, and installed package state
Remove-Item "$HOME\\.feynman" -Recurse -Force
# optional: remove alphaXiv auth state
Remove-Item "$HOME\\.ahub" -Recurse -Force
```
If you added the launcher directory to `PATH` manually, remove that entry as well.