fix: tighten workflow prompts and search defaults

2026-04-14 09:30:15 -07:00
parent af6486312d
commit 0995f5cc22
11 changed files with 26 additions and 17 deletions
--- a/README.md
+++ b/README.md
@@ -31,7 +31,7 @@ The installer downloads a standalone native bundle with its own Node.js runtime.

 To upgrade the standalone app later, rerun the installer. `feynman update` only refreshes installed Pi packages inside Feynman's environment; it does not replace the standalone runtime bundle itself.

-To uninstall the standalone app, remove the launcher and runtime bundle, then optionally remove `~/.feynman` if you also want to delete settings, auth, and sessions. See the installation guide for platform-specific paths.
+To uninstall the standalone app, remove the launcher and runtime bundle, then optionally remove `~/.feynman` if you also want to delete settings, sessions, and installed package state. If you also want to delete alphaXiv login state, remove `~/.ahub`. See the installation guide for platform-specific paths.

 Local models are supported through the custom-provider flow. For Ollama, run `feynman setup`, choose `Custom provider (baseUrl + API key)`, use `openai-completions`, and point it at `http://localhost:11434/v1`.

--- a/prompts/audit.md
+++ b/prompts/audit.md
@@ -9,7 +9,7 @@ Audit the paper and codebase for: $@
 Derive a short slug from the audit target (lowercase, hyphens, no filler words, ≤5 words). Use this slug for all files in this run.

 Requirements:
- Before starting, outline the audit plan: which paper, which repo, which claims to check. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user, then continue automatically. Do not block the workflow waiting for confirmation.
+- Before starting, outline the audit plan: which paper, which repo, which claims to check. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
 - Use the `researcher` subagent for evidence gathering and the `verifier` subagent to verify sources and add inline citations when the audit is non-trivial.
 - Compare claimed methods, defaults, metrics, and data handling against the actual code.
 - Call out missing code, mismatches, ambiguous defaults, and reproduction risks.
--- a/prompts/compare.md
+++ b/prompts/compare.md
@@ -9,7 +9,7 @@ Compare sources for: $@
 Derive a short slug from the comparison topic (lowercase, hyphens, no filler words, ≤5 words). Use this slug for all files in this run.

 Requirements:
- Before starting, outline the comparison plan: which sources to compare, which dimensions to evaluate, expected output structure. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user, then continue automatically. Do not block the workflow waiting for confirmation.
+- Before starting, outline the comparison plan: which sources to compare, which dimensions to evaluate, expected output structure. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
 - Use the `researcher` subagent to gather source material when the comparison set is broad, and the `verifier` subagent to verify sources and add inline citations to the final matrix.
 - Build a comparison matrix covering: source, key claim, evidence type, caveats, confidence.
 - Generate charts with `pi-charts` when the comparison involves quantitative metrics. Use Mermaid for method or architecture comparisons.
--- a/prompts/deepresearch.md
+++ b/prompts/deepresearch.md
@@ -51,7 +51,7 @@ If `CHANGELOG.md` exists, read the most recent relevant entries before finalizin

 Also save the plan with `memory_remember` (type: `fact`, key: `deepresearch.<slug>.plan`) so it survives context truncation.

-Present the plan to the user, then continue automatically. Do not block the workflow waiting for approval. If the user actively asks for changes, revise the plan first before proceeding.
+Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting in the terminal, give them a brief chance to request plan changes before proceeding.

 ## 2. Scale decision

--- a/prompts/draft.md
+++ b/prompts/draft.md
@@ -9,7 +9,7 @@ Write a paper-style draft for: $@
 Derive a short slug from the topic (lowercase, hyphens, no filler words, ≤5 words). Use this slug for all files in this run.

 Requirements:
- Before writing, outline the draft structure: proposed title, sections, key claims to make, source material to draw from, and a verification log for the critical claims, figures, and calculations. Write the outline to `outputs/.plans/<slug>.md`. Present the outline to the user, then continue automatically. Do not block the workflow waiting for confirmation.
+- Before writing, outline the draft structure: proposed title, sections, key claims to make, source material to draw from, and a verification log for the critical claims, figures, and calculations. Write the outline to `outputs/.plans/<slug>.md`. Present the outline to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
 - Use the `writer` subagent when the draft should be produced from already-collected notes, then use the `verifier` subagent to add inline citations and verify sources.
 - Include at minimum: title, abstract, problem statement, related work, method or synthesis, evidence or experiments, limitations, conclusion.
 - Use clean Markdown with LaTeX where equations materially help.
--- a/prompts/lit.md
+++ b/prompts/lit.md
@@ -10,7 +10,7 @@ Derive a short slug from the topic (lowercase, hyphens, no filler words, ≤5 wo

 ## Workflow

-1. **Plan** — Outline the scope: key questions, source types to search (papers, web, repos), time period, expected sections, and a small task ledger plus verification log. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user, then continue automatically. Do not block the workflow waiting for confirmation.
+1. **Plan** — Outline the scope: key questions, source types to search (papers, web, repos), time period, expected sections, and a small task ledger plus verification log. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
 2. **Gather** — Use the `researcher` subagent when the sweep is wide enough to benefit from delegated paper triage before synthesis. For narrow topics, search directly. Researcher outputs go to `<slug>-research-*.md`. Do not silently skip assigned questions; mark them `done`, `blocked`, or `superseded`.
 3. **Synthesize** — Separate consensus, disagreements, and open questions. When useful, propose concrete next experiments or follow-up reading. Generate charts with `pi-charts` for quantitative comparisons across papers and Mermaid diagrams for taxonomies or method pipelines. Before finishing the draft, sweep every strong claim against the verification log and downgrade anything that is inferred or single-source critical.
 4. **Cite** — Spawn the `verifier` agent to add inline citations and verify every source URL in the draft.
--- a/prompts/review.md
+++ b/prompts/review.md
@@ -9,7 +9,7 @@ Review this AI research artifact: $@
 Derive a short slug from the artifact name (lowercase, hyphens, no filler words, ≤5 words). Use this slug for all files in this run.

 Requirements:
- Before starting, outline what will be reviewed, the review criteria (novelty, empirical rigor, baselines, reproducibility, etc.), and any verification-specific checks needed for claims, figures, and reported metrics. Present the plan to the user, then continue automatically. Do not block the workflow waiting for confirmation.
+- Before starting, outline what will be reviewed, the review criteria (novelty, empirical rigor, baselines, reproducibility, etc.), and any verification-specific checks needed for claims, figures, and reported metrics. Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
 - Spawn a `researcher` subagent to gather evidence on the artifact — inspect the paper, code, cited work, and any linked experimental artifacts. Save to `<slug>-research.md`.
 - Spawn a `reviewer` subagent with `<slug>-research.md` to produce the final peer review with inline annotations.
 - For small or simple artifacts where evidence gathering is overkill, run the `reviewer` subagent directly instead.
--- a/prompts/watch.md
+++ b/prompts/watch.md
@@ -9,7 +9,7 @@ Create a research watch for: $@
 Derive a short slug from the watch topic (lowercase, hyphens, no filler words, ≤5 words). Use this slug for all files in this run.

 Requirements:
- Before starting, outline the watch plan: what to monitor, what signals matter, what counts as a meaningful change, and the check frequency. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user, then continue automatically. Do not block the workflow waiting for confirmation.
+- Before starting, outline the watch plan: what to monitor, what signals matter, what counts as a meaningful change, and the check frequency. Write the plan to `outputs/.plans/<slug>.md`. Present the plan to the user. If this is an unattended or one-shot run, continue automatically. If the user is actively interacting, give them a brief chance to request changes before proceeding.
 - Start with a baseline sweep of the topic.
 - Use `schedule_prompt` to create the recurring or delayed follow-up instead of merely promising to check later.
 - Save exactly one baseline artifact to `outputs/<slug>-baseline.md`.
--- a/scripts/lib/pi-web-access-patch.mjs
+++ b/scripts/lib/pi-web-access-patch.mjs
@@ -24,14 +24,16 @@ export function patchPiWebAccessSource(relativePath, source) {
 	}

 	if (relativePath === "index.ts") {
-		if (patched.includes('return "summary-review";')) {
-			patched = patched.replace('return "summary-review";', 'return "none";');
+		const workflowDefaultOriginal = 'const workflow = resolveWorkflow(params.workflow ?? configWorkflow, ctx?.hasUI !== false);';
+		const workflowDefaultPatched = 'const workflow = resolveWorkflow(params.workflow ?? configWorkflow ?? "none", ctx?.hasUI !== false);';
+		if (patched.includes(workflowDefaultOriginal)) {
+			patched = patched.replace(workflowDefaultOriginal, workflowDefaultPatched);
 			changed = true;
 		}
 		if (patched.includes('summary-review = open curator with auto summary draft (default)')) {
 			patched = patched.replace(
 				'summary-review = open curator with auto summary draft (default)',
-				'summary-review = open curator with auto summary draft',
+				'summary-review = open curator with auto summary draft (opt-in)',
 			);
 			changed = true;
 		}
--- a/tests/pi-web-access-patch.test.ts
+++ b/tests/pi-web-access-patch.test.ts
@@ -33,13 +33,15 @@ test("patchPiWebAccessSource updates index.ts directory handling", () => {
 	assert.match(patched, /const dir = dirname\(WEB_SEARCH_CONFIG_PATH\);/);
 });

-test("patchPiWebAccessSource defaults workflow to none for index.ts", () => {
+test("patchPiWebAccessSource defaults workflow to none for index.ts without disabling explicit summary-review", () => {
 	const input = [
 		'function resolveWorkflow(input: unknown, hasUI: boolean): WebSearchWorkflow {',
 		'\tif (!hasUI) return "none";',
 		'\tif (typeof input === "string" && input.trim().toLowerCase() === "none") return "none";',
 		'\treturn "summary-review";',
 		'}',
+		'const configWorkflow = loadConfigForExtensionInit().workflow;',
+		'const workflow = resolveWorkflow(params.workflow ?? configWorkflow, ctx?.hasUI !== false);',
 		'workflow: Type.Optional(',
 		'\tStringEnum(["none", "summary-review"], {',
 		'\t\tdescription: "Search workflow mode: none = no curator, summary-review = open curator with auto summary draft (default)",',
@@ -50,8 +52,9 @@ test("patchPiWebAccessSource defaults workflow to none for index.ts", () => {

 	const patched = patchPiWebAccessSource("index.ts", input);

-	assert.match(patched, /return "none";/);
-	assert.doesNotMatch(patched, /summary-review = open curator with auto summary draft \(default\)/);
+	assert.match(patched, /params\.workflow \?\? configWorkflow \?\? "none"/);
+	assert.match(patched, /return "summary-review";/);
+	assert.match(patched, /summary-review = open curator with auto summary draft \(opt-in\)/);
 });

 test("patchPiWebAccessSource is idempotent", () => {
--- a/website/src/content/docs/getting-started/installation.md
+++ b/website/src/content/docs/getting-started/installation.md
@@ -35,23 +35,27 @@ To update the standalone Feynman app on macOS, Linux, or Windows, rerun the inst

 ## Uninstalling

-Feynman does not currently ship a dedicated `uninstall` command. Remove the standalone launcher and runtime bundle directly, then optionally remove the Feynman home directory if you also want to delete settings, auth, and sessions.
+Feynman does not currently ship a dedicated `uninstall` command. Remove the standalone launcher and runtime bundle directly, then optionally remove the Feynman home directory if you also want to delete settings, sessions, and installed package state. If you also want to clear alphaXiv login state, remove `~/.ahub`.

 On macOS or Linux:

 ```bash
 rm -f ~/.local/bin/feynman
 rm -rf ~/.local/share/feynman
-# optional: remove settings, auth, sessions, and installed package state
+# optional: remove settings, sessions, and installed package state
 rm -rf ~/.feynman
+# optional: remove alphaXiv auth state
+rm -rf ~/.ahub
 ```

 On Windows PowerShell:

 ```powershell
 Remove-Item "$env:LOCALAPPDATA\\Programs\\feynman" -Recurse -Force
-# optional: remove settings, auth, sessions, and installed package state
+# optional: remove settings, sessions, and installed package state
 Remove-Item "$HOME\\.feynman" -Recurse -Force
+# optional: remove alphaXiv auth state
+Remove-Item "$HOME\\.ahub" -Recurse -Force
 ```

 If you added the launcher directory to `PATH` manually, remove that entry as well.