Finalize workflow and prompt updates

2026-03-24 11:59:50 -07:00
parent d7afde8fc0
commit 1c90128605
7 changed files with 40 additions and 2 deletions
--- a/prompts/autoresearch.md
+++ b/prompts/autoresearch.md
@@ -11,6 +11,7 @@ This command uses pi-autoresearch.
 ## Step 1: Gather

 If `autoresearch.md` and `autoresearch.jsonl` already exist, ask the user if they want to resume or start fresh.
+If `CHANGELOG.md` exists, read the most recent relevant entries before resuming.

 Otherwise, collect the following from the user before doing anything else:
 - What to optimize (test speed, bundle size, training loss, build time, etc.)
@@ -48,6 +49,7 @@ Ask the user to confirm. Do not start the loop without explicit approval.
 Initialize the session: create `autoresearch.md`, `autoresearch.sh`, run the baseline, and start looping.

 Each iteration: edit → commit → `run_experiment` → `log_experiment` → keep or revert → repeat. Do not stop unless interrupted or `maxIterations` is reached.
+After the baseline and after meaningful iteration milestones, append a concise entry to `CHANGELOG.md` summarizing what changed, what metric result was observed, what failed, and the next step.

 ## Key tools

--- a/prompts/deepresearch.md
+++ b/prompts/deepresearch.md
@@ -18,6 +18,7 @@ Analyze the research question using extended thinking. Develop a research strate
 - Acceptance criteria: what evidence would make the answer "sufficient"

 Derive a short slug from the topic (lowercase, hyphens, no filler words, ≤5 words — e.g. "cloud-sandbox-pricing" not "deepresearch-plan"). Write the plan to `outputs/.plans/<slug>.md` as a self-contained artifact. Use this same slug for all artifacts in this run.
+If `CHANGELOG.md` exists, read the most recent relevant entries before finalizing the plan. Once the workflow becomes multi-round or spans enough work to merit resume support, append concise entries to `CHANGELOG.md` after meaningful progress and before stopping.

 ```markdown
 # Research Plan: [topic]
@@ -100,6 +101,7 @@ After researchers return, read their output files and critically assess:
 If gaps are significant, spawn another targeted batch of researchers. No fixed cap on rounds — iterate until evidence is sufficient or sources are exhausted.

 Update the plan artifact (`outputs/.plans/<slug>.md`) task ledger, verification log, and decision log after each round.
+When the work spans multiple rounds, also append a concise chronological entry to `CHANGELOG.md` covering what changed, what was verified, what remains blocked, and the next recommended step.

 Most topics need 1-2 rounds. Stop when additional rounds would not materially change conclusions.

--- a/prompts/replicate.md
+++ b/prompts/replicate.md
@@ -8,7 +8,7 @@ Design a replication plan for: $@

 ## Workflow

-1. **Extract** — Use the `researcher` subagent to pull implementation details from the target paper and any linked code.
+1. **Extract** — Use the `researcher` subagent to pull implementation details from the target paper and any linked code. If `CHANGELOG.md` exists, read the most recent relevant entries before planning or resuming.
 2. **Plan** — Determine what code, datasets, metrics, and environment are needed. Be explicit about what is verified, what is inferred, what is still missing, and which checks or test oracles will be used to decide whether the replication succeeded.
 3. **Environment** — Before running anything, ask the user where to execute:
   - **Local** — run in the current working directory
@@ -16,6 +16,7 @@ Design a replication plan for: $@
   - **Docker** — run experiment code inside an isolated Docker container
   - **Plan only** — produce the replication plan without executing
 4. **Execute** — If the user chose an execution environment, implement and run the replication steps there. Save notes, scripts, raw outputs, and results to disk in a reproducible layout. Do not call the outcome replicated unless the planned checks actually passed.
-5. **Report** — End with a `Sources` section containing paper and repository URLs.
+5. **Log** — For multi-step or resumable replication work, append concise entries to `CHANGELOG.md` after meaningful progress, failed attempts, major verification outcomes, and before stopping. Record the active objective, what changed, what was checked, and the next step.
+6. **Report** — End with a `Sources` section containing paper and repository URLs.

 Do not install packages, run training, or execute experiments without confirming the execution environment first.