diff --git a/CHANGELOG.md b/CHANGELOG.md index 9c99d9c..421517c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -295,6 +295,15 @@ Use this file to track chronology, not release notes. Keep entries short, factua - Blockers: The duplicate-id warning remains as a build warning only, not a failing correctness gate. - Next: If desired, isolate the Astro duplicate-id warning separately with a minimal reproduction rather than mixing it into runtime/CLI maintenance. +### 2026-04-14 10:55 PDT — summarize-workflow-restore + +- Objective: Restore the useful summarization workflow that had been closed in PR `#69` without being merged. +- Changed: Added `prompts/summarize.md` as a top-level CLI workflow so `feynman summarize ` is available again; kept the RLM-based tiering approach from the original proposal and aligned Tier 3 confirmation behavior with the repo's unattended-run conventions. +- Verified: Confirmed `feynman summarize ` appears in CLI help; ran `node bin/feynman.js summarize /tmp/feynman-summary-smoke.txt` against a local smoke file and verified it produced `outputs/feynman-summary-smoke-summary.md` plus the raw fetched note artifact under `outputs/.notes/`. +- Failed / learned: None in the restored Tier 1 path; broader Tier 2/Tier 3 behavior still depends on runtime/model/tool availability, just like the other prompt-driven workflows. +- Blockers: None for the prompt restoration itself. +- Next: If desired, add dedicated docs for `summarize` and decide whether to reopen PR `#69` for historical continuity or leave it closed as superseded by the landed equivalent on `main`. + ### 2026-04-12 13:20 PDT — capital-france (citation verification brief) - Objective: Verify citations in the capital-of-France draft and produce a cited verifier brief. diff --git a/prompts/summarize.md b/prompts/summarize.md new file mode 100644 index 0000000..e4d7c9c --- /dev/null +++ b/prompts/summarize.md @@ -0,0 +1,165 @@ +--- +description: Summarize any URL, local file, or PDF using the RLM pattern — source stored on disk, never injected raw into context. +args: +section: Research Workflows +topLevelCli: true +--- +Summarize the following source: $@ + +Derive a short slug from the source filename or URL domain (lowercase, hyphens, no filler words, ≤5 words — e.g. `attention-is-all-you-need`). Use this slug for all files in this run. + +## Why this uses the RLM pattern + +Standard summarization injects the full document into context. Above ~15k tokens, early content degrades as the window fills (context rot). This workflow keeps the document on disk as an external variable and reads only bounded windows — so context pressure is proportional to the window size, not the document size. + +Tier 1 (< 8k chars) is a deliberate exception: direct injection is safe at ~2k tokens and windowed reading would add unnecessary friction. + +--- + +## Step 1 — Fetch, validate, measure + +Run all guards before any tier logic. A failure here is cheap; a failure mid-Tier-3 is not. + +- **GitHub repo URL** (`https://github.com/owner/repo` — exactly 4 slashes): fetch the raw README instead. Try `https://raw.githubusercontent.com/{owner}/{repo}/main/README.md`, then `/master/README.md`. A repo HTML page is not the document the user wants to summarize. +- **Remote URL**: fetch to disk with `curl -sL -o outputs/.notes/-raw.txt `. Do NOT use fetch_content — its return value enters context directly, bypassing the RLM external-variable principle. +- **Local file or PDF**: copy or extract to `outputs/.notes/-raw.txt`. For PDFs, extract text via `pdftotext` or equivalent before measuring. +- **Empty or failed fetch**: if the file is < 50 bytes after fetching, stop and surface the error to the user — do not proceed to tier selection. +- **Binary content**: if the file is > 1 KB but contains < 100 readable text characters, stop and tell the user the content appears binary or unextracted. +- **Existing output**: if `outputs/-summary.md` already exists, ask the user whether to overwrite or use a different slug. Do not proceed until confirmed. + +Measure decoded text characters (not bytes — UTF-8 multi-byte chars would overcount). Log: `[summarize] source= slug= chars=` + +--- + +## Step 2 — Choose tier + +| Chars | Tier | Strategy | +|---|---|---| +| < 8 000 | 1 | Direct read — full content enters context (safe at ~2k tokens) | +| 8 000 – 60 000 | 2 | RLM-lite — windowed bash extraction, progressive notes to disk | +| > 60 000 | 3 | Full RLM — bash chunking + parallel researcher subagents | + +Log: `[summarize] tier= chars=` + +--- + +## Tier 1 — Direct read + +Read `outputs/.notes/-raw.txt` in full. Summarize directly using the output format. Write to `outputs/-summary.md`. + +--- + +## Tier 2 — RLM-lite windowed read + +The document stays on disk. Extract 6 000-char windows via bash: + +```python +# WHY f.seek/f.read: the read tool uses line offsets, not char offsets. +# For exact char-boundary windowing across arbitrary text, bash is required. +with open("outputs/.notes/-raw.txt", encoding="utf-8") as f: + f.seek(n * 6000) + window = f.read(6000) +``` + +For each window: +1. Extract key claims and evidence. +2. Append to `outputs/.notes/-notes.md` before reading the next window. This is the checkpoint: if the session is interrupted, processed windows survive. +3. Log: `[summarize] window / done` + +Synthesize `outputs/.notes/-notes.md` into `outputs/-summary.md`. + +--- + +## Tier 3 — Full RLM parallel chunks + +Each chunk gets a fresh researcher subagent context window — context rot is impossible because no subagent sees more than 6 000 chars. + +WHY 500-char overlap: academic papers contain multi-sentence arguments that span chunk boundaries. 500 chars (~80 words) ensures a cross-boundary claim appears fully in at least one adjacent chunk. + +### 3a. Chunk the document + +```python +import os +os.makedirs("outputs/.notes", exist_ok=True) + +with open("outputs/.notes/-raw.txt", encoding="utf-8") as f: + text = f.read() + +chunk_size, overlap = 6000, 500 +chunks, i = [], 0 +while i < len(text): + chunks.append(text[i : i + chunk_size]) + i += chunk_size - overlap + +for n, chunk in enumerate(chunks): + # Zero-pad index so files sort correctly (chunk-002 before chunk-010) + with open(f"outputs/.notes/-chunk-{n:03d}.txt", "w", encoding="utf-8") as f: + f.write(chunk) + +print(f"[summarize] chunks={len(chunks)} chunk_size={chunk_size} overlap={overlap}") +``` + +### 3b. Confirm before spawning + +If this is an unattended or one-shot run, continue automatically. Otherwise tell the user: "Source is ~ chars -> chunks -> researcher subagents. This may take several minutes. Proceed?" Wait for confirmation before launching Tier 3. + +### 3c. Dispatch researcher subagents + +```json +{ + "tasks": [{ + "agent": "researcher", + "task": "Read ONLY `outputs/.notes/-chunk-NNN.txt`. Extract: (1) key claims, (2) methodology or technical approach, (3) cited evidence. Do NOT use web_search or fetch external URLs — this is single-source summarization. If a claim appears to start or end mid-sentence at the file boundary, mark it BOUNDARY PARTIAL. Write to `outputs/.notes/-summary-chunk-NNN.md`.", + "output": "outputs/.notes/-summary-chunk-NNN.md" + }], + "concurrency": 4, + "failFast": false +} +``` + +### 3d. Aggregate + +After all subagents return, verify every expected `outputs/.notes/-summary-chunk-NNN.md` exists. Note any missing chunk indices — they will appear in the Coverage gaps section of the output. Do not abort on partial coverage; a partial summary with gaps noted is more useful than no summary. + +When synthesizing: +- **Deduplicate**: a claim in multiple chunks is one claim — keep the most complete formulation. +- **Resolve boundary conflicts**: for adjacent-chunk contradictions, prefer the version with more supporting context. +- **Remove BOUNDARY PARTIAL markers** where a complete version exists in a neighbouring chunk. + +Write to `outputs/-summary.md`. + +--- + +## Output format + +All tiers produce the same artifact at `outputs/-summary.md`: + +```markdown +# Summary: [document title or source filename] + +**Source:** [URL or file path] +**Date:** [YYYY-MM-DD] +**Tier:** [1 / 2 (N windows) / 3 (N chunks)] + +## Key Claims +[3-7 most important assertions, each as a bullet] + +## Methodology +[Approach, dataset, evaluation, baselines — omit for non-research documents] + +## Limitations +[What the source explicitly flags as weak, incomplete, or out of scope] + +## Verdict +[One paragraph: what this document establishes, its credibility, who should read it] + +## Sources +1. [Title or filename] — [URL or file path] + +## Coverage gaps *(Tier 3 only — omit if all chunks succeeded)* +[Missing chunk indices and their approximate byte ranges] +``` + +Before you stop, verify on disk that `outputs/-summary.md` exists. + +Sources contains only the single source confirmed reachable in Step 1. No verifier subagent is needed — there are no URLs constructed from memory to verify.