Deduplicate fabricated-results guardrails
This commit is contained in:
@@ -17,7 +17,7 @@ You receive a draft document and the research files it was built from. Your job
|
|||||||
4. **Remove unsourced claims** — if a factual claim in the draft cannot be traced to any source in the research files, either find a source for it or remove it. Do not leave unsourced factual claims.
|
4. **Remove unsourced claims** — if a factual claim in the draft cannot be traced to any source in the research files, either find a source for it or remove it. Do not leave unsourced factual claims.
|
||||||
5. **Verify meaning, not just topic overlap.** A citation is valid only if the source actually supports the specific number, quote, or conclusion attached to it.
|
5. **Verify meaning, not just topic overlap.** A citation is valid only if the source actually supports the specific number, quote, or conclusion attached to it.
|
||||||
6. **Refuse fake certainty.** Do not use words like `verified`, `confirmed`, or `reproduced` unless the draft already contains or the research files provide the underlying evidence.
|
6. **Refuse fake certainty.** Do not use words like `verified`, `confirmed`, or `reproduced` unless the draft already contains or the research files provide the underlying evidence.
|
||||||
7. **Never invent or keep fabricated results.** If any image, figure, chart, table, benchmark, score, dataset, sample size, ablation, or experimental result lacks explicit provenance, remove it or replace it with a clearly labeled TODO. Never keep a made-up result because it “looks plausible.”
|
7. **Enforce the system prompt's provenance rule.** Unsupported results, figures, charts, tables, benchmarks, and quantitative claims must be removed or converted to TODOs.
|
||||||
|
|
||||||
## Citation rules
|
## Citation rules
|
||||||
|
|
||||||
@@ -41,7 +41,7 @@ For code-backed or quantitative claims:
|
|||||||
- Treat captions such as “illustrative,” “simulated,” “representative,” or “example” as insufficient unless the user explicitly requested synthetic/example data. Otherwise remove the visual and mark the missing experiment.
|
- Treat captions such as “illustrative,” “simulated,” “representative,” or “example” as insufficient unless the user explicitly requested synthetic/example data. Otherwise remove the visual and mark the missing experiment.
|
||||||
- Do not preserve polished summaries that outrun the raw evidence.
|
- Do not preserve polished summaries that outrun the raw evidence.
|
||||||
|
|
||||||
## Fabrication audit
|
## Result provenance audit
|
||||||
|
|
||||||
Before saving the final document, scan for:
|
Before saving the final document, scan for:
|
||||||
- numeric scores or percentages,
|
- numeric scores or percentages,
|
||||||
|
|||||||
@@ -15,7 +15,7 @@ You are Feynman's writing subagent.
|
|||||||
3. **Be explicit about gaps.** If the research files have unresolved questions or conflicting evidence, surface them — do not paper over them.
|
3. **Be explicit about gaps.** If the research files have unresolved questions or conflicting evidence, surface them — do not paper over them.
|
||||||
4. **Do not promote draft text into fact.** If a result is tentative, inferred, or awaiting verification, label it that way in the prose.
|
4. **Do not promote draft text into fact.** If a result is tentative, inferred, or awaiting verification, label it that way in the prose.
|
||||||
5. **No aesthetic laundering.** Do not make plots, tables, or summaries look cleaner than the underlying evidence justifies.
|
5. **No aesthetic laundering.** Do not make plots, tables, or summaries look cleaner than the underlying evidence justifies.
|
||||||
6. **Never fabricate results.** Do not invent experimental scores, datasets, sample sizes, ablations, benchmark tables, charts, image captions, or figures. If evidence is missing, write `No results are available yet` or `TODO: run experiment` rather than producing plausible-looking data.
|
6. **Follow the system prompt's provenance rule.** Missing results become gaps or TODOs, never plausible-looking data.
|
||||||
|
|
||||||
## Output structure
|
## Output structure
|
||||||
|
|
||||||
@@ -50,7 +50,7 @@ Unresolved issues, disagreements between sources, gaps in evidence.
|
|||||||
- Do NOT add inline citations — the verifier agent handles that as a separate post-processing step.
|
- Do NOT add inline citations — the verifier agent handles that as a separate post-processing step.
|
||||||
- Do NOT add a Sources section — the verifier agent builds that.
|
- Do NOT add a Sources section — the verifier agent builds that.
|
||||||
- Before finishing, do a claim sweep: every strong factual statement in the draft should have an obvious source home in the research files.
|
- Before finishing, do a claim sweep: every strong factual statement in the draft should have an obvious source home in the research files.
|
||||||
- Before finishing, do a fake-result sweep: remove or replace any numeric result, figure, chart, benchmark, table, or image that lacks explicit provenance.
|
- Before finishing, do a result-provenance sweep for numeric results, figures, charts, benchmarks, tables, and images.
|
||||||
|
|
||||||
## Output contract
|
## Output contract
|
||||||
- Save the main artifact to the specified output path (default: `draft.md`).
|
- Save the main artifact to the specified output path (default: `draft.md`).
|
||||||
|
|||||||
@@ -39,14 +39,20 @@ test("research writing prompts forbid fabricated results and unproven figures",
|
|||||||
|
|
||||||
for (const [label, content] of [
|
for (const [label, content] of [
|
||||||
["system prompt", systemPrompt],
|
["system prompt", systemPrompt],
|
||||||
["writer prompt", writerPrompt],
|
|
||||||
["verifier prompt", verifierPrompt],
|
|
||||||
] as const) {
|
] as const) {
|
||||||
assert.match(content, /Never (invent|fabricate)/i, `${label} must explicitly forbid invented or fabricated results`);
|
assert.match(content, /Never (invent|fabricate)/i, `${label} must explicitly forbid invented or fabricated results`);
|
||||||
assert.match(content, /(figure|chart|image|table)/i, `${label} must cover visual/table provenance`);
|
assert.match(content, /(figure|chart|image|table)/i, `${label} must cover visual/table provenance`);
|
||||||
assert.match(content, /(provenance|source|artifact|script|raw)/i, `${label} must require traceable support`);
|
assert.match(content, /(provenance|source|artifact|script|raw)/i, `${label} must require traceable support`);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
for (const [label, content] of [
|
||||||
|
["writer prompt", writerPrompt],
|
||||||
|
["verifier prompt", verifierPrompt],
|
||||||
|
["draft prompt", draftPrompt],
|
||||||
|
] as const) {
|
||||||
|
assert.match(content, /system prompt.*provenance rule/i, `${label} must point back to the system provenance rule`);
|
||||||
|
}
|
||||||
|
|
||||||
assert.match(draftPrompt, /system prompt's provenance rules/i);
|
assert.match(draftPrompt, /system prompt's provenance rules/i);
|
||||||
assert.match(draftPrompt, /placeholder or proposed experimental plan/i);
|
assert.match(draftPrompt, /placeholder or proposed experimental plan/i);
|
||||||
assert.match(draftPrompt, /source-backed quantitative data/i);
|
assert.match(draftPrompt, /source-backed quantitative data/i);
|
||||||
|
|||||||
@@ -35,7 +35,7 @@ When working from existing session context (after a deep research or literature
|
|||||||
|
|
||||||
The writer pays attention to academic conventions: claims are attributed to their sources with inline citations, methodology sections describe procedures precisely, and limitations are discussed honestly. The draft includes placeholder sections for any content the writer cannot generate from available sources, clearly marking what needs human input.
|
The writer pays attention to academic conventions: claims are attributed to their sources with inline citations, methodology sections describe procedures precisely, and limitations are discussed honestly. The draft includes placeholder sections for any content the writer cannot generate from available sources, clearly marking what needs human input.
|
||||||
|
|
||||||
The draft workflow must not invent experimental results, scores, figures, images, tables, or benchmark data. When no source material or raw artifact supports a result, Feynman should leave a clearly labeled placeholder such as `No experimental results are available yet` or `TODO: run experiment` instead of producing plausible-looking data.
|
Drafts follow Feynman's system-wide provenance rules: unsupported results, figures, images, tables, or benchmark data should become clearly labeled gaps or TODOs, not plausible-looking claims.
|
||||||
|
|
||||||
## Output format
|
## Output format
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user