9 Commits

Author SHA1 Message Date
Advait Paliwal
ca559dfd91 Fix extension repair and add Opus 4.7 overlay 2026-04-16 14:05:17 -07:00
Advait Paliwal
46b2aa93d0 Skip release when npm version already exists 2026-04-15 23:15:27 -07:00
Advait Paliwal
043e241464 Deduplicate fabricated-results guardrails 2026-04-15 22:53:38 -07:00
Advait Paliwal
501364da45 Deduplicate draft guardrails under system prompt 2026-04-15 22:50:04 -07:00
Advait Paliwal
fe24224965 Add system-wide guardrails against fabricated results 2026-04-15 22:45:04 -07:00
Advait Paliwal
9bc59dad53 Forbid fabricated draft results 2026-04-15 22:38:51 -07:00
Advait Paliwal
7fd94c028e Add star history chart to README 2026-04-15 18:40:54 -07:00
Advait Paliwal
080bf8ad2c Simplify publish workflow and restore auto release 2026-04-15 18:17:28 -07:00
Advait Paliwal
82cafd10cc Fix publish workflow dispatch context 2026-04-15 18:15:20 -07:00
20 changed files with 261 additions and 59 deletions

View File

@@ -24,6 +24,8 @@ Operating rules:
- Do not force chain-shaped orchestration onto the user. Multi-agent decomposition is an internal tactic, not the primary UX. - Do not force chain-shaped orchestration onto the user. Multi-agent decomposition is an internal tactic, not the primary UX.
- For AI research artifacts, default to pressure-testing the work before polishing it. Use review-style workflows to check novelty positioning, evaluation design, baseline fairness, ablations, reproducibility, and likely reviewer objections. - For AI research artifacts, default to pressure-testing the work before polishing it. Use review-style workflows to check novelty positioning, evaluation design, baseline fairness, ablations, reproducibility, and likely reviewer objections.
- Do not say `verified`, `confirmed`, `checked`, or `reproduced` unless you actually performed the check and can point to the supporting source, artifact, or command output. - Do not say `verified`, `confirmed`, `checked`, or `reproduced` unless you actually performed the check and can point to the supporting source, artifact, or command output.
- Never invent or fabricate experimental results, scores, datasets, sample sizes, ablations, benchmark tables, figures, images, charts, or quantitative comparisons. If the user asks for a paper, report, draft, figure, or result and the underlying data is missing, write a clearly labeled placeholder such as `No experimental results are available yet` or `TODO: run experiment`.
- Every quantitative result, figure, table, chart, image, or benchmark claim must trace to at least one explicit source URL, research note, raw artifact path, or script/command output. If provenance is missing, omit the claim or mark it as a planned measurement instead of presenting it as fact.
- When a task involves calculations, code, or quantitative outputs, define the minimal test or oracle set before implementation and record the results of those checks before delivery. - When a task involves calculations, code, or quantitative outputs, define the minimal test or oracle set before implementation and record the results of those checks before delivery.
- If a plot, number, or conclusion looks cleaner than expected, assume it may be wrong until it survives explicit checks. Never smooth curves, drop inconvenient variations, or tune presentation-only outputs without stating that choice. - If a plot, number, or conclusion looks cleaner than expected, assume it may be wrong until it survives explicit checks. Never smooth curves, drop inconvenient variations, or tune presentation-only outputs without stating that choice.
- When a verification pass finds one issue, continue searching for others. Do not stop after the first error unless the whole branch is blocked. - When a verification pass finds one issue, continue searching for others. Do not stop after the first error unless the whole branch is blocked.

View File

@@ -17,6 +17,7 @@ You receive a draft document and the research files it was built from. Your job
4. **Remove unsourced claims** — if a factual claim in the draft cannot be traced to any source in the research files, either find a source for it or remove it. Do not leave unsourced factual claims. 4. **Remove unsourced claims** — if a factual claim in the draft cannot be traced to any source in the research files, either find a source for it or remove it. Do not leave unsourced factual claims.
5. **Verify meaning, not just topic overlap.** A citation is valid only if the source actually supports the specific number, quote, or conclusion attached to it. 5. **Verify meaning, not just topic overlap.** A citation is valid only if the source actually supports the specific number, quote, or conclusion attached to it.
6. **Refuse fake certainty.** Do not use words like `verified`, `confirmed`, or `reproduced` unless the draft already contains or the research files provide the underlying evidence. 6. **Refuse fake certainty.** Do not use words like `verified`, `confirmed`, or `reproduced` unless the draft already contains or the research files provide the underlying evidence.
7. **Enforce the system prompt's provenance rule.** Unsupported results, figures, charts, tables, benchmarks, and quantitative claims must be removed or converted to TODOs.
## Citation rules ## Citation rules
@@ -37,8 +38,21 @@ For each source URL:
For code-backed or quantitative claims: For code-backed or quantitative claims:
- Keep the claim only if the supporting artifact is present in the research files or clearly documented in the draft. - Keep the claim only if the supporting artifact is present in the research files or clearly documented in the draft.
- If a figure, table, benchmark, or computed result lacks a traceable source or artifact path, weaken or remove the claim rather than guessing. - If a figure, table, benchmark, or computed result lacks a traceable source or artifact path, weaken or remove the claim rather than guessing.
- Treat captions such as “illustrative,” “simulated,” “representative,” or “example” as insufficient unless the user explicitly requested synthetic/example data. Otherwise remove the visual and mark the missing experiment.
- Do not preserve polished summaries that outrun the raw evidence. - Do not preserve polished summaries that outrun the raw evidence.
## Result provenance audit
Before saving the final document, scan for:
- numeric scores or percentages,
- benchmark names and tables,
- figure/image references,
- claims of improvement or superiority,
- dataset sizes or experimental setup details,
- charts or visualizations.
For each item, verify that it maps to a source URL, research note, raw artifact path, or script path. If not, remove it or replace it with a TODO. Add a short `Removed Unsupported Claims` section only when you remove material.
## Output contract ## Output contract
- Save to the output path specified by the parent (default: `cited.md`). - Save to the output path specified by the parent (default: `cited.md`).
- The output is the complete final document — same structure as the input draft, but with inline citations added throughout and a verified Sources section. - The output is the complete final document — same structure as the input draft, but with inline citations added throughout and a verified Sources section.

View File

@@ -15,6 +15,7 @@ You are Feynman's writing subagent.
3. **Be explicit about gaps.** If the research files have unresolved questions or conflicting evidence, surface them — do not paper over them. 3. **Be explicit about gaps.** If the research files have unresolved questions or conflicting evidence, surface them — do not paper over them.
4. **Do not promote draft text into fact.** If a result is tentative, inferred, or awaiting verification, label it that way in the prose. 4. **Do not promote draft text into fact.** If a result is tentative, inferred, or awaiting verification, label it that way in the prose.
5. **No aesthetic laundering.** Do not make plots, tables, or summaries look cleaner than the underlying evidence justifies. 5. **No aesthetic laundering.** Do not make plots, tables, or summaries look cleaner than the underlying evidence justifies.
6. **Follow the system prompt's provenance rule.** Missing results become gaps or TODOs, never plausible-looking data.
## Output structure ## Output structure
@@ -36,9 +37,10 @@ Unresolved issues, disagreements between sources, gaps in evidence.
## Visuals ## Visuals
- When the research contains quantitative data (benchmarks, comparisons, trends over time), generate charts using the `pi-charts` package to embed them in the draft. - When the research contains quantitative data (benchmarks, comparisons, trends over time), generate charts using the `pi-charts` package to embed them in the draft.
- When explaining architectures, pipelines, or multi-step processes, use Mermaid diagrams. - Do not create charts from invented or example data. If values are missing, describe the planned measurement instead.
- When a comparison across multiple dimensions would benefit from an interactive view, use `pi-generative-ui`. - When explaining architectures, pipelines, or multi-step processes, use Mermaid diagrams only when the structure is supported by the supplied evidence.
- Every visual must have a descriptive caption and reference the data it's based on. - When a comparison across multiple dimensions would benefit from an interactive view, use `pi-generative-ui` only for source-backed data.
- Every visual must have a descriptive caption and reference the data, source URL, research file, raw artifact, or script it is based on.
- Do not add visuals for decoration — only when they materially improve understanding of the evidence. - Do not add visuals for decoration — only when they materially improve understanding of the evidence.
## Operating rules ## Operating rules
@@ -48,6 +50,7 @@ Unresolved issues, disagreements between sources, gaps in evidence.
- Do NOT add inline citations — the verifier agent handles that as a separate post-processing step. - Do NOT add inline citations — the verifier agent handles that as a separate post-processing step.
- Do NOT add a Sources section — the verifier agent builds that. - Do NOT add a Sources section — the verifier agent builds that.
- Before finishing, do a claim sweep: every strong factual statement in the draft should have an obvious source home in the research files. - Before finishing, do a claim sweep: every strong factual statement in the draft should have an obvious source home in the research files.
- Before finishing, do a result-provenance sweep for numeric results, figures, charts, benchmarks, tables, and images.
## Output contract ## Output contract
- Save the main artifact to the specified output path (default: `draft.md`). - Save the main artifact to the specified output path (default: `draft.md`).

View File

@@ -5,62 +5,64 @@ env:
on: on:
push: push:
tags: branches: [main]
- "v*"
workflow_dispatch: workflow_dispatch:
inputs:
tag:
description: Existing git tag to publish and release (for example: v0.2.18)
required: true
type: string
jobs: jobs:
verify: version-check:
runs-on: ubuntu-latest runs-on: ubuntu-latest
permissions: permissions:
contents: read contents: read
outputs: outputs:
tag: ${{ steps.meta.outputs.tag }} version: ${{ steps.version.outputs.version }}
version: ${{ steps.meta.outputs.version }} should_release: ${{ steps.version.outputs.should_release }}
steps: steps:
- name: Resolve release metadata - uses: actions/checkout@v6
id: meta - uses: actions/setup-node@v6
with:
node-version: 24
registry-url: "https://registry.npmjs.org"
- id: version
shell: bash shell: bash
env: env:
INPUT_TAG: ${{ inputs.tag }} GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
REF_NAME: ${{ github.ref_name }}
run: | run: |
TAG="${INPUT_TAG:-$REF_NAME}" LOCAL=$(node -p "require('./package.json').version")
VERSION="${TAG#v}" echo "version=$LOCAL" >> "$GITHUB_OUTPUT"
echo "tag=$TAG" >> "$GITHUB_OUTPUT" PUBLISHED=$(npm view @companion-ai/feynman version 2>/dev/null || true)
echo "version=$VERSION" >> "$GITHUB_OUTPUT" if [ "$PUBLISHED" = "$LOCAL" ] || gh release view "v$LOCAL" >/dev/null 2>&1; then
echo "should_release=false" >> "$GITHUB_OUTPUT"
else
echo "should_release=true" >> "$GITHUB_OUTPUT"
fi
verify:
needs: version-check
if: needs.version-check.outputs.should_release == 'true'
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v6 - uses: actions/checkout@v6
with:
ref: refs/tags/${{ steps.meta.outputs.tag }}
- uses: actions/setup-node@v6 - uses: actions/setup-node@v6
with: with:
node-version: 24 node-version: 24
registry-url: "https://registry.npmjs.org" registry-url: "https://registry.npmjs.org"
- run: npm ci - run: npm ci
- name: Verify package version matches tag
shell: bash
run: |
ACTUAL="$(node -p "require('./package.json').version")"
EXPECTED="${{ steps.meta.outputs.version }}"
test "$ACTUAL" = "$EXPECTED"
- run: npm test - run: npm test
- run: npm pack - run: npm pack
publish-npm: publish-npm:
needs: verify needs:
- version-check
- verify
if: needs.version-check.outputs.should_release == 'true' && needs.verify.result == 'success'
runs-on: ubuntu-latest runs-on: ubuntu-latest
permissions: permissions:
contents: read contents: read
id-token: write id-token: write
steps: steps:
- uses: actions/checkout@v6 - uses: actions/checkout@v6
with:
ref: refs/tags/${{ needs.verify.outputs.tag }}
- uses: actions/setup-node@v6 - uses: actions/setup-node@v6
with: with:
node-version: 24 node-version: 24
@@ -69,7 +71,8 @@ jobs:
- run: npm publish --provenance --access public - run: npm publish --provenance --access public
build-native-bundles: build-native-bundles:
needs: verify needs: version-check
if: needs.version-check.outputs.should_release == 'true'
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
@@ -87,8 +90,6 @@ jobs:
contents: read contents: read
steps: steps:
- uses: actions/checkout@v6 - uses: actions/checkout@v6
with:
ref: refs/tags/${{ needs.verify.outputs.tag }}
- uses: actions/setup-node@v6 - uses: actions/setup-node@v6
with: with:
node-version: 24 node-version: 24
@@ -121,8 +122,10 @@ jobs:
release-github: release-github:
needs: needs:
- version-check
- publish-npm - publish-npm
- build-native-bundles - build-native-bundles
if: needs.version-check.outputs.should_release == 'true' && needs.publish-npm.result == 'success' && needs.build-native-bundles.result == 'success'
runs-on: ubuntu-latest runs-on: ubuntu-latest
permissions: permissions:
contents: write contents: write
@@ -136,17 +139,18 @@ jobs:
env: env:
GH_REPO: ${{ github.repository }} GH_REPO: ${{ github.repository }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
TAG: ${{ needs.verify.outputs.tag }} VERSION: ${{ needs.version-check.outputs.version }}
run: | run: |
if gh release view "$TAG" >/dev/null 2>&1; then if gh release view "v$VERSION" >/dev/null 2>&1; then
gh release upload "$TAG" release-assets/* --clobber gh release upload "v$VERSION" release-assets/* --clobber
gh release edit "$TAG" \ gh release edit "v$VERSION" \
--title "$TAG" \ --title "v$VERSION" \
--notes "Standalone Feynman bundles for native installation." \ --notes "Standalone Feynman bundles for native installation." \
--draft=false \ --draft=false \
--latest --latest
else else
gh release create "$TAG" release-assets/* \ gh release create "v$VERSION" release-assets/* \
--title "$TAG" \ --title "v$VERSION" \
--notes "Standalone Feynman bundles for native installation." --notes "Standalone Feynman bundles for native installation." \
--target "$GITHUB_SHA"
fi fi

View File

@@ -25,7 +25,7 @@ curl -fsSL https://feynman.is/install | bash
irm https://feynman.is/install.ps1 | iex irm https://feynman.is/install.ps1 | iex
``` ```
The one-line installer fetches the latest tagged release. To pin a version, pass it explicitly, for example `curl -fsSL https://feynman.is/install | bash -s -- 0.2.18`. The one-line installer fetches the latest tagged release. To pin a version, pass it explicitly, for example `curl -fsSL https://feynman.is/install | bash -s -- 0.2.21`.
The installer downloads a standalone native bundle with its own Node.js runtime. The installer downloads a standalone native bundle with its own Node.js runtime.
@@ -142,6 +142,18 @@ Built on [Pi](https://github.com/badlogic/pi-mono) for the agent runtime, [alpha
--- ---
### Star History
<a href="https://www.star-history.com/?repos=getcompanion-ai%2Ffeynman&type=date&legend=top-left">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/chart?repos=getcompanion-ai/feynman&type=date&theme=dark&legend=top-left" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/chart?repos=getcompanion-ai/feynman&type=date&legend=top-left" />
<img alt="Star History Chart" src="https://api.star-history.com/chart?repos=getcompanion-ai/feynman&type=date&legend=top-left" />
</picture>
</a>
---
### Contributing ### Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md) for the full contributor guide. See [CONTRIBUTING.md](CONTRIBUTING.md) for the full contributor guide.

4
package-lock.json generated
View File

@@ -1,12 +1,12 @@
{ {
"name": "@companion-ai/feynman", "name": "@companion-ai/feynman",
"version": "0.2.18", "version": "0.2.21",
"lockfileVersion": 3, "lockfileVersion": 3,
"requires": true, "requires": true,
"packages": { "packages": {
"": { "": {
"name": "@companion-ai/feynman", "name": "@companion-ai/feynman",
"version": "0.2.18", "version": "0.2.21",
"hasInstallScript": true, "hasInstallScript": true,
"license": "MIT", "license": "MIT",
"dependencies": { "dependencies": {

View File

@@ -1,6 +1,6 @@
{ {
"name": "@companion-ai/feynman", "name": "@companion-ai/feynman",
"version": "0.2.18", "version": "0.2.21",
"description": "Research-first CLI agent built on Pi and alphaXiv", "description": "Research-first CLI agent built on Pi and alphaXiv",
"license": "MIT", "license": "MIT",
"type": "module", "type": "module",

View File

@@ -13,7 +13,8 @@ Requirements:
- Use the `writer` subagent when the draft should be produced from already-collected notes, then use the `verifier` subagent to add inline citations and verify sources. - Use the `writer` subagent when the draft should be produced from already-collected notes, then use the `verifier` subagent to add inline citations and verify sources.
- Include at minimum: title, abstract, problem statement, related work, method or synthesis, evidence or experiments, limitations, conclusion. - Include at minimum: title, abstract, problem statement, related work, method or synthesis, evidence or experiments, limitations, conclusion.
- Use clean Markdown with LaTeX where equations materially help. - Use clean Markdown with LaTeX where equations materially help.
- Generate charts with `pi-charts` for quantitative data, benchmarks, and comparisons. Use Mermaid for architectures and pipelines. Every figure needs a caption. - Follow the system prompt's provenance rules for all results, figures, charts, images, tables, benchmarks, and quantitative comparisons. If evidence is missing, leave a placeholder or proposed experimental plan instead of claiming an outcome.
- Generate charts with `pi-charts` only for source-backed quantitative data, benchmarks, and comparisons. Use Mermaid for architectures and pipelines only when the structure is supported by sources. Every figure needs a provenance-bearing caption.
- Before delivery, sweep the draft for any claim that sounds stronger than its support. Mark tentative results as tentative and remove unsupported numerics instead of letting the verifier discover them later. - Before delivery, sweep the draft for any claim that sounds stronger than its support. Mark tentative results as tentative and remove unsupported numerics instead of letting the verifier discover them later.
- Save exactly one draft to `papers/<slug>.md`. - Save exactly one draft to `papers/<slug>.md`.
- End with a `Sources` appendix with direct URLs for all primary references. - End with a `Sources` appendix with direct URLs for all primary references.

View File

@@ -110,7 +110,7 @@ This usually means the release exists, but not all platform bundles were uploade
Workarounds: Workarounds:
- try again after the release finishes publishing - try again after the release finishes publishing
- pass the latest published version explicitly, e.g.: - pass the latest published version explicitly, e.g.:
& ([scriptblock]::Create((irm https://feynman.is/install.ps1))) -Version 0.2.18 & ([scriptblock]::Create((irm https://feynman.is/install.ps1))) -Version 0.2.21
"@ "@
} }

View File

@@ -261,7 +261,7 @@ This usually means the release exists, but not all platform bundles were uploade
Workarounds: Workarounds:
- try again after the release finishes publishing - try again after the release finishes publishing
- pass the latest published version explicitly, e.g.: - pass the latest published version explicitly, e.g.:
curl -fsSL https://feynman.is/install | bash -s -- 0.2.18 curl -fsSL https://feynman.is/install | bash -s -- 0.2.21
EOF EOF
exit 1 exit 1
fi fi

View File

@@ -260,6 +260,23 @@ function ensureParentDir(path) {
mkdirSync(dirname(path), { recursive: true }); mkdirSync(dirname(path), { recursive: true });
} }
function packageDependencyExists(packagePath, globalNodeModulesRoot, dependency) {
return existsSync(resolve(packagePath, "node_modules", dependency)) ||
existsSync(resolve(globalNodeModulesRoot, dependency));
}
function installedPackageLooksUsable(packagePath, globalNodeModulesRoot) {
if (!existsSync(resolve(packagePath, "package.json"))) return false;
try {
const pkg = JSON.parse(readFileSync(resolve(packagePath, "package.json"), "utf8"));
return Object.keys(pkg.dependencies ?? {}).every((dependency) =>
packageDependencyExists(packagePath, globalNodeModulesRoot, dependency)
);
} catch {
return false;
}
}
function linkPointsTo(linkPath, targetPath) { function linkPointsTo(linkPath, targetPath) {
try { try {
if (!lstatSync(linkPath).isSymbolicLink()) return false; if (!lstatSync(linkPath).isSymbolicLink()) return false;
@@ -281,6 +298,8 @@ function ensureBundledPackageLinks(packageSpecs) {
try { try {
if (lstatSync(targetPath).isSymbolicLink()) { if (lstatSync(targetPath).isSymbolicLink()) {
rmSync(targetPath, { force: true }); rmSync(targetPath, { force: true });
} else if (!installedPackageLooksUsable(targetPath, globalNodeModulesRoot)) {
rmSync(targetPath, { recursive: true, force: true });
} }
} catch {} } catch {}
if (existsSync(targetPath)) continue; if (existsSync(targetPath)) continue;

View File

@@ -1,11 +1,41 @@
import { dirname, resolve } from "node:path"; import { dirname, resolve } from "node:path";
import { AuthStorage, ModelRegistry } from "@mariozechner/pi-coding-agent"; import { AuthStorage, ModelRegistry } from "@mariozechner/pi-coding-agent";
import { getModels } from "@mariozechner/pi-ai";
import { anthropicOAuthProvider } from "@mariozechner/pi-ai/oauth";
export function getModelsJsonPath(authPath: string): string { export function getModelsJsonPath(authPath: string): string {
return resolve(dirname(authPath), "models.json"); return resolve(dirname(authPath), "models.json");
} }
export function createModelRegistry(authPath: string): ModelRegistry { function registerFeynmanModelOverlays(modelRegistry: ModelRegistry): void {
return ModelRegistry.create(AuthStorage.create(authPath), getModelsJsonPath(authPath)); const anthropicModels = getModels("anthropic");
if (anthropicModels.some((model) => model.id === "claude-opus-4-7")) {
return;
}
const opus46 = anthropicModels.find((model) => model.id === "claude-opus-4-6");
if (!opus46) {
return;
}
modelRegistry.registerProvider("anthropic", {
baseUrl: "https://api.anthropic.com",
api: "anthropic-messages",
oauth: anthropicOAuthProvider,
models: [
...anthropicModels,
{
...opus46,
id: "claude-opus-4-7",
name: "Claude Opus 4.7",
},
],
});
}
export function createModelRegistry(authPath: string): ModelRegistry {
const registry = ModelRegistry.create(AuthStorage.create(authPath), getModelsJsonPath(authPath));
registerFeynmanModelOverlays(registry);
return registry;
} }

View File

@@ -1,5 +1,5 @@
import { spawn } from "node:child_process"; import { spawn } from "node:child_process";
import { cpSync, existsSync, lstatSync, mkdirSync, readlinkSync, rmSync, symlinkSync, writeFileSync } from "node:fs"; import { cpSync, existsSync, lstatSync, mkdirSync, readFileSync, readlinkSync, rmSync, symlinkSync, writeFileSync } from "node:fs";
import { fileURLToPath } from "node:url"; import { fileURLToPath } from "node:url";
import { dirname, join, resolve } from "node:path"; import { dirname, join, resolve } from "node:path";
@@ -423,6 +423,47 @@ function linkDirectory(linkPath: string, targetPath: string): void {
} }
} }
function packageNameToPath(root: string, packageName: string): string {
return resolve(root, packageName);
}
function packageDependencyExists(packagePath: string, globalNodeModulesRoot: string, dependency: string): boolean {
return existsSync(packageNameToPath(resolve(packagePath, "node_modules"), dependency)) ||
existsSync(packageNameToPath(globalNodeModulesRoot, dependency));
}
function installedPackageLooksUsable(packagePath: string, globalNodeModulesRoot: string): boolean {
if (!existsSync(resolve(packagePath, "package.json"))) {
return false;
}
try {
const pkg = JSON.parse(readFileSync(resolve(packagePath, "package.json"), "utf8")) as {
dependencies?: Record<string, string>;
};
const dependencies = Object.keys(pkg.dependencies ?? {});
return dependencies.every((dependency) => packageDependencyExists(packagePath, globalNodeModulesRoot, dependency));
} catch {
return false;
}
}
function replaceBrokenPackageWithBundledCopy(targetPath: string, bundledPackagePath: string, globalNodeModulesRoot: string): boolean {
if (!existsSync(targetPath)) {
return false;
}
if (pathsMatchSymlinkTarget(targetPath, bundledPackagePath)) {
return false;
}
if (installedPackageLooksUsable(targetPath, globalNodeModulesRoot)) {
return false;
}
rmSync(targetPath, { recursive: true, force: true });
linkDirectory(targetPath, bundledPackagePath);
return true;
}
export function seedBundledWorkspacePackages( export function seedBundledWorkspacePackages(
agentDir: string, agentDir: string,
appRoot: string, appRoot: string,
@@ -446,6 +487,10 @@ export function seedBundledWorkspacePackages(
if (!existsSync(bundledPackagePath)) continue; if (!existsSync(bundledPackagePath)) continue;
const targetPath = resolve(globalNodeModulesRoot, parsed.name); const targetPath = resolve(globalNodeModulesRoot, parsed.name);
if (replaceBrokenPackageWithBundledCopy(targetPath, bundledPackagePath, globalNodeModulesRoot)) {
seeded.push(source);
continue;
}
if (!existsSync(targetPath)) { if (!existsSync(targetPath)) {
linkDirectory(targetPath, bundledPackagePath); linkDirectory(targetPath, bundledPackagePath);
seeded.push(source); seeded.push(source);

View File

@@ -30,3 +30,30 @@ test("bundled prompts and skills do not contain blocked promotional product cont
} }
} }
}); });
test("research writing prompts forbid fabricated results and unproven figures", () => {
const draftPrompt = readFileSync(join(repoRoot, "prompts", "draft.md"), "utf8");
const systemPrompt = readFileSync(join(repoRoot, ".feynman", "SYSTEM.md"), "utf8");
const writerPrompt = readFileSync(join(repoRoot, ".feynman", "agents", "writer.md"), "utf8");
const verifierPrompt = readFileSync(join(repoRoot, ".feynman", "agents", "verifier.md"), "utf8");
for (const [label, content] of [
["system prompt", systemPrompt],
] as const) {
assert.match(content, /Never (invent|fabricate)/i, `${label} must explicitly forbid invented or fabricated results`);
assert.match(content, /(figure|chart|image|table)/i, `${label} must cover visual/table provenance`);
assert.match(content, /(provenance|source|artifact|script|raw)/i, `${label} must require traceable support`);
}
for (const [label, content] of [
["writer prompt", writerPrompt],
["verifier prompt", verifierPrompt],
["draft prompt", draftPrompt],
] as const) {
assert.match(content, /system prompt.*provenance rule/i, `${label} must point back to the system provenance rule`);
}
assert.match(draftPrompt, /system prompt's provenance rules/i);
assert.match(draftPrompt, /placeholder or proposed experimental plan/i);
assert.match(draftPrompt, /source-backed quantitative data/i);
});

View File

@@ -7,6 +7,7 @@ import { join } from "node:path";
import { resolveInitialPrompt, shouldRunInteractiveSetup } from "../src/cli.js"; import { resolveInitialPrompt, shouldRunInteractiveSetup } from "../src/cli.js";
import { buildModelStatusSnapshotFromRecords, chooseRecommendedModel } from "../src/model/catalog.js"; import { buildModelStatusSnapshotFromRecords, chooseRecommendedModel } from "../src/model/catalog.js";
import { resolveModelProviderForCommand, setDefaultModelSpec } from "../src/model/commands.js"; import { resolveModelProviderForCommand, setDefaultModelSpec } from "../src/model/commands.js";
import { createModelRegistry } from "../src/model/registry.js";
function createAuthPath(contents: Record<string, unknown>): string { function createAuthPath(contents: Record<string, unknown>): string {
const root = mkdtempSync(join(tmpdir(), "feynman-auth-")); const root = mkdtempSync(join(tmpdir(), "feynman-auth-"));
@@ -26,6 +27,17 @@ test("chooseRecommendedModel prefers the strongest authenticated research model"
assert.equal(recommendation?.spec, "anthropic/claude-opus-4-6"); assert.equal(recommendation?.spec, "anthropic/claude-opus-4-6");
}); });
test("createModelRegistry overlays new Anthropic Opus model before upstream Pi updates", () => {
const authPath = createAuthPath({
anthropic: { type: "api_key", key: "anthropic-test-key" },
});
const registry = createModelRegistry(authPath);
assert.ok(registry.find("anthropic", "claude-opus-4-7"));
assert.equal(registry.getAvailable().some((model) => model.provider === "anthropic" && model.id === "claude-opus-4-7"), true);
});
test("setDefaultModelSpec accepts a unique bare model id from authenticated models", () => { test("setDefaultModelSpec accepts a unique bare model id from authenticated models", () => {
const authPath = createAuthPath({ const authPath = createAuthPath({
openai: { type: "api_key", key: "openai-test-key" }, openai: { type: "api_key", key: "openai-test-key" },

View File

@@ -6,13 +6,17 @@ import { join, resolve } from "node:path";
import { installPackageSources, seedBundledWorkspacePackages, updateConfiguredPackages } from "../src/pi/package-ops.js"; import { installPackageSources, seedBundledWorkspacePackages, updateConfiguredPackages } from "../src/pi/package-ops.js";
function createBundledWorkspace(appRoot: string, packageNames: string[]): void { function createBundledWorkspace(
appRoot: string,
packageNames: string[],
dependenciesByPackage: Record<string, Record<string, string>> = {},
): void {
for (const packageName of packageNames) { for (const packageName of packageNames) {
const packageDir = resolve(appRoot, ".feynman", "npm", "node_modules", packageName); const packageDir = resolve(appRoot, ".feynman", "npm", "node_modules", packageName);
mkdirSync(packageDir, { recursive: true }); mkdirSync(packageDir, { recursive: true });
writeFileSync( writeFileSync(
join(packageDir, "package.json"), join(packageDir, "package.json"),
JSON.stringify({ name: packageName, version: "1.0.0" }, null, 2) + "\n", JSON.stringify({ name: packageName, version: "1.0.0", dependencies: dependenciesByPackage[packageName] }, null, 2) + "\n",
"utf8", "utf8",
); );
} }
@@ -76,6 +80,33 @@ test("seedBundledWorkspacePackages preserves existing installed packages", () =>
assert.equal(lstatSync(existingPackageDir).isSymbolicLink(), false); assert.equal(lstatSync(existingPackageDir).isSymbolicLink(), false);
}); });
test("seedBundledWorkspacePackages repairs broken existing bundled packages", () => {
const appRoot = mkdtempSync(join(tmpdir(), "feynman-bundle-"));
const homeRoot = mkdtempSync(join(tmpdir(), "feynman-home-"));
const agentDir = resolve(homeRoot, "agent");
const existingPackageDir = resolve(homeRoot, "npm-global", "lib", "node_modules", "pi-markdown-preview");
mkdirSync(agentDir, { recursive: true });
createBundledWorkspace(appRoot, ["pi-markdown-preview", "puppeteer-core"], {
"pi-markdown-preview": { "puppeteer-core": "^24.0.0" },
});
mkdirSync(existingPackageDir, { recursive: true });
writeFileSync(
resolve(existingPackageDir, "package.json"),
JSON.stringify({ name: "pi-markdown-preview", version: "broken", dependencies: { "puppeteer-core": "^24.0.0" } }) + "\n",
"utf8",
);
const seeded = seedBundledWorkspacePackages(agentDir, appRoot, ["npm:pi-markdown-preview"]);
assert.deepEqual(seeded, ["npm:pi-markdown-preview"]);
assert.equal(lstatSync(existingPackageDir).isSymbolicLink(), true);
assert.equal(
readFileSync(resolve(existingPackageDir, "package.json"), "utf8").includes('"version": "1.0.0"'),
true,
);
});
test("installPackageSources filters noisy npm chatter but preserves meaningful output", async () => { test("installPackageSources filters noisy npm chatter but preserves meaningful output", async () => {
const root = mkdtempSync(join(tmpdir(), "feynman-package-ops-")); const root = mkdtempSync(join(tmpdir(), "feynman-package-ops-"));
const workingDir = resolve(root, "project"); const workingDir = resolve(root, "project");

View File

@@ -261,7 +261,7 @@ This usually means the release exists, but not all platform bundles were uploade
Workarounds: Workarounds:
- try again after the release finishes publishing - try again after the release finishes publishing
- pass the latest published version explicitly, e.g.: - pass the latest published version explicitly, e.g.:
curl -fsSL https://feynman.is/install | bash -s -- 0.2.18 curl -fsSL https://feynman.is/install | bash -s -- 0.2.21
EOF EOF
exit 1 exit 1
fi fi

View File

@@ -110,7 +110,7 @@ This usually means the release exists, but not all platform bundles were uploade
Workarounds: Workarounds:
- try again after the release finishes publishing - try again after the release finishes publishing
- pass the latest published version explicitly, e.g.: - pass the latest published version explicitly, e.g.:
& ([scriptblock]::Create((irm https://feynman.is/install.ps1))) -Version 0.2.18 & ([scriptblock]::Create((irm https://feynman.is/install.ps1))) -Version 0.2.21
"@ "@
} }

View File

@@ -117,13 +117,13 @@ These installers download the bundled `skills/` and `prompts/` trees plus the re
The one-line installer already targets the latest tagged release. To pin an exact version, pass it explicitly: The one-line installer already targets the latest tagged release. To pin an exact version, pass it explicitly:
```bash ```bash
curl -fsSL https://feynman.is/install | bash -s -- 0.2.18 curl -fsSL https://feynman.is/install | bash -s -- 0.2.21
``` ```
On Windows: On Windows:
```powershell ```powershell
& ([scriptblock]::Create((irm https://feynman.is/install.ps1))) -Version 0.2.18 & ([scriptblock]::Create((irm https://feynman.is/install.ps1))) -Version 0.2.21
``` ```
## Post-install setup ## Post-install setup

View File

@@ -35,6 +35,8 @@ When working from existing session context (after a deep research or literature
The writer pays attention to academic conventions: claims are attributed to their sources with inline citations, methodology sections describe procedures precisely, and limitations are discussed honestly. The draft includes placeholder sections for any content the writer cannot generate from available sources, clearly marking what needs human input. The writer pays attention to academic conventions: claims are attributed to their sources with inline citations, methodology sections describe procedures precisely, and limitations are discussed honestly. The draft includes placeholder sections for any content the writer cannot generate from available sources, clearly marking what needs human input.
Drafts follow Feynman's system-wide provenance rules: unsupported results, figures, images, tables, or benchmark data should become clearly labeled gaps or TODOs, not plausible-looking claims.
## Output format ## Output format
The draft follows standard academic structure: The draft follows standard academic structure: