diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index b937f7c..50ad8e6 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -68,7 +68,7 @@ Plans: - [x] 02-02-PLAN.md — Tier 1 keyword-anchored providers (Azure OpenAI, Meta AI, Cohere, Mistral, Inflection, AI21) - [x] 02-03-PLAN.md — Tier 2 inference platforms batch 1 (Groq, Replicate, Anyscale, Together, Fireworks, Baseten, DeepInfra) - [x] 02-04-PLAN.md — Tier 2 inference platforms batch 2 (Lepton, Modal, Cerebrium, Novita, SambaNova, OctoAI, Friendli) -- [ ] 02-05-PLAN.md — Registry guardrail test: assert 12 Tier 1 + 14 Tier 2 + regex compilation +- [x] 02-05-PLAN.md — Registry guardrail test: assert 12 Tier 1 + 14 Tier 2 + regex compilation ### Phase 3: Tier 3-9 Providers **Goal**: All 108+ LLM provider definitions exist — specialized models, Chinese/regional providers, infrastructure gateways, emerging tools, code assistants, self-hosted runtimes, and enterprise platforms diff --git a/.planning/STATE.md b/.planning/STATE.md index 155f160..016bb06 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -3,14 +3,14 @@ gsd_state_version: 1.0 milestone: v1.0 milestone_name: milestone status: executing -stopped_at: Completed 02-tier-1-2-providers 02-04-PLAN.md -last_updated: "2026-04-05T11:12:58.710Z" +stopped_at: Completed 02-tier-1-2-providers 02-05-PLAN.md +last_updated: "2026-04-05T11:16:19.385Z" last_activity: 2026-04-05 progress: total_phases: 18 - completed_phases: 1 + completed_phases: 2 total_plans: 10 - completed_plans: 9 + completed_plans: 10 percent: 20 --- @@ -26,7 +26,7 @@ See: .planning/PROJECT.md (updated 2026-04-04) ## Current Position Phase: 02 (tier-1-2-providers) — EXECUTING -Plan: 3 of 5 +Plan: 5 of 5 Status: Ready to execute Last activity: 2026-04-05 @@ -59,6 +59,7 @@ Progress: [██░░░░░░░░] 20% | Phase 02-tier-1-2-providers P03 | 3min | 2 tasks | 14 files | | Phase 02-tier-1-2-providers P01 | 3min | 2 tasks | 12 files | | Phase 02-tier-1-2-providers P04 | 1min | 2 tasks tasks | 14 files files | +| Phase 02-tier-1-2-providers P05 | 2min | 1 tasks | 1 files | ## Accumulated Context @@ -91,6 +92,6 @@ None yet. ## Session Continuity -Last session: 2026-04-05T11:12:58.706Z -Stopped at: Completed 02-tier-1-2-providers 02-04-PLAN.md +Last session: 2026-04-05T11:16:08.292Z +Stopped at: Completed 02-tier-1-2-providers 02-05-PLAN.md Resume file: None diff --git a/.planning/phases/02-tier-1-2-providers/02-05-SUMMARY.md b/.planning/phases/02-tier-1-2-providers/02-05-SUMMARY.md new file mode 100644 index 0000000..2dde92c --- /dev/null +++ b/.planning/phases/02-tier-1-2-providers/02-05-SUMMARY.md @@ -0,0 +1,115 @@ +--- +phase: 02-tier-1-2-providers +plan: 05 +subsystem: testing +tags: [providers, guardrail, integration-test, re2, aho-corasick] + +requires: + - phase: 02-tier-1-2-providers (plans 01-04) + provides: 26 Tier 1+2 provider YAML definitions embedded in registry +provides: + - Compile-time guardrail locking Tier 1 (12) and Tier 2 (14) provider counts + - Runtime verification that every provider regex compiles under Go RE2 + - Runtime verification that every provider has non-empty keywords (AC pre-filter) +affects: [03-tier-3-providers, all future phases touching pkg/providers] + +tech-stack: + added: [] + patterns: + - "Name-list guardrail: test enumerates expected provider names so silent deletions fail loudly" + +key-files: + created: + - pkg/providers/tier12_test.go + modified: [] + +key-decisions: + - "Guardrail uses Registry.List() and Registry.Get() public API (no reflection / internal poking)" + - "Test scoped to pkg/providers only — unrelated pkg/engine regression pre-exists this plan" + +patterns-established: + - "Tier-count regression test: lock phase deliverables with explicit count assertions" + - "Name-enumeration test: silent provider removal breaks TestTier{1,2}ProviderNames" + +requirements-completed: [PROV-01, PROV-02] + +duration: 2min +completed: 2026-04-05 +--- + +# Phase 02 Plan 05: Tier 1+2 Provider Guardrail Test Summary + +**Six-function integration test that locks 12 Tier 1 + 14 Tier 2 provider counts, RE2 regex compilation, and keyword presence against future regressions.** + +## Performance + +- **Duration:** ~2 min +- **Started:** 2026-04-05T11:14:00Z +- **Completed:** 2026-04-05T11:16:00Z +- **Tasks:** 1 +- **Files modified:** 1 + +## Accomplishments + +- Added `pkg/providers/tier12_test.go` with 6 guardrail tests +- All 6 tests pass against current registry state (12 Tier 1 + 14 Tier 2 providers) +- Full `pkg/providers/...` test suite remains green +- Phase 2 success criteria now machine-verified — any future drop of a provider from the registry fails CI loudly + +## Task Commits + +1. **Task 1: tier1/tier2 guardrail test** - `58f302b` (test) + +_Note: Plan was a single TDD-style task. Because the target tests assert state already built by plans 02-01 through 02-04, RED was skipped — tests went directly to GREEN and stayed there. No refactor needed._ + +## Files Created/Modified + +- `pkg/providers/tier12_test.go` — 103 lines, 6 test functions asserting Tier 1/2 counts, regex compilation, keyword non-emptiness, and expected provider names + +## Decisions Made + +- Used `Registry.List()` rather than a hypothetical `All()`/`Each()` helper — matches the actual public API +- Used `reg.Get(name)` for name lookups — already exported +- Kept expected name lists as package-level `var` so future phases can reuse them if needed + +## Deviations from Plan + +None in scope. Plan executed exactly as written. + +## Issues Encountered + +### Pre-existing regression (OUT OF SCOPE — documented per orchestrator instructions) + +`go test ./pkg/engine/...` currently fails with `TestScannerPipelineOpenAI`, `TestScannerPipelineAnthropic`, and `TestScannerPipelineMultipleKeys`. Root cause: several Tier 2 provider regex patterns introduced in Wave 1 (plans 02-03 and 02-04 — ai21, baseten, cerebrium, cohere, deepinfra, fireworks, friendli, inflection, lepton, mistral, novita, octoai, sambanova) are too broad and match the OpenAI/Anthropic fixtures in `testdata/samples/`, producing 14–15 findings where exactly 1 is expected. `TestScannerPipelineMultipleKeys` additionally does not even find the anthropic provider by name in the over-matched result set. + +Per orchestrator instructions, this regression was **NOT** fixed in plan 02-05. It pre-existed this plan's creation and requires either: +1. Tightening the over-broad Tier 2 regexes (prefixes, length bounds, character classes), or +2. Updating the engine test expectations to tolerate multi-provider matches on ambiguous keys. + +This is a blocker for the phase-level success criterion `go test ./... -count=1 passes` and should be handled as a follow-up plan or verifier-driven fix in Phase 2 closeout. + +The guardrail test added in this plan **does not depend on** and **is not affected by** the engine regression — it runs cleanly via `go test ./pkg/providers/... -count=1`. + +## User Setup Required + +None — pure test code, no external services. + +## Next Phase Readiness + +**Ready for phase closeout verification** with one blocker: + +- Phase 2 guardrail in place (this plan) — DONE +- Pre-existing Tier 2 regex over-match breaking `pkg/engine` tests — BLOCKER for full `go test ./...` + +Phase 3 (Tier 3 providers) should NOT start until the engine regression is resolved, because adding more providers on top of the current over-broad patterns will compound the problem. + +## Self-Check: PASSED + +- File exists: `pkg/providers/tier12_test.go` — FOUND +- Commit exists: `58f302b` — FOUND +- Test execution: `go test ./pkg/providers/... -count=1` — PASSING +- Six test functions present and passing: TestTier1Count, TestTier2Count, TestAllPatternsCompile, TestAllProvidersHaveKeywords, TestTier1ProviderNames, TestTier2ProviderNames + +--- +*Phase: 02-tier-1-2-providers* +*Completed: 2026-04-05*