docs(02-05): complete tier1/tier2 guardrail test plan
Adds guardrail summary and advances phase 02 state. Notes pre-existing Tier 2 regex over-match regression in pkg/engine as a phase-2 blocker to be handled in a follow-up plan.
This commit is contained in:
@@ -68,7 +68,7 @@ Plans:
|
||||
- [x] 02-02-PLAN.md — Tier 1 keyword-anchored providers (Azure OpenAI, Meta AI, Cohere, Mistral, Inflection, AI21)
|
||||
- [x] 02-03-PLAN.md — Tier 2 inference platforms batch 1 (Groq, Replicate, Anyscale, Together, Fireworks, Baseten, DeepInfra)
|
||||
- [x] 02-04-PLAN.md — Tier 2 inference platforms batch 2 (Lepton, Modal, Cerebrium, Novita, SambaNova, OctoAI, Friendli)
|
||||
- [ ] 02-05-PLAN.md — Registry guardrail test: assert 12 Tier 1 + 14 Tier 2 + regex compilation
|
||||
- [x] 02-05-PLAN.md — Registry guardrail test: assert 12 Tier 1 + 14 Tier 2 + regex compilation
|
||||
|
||||
### Phase 3: Tier 3-9 Providers
|
||||
**Goal**: All 108+ LLM provider definitions exist — specialized models, Chinese/regional providers, infrastructure gateways, emerging tools, code assistants, self-hosted runtimes, and enterprise platforms
|
||||
|
||||
@@ -3,14 +3,14 @@ gsd_state_version: 1.0
|
||||
milestone: v1.0
|
||||
milestone_name: milestone
|
||||
status: executing
|
||||
stopped_at: Completed 02-tier-1-2-providers 02-04-PLAN.md
|
||||
last_updated: "2026-04-05T11:12:58.710Z"
|
||||
stopped_at: Completed 02-tier-1-2-providers 02-05-PLAN.md
|
||||
last_updated: "2026-04-05T11:16:19.385Z"
|
||||
last_activity: 2026-04-05
|
||||
progress:
|
||||
total_phases: 18
|
||||
completed_phases: 1
|
||||
completed_phases: 2
|
||||
total_plans: 10
|
||||
completed_plans: 9
|
||||
completed_plans: 10
|
||||
percent: 20
|
||||
---
|
||||
|
||||
@@ -26,7 +26,7 @@ See: .planning/PROJECT.md (updated 2026-04-04)
|
||||
## Current Position
|
||||
|
||||
Phase: 02 (tier-1-2-providers) — EXECUTING
|
||||
Plan: 3 of 5
|
||||
Plan: 5 of 5
|
||||
Status: Ready to execute
|
||||
Last activity: 2026-04-05
|
||||
|
||||
@@ -59,6 +59,7 @@ Progress: [██░░░░░░░░] 20%
|
||||
| Phase 02-tier-1-2-providers P03 | 3min | 2 tasks | 14 files |
|
||||
| Phase 02-tier-1-2-providers P01 | 3min | 2 tasks | 12 files |
|
||||
| Phase 02-tier-1-2-providers P04 | 1min | 2 tasks tasks | 14 files files |
|
||||
| Phase 02-tier-1-2-providers P05 | 2min | 1 tasks | 1 files |
|
||||
|
||||
## Accumulated Context
|
||||
|
||||
@@ -91,6 +92,6 @@ None yet.
|
||||
|
||||
## Session Continuity
|
||||
|
||||
Last session: 2026-04-05T11:12:58.706Z
|
||||
Stopped at: Completed 02-tier-1-2-providers 02-04-PLAN.md
|
||||
Last session: 2026-04-05T11:16:08.292Z
|
||||
Stopped at: Completed 02-tier-1-2-providers 02-05-PLAN.md
|
||||
Resume file: None
|
||||
|
||||
115
.planning/phases/02-tier-1-2-providers/02-05-SUMMARY.md
Normal file
115
.planning/phases/02-tier-1-2-providers/02-05-SUMMARY.md
Normal file
@@ -0,0 +1,115 @@
|
||||
---
|
||||
phase: 02-tier-1-2-providers
|
||||
plan: 05
|
||||
subsystem: testing
|
||||
tags: [providers, guardrail, integration-test, re2, aho-corasick]
|
||||
|
||||
requires:
|
||||
- phase: 02-tier-1-2-providers (plans 01-04)
|
||||
provides: 26 Tier 1+2 provider YAML definitions embedded in registry
|
||||
provides:
|
||||
- Compile-time guardrail locking Tier 1 (12) and Tier 2 (14) provider counts
|
||||
- Runtime verification that every provider regex compiles under Go RE2
|
||||
- Runtime verification that every provider has non-empty keywords (AC pre-filter)
|
||||
affects: [03-tier-3-providers, all future phases touching pkg/providers]
|
||||
|
||||
tech-stack:
|
||||
added: []
|
||||
patterns:
|
||||
- "Name-list guardrail: test enumerates expected provider names so silent deletions fail loudly"
|
||||
|
||||
key-files:
|
||||
created:
|
||||
- pkg/providers/tier12_test.go
|
||||
modified: []
|
||||
|
||||
key-decisions:
|
||||
- "Guardrail uses Registry.List() and Registry.Get() public API (no reflection / internal poking)"
|
||||
- "Test scoped to pkg/providers only — unrelated pkg/engine regression pre-exists this plan"
|
||||
|
||||
patterns-established:
|
||||
- "Tier-count regression test: lock phase deliverables with explicit count assertions"
|
||||
- "Name-enumeration test: silent provider removal breaks TestTier{1,2}ProviderNames"
|
||||
|
||||
requirements-completed: [PROV-01, PROV-02]
|
||||
|
||||
duration: 2min
|
||||
completed: 2026-04-05
|
||||
---
|
||||
|
||||
# Phase 02 Plan 05: Tier 1+2 Provider Guardrail Test Summary
|
||||
|
||||
**Six-function integration test that locks 12 Tier 1 + 14 Tier 2 provider counts, RE2 regex compilation, and keyword presence against future regressions.**
|
||||
|
||||
## Performance
|
||||
|
||||
- **Duration:** ~2 min
|
||||
- **Started:** 2026-04-05T11:14:00Z
|
||||
- **Completed:** 2026-04-05T11:16:00Z
|
||||
- **Tasks:** 1
|
||||
- **Files modified:** 1
|
||||
|
||||
## Accomplishments
|
||||
|
||||
- Added `pkg/providers/tier12_test.go` with 6 guardrail tests
|
||||
- All 6 tests pass against current registry state (12 Tier 1 + 14 Tier 2 providers)
|
||||
- Full `pkg/providers/...` test suite remains green
|
||||
- Phase 2 success criteria now machine-verified — any future drop of a provider from the registry fails CI loudly
|
||||
|
||||
## Task Commits
|
||||
|
||||
1. **Task 1: tier1/tier2 guardrail test** - `58f302b` (test)
|
||||
|
||||
_Note: Plan was a single TDD-style task. Because the target tests assert state already built by plans 02-01 through 02-04, RED was skipped — tests went directly to GREEN and stayed there. No refactor needed._
|
||||
|
||||
## Files Created/Modified
|
||||
|
||||
- `pkg/providers/tier12_test.go` — 103 lines, 6 test functions asserting Tier 1/2 counts, regex compilation, keyword non-emptiness, and expected provider names
|
||||
|
||||
## Decisions Made
|
||||
|
||||
- Used `Registry.List()` rather than a hypothetical `All()`/`Each()` helper — matches the actual public API
|
||||
- Used `reg.Get(name)` for name lookups — already exported
|
||||
- Kept expected name lists as package-level `var` so future phases can reuse them if needed
|
||||
|
||||
## Deviations from Plan
|
||||
|
||||
None in scope. Plan executed exactly as written.
|
||||
|
||||
## Issues Encountered
|
||||
|
||||
### Pre-existing regression (OUT OF SCOPE — documented per orchestrator instructions)
|
||||
|
||||
`go test ./pkg/engine/...` currently fails with `TestScannerPipelineOpenAI`, `TestScannerPipelineAnthropic`, and `TestScannerPipelineMultipleKeys`. Root cause: several Tier 2 provider regex patterns introduced in Wave 1 (plans 02-03 and 02-04 — ai21, baseten, cerebrium, cohere, deepinfra, fireworks, friendli, inflection, lepton, mistral, novita, octoai, sambanova) are too broad and match the OpenAI/Anthropic fixtures in `testdata/samples/`, producing 14–15 findings where exactly 1 is expected. `TestScannerPipelineMultipleKeys` additionally does not even find the anthropic provider by name in the over-matched result set.
|
||||
|
||||
Per orchestrator instructions, this regression was **NOT** fixed in plan 02-05. It pre-existed this plan's creation and requires either:
|
||||
1. Tightening the over-broad Tier 2 regexes (prefixes, length bounds, character classes), or
|
||||
2. Updating the engine test expectations to tolerate multi-provider matches on ambiguous keys.
|
||||
|
||||
This is a blocker for the phase-level success criterion `go test ./... -count=1 passes` and should be handled as a follow-up plan or verifier-driven fix in Phase 2 closeout.
|
||||
|
||||
The guardrail test added in this plan **does not depend on** and **is not affected by** the engine regression — it runs cleanly via `go test ./pkg/providers/... -count=1`.
|
||||
|
||||
## User Setup Required
|
||||
|
||||
None — pure test code, no external services.
|
||||
|
||||
## Next Phase Readiness
|
||||
|
||||
**Ready for phase closeout verification** with one blocker:
|
||||
|
||||
- Phase 2 guardrail in place (this plan) — DONE
|
||||
- Pre-existing Tier 2 regex over-match breaking `pkg/engine` tests — BLOCKER for full `go test ./...`
|
||||
|
||||
Phase 3 (Tier 3 providers) should NOT start until the engine regression is resolved, because adding more providers on top of the current over-broad patterns will compound the problem.
|
||||
|
||||
## Self-Check: PASSED
|
||||
|
||||
- File exists: `pkg/providers/tier12_test.go` — FOUND
|
||||
- Commit exists: `58f302b` — FOUND
|
||||
- Test execution: `go test ./pkg/providers/... -count=1` — PASSING
|
||||
- Six test functions present and passing: TestTier1Count, TestTier2Count, TestAllPatternsCompile, TestAllProvidersHaveKeywords, TestTier1ProviderNames, TestTier2ProviderNames
|
||||
|
||||
---
|
||||
*Phase: 02-tier-1-2-providers*
|
||||
*Completed: 2026-04-05*
|
||||
Reference in New Issue
Block a user