docs(01-foundation): create phase 1 plan — 5 plans across 3 execution waves
Wave 0: module init + test scaffolding (01-01) Wave 1: provider registry (01-02) + storage layer (01-03) in parallel Wave 2: scan engine pipeline (01-04, depends on 01-02) Wave 3: CLI wiring + integration checkpoint (01-05, depends on all) Covers all 16 Phase 1 requirements: CORE-01 through CORE-07, STOR-01 through STOR-03, CLI-01 through CLI-05, PROV-10. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
663
.planning/phases/01-foundation/01-02-PLAN.md
Normal file
663
.planning/phases/01-foundation/01-02-PLAN.md
Normal file
@@ -0,0 +1,663 @@
|
||||
---
|
||||
phase: 01-foundation
|
||||
plan: 02
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: [01-01]
|
||||
files_modified:
|
||||
- providers/openai.yaml
|
||||
- providers/anthropic.yaml
|
||||
- providers/huggingface.yaml
|
||||
- pkg/providers/schema.go
|
||||
- pkg/providers/loader.go
|
||||
- pkg/providers/registry.go
|
||||
- pkg/providers/registry_test.go
|
||||
autonomous: true
|
||||
requirements: [CORE-02, CORE-03, CORE-06, PROV-10]
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "Provider YAML files are embedded at compile time — no filesystem access at runtime"
|
||||
- "Registry loads all YAML files from embed.FS and returns a slice of Provider structs"
|
||||
- "Provider schema validation rejects YAML missing format_version or last_verified"
|
||||
- "Aho-Corasick automaton is built from all provider keywords at registry init"
|
||||
- "keyhunter providers list command lists providers (tested via registry methods)"
|
||||
artifacts:
|
||||
- path: "providers/openai.yaml"
|
||||
provides: "Reference provider definition with all schema fields"
|
||||
contains: "format_version"
|
||||
- path: "pkg/providers/schema.go"
|
||||
provides: "Provider, Pattern, VerifySpec Go structs with UnmarshalYAML validation"
|
||||
exports: ["Provider", "Pattern", "VerifySpec"]
|
||||
- path: "pkg/providers/registry.go"
|
||||
provides: "Registry struct with List, Get, Stats, AC methods"
|
||||
exports: ["Registry", "NewRegistry"]
|
||||
- path: "pkg/providers/loader.go"
|
||||
provides: "embed.FS declaration and fs.WalkDir loading logic"
|
||||
contains: "go:embed"
|
||||
key_links:
|
||||
- from: "pkg/providers/loader.go"
|
||||
to: "providers/*.yaml"
|
||||
via: "//go:embed directive"
|
||||
pattern: "go:embed.*providers"
|
||||
- from: "pkg/providers/registry.go"
|
||||
to: "github.com/petar-dambovaliev/aho-corasick"
|
||||
via: "AC automaton build at NewRegistry()"
|
||||
pattern: "ahocorasick"
|
||||
- from: "pkg/providers/schema.go"
|
||||
to: "format_version and last_verified YAML fields"
|
||||
via: "UnmarshalYAML validation"
|
||||
pattern: "UnmarshalYAML"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Build the provider registry: YAML schema structs with validation, embed.FS loader, in-memory registry with List/Get/Stats/AC methods, and three reference provider YAML definitions. The Aho-Corasick automaton is built from all provider keywords at registry initialization.
|
||||
|
||||
Purpose: Every downstream subsystem (scan engine, CLI providers command, verification engine) depends on the Registry interface. This plan establishes the stable contract they build against.
|
||||
Output: providers/*.yaml, pkg/providers/{schema,loader,registry}.go, registry_test.go (stubs filled).
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/01-foundation/01-RESEARCH.md
|
||||
@.planning/phases/01-foundation/01-01-SUMMARY.md
|
||||
|
||||
<interfaces>
|
||||
<!-- Provider YAML schema (from ARCHITECTURE.md and RESEARCH.md) -->
|
||||
Full provider YAML structure:
|
||||
```yaml
|
||||
format_version: 1
|
||||
name: openai
|
||||
display_name: OpenAI
|
||||
tier: 1
|
||||
last_verified: "2026-04-04"
|
||||
keywords:
|
||||
- "sk-proj-"
|
||||
- "openai"
|
||||
patterns:
|
||||
- regex: 'sk-proj-[A-Za-z0-9_\-]{48,}'
|
||||
entropy_min: 3.5
|
||||
confidence: high
|
||||
verify:
|
||||
method: GET
|
||||
url: https://api.openai.com/v1/models
|
||||
headers:
|
||||
Authorization: "Bearer {KEY}"
|
||||
valid_status: [200]
|
||||
invalid_status: [401, 403]
|
||||
```
|
||||
|
||||
<!-- Go struct mapping -->
|
||||
Provider struct fields:
|
||||
FormatVersion int (yaml:"format_version" — must be >= 1)
|
||||
Name string (yaml:"name")
|
||||
DisplayName string (yaml:"display_name")
|
||||
Tier int (yaml:"tier")
|
||||
LastVerified string (yaml:"last_verified" — must be non-empty)
|
||||
Keywords []string (yaml:"keywords")
|
||||
Patterns []Pattern (yaml:"patterns")
|
||||
Verify VerifySpec (yaml:"verify")
|
||||
|
||||
Pattern struct fields:
|
||||
Regex string (yaml:"regex")
|
||||
EntropyMin float64 (yaml:"entropy_min")
|
||||
Confidence string (yaml:"confidence" — "high", "medium", "low")
|
||||
|
||||
VerifySpec struct fields:
|
||||
Method string (yaml:"method")
|
||||
URL string (yaml:"url")
|
||||
Headers map[string]string (yaml:"headers")
|
||||
ValidStatus []int (yaml:"valid_status")
|
||||
InvalidStatus []int (yaml:"invalid_status")
|
||||
|
||||
<!-- Registry methods needed by downstream plans -->
|
||||
type Registry struct { ... }
|
||||
func NewRegistry() (*Registry, error)
|
||||
func (r *Registry) List() []Provider
|
||||
func (r *Registry) Get(name string) (Provider, bool)
|
||||
func (r *Registry) Stats() RegistryStats // {Total int, ByTier map[int]int, ByConfidence map[string]int}
|
||||
func (r *Registry) AC() ahocorasick.AhoCorasick // pre-built automaton
|
||||
|
||||
<!-- embed path convention -->
|
||||
The embed directive must reference providers relative to loader.go location.
|
||||
loader.go is at pkg/providers/loader.go.
|
||||
providers/ directory is at project root.
|
||||
Use: //go:embed ../../providers/*.yaml
|
||||
and embed.FS path will be "../../providers/openai.yaml" etc.
|
||||
|
||||
Actually: Go embed paths must be relative and cannot use "..".
|
||||
Correct approach: place the embed in a file at project root level, or adjust.
|
||||
Better approach from research: put loader in providers package, embed from pkg/providers,
|
||||
but reference the providers/ dir which sits at root.
|
||||
|
||||
Resolution: The go:embed directive path is relative to the SOURCE FILE, not the module root.
|
||||
Since loader.go is at pkg/providers/loader.go, to embed ../../providers/*.yaml would work
|
||||
syntactically but Go's embed restricts paths containing "..".
|
||||
|
||||
Use this instead: place a providers_embed.go at the PROJECT ROOT (same dir as go.mod):
|
||||
package main -- NO, this breaks package separation
|
||||
|
||||
Correct architectural pattern (from RESEARCH.md example):
|
||||
The embed FS should be in pkg/providers/loader.go using a path that doesn't traverse up.
|
||||
Solution: embed the providers directory from within the providers package itself by
|
||||
symlinking or — better — move the YAML files to pkg/providers/definitions/*.yaml and use:
|
||||
//go:embed definitions/*.yaml
|
||||
|
||||
This is the clean solution: pkg/providers/definitions/openai.yaml etc.
|
||||
Update files_modified accordingly. The RESEARCH.md shows //go:embed ../../providers/*.yaml
|
||||
but that path won't work with Go's embed restrictions. Use definitions/ subdirectory instead.
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 1: Provider YAML schema structs with validation</name>
|
||||
<files>pkg/providers/schema.go, providers/openai.yaml, providers/anthropic.yaml, providers/huggingface.yaml</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (Pattern 1: Provider Registry, Provider YAML schema section, PROV-10 row in requirements table)
|
||||
- /home/salva/Documents/apikey/.planning/research/ARCHITECTURE.md (Provider Registry component, YAML schema example)
|
||||
</read_first>
|
||||
<behavior>
|
||||
- Test 1: Provider with format_version=0 → UnmarshalYAML returns error "format_version must be >= 1"
|
||||
- Test 2: Provider with empty last_verified → UnmarshalYAML returns error "last_verified is required"
|
||||
- Test 3: Valid provider YAML → UnmarshalYAML succeeds, Provider.Name == "openai"
|
||||
- Test 4: Provider with no patterns → loaded successfully (patterns list can be empty for schema-only providers)
|
||||
- Test 5: Pattern.Confidence not in {"high","medium","low"} → error "confidence must be high, medium, or low"
|
||||
</behavior>
|
||||
<action>
|
||||
Create pkg/providers/schema.go:
|
||||
|
||||
```go
|
||||
package providers
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"gopkg.in/yaml.v3"
|
||||
)
|
||||
|
||||
// Provider represents a single API key provider definition loaded from YAML.
|
||||
type Provider struct {
|
||||
FormatVersion int `yaml:"format_version"`
|
||||
Name string `yaml:"name"`
|
||||
DisplayName string `yaml:"display_name"`
|
||||
Tier int `yaml:"tier"`
|
||||
LastVerified string `yaml:"last_verified"`
|
||||
Keywords []string `yaml:"keywords"`
|
||||
Patterns []Pattern `yaml:"patterns"`
|
||||
Verify VerifySpec `yaml:"verify"`
|
||||
}
|
||||
|
||||
// Pattern defines a single regex pattern for API key detection.
|
||||
type Pattern struct {
|
||||
Regex string `yaml:"regex"`
|
||||
EntropyMin float64 `yaml:"entropy_min"`
|
||||
Confidence string `yaml:"confidence"`
|
||||
}
|
||||
|
||||
// VerifySpec defines how to verify a key is live (used by Phase 5 verification engine).
|
||||
type VerifySpec struct {
|
||||
Method string `yaml:"method"`
|
||||
URL string `yaml:"url"`
|
||||
Headers map[string]string `yaml:"headers"`
|
||||
ValidStatus []int `yaml:"valid_status"`
|
||||
InvalidStatus []int `yaml:"invalid_status"`
|
||||
}
|
||||
|
||||
// RegistryStats holds aggregate statistics about loaded providers.
|
||||
type RegistryStats struct {
|
||||
Total int
|
||||
ByTier map[int]int
|
||||
ByConfidence map[string]int
|
||||
}
|
||||
|
||||
// UnmarshalYAML implements yaml.Unmarshaler with schema validation (satisfies PROV-10).
|
||||
func (p *Provider) UnmarshalYAML(value *yaml.Node) error {
|
||||
// Use a type alias to avoid infinite recursion
|
||||
type ProviderAlias Provider
|
||||
var alias ProviderAlias
|
||||
if err := value.Decode(&alias); err != nil {
|
||||
return err
|
||||
}
|
||||
if alias.FormatVersion < 1 {
|
||||
return fmt.Errorf("provider %q: format_version must be >= 1 (got %d)", alias.Name, alias.FormatVersion)
|
||||
}
|
||||
if alias.LastVerified == "" {
|
||||
return fmt.Errorf("provider %q: last_verified is required", alias.Name)
|
||||
}
|
||||
validConfidences := map[string]bool{"high": true, "medium": true, "low": true, "": true}
|
||||
for _, pat := range alias.Patterns {
|
||||
if !validConfidences[pat.Confidence] {
|
||||
return fmt.Errorf("provider %q: pattern confidence %q must be high, medium, or low", alias.Name, pat.Confidence)
|
||||
}
|
||||
}
|
||||
*p = Provider(alias)
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
Create the three reference YAML provider definitions. These are SCHEMA EXAMPLES for Phase 1; full pattern libraries come in Phase 2-3.
|
||||
|
||||
**providers/openai.yaml:**
|
||||
```yaml
|
||||
format_version: 1
|
||||
name: openai
|
||||
display_name: OpenAI
|
||||
tier: 1
|
||||
last_verified: "2026-04-04"
|
||||
keywords:
|
||||
- "sk-proj-"
|
||||
- "openai"
|
||||
patterns:
|
||||
- regex: 'sk-proj-[A-Za-z0-9_\-]{48,}'
|
||||
entropy_min: 3.5
|
||||
confidence: high
|
||||
verify:
|
||||
method: GET
|
||||
url: https://api.openai.com/v1/models
|
||||
headers:
|
||||
Authorization: "Bearer {KEY}"
|
||||
valid_status: [200]
|
||||
invalid_status: [401, 403]
|
||||
```
|
||||
|
||||
**providers/anthropic.yaml:**
|
||||
```yaml
|
||||
format_version: 1
|
||||
name: anthropic
|
||||
display_name: Anthropic
|
||||
tier: 1
|
||||
last_verified: "2026-04-04"
|
||||
keywords:
|
||||
- "sk-ant-api03-"
|
||||
- "anthropic"
|
||||
patterns:
|
||||
- regex: 'sk-ant-api03-[A-Za-z0-9_\-]{93,}'
|
||||
entropy_min: 3.5
|
||||
confidence: high
|
||||
verify:
|
||||
method: GET
|
||||
url: https://api.anthropic.com/v1/models
|
||||
headers:
|
||||
x-api-key: "{KEY}"
|
||||
anthropic-version: "2023-06-01"
|
||||
valid_status: [200]
|
||||
invalid_status: [401, 403]
|
||||
```
|
||||
|
||||
**providers/huggingface.yaml:**
|
||||
```yaml
|
||||
format_version: 1
|
||||
name: huggingface
|
||||
display_name: HuggingFace
|
||||
tier: 3
|
||||
last_verified: "2026-04-04"
|
||||
keywords:
|
||||
- "hf_"
|
||||
- "huggingface"
|
||||
patterns:
|
||||
- regex: 'hf_[A-Za-z0-9]{34,}'
|
||||
entropy_min: 3.5
|
||||
confidence: high
|
||||
verify:
|
||||
method: GET
|
||||
url: https://huggingface.co/api/whoami-v2
|
||||
headers:
|
||||
Authorization: "Bearer {KEY}"
|
||||
valid_status: [200]
|
||||
invalid_status: [401, 403]
|
||||
```
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go build ./pkg/providers/... && go test ./pkg/providers/... -run TestProviderSchemaValidation -v 2>&1 | head -30</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go build ./pkg/providers/...` exits 0
|
||||
- providers/openai.yaml contains `format_version: 1` and `last_verified`
|
||||
- providers/anthropic.yaml contains `format_version: 1` and `last_verified`
|
||||
- providers/huggingface.yaml contains `format_version: 1` and `last_verified`
|
||||
- pkg/providers/schema.go exports: Provider, Pattern, VerifySpec, RegistryStats
|
||||
- Provider.UnmarshalYAML returns error when format_version < 1
|
||||
- Provider.UnmarshalYAML returns error when last_verified is empty
|
||||
- `grep -q 'UnmarshalYAML' pkg/providers/schema.go` exits 0
|
||||
</acceptance_criteria>
|
||||
<done>Provider schema structs exist with validation. Three reference YAML files exist with all required fields.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 2: Embed loader, registry with Aho-Corasick, and filled test stubs</name>
|
||||
<files>pkg/providers/loader.go, pkg/providers/registry.go, pkg/providers/registry_test.go</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (Pattern 1: Provider Registry with Compile-Time Embed — exact code example)
|
||||
- /home/salva/Documents/apikey/pkg/providers/schema.go (types just created in Task 1)
|
||||
</read_first>
|
||||
<behavior>
|
||||
- Test 1: NewRegistry() loads 3 providers from embedded YAML → registry.List() returns slice of length 3
|
||||
- Test 2: registry.Get("openai") → returns Provider with Name=="openai", bool==true
|
||||
- Test 3: registry.Get("nonexistent") → returns zero Provider, bool==false
|
||||
- Test 4: registry.Stats().Total == 3 and Stats().ByTier[1] == 2 (openai + anthropic are tier 1)
|
||||
- Test 5: AC automaton built — registry.AC().FindAll("sk-proj-abc") returns non-empty slice
|
||||
- Test 6: AC automaton does NOT match — registry.AC().FindAll("hello world") returns empty slice
|
||||
</behavior>
|
||||
<action>
|
||||
IMPORTANT NOTE ON EMBED PATHS: Go's embed package does NOT allow paths containing "..".
|
||||
Since loader.go is at pkg/providers/loader.go, it CANNOT embed ../../providers/*.yaml.
|
||||
|
||||
Solution: Place provider YAML files at pkg/providers/definitions/*.yaml and use:
|
||||
//go:embed definitions/*.yaml
|
||||
|
||||
This means the YAML files created in Task 1 at providers/openai.yaml etc. are the
|
||||
"source of truth" files users may inspect, but the embedded versions live in
|
||||
pkg/providers/definitions/. Copy them there (or move and update Task 1 output).
|
||||
|
||||
Actually, the cleanest solution per Go embed docs: put an embed.go file at the PACKAGE
|
||||
level that embeds a subdirectory. Since pkg/providers/ package owns the embed, use:
|
||||
pkg/providers/definitions/openai.yaml (embedded)
|
||||
providers/openai.yaml (user-facing, can symlink or keep as docs)
|
||||
|
||||
For Phase 1, keep BOTH: the providers/ root dir for user reference, definitions/ for embed.
|
||||
Copy the three YAML files from providers/ to pkg/providers/definitions/ at the end.
|
||||
|
||||
Create **pkg/providers/loader.go**:
|
||||
```go
|
||||
package providers
|
||||
|
||||
import (
|
||||
"embed"
|
||||
"fmt"
|
||||
"io/fs"
|
||||
"path/filepath"
|
||||
|
||||
"gopkg.in/yaml.v3"
|
||||
)
|
||||
|
||||
//go:embed definitions/*.yaml
|
||||
var definitionsFS embed.FS
|
||||
|
||||
// loadProviders reads all YAML files from the embedded definitions FS.
|
||||
func loadProviders() ([]Provider, error) {
|
||||
var providers []Provider
|
||||
err := fs.WalkDir(definitionsFS, "definitions", func(path string, d fs.DirEntry, err error) error {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if d.IsDir() || filepath.Ext(path) != ".yaml" {
|
||||
return nil
|
||||
}
|
||||
data, err := definitionsFS.ReadFile(path)
|
||||
if err != nil {
|
||||
return fmt.Errorf("reading provider file %s: %w", path, err)
|
||||
}
|
||||
var p Provider
|
||||
if err := yaml.Unmarshal(data, &p); err != nil {
|
||||
return fmt.Errorf("parsing provider %s: %w", path, err)
|
||||
}
|
||||
providers = append(providers, p)
|
||||
return nil
|
||||
})
|
||||
return providers, err
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/providers/registry.go**:
|
||||
```go
|
||||
package providers
|
||||
|
||||
import (
|
||||
ahocorasick "github.com/petar-dambovaliev/aho-corasick"
|
||||
)
|
||||
|
||||
// Registry is the in-memory store of all loaded provider definitions.
|
||||
// It is initialized once at startup and is safe for concurrent reads.
|
||||
type Registry struct {
|
||||
providers []Provider
|
||||
index map[string]int // name -> slice index
|
||||
ac ahocorasick.AhoCorasick // pre-built automaton for keyword pre-filter
|
||||
}
|
||||
|
||||
// NewRegistry loads all embedded provider YAML files, validates them, builds the
|
||||
// Aho-Corasick automaton from all provider keywords, and returns the Registry.
|
||||
func NewRegistry() (*Registry, error) {
|
||||
providers, err := loadProviders()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("loading providers: %w", err)
|
||||
}
|
||||
|
||||
index := make(map[string]int, len(providers))
|
||||
var keywords []string
|
||||
for i, p := range providers {
|
||||
index[p.Name] = i
|
||||
keywords = append(keywords, p.Keywords...)
|
||||
}
|
||||
|
||||
builder := ahocorasick.NewAhoCorasickBuilder(ahocorasick.Opts{DFA: true})
|
||||
ac := builder.Build(keywords)
|
||||
|
||||
return &Registry{
|
||||
providers: providers,
|
||||
index: index,
|
||||
ac: ac,
|
||||
}, nil
|
||||
}
|
||||
|
||||
// List returns all loaded providers.
|
||||
func (r *Registry) List() []Provider {
|
||||
return r.providers
|
||||
}
|
||||
|
||||
// Get returns a provider by name and a boolean indicating whether it was found.
|
||||
func (r *Registry) Get(name string) (Provider, bool) {
|
||||
idx, ok := r.index[name]
|
||||
if !ok {
|
||||
return Provider{}, false
|
||||
}
|
||||
return r.providers[idx], true
|
||||
}
|
||||
|
||||
// Stats returns aggregate statistics about the loaded providers.
|
||||
func (r *Registry) Stats() RegistryStats {
|
||||
stats := RegistryStats{
|
||||
Total: len(r.providers),
|
||||
ByTier: make(map[int]int),
|
||||
ByConfidence: make(map[string]int),
|
||||
}
|
||||
for _, p := range r.providers {
|
||||
stats.ByTier[p.Tier]++
|
||||
for _, pat := range p.Patterns {
|
||||
stats.ByConfidence[pat.Confidence]++
|
||||
}
|
||||
}
|
||||
return stats
|
||||
}
|
||||
|
||||
// AC returns the pre-built Aho-Corasick automaton for keyword pre-filtering.
|
||||
func (r *Registry) AC() ahocorasick.AhoCorasick {
|
||||
return r.ac
|
||||
}
|
||||
```
|
||||
|
||||
Note: registry.go needs `import "fmt"` added.
|
||||
|
||||
Then copy the three YAML files into the embed location:
|
||||
```bash
|
||||
mkdir -p /home/salva/Documents/apikey/pkg/providers/definitions
|
||||
cp /home/salva/Documents/apikey/providers/openai.yaml /home/salva/Documents/apikey/pkg/providers/definitions/
|
||||
cp /home/salva/Documents/apikey/providers/anthropic.yaml /home/salva/Documents/apikey/pkg/providers/definitions/
|
||||
cp /home/salva/Documents/apikey/providers/huggingface.yaml /home/salva/Documents/apikey/pkg/providers/definitions/
|
||||
```
|
||||
|
||||
Finally, fill in **pkg/providers/registry_test.go** (replacing the stubs from Plan 01):
|
||||
```go
|
||||
package providers_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/providers"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestRegistryLoad(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
assert.GreaterOrEqual(t, len(reg.List()), 3, "expected at least 3 providers loaded")
|
||||
}
|
||||
|
||||
func TestRegistryGet(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
p, ok := reg.Get("openai")
|
||||
assert.True(t, ok)
|
||||
assert.Equal(t, "openai", p.Name)
|
||||
assert.Equal(t, 1, p.Tier)
|
||||
|
||||
_, ok = reg.Get("nonexistent-provider")
|
||||
assert.False(t, ok)
|
||||
}
|
||||
|
||||
func TestRegistryStats(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
stats := reg.Stats()
|
||||
assert.GreaterOrEqual(t, stats.Total, 3)
|
||||
assert.GreaterOrEqual(t, stats.ByTier[1], 2, "expected at least 2 tier-1 providers")
|
||||
}
|
||||
|
||||
func TestAhoCorasickBuild(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
ac := reg.AC()
|
||||
|
||||
// Should match OpenAI keyword
|
||||
matches := ac.FindAll("OPENAI_API_KEY=sk-proj-abc")
|
||||
assert.NotEmpty(t, matches, "expected AC to find keyword in string containing 'sk-proj-'")
|
||||
|
||||
// Should not match clean text
|
||||
noMatches := ac.FindAll("hello world no secrets here")
|
||||
assert.Empty(t, noMatches, "expected no AC matches in text with no provider keywords")
|
||||
}
|
||||
|
||||
func TestProviderSchemaValidation(t *testing.T) {
|
||||
import_yaml := `
|
||||
format_version: 0
|
||||
name: invalid
|
||||
last_verified: ""
|
||||
`
|
||||
// Directly test UnmarshalYAML via yaml.Unmarshal
|
||||
var p providers.Provider
|
||||
err := yaml.Unmarshal([]byte(import_yaml), &p) // NOTE: need import "gopkg.in/yaml.v3"
|
||||
assert.Error(t, err, "expected validation error for format_version=0")
|
||||
}
|
||||
```
|
||||
|
||||
Note: The TestProviderSchemaValidation test needs `import "gopkg.in/yaml.v3"` added.
|
||||
Add it to the imports. Full corrected test file with proper imports:
|
||||
|
||||
```go
|
||||
package providers_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/providers"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
"gopkg.in/yaml.v3"
|
||||
)
|
||||
|
||||
func TestRegistryLoad(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
assert.GreaterOrEqual(t, len(reg.List()), 3, "expected at least 3 providers")
|
||||
}
|
||||
|
||||
func TestRegistryGet(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
p, ok := reg.Get("openai")
|
||||
assert.True(t, ok)
|
||||
assert.Equal(t, "openai", p.Name)
|
||||
assert.Equal(t, 1, p.Tier)
|
||||
|
||||
_, notOk := reg.Get("nonexistent-provider")
|
||||
assert.False(t, notOk)
|
||||
}
|
||||
|
||||
func TestRegistryStats(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
stats := reg.Stats()
|
||||
assert.GreaterOrEqual(t, stats.Total, 3)
|
||||
assert.GreaterOrEqual(t, stats.ByTier[1], 2)
|
||||
}
|
||||
|
||||
func TestAhoCorasickBuild(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
ac := reg.AC()
|
||||
matches := ac.FindAll("export OPENAI_API_KEY=sk-proj-abc")
|
||||
assert.NotEmpty(t, matches)
|
||||
|
||||
noMatches := ac.FindAll("hello world nothing here")
|
||||
assert.Empty(t, noMatches)
|
||||
}
|
||||
|
||||
func TestProviderSchemaValidation(t *testing.T) {
|
||||
invalid := []byte("format_version: 0\nname: invalid\nlast_verified: \"\"\n")
|
||||
var p providers.Provider
|
||||
err := yaml.Unmarshal(invalid, &p)
|
||||
assert.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "format_version")
|
||||
}
|
||||
```
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/providers/... -v -count=1 2>&1 | tail -20</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go test ./pkg/providers/... -v` exits 0 with all 5 tests PASS (not SKIP)
|
||||
- TestRegistryLoad passes with >= 3 providers
|
||||
- TestRegistryGet passes — "openai" found, "nonexistent" not found
|
||||
- TestRegistryStats passes — Total >= 3
|
||||
- TestAhoCorasickBuild passes — "sk-proj-" match found, "hello world" empty
|
||||
- TestProviderSchemaValidation passes — error on format_version=0
|
||||
- `grep -r 'go:embed' pkg/providers/loader.go` exits 0
|
||||
- pkg/providers/definitions/ directory exists with 3 YAML files
|
||||
</acceptance_criteria>
|
||||
<done>Registry loads providers from embedded YAML, builds Aho-Corasick automaton, exposes List/Get/Stats/AC. All 5 tests pass.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
After both tasks:
|
||||
- `go test ./pkg/providers/... -v -count=1` exits 0 with 5 tests PASS
|
||||
- `go build ./...` still exits 0
|
||||
- `grep -q 'format_version' providers/openai.yaml providers/anthropic.yaml providers/huggingface.yaml` exits 0
|
||||
- `grep -q 'go:embed' pkg/providers/loader.go` exits 0
|
||||
- pkg/providers/definitions/ has 3 YAML files (same content as providers/)
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- 3 reference provider YAML files exist in providers/ and pkg/providers/definitions/ with format_version and last_verified
|
||||
- Provider schema validates format_version >= 1 and non-empty last_verified (PROV-10)
|
||||
- Registry loads providers from embed.FS at compile time (CORE-02)
|
||||
- Aho-Corasick automaton built from all keywords at NewRegistry() (CORE-06)
|
||||
- Registry exposes List(), Get(), Stats(), AC() (CORE-03)
|
||||
- 5 provider tests all pass
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/01-foundation/01-02-SUMMARY.md` following the summary template.
|
||||
</output>
|
||||
Reference in New Issue
Block a user