15 KiB
phase, verified, status, score, gaps
| phase | verified | status | score | gaps | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 01-foundation | 2026-04-05T12:00:00Z | gaps_found | 5/5 success criteria verified, 1 requirement partially covered |
|
Phase 1: Foundation Verification Report
Phase Goal: The provider registry schema, encrypted storage layer, and CLI skeleton exist and function correctly -- all downstream subsystems have stable interfaces to build against Verified: 2026-04-05 Status: gaps_found (minor -- all 5 success criteria pass; gaps are incomplete CLI flags and a planned deferral) Re-verification: No -- initial verification
Goal Achievement
Observable Truths (Success Criteria)
| # | Truth | Status | Evidence |
|---|---|---|---|
| 1 | keyhunter scan ./somefile runs three-stage pipeline (AC pre-filter, regex, entropy) and returns findings with provider names |
VERIFIED | go run . scan ./testdata/samples/openai_key.txt outputs finding with provider "openai". Engine uses KeywordFilter (AC), Detect (regex+entropy), ants pool. All 12 engine tests pass. |
| 2 | Findings persisted to SQLite with key value AES-256 encrypted -- plaintext never in DB | VERIFIED | TestSaveFindingEncrypted asserts raw BLOB does not contain plaintext. grep on DB file confirms no plaintext. Salt stored in settings table, not hardcoded. |
| 3 | keyhunter config init creates ~/.keyhunter.yaml; config set <key> <value> persists |
VERIFIED | go run . config init creates file. go run . config set workers 16 persists value. File contents confirmed. |
| 4 | keyhunter providers list and providers info <name> return provider metadata from YAML |
VERIFIED | providers list shows 3 providers with name, tier, patterns, keywords. providers info openai shows full details including regex and verify URL. |
| 5 | Provider YAML schema includes format_version and last_verified validated at load time | VERIFIED | openai.yaml has format_version: 1 and last_verified: "2026-04-04". TestProviderSchemaValidation confirms format_version=0 is rejected. UnmarshalYAML in schema.go validates both fields. |
Score: 5/5 success criteria verified
Required Artifacts
| Artifact | Expected | Status | Details |
|---|---|---|---|
go.mod |
Module with all Phase 1 deps | VERIFIED | Module github.com/salvacybersec/keyhunter, cobra v1.10.2, viper v1.21.0, ants v2.12.0, sqlite v1.48.1, aho-corasick, lipgloss, testify |
main.go |
Entry point | VERIFIED | Calls cmd.Execute(), 7 lines |
cmd/root.go |
Cobra root with all commands | VERIFIED | 11 commands registered (scan, verify, import, recon, keys, serve, dorks, providers, config, hook, schedule) |
cmd/scan.go |
Scan command wiring engine + storage + output | VERIFIED | Wires engine.NewEngine, sources.NewFileSource, storage.Open, SaveFinding, loadOrCreateEncKey with per-installation salt |
cmd/providers.go |
providers list/info/stats | VERIFIED | Three subcommands using Registry.List(), Get(), Stats() |
cmd/config.go |
config init/set/get | VERIFIED | Uses viper.WriteConfigAs for init, viper.Set + WriteConfig for set |
cmd/stubs.go |
8 stub commands for future phases | VERIFIED | verify, import, recon, keys, serve, dorks, hook, schedule |
pkg/providers/schema.go |
Provider/Pattern/VerifySpec structs with validation | VERIFIED | UnmarshalYAML validates format_version >= 1, last_verified non-empty, confidence values |
pkg/providers/loader.go |
embed.FS loader | VERIFIED | //go:embed definitions/*.yaml with fs.WalkDir loading |
pkg/providers/registry.go |
Registry with List/Get/Stats/AC | VERIFIED | All 4 methods implemented, AC built from keywords at NewRegistry() |
pkg/providers/definitions/*.yaml |
3 provider YAML files | VERIFIED | openai, anthropic, huggingface with all schema fields |
pkg/storage/encrypt.go |
AES-256-GCM Encrypt/Decrypt | VERIFIED | Random nonce prepended, GCM authenticated encryption |
pkg/storage/crypto.go |
Argon2id DeriveKey/NewSalt | VERIFIED | RFC 9106 params (time=1, memory=64MB, threads=4, keyLen=32) |
pkg/storage/db.go |
SQLite DB with WAL and embedded schema | VERIFIED | //go:embed schema.sql, WAL mode, foreign keys enabled |
pkg/storage/findings.go |
SaveFinding/ListFindings with transparent encryption | VERIFIED | Encrypt before INSERT, Decrypt after SELECT, MaskKey for display |
pkg/storage/settings.go |
GetSetting/SetSetting for salt storage | VERIFIED | UPSERT pattern, used by loadOrCreateEncKey |
pkg/storage/schema.sql |
CREATE TABLE findings, scans, settings | VERIFIED | All 3 tables plus indexes |
pkg/engine/engine.go |
Engine with Scan() three-stage pipeline | VERIFIED | chunksChan -> KeywordFilter -> ants pool detectors -> resultsChan |
pkg/engine/entropy.go |
Shannon entropy function | VERIFIED | math.Log2 implementation, tested with known values |
pkg/engine/filter.go |
KeywordFilter with AC | VERIFIED | AC.FindAll on each chunk |
pkg/engine/detector.go |
Detect with regex + entropy | VERIFIED | Iterates providers, compiles regex, checks entropy threshold |
pkg/engine/sources/file.go |
FileSource with overlapping chunks | VERIFIED | os.ReadFile with 4096 byte chunks and 256 byte overlap |
pkg/types/chunk.go |
Shared Chunk type | VERIFIED | Breaks circular import engine <-> sources |
pkg/config/config.go |
Config struct with Load() | VERIFIED | Provides defaults for Workers, DBPath, Passphrase |
pkg/output/table.go |
lipgloss terminal table | VERIFIED | PrintFindings renders provider, key, confidence, source, line |
testdata/samples/*.txt |
4 test fixture files | VERIFIED | openai_key, anthropic_key, multiple_keys, no_keys |
Key Link Verification
| From | To | Via | Status | Details |
|---|---|---|---|---|
| cmd/scan.go | pkg/engine/engine.go | engine.NewEngine(reg).Scan() | WIRED | Line 59-60: eng := engine.NewEngine(reg); ch, err := eng.Scan() |
| cmd/scan.go | pkg/storage/db.go | storage.Open() + SaveFinding | WIRED | Line 79: db, err := storage.Open(dbPath); Line 109: db.SaveFinding |
| cmd/scan.go | pkg/storage/crypto.go | loadOrCreateEncKey -> DeriveKey | WIRED | Line 85: loadOrCreateEncKey uses GetSetting/SetSetting + DeriveKey |
| cmd/root.go | viper | viper.SetConfigFile in initConfig | WIRED | Line 49: viper.SetConfigFile(cfgFile) |
| cmd/providers.go | pkg/providers/registry.go | Registry.List/Get/Stats | WIRED | Lines 23,49,74: NewRegistry() + method calls |
| pkg/engine/engine.go | pkg/providers/registry.go | Engine holds Registry, uses AC() | WIRED | Line 55: KeywordFilter(e.registry.AC(), ...) |
| pkg/engine/filter.go | aho-corasick | AC.FindAll() | WIRED | Line 11: ac.FindAll(string(chunk.Data)) |
| pkg/engine/detector.go | pkg/engine/entropy.go | Shannon() called for entropy check | WIRED | Line referenced: Shannon(match) < pat.EntropyMin |
| pkg/engine/engine.go | ants/v2 | ants.NewPool for workers | WIRED | Line 59: pool, err := ants.NewPool(workers) |
| pkg/storage/findings.go | pkg/storage/encrypt.go | Encrypt before INSERT, Decrypt after SELECT | WIRED | SaveFinding line: Encrypt([]byte(f.KeyValue), encKey); ListFindings: Decrypt(encrypted, encKey) |
| pkg/storage/db.go | pkg/storage/schema.sql | go:embed + Exec | WIRED | Line: //go:embed schema.sql; sqlDB.Exec(string(schemaSQLBytes)) |
| pkg/storage/crypto.go | golang.org/x/crypto/argon2 | argon2.IDKey call | WIRED | argon2.IDKey(passphrase, salt, ...) |
| pkg/providers/loader.go | definitions/*.yaml | go:embed directive | WIRED | //go:embed definitions/*.yaml |
Data-Flow Trace (Level 4)
| Artifact | Data Variable | Source | Produces Real Data | Status |
|---|---|---|---|---|
| cmd/scan.go | findings []engine.Finding | engine.Scan() channel | Yes -- reads real files, runs AC+regex pipeline | FLOWING |
| cmd/scan.go | DB persistence | storage.SaveFinding | Yes -- encrypted INSERT into SQLite | FLOWING |
| cmd/providers.go | reg.List() | providers.NewRegistry() | Yes -- loads embedded YAML at compile time | FLOWING |
| cmd/config.go | viper config | viper.WriteConfigAs | Yes -- creates real YAML file on disk | FLOWING |
Behavioral Spot-Checks
| Behavior | Command | Result | Status |
|---|---|---|---|
| Scan finds OpenAI key | go run . scan ./testdata/samples/openai_key.txt |
1 finding: openai, sk-proj-...1234, high, line 2 | PASS |
| Providers list shows 3 | go run . providers list |
3 providers: anthropic, huggingface, openai | PASS |
| Provider info shows details | go run . providers info openai |
Full metadata including regex and verify URL | PASS |
| Config init creates file | go run . config init |
~/.keyhunter.yaml created with defaults | PASS |
| Config set persists | go run . config set workers 16 |
Value appears in ~/.keyhunter.yaml | PASS |
| Help shows all 11 commands | go run . --help |
scan, verify, import, recon, keys, serve, dorks, providers, config, hook, schedule | PASS |
| DB has no plaintext keys | grep sk-proj-ABCDEF ~/.keyhunter/keyhunter.db |
0 matches | PASS |
| Salt in settings table | sqlite3 ~/.keyhunter/keyhunter.db "SELECT * FROM settings" |
encryption.salt with 32-char hex value | PASS |
| All tests pass | go test ./... -count=1 |
engine 12/12, providers 5/5, storage 7/7 PASS | PASS |
| Build clean | go build ./... |
Exit 0, no errors | PASS |
Requirements Coverage
| Requirement | Source Plan | Description | Status | Evidence |
|---|---|---|---|---|
| CORE-01 | 01-04 | Scanner engine: keyword pre-filter + regex pipeline | SATISFIED | Three-stage pipeline in engine.go, all pipeline tests pass |
| CORE-02 | 01-02 | Provider YAML embedded at compile time via Go embed | SATISFIED | //go:embed definitions/*.yaml in loader.go |
| CORE-03 | 01-02 | Provider registry with pattern, keyword, confidence metadata | SATISFIED | Registry.List/Get/Stats/AC all working, 3 providers loaded |
| CORE-04 | 01-04 | Shannon entropy analysis for secondary signal | SATISFIED | Shannon() in entropy.go, used in detector.go with threshold check |
| CORE-05 | 01-04 | Worker pool with configurable count | SATISFIED | ants.NewPool(workers) in engine.go, --workers flag in scan.go |
| CORE-06 | 01-02, 01-04 | Aho-Corasick pre-filter before regex | SATISFIED | AC built at NewRegistry(), used in KeywordFilter stage |
| CORE-07 | 01-04 | mmap-based large file reading | DEFERRED | Explicitly deferred to Phase 4 in plan 01-04. FileSource uses os.ReadFile(). |
| STOR-01 | 01-03 | SQLite database for persisting scan results | SATISFIED | DB.Open with WAL mode, schema.sql embedded, findings/scans/settings tables |
| STOR-02 | 01-03 | AES-256 encryption for stored keys | SATISFIED | AES-256-GCM in encrypt.go, verified by test + raw DB grep |
| STOR-03 | 01-03 | Argon2 key derivation from passphrase | SATISFIED | DeriveKey with Argon2id RFC 9106 params in crypto.go |
| CLI-01 | 01-05 | 11 Cobra commands | SATISFIED | All 11 visible in --help output |
| CLI-02 | 01-05 | config init creates ~/.keyhunter.yaml | SATISFIED | Behavioral check confirms file creation |
| CLI-03 | 01-05 | config set | SATISFIED | Behavioral check confirms persistence |
| CLI-04 | 01-05 | providers list/info/stats | SATISFIED | All 3 subcommands working with real data |
| CLI-05 | 01-05 | Scan flags: --providers, --category, --confidence, --exclude, --verify, --workers, --output, --unmask, --notify | PARTIAL | Has: --exclude, --verify, --workers, --output, --unmask. Missing: --providers, --category, --confidence, --notify |
| PROV-10 | 01-02 | Provider YAML format_version and last_verified validated | SATISFIED | UnmarshalYAML validates both fields, test confirms rejection of invalid values |
Anti-Patterns Found
| File | Line | Pattern | Severity | Impact |
|---|---|---|---|---|
| cmd/stubs.go | 12 | "not implemented in this phase" messages | Info | Expected -- 8 stub commands for future phases, correctly deferred |
No blocker anti-patterns found. No TODO/FIXME/PLACEHOLDER comments in production code.
Human Verification Required
1. Visual Table Output Quality
Test: Run keyhunter scan ./testdata/samples/multiple_keys.txt in a terminal
Expected: Table output is properly aligned with lipgloss styling, no broken Unicode characters
Why human: Terminal rendering and visual alignment cannot be verified programmatically
2. Config File Formatting
Test: Inspect ~/.keyhunter.yaml after config init then config set workers 16
Expected: Clean YAML formatting, no duplicate keys, readable by human
Why human: YAML formatting quality is subjective; note that config set workers 16 creates a top-level workers key separate from scan.workers which may be confusing
Gaps Summary
All 5 success criteria from the ROADMAP are fully verified. The phase goal -- "provider registry schema, encrypted storage layer, and CLI skeleton exist and function correctly" -- is achieved. All downstream subsystems have stable interfaces to build against.
Two minor gaps exist:
-
CLI-05 partial coverage: 4 of 9 scan flags (--providers, --category, --confidence, --notify) are missing. These are filtering and notification flags that depend on features from later phases (provider filtering needs more providers in Phase 2-3, --notify needs Telegram in Phase 17). The 5 implemented flags (--exclude, --verify, --workers, --output, --unmask) are the ones relevant to Phase 1 functionality.
-
CORE-07 deferred: mmap-based large file reading was explicitly deferred to Phase 4 in the plan. FileSource uses os.ReadFile() which is correct for test fixtures but will not scale to large files.
-
REQUIREMENTS.md stale: STOR-01/02/03 checkboxes are unchecked despite complete implementation.
None of these gaps block downstream development. The phase goal is achieved.
Verified: 2026-04-05 Verifier: Claude (gsd-verifier)