docs(01-foundation): create phase 1 plan — 5 plans across 3 execution waves
Wave 0: module init + test scaffolding (01-01) Wave 1: provider registry (01-02) + storage layer (01-03) in parallel Wave 2: scan engine pipeline (01-04, depends on 01-02) Wave 3: CLI wiring + integration checkpoint (01-05, depends on all) Covers all 16 Phase 1 requirements: CORE-01 through CORE-07, STOR-01 through STOR-03, CLI-01 through CLI-05, PROV-10. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -43,7 +43,14 @@ Decimal phases appear between their surrounding integers in numeric order.
|
||||
3. `keyhunter config init` creates `~/.keyhunter.yaml` and `keyhunter config set <key> <value>` persists values
|
||||
4. `keyhunter providers list` and `keyhunter providers info <name>` return provider metadata from YAML definitions
|
||||
5. Provider YAML schema includes `format_version` and `last_verified` fields validated at load time
|
||||
**Plans**: TBD
|
||||
**Plans**: 5 plans
|
||||
|
||||
Plans:
|
||||
- [ ] 01-01-PLAN.md — Go module init, dependency installation, test scaffolding and testdata fixtures
|
||||
- [ ] 01-02-PLAN.md — Provider registry: YAML schema, embed loader, Aho-Corasick automaton, Registry struct
|
||||
- [ ] 01-03-PLAN.md — Storage layer: AES-256-GCM encryption, Argon2id key derivation, SQLite + Finding CRUD
|
||||
- [ ] 01-04-PLAN.md — Scan engine pipeline: keyword pre-filter, regex+entropy detector, FileSource, ants worker pool
|
||||
- [ ] 01-05-PLAN.md — CLI wiring: scan, providers list/info/stats, config init/set/get, output table
|
||||
|
||||
### Phase 2: Tier 1-2 Providers
|
||||
**Goal**: The 26 highest-value LLM provider YAML definitions exist with accurate regex patterns, keyword lists, confidence levels, and verify endpoints — covering OpenAI, Anthropic, Google AI, AWS Bedrock, Azure OpenAI and all major inference platforms
|
||||
@@ -248,7 +255,7 @@ Phases execute in numeric order: 1 → 2 → 3 → ... → 18
|
||||
|
||||
| Phase | Plans Complete | Status | Completed |
|
||||
|-------|----------------|--------|-----------|
|
||||
| 1. Foundation | 0/? | Not started | - |
|
||||
| 1. Foundation | 0/5 | Planning complete | - |
|
||||
| 2. Tier 1-2 Providers | 0/? | Not started | - |
|
||||
| 3. Tier 3-9 Providers | 0/? | Not started | - |
|
||||
| 4. Input Sources | 0/? | Not started | - |
|
||||
|
||||
359
.planning/phases/01-foundation/01-01-PLAN.md
Normal file
359
.planning/phases/01-foundation/01-01-PLAN.md
Normal file
@@ -0,0 +1,359 @@
|
||||
---
|
||||
phase: 01-foundation
|
||||
plan: 01
|
||||
type: execute
|
||||
wave: 0
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- go.mod
|
||||
- go.sum
|
||||
- main.go
|
||||
- testdata/samples/openai_key.txt
|
||||
- testdata/samples/anthropic_key.txt
|
||||
- testdata/samples/no_keys.txt
|
||||
- pkg/providers/registry_test.go
|
||||
- pkg/storage/db_test.go
|
||||
- pkg/engine/scanner_test.go
|
||||
autonomous: true
|
||||
requirements: [CORE-01, CORE-02, CORE-03, CORE-04, CORE-05, CORE-06, CORE-07, STOR-01, STOR-02, STOR-03, CLI-01]
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "go.mod exists with all Phase 1 dependencies at pinned versions"
|
||||
- "go build ./... succeeds with zero errors on a fresh checkout"
|
||||
- "go test ./... -short runs without compilation errors (tests may fail — stubs are fine)"
|
||||
- "testdata/ contains files with known key patterns for scanner integration tests"
|
||||
artifacts:
|
||||
- path: "go.mod"
|
||||
provides: "Module declaration with all Phase 1 dependencies"
|
||||
contains: "module github.com/salvacybersec/keyhunter"
|
||||
- path: "main.go"
|
||||
provides: "Binary entry point under 30 lines"
|
||||
contains: "func main()"
|
||||
- path: "testdata/samples/openai_key.txt"
|
||||
provides: "Sample file with synthetic OpenAI key for scanner tests"
|
||||
- path: "pkg/providers/registry_test.go"
|
||||
provides: "Test stubs for provider loading and registry"
|
||||
- path: "pkg/storage/db_test.go"
|
||||
provides: "Test stubs for SQLite + encryption roundtrip"
|
||||
- path: "pkg/engine/scanner_test.go"
|
||||
provides: "Test stubs for pipeline stages"
|
||||
key_links:
|
||||
- from: "go.mod"
|
||||
to: "petar-dambovaliev/aho-corasick"
|
||||
via: "require directive"
|
||||
pattern: "petar-dambovaliev/aho-corasick"
|
||||
- from: "go.mod"
|
||||
to: "modernc.org/sqlite"
|
||||
via: "require directive"
|
||||
pattern: "modernc.org/sqlite"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Initialize the Go module, install all Phase 1 dependencies at pinned versions, create the minimal main.go entry point, and lay down test scaffolding with testdata fixtures that every subsequent plan's tests depend on.
|
||||
|
||||
Purpose: All subsequent plans require a compiling module and test infrastructure to exist before they can add production code and make tests green. Wave 0 satisfies this bootstrap requirement.
|
||||
Output: go.mod, go.sum, main.go, pkg/*/test stubs, testdata/ fixtures.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/PROJECT.md
|
||||
@.planning/ROADMAP.md
|
||||
@.planning/phases/01-foundation/01-RESEARCH.md
|
||||
@.planning/phases/01-foundation/01-VALIDATION.md
|
||||
|
||||
<interfaces>
|
||||
<!-- Module path used throughout the project -->
|
||||
Module: github.com/salvacybersec/keyhunter
|
||||
|
||||
<!-- Pinned versions from RESEARCH.md -->
|
||||
Dependencies to install:
|
||||
github.com/spf13/cobra@v1.10.2
|
||||
github.com/spf13/viper@v1.21.0
|
||||
modernc.org/sqlite@latest
|
||||
gopkg.in/yaml.v3@v3.0.1
|
||||
github.com/petar-dambovaliev/aho-corasick@latest
|
||||
github.com/panjf2000/ants/v2@v2.12.0
|
||||
golang.org/x/crypto@latest
|
||||
golang.org/x/time@latest
|
||||
github.com/charmbracelet/lipgloss@latest
|
||||
github.com/stretchr/testify@latest
|
||||
|
||||
<!-- Go version -->
|
||||
go 1.22
|
||||
|
||||
<!-- Directory structure to scaffold (from RESEARCH.md) -->
|
||||
keyhunter/
|
||||
main.go
|
||||
cmd/
|
||||
root.go (created in Plan 05)
|
||||
scan.go (created in Plan 05)
|
||||
providers.go (created in Plan 05)
|
||||
config.go (created in Plan 05)
|
||||
pkg/
|
||||
providers/ (created in Plan 02)
|
||||
engine/ (created in Plan 04)
|
||||
storage/ (created in Plan 03)
|
||||
config/ (created in Plan 05)
|
||||
output/ (created in Plan 05)
|
||||
providers/ (created in Plan 02)
|
||||
testdata/
|
||||
samples/
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="false">
|
||||
<name>Task 1: Initialize Go module and install Phase 1 dependencies</name>
|
||||
<files>go.mod, go.sum</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (Standard Stack section — exact library versions)
|
||||
- /home/salva/Documents/apikey/CLAUDE.md (Technology Stack table — version constraints)
|
||||
</read_first>
|
||||
<action>
|
||||
Run the following commands in the project root (/home/salva/Documents/apikey):
|
||||
|
||||
```bash
|
||||
go mod init github.com/salvacybersec/keyhunter
|
||||
go get github.com/spf13/cobra@v1.10.2
|
||||
go get github.com/spf13/viper@v1.21.0
|
||||
go get modernc.org/sqlite@latest
|
||||
go get gopkg.in/yaml.v3@v3.0.1
|
||||
go get github.com/petar-dambovaliev/aho-corasick@latest
|
||||
go get github.com/panjf2000/ants/v2@v2.12.0
|
||||
go get golang.org/x/crypto@latest
|
||||
go get golang.org/x/time@latest
|
||||
go get github.com/charmbracelet/lipgloss@latest
|
||||
go get github.com/stretchr/testify@latest
|
||||
go mod tidy
|
||||
```
|
||||
|
||||
Verify the resulting go.mod contains:
|
||||
- `module github.com/salvacybersec/keyhunter`
|
||||
- `go 1.22` (or 1.22.x)
|
||||
- `github.com/spf13/cobra v1.10.2`
|
||||
- `github.com/spf13/viper v1.21.0`
|
||||
- `github.com/petar-dambovaliev/aho-corasick` (any version)
|
||||
- `github.com/panjf2000/ants/v2 v2.12.0`
|
||||
- `modernc.org/sqlite` (any v1.35.x)
|
||||
- `github.com/charmbracelet/lipgloss` (any version)
|
||||
|
||||
Do NOT add: chi, templ, telego, gocron — these are Phase 17-18 only.
|
||||
Do NOT use CGO_ENABLED=1 or mattn/go-sqlite3.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && grep -q 'module github.com/salvacybersec/keyhunter' go.mod && grep -q 'cobra v1.10.2' go.mod && grep -q 'modernc.org/sqlite' go.mod && echo "go.mod OK"</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- go.mod contains `module github.com/salvacybersec/keyhunter`
|
||||
- go.mod contains `github.com/spf13/cobra v1.10.2` (exact)
|
||||
- go.mod contains `github.com/spf13/viper v1.21.0` (exact)
|
||||
- go.mod contains `github.com/panjf2000/ants/v2 v2.12.0` (exact)
|
||||
- go.mod contains `modernc.org/sqlite` (v1.35.x)
|
||||
- go.mod contains `github.com/petar-dambovaliev/aho-corasick`
|
||||
- go.mod contains `golang.org/x/crypto`
|
||||
- go.mod contains `github.com/charmbracelet/lipgloss`
|
||||
- go.sum exists and is non-empty
|
||||
- `go mod verify` exits 0
|
||||
</acceptance_criteria>
|
||||
<done>go.mod and go.sum committed with all Phase 1 dependencies at correct versions</done>
|
||||
</task>
|
||||
|
||||
<task type="auto" tdd="false">
|
||||
<name>Task 2: Create main.go entry point and test scaffolding</name>
|
||||
<files>
|
||||
main.go,
|
||||
testdata/samples/openai_key.txt,
|
||||
testdata/samples/anthropic_key.txt,
|
||||
testdata/samples/multiple_keys.txt,
|
||||
testdata/samples/no_keys.txt,
|
||||
pkg/providers/registry_test.go,
|
||||
pkg/storage/db_test.go,
|
||||
pkg/engine/scanner_test.go
|
||||
</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-VALIDATION.md (Wave 0 Requirements and Per-Task Verification Map)
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (Architecture Patterns, project structure diagram)
|
||||
</read_first>
|
||||
<action>
|
||||
Create the following files:
|
||||
|
||||
**main.go** (must be under 30 lines):
|
||||
```go
|
||||
package main
|
||||
|
||||
import "github.com/salvacybersec/keyhunter/cmd"
|
||||
|
||||
func main() {
|
||||
cmd.Execute()
|
||||
}
|
||||
```
|
||||
|
||||
**testdata/samples/openai_key.txt** — file containing a synthetic (non-real) OpenAI-style key for scanner integration tests:
|
||||
```
|
||||
# Test file: synthetic OpenAI key pattern
|
||||
OPENAI_API_KEY=sk-proj-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqr1234
|
||||
```
|
||||
|
||||
**testdata/samples/anthropic_key.txt** — file containing a synthetic Anthropic-style key:
|
||||
```
|
||||
# Test file: synthetic Anthropic key pattern
|
||||
export ANTHROPIC_API_KEY="sk-ant-api03-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxy01234567890-ABCDE"
|
||||
```
|
||||
|
||||
**testdata/samples/multiple_keys.txt** — file with both key types:
|
||||
```
|
||||
# Multiple providers in one file
|
||||
OPENAI_API_KEY=sk-proj-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqr5678
|
||||
ANTHROPIC_API_KEY=sk-ant-api03-XYZabcdefghijklmnopqrstuvwxyz01234567890ABCDEFGH-XYZAB
|
||||
```
|
||||
|
||||
**testdata/samples/no_keys.txt** — file with no keys (negative test case):
|
||||
```
|
||||
# This file contains no API keys
|
||||
# Used to verify false-positive rate is zero for clean files
|
||||
Hello world
|
||||
```
|
||||
|
||||
**pkg/providers/registry_test.go** — test stubs (will be filled by Plan 02):
|
||||
```go
|
||||
package providers_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
)
|
||||
|
||||
// TestRegistryLoad verifies that provider YAML files are loaded from embed.FS.
|
||||
// Stub: will be implemented when registry.go exists (Plan 02).
|
||||
func TestRegistryLoad(t *testing.T) {
|
||||
t.Skip("stub — implement after registry.go exists")
|
||||
}
|
||||
|
||||
// TestProviderSchemaValidation verifies format_version and last_verified are required.
|
||||
// Stub: will be implemented when schema.go validation exists (Plan 02).
|
||||
func TestProviderSchemaValidation(t *testing.T) {
|
||||
t.Skip("stub — implement after schema.go validation exists")
|
||||
}
|
||||
|
||||
// TestAhoCorasickBuild verifies Aho-Corasick automaton builds from provider keywords.
|
||||
// Stub: will be implemented when registry builds automaton (Plan 02).
|
||||
func TestAhoCorasickBuild(t *testing.T) {
|
||||
t.Skip("stub — implement after registry AC build exists")
|
||||
}
|
||||
```
|
||||
|
||||
**pkg/storage/db_test.go** — test stubs (will be filled by Plan 03):
|
||||
```go
|
||||
package storage_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
)
|
||||
|
||||
// TestDBOpen verifies SQLite database opens and creates schema.
|
||||
// Stub: will be implemented when db.go exists (Plan 03).
|
||||
func TestDBOpen(t *testing.T) {
|
||||
t.Skip("stub — implement after db.go exists")
|
||||
}
|
||||
|
||||
// TestEncryptDecryptRoundtrip verifies AES-256-GCM encrypt/decrypt roundtrip.
|
||||
// Stub: will be implemented when encrypt.go exists (Plan 03).
|
||||
func TestEncryptDecryptRoundtrip(t *testing.T) {
|
||||
t.Skip("stub — implement after encrypt.go exists")
|
||||
}
|
||||
|
||||
// TestArgon2KeyDerivation verifies Argon2id produces 32-byte key deterministically.
|
||||
// Stub: will be implemented when crypto.go exists (Plan 03).
|
||||
func TestArgon2KeyDerivation(t *testing.T) {
|
||||
t.Skip("stub — implement after crypto.go exists")
|
||||
}
|
||||
```
|
||||
|
||||
**pkg/engine/scanner_test.go** — test stubs (will be filled by Plan 04):
|
||||
```go
|
||||
package engine_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
)
|
||||
|
||||
// TestShannonEntropy verifies the entropy function returns expected values.
|
||||
// Stub: will be implemented when entropy.go exists (Plan 04).
|
||||
func TestShannonEntropy(t *testing.T) {
|
||||
t.Skip("stub — implement after entropy.go exists")
|
||||
}
|
||||
|
||||
// TestKeywordPreFilter verifies Aho-Corasick pre-filter rejects files without keywords.
|
||||
// Stub: will be implemented when filter.go exists (Plan 04).
|
||||
func TestKeywordPreFilter(t *testing.T) {
|
||||
t.Skip("stub — implement after filter.go exists")
|
||||
}
|
||||
|
||||
// TestScannerPipeline verifies end-to-end scan of testdata returns expected findings.
|
||||
// Stub: will be implemented when engine.go exists (Plan 04).
|
||||
func TestScannerPipeline(t *testing.T) {
|
||||
t.Skip("stub — implement after engine.go exists")
|
||||
}
|
||||
```
|
||||
|
||||
Create the `cmd/` package directory with a minimal stub so main.go compiles:
|
||||
|
||||
**cmd/root.go** (minimal stub — will be replaced by Plan 05):
|
||||
```go
|
||||
package cmd
|
||||
|
||||
import "os"
|
||||
|
||||
// Execute is a stub. The real command tree is built in Plan 05.
|
||||
func Execute() {
|
||||
_ = os.Args
|
||||
}
|
||||
```
|
||||
|
||||
After creating all files, run `go build ./...` to confirm the module compiles.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go build ./... && go test ./... -short 2>&1 | grep -v "^--- SKIP" | grep -v "^SKIP" | grep -v "^ok" || true && echo "BUILD OK"</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go build ./...` exits 0 with no errors
|
||||
- `go test ./... -short` exits 0 (all stubs skip, no failures)
|
||||
- main.go is under 30 lines
|
||||
- testdata/samples/openai_key.txt contains `sk-proj-` prefix
|
||||
- testdata/samples/anthropic_key.txt contains `sk-ant-api03-` prefix
|
||||
- testdata/samples/no_keys.txt contains no key patterns
|
||||
- pkg/providers/registry_test.go, pkg/storage/db_test.go, pkg/engine/scanner_test.go each exist with skip-based stubs
|
||||
- cmd/root.go exists so `go build ./...` compiles
|
||||
</acceptance_criteria>
|
||||
<done>Module compiles, test stubs exist, testdata fixtures created. Subsequent plans can now add production code and make tests green.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
After both tasks:
|
||||
- `cd /home/salva/Documents/apikey && go build ./...` exits 0
|
||||
- `go test ./... -short` exits 0
|
||||
- `grep -r 'sk-proj-' testdata/` finds the OpenAI test fixture
|
||||
- `grep -r 'sk-ant-api03-' testdata/` finds the Anthropic test fixture
|
||||
- go.mod has all required dependencies at specified versions
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- go.mod initialized with module path `github.com/salvacybersec/keyhunter` and Go 1.22
|
||||
- All 10 Phase 1 dependencies installed at correct versions
|
||||
- main.go under 30 lines, compiles successfully
|
||||
- 3 test stub files exist (providers, storage, engine)
|
||||
- 4 testdata fixture files exist (openai key, anthropic key, multiple keys, no keys)
|
||||
- `go build ./...` and `go test ./... -short` both exit 0
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/01-foundation/01-01-SUMMARY.md` following the summary template.
|
||||
</output>
|
||||
663
.planning/phases/01-foundation/01-02-PLAN.md
Normal file
663
.planning/phases/01-foundation/01-02-PLAN.md
Normal file
@@ -0,0 +1,663 @@
|
||||
---
|
||||
phase: 01-foundation
|
||||
plan: 02
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: [01-01]
|
||||
files_modified:
|
||||
- providers/openai.yaml
|
||||
- providers/anthropic.yaml
|
||||
- providers/huggingface.yaml
|
||||
- pkg/providers/schema.go
|
||||
- pkg/providers/loader.go
|
||||
- pkg/providers/registry.go
|
||||
- pkg/providers/registry_test.go
|
||||
autonomous: true
|
||||
requirements: [CORE-02, CORE-03, CORE-06, PROV-10]
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "Provider YAML files are embedded at compile time — no filesystem access at runtime"
|
||||
- "Registry loads all YAML files from embed.FS and returns a slice of Provider structs"
|
||||
- "Provider schema validation rejects YAML missing format_version or last_verified"
|
||||
- "Aho-Corasick automaton is built from all provider keywords at registry init"
|
||||
- "keyhunter providers list command lists providers (tested via registry methods)"
|
||||
artifacts:
|
||||
- path: "providers/openai.yaml"
|
||||
provides: "Reference provider definition with all schema fields"
|
||||
contains: "format_version"
|
||||
- path: "pkg/providers/schema.go"
|
||||
provides: "Provider, Pattern, VerifySpec Go structs with UnmarshalYAML validation"
|
||||
exports: ["Provider", "Pattern", "VerifySpec"]
|
||||
- path: "pkg/providers/registry.go"
|
||||
provides: "Registry struct with List, Get, Stats, AC methods"
|
||||
exports: ["Registry", "NewRegistry"]
|
||||
- path: "pkg/providers/loader.go"
|
||||
provides: "embed.FS declaration and fs.WalkDir loading logic"
|
||||
contains: "go:embed"
|
||||
key_links:
|
||||
- from: "pkg/providers/loader.go"
|
||||
to: "providers/*.yaml"
|
||||
via: "//go:embed directive"
|
||||
pattern: "go:embed.*providers"
|
||||
- from: "pkg/providers/registry.go"
|
||||
to: "github.com/petar-dambovaliev/aho-corasick"
|
||||
via: "AC automaton build at NewRegistry()"
|
||||
pattern: "ahocorasick"
|
||||
- from: "pkg/providers/schema.go"
|
||||
to: "format_version and last_verified YAML fields"
|
||||
via: "UnmarshalYAML validation"
|
||||
pattern: "UnmarshalYAML"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Build the provider registry: YAML schema structs with validation, embed.FS loader, in-memory registry with List/Get/Stats/AC methods, and three reference provider YAML definitions. The Aho-Corasick automaton is built from all provider keywords at registry initialization.
|
||||
|
||||
Purpose: Every downstream subsystem (scan engine, CLI providers command, verification engine) depends on the Registry interface. This plan establishes the stable contract they build against.
|
||||
Output: providers/*.yaml, pkg/providers/{schema,loader,registry}.go, registry_test.go (stubs filled).
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/01-foundation/01-RESEARCH.md
|
||||
@.planning/phases/01-foundation/01-01-SUMMARY.md
|
||||
|
||||
<interfaces>
|
||||
<!-- Provider YAML schema (from ARCHITECTURE.md and RESEARCH.md) -->
|
||||
Full provider YAML structure:
|
||||
```yaml
|
||||
format_version: 1
|
||||
name: openai
|
||||
display_name: OpenAI
|
||||
tier: 1
|
||||
last_verified: "2026-04-04"
|
||||
keywords:
|
||||
- "sk-proj-"
|
||||
- "openai"
|
||||
patterns:
|
||||
- regex: 'sk-proj-[A-Za-z0-9_\-]{48,}'
|
||||
entropy_min: 3.5
|
||||
confidence: high
|
||||
verify:
|
||||
method: GET
|
||||
url: https://api.openai.com/v1/models
|
||||
headers:
|
||||
Authorization: "Bearer {KEY}"
|
||||
valid_status: [200]
|
||||
invalid_status: [401, 403]
|
||||
```
|
||||
|
||||
<!-- Go struct mapping -->
|
||||
Provider struct fields:
|
||||
FormatVersion int (yaml:"format_version" — must be >= 1)
|
||||
Name string (yaml:"name")
|
||||
DisplayName string (yaml:"display_name")
|
||||
Tier int (yaml:"tier")
|
||||
LastVerified string (yaml:"last_verified" — must be non-empty)
|
||||
Keywords []string (yaml:"keywords")
|
||||
Patterns []Pattern (yaml:"patterns")
|
||||
Verify VerifySpec (yaml:"verify")
|
||||
|
||||
Pattern struct fields:
|
||||
Regex string (yaml:"regex")
|
||||
EntropyMin float64 (yaml:"entropy_min")
|
||||
Confidence string (yaml:"confidence" — "high", "medium", "low")
|
||||
|
||||
VerifySpec struct fields:
|
||||
Method string (yaml:"method")
|
||||
URL string (yaml:"url")
|
||||
Headers map[string]string (yaml:"headers")
|
||||
ValidStatus []int (yaml:"valid_status")
|
||||
InvalidStatus []int (yaml:"invalid_status")
|
||||
|
||||
<!-- Registry methods needed by downstream plans -->
|
||||
type Registry struct { ... }
|
||||
func NewRegistry() (*Registry, error)
|
||||
func (r *Registry) List() []Provider
|
||||
func (r *Registry) Get(name string) (Provider, bool)
|
||||
func (r *Registry) Stats() RegistryStats // {Total int, ByTier map[int]int, ByConfidence map[string]int}
|
||||
func (r *Registry) AC() ahocorasick.AhoCorasick // pre-built automaton
|
||||
|
||||
<!-- embed path convention -->
|
||||
The embed directive must reference providers relative to loader.go location.
|
||||
loader.go is at pkg/providers/loader.go.
|
||||
providers/ directory is at project root.
|
||||
Use: //go:embed ../../providers/*.yaml
|
||||
and embed.FS path will be "../../providers/openai.yaml" etc.
|
||||
|
||||
Actually: Go embed paths must be relative and cannot use "..".
|
||||
Correct approach: place the embed in a file at project root level, or adjust.
|
||||
Better approach from research: put loader in providers package, embed from pkg/providers,
|
||||
but reference the providers/ dir which sits at root.
|
||||
|
||||
Resolution: The go:embed directive path is relative to the SOURCE FILE, not the module root.
|
||||
Since loader.go is at pkg/providers/loader.go, to embed ../../providers/*.yaml would work
|
||||
syntactically but Go's embed restricts paths containing "..".
|
||||
|
||||
Use this instead: place a providers_embed.go at the PROJECT ROOT (same dir as go.mod):
|
||||
package main -- NO, this breaks package separation
|
||||
|
||||
Correct architectural pattern (from RESEARCH.md example):
|
||||
The embed FS should be in pkg/providers/loader.go using a path that doesn't traverse up.
|
||||
Solution: embed the providers directory from within the providers package itself by
|
||||
symlinking or — better — move the YAML files to pkg/providers/definitions/*.yaml and use:
|
||||
//go:embed definitions/*.yaml
|
||||
|
||||
This is the clean solution: pkg/providers/definitions/openai.yaml etc.
|
||||
Update files_modified accordingly. The RESEARCH.md shows //go:embed ../../providers/*.yaml
|
||||
but that path won't work with Go's embed restrictions. Use definitions/ subdirectory instead.
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 1: Provider YAML schema structs with validation</name>
|
||||
<files>pkg/providers/schema.go, providers/openai.yaml, providers/anthropic.yaml, providers/huggingface.yaml</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (Pattern 1: Provider Registry, Provider YAML schema section, PROV-10 row in requirements table)
|
||||
- /home/salva/Documents/apikey/.planning/research/ARCHITECTURE.md (Provider Registry component, YAML schema example)
|
||||
</read_first>
|
||||
<behavior>
|
||||
- Test 1: Provider with format_version=0 → UnmarshalYAML returns error "format_version must be >= 1"
|
||||
- Test 2: Provider with empty last_verified → UnmarshalYAML returns error "last_verified is required"
|
||||
- Test 3: Valid provider YAML → UnmarshalYAML succeeds, Provider.Name == "openai"
|
||||
- Test 4: Provider with no patterns → loaded successfully (patterns list can be empty for schema-only providers)
|
||||
- Test 5: Pattern.Confidence not in {"high","medium","low"} → error "confidence must be high, medium, or low"
|
||||
</behavior>
|
||||
<action>
|
||||
Create pkg/providers/schema.go:
|
||||
|
||||
```go
|
||||
package providers
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"gopkg.in/yaml.v3"
|
||||
)
|
||||
|
||||
// Provider represents a single API key provider definition loaded from YAML.
|
||||
type Provider struct {
|
||||
FormatVersion int `yaml:"format_version"`
|
||||
Name string `yaml:"name"`
|
||||
DisplayName string `yaml:"display_name"`
|
||||
Tier int `yaml:"tier"`
|
||||
LastVerified string `yaml:"last_verified"`
|
||||
Keywords []string `yaml:"keywords"`
|
||||
Patterns []Pattern `yaml:"patterns"`
|
||||
Verify VerifySpec `yaml:"verify"`
|
||||
}
|
||||
|
||||
// Pattern defines a single regex pattern for API key detection.
|
||||
type Pattern struct {
|
||||
Regex string `yaml:"regex"`
|
||||
EntropyMin float64 `yaml:"entropy_min"`
|
||||
Confidence string `yaml:"confidence"`
|
||||
}
|
||||
|
||||
// VerifySpec defines how to verify a key is live (used by Phase 5 verification engine).
|
||||
type VerifySpec struct {
|
||||
Method string `yaml:"method"`
|
||||
URL string `yaml:"url"`
|
||||
Headers map[string]string `yaml:"headers"`
|
||||
ValidStatus []int `yaml:"valid_status"`
|
||||
InvalidStatus []int `yaml:"invalid_status"`
|
||||
}
|
||||
|
||||
// RegistryStats holds aggregate statistics about loaded providers.
|
||||
type RegistryStats struct {
|
||||
Total int
|
||||
ByTier map[int]int
|
||||
ByConfidence map[string]int
|
||||
}
|
||||
|
||||
// UnmarshalYAML implements yaml.Unmarshaler with schema validation (satisfies PROV-10).
|
||||
func (p *Provider) UnmarshalYAML(value *yaml.Node) error {
|
||||
// Use a type alias to avoid infinite recursion
|
||||
type ProviderAlias Provider
|
||||
var alias ProviderAlias
|
||||
if err := value.Decode(&alias); err != nil {
|
||||
return err
|
||||
}
|
||||
if alias.FormatVersion < 1 {
|
||||
return fmt.Errorf("provider %q: format_version must be >= 1 (got %d)", alias.Name, alias.FormatVersion)
|
||||
}
|
||||
if alias.LastVerified == "" {
|
||||
return fmt.Errorf("provider %q: last_verified is required", alias.Name)
|
||||
}
|
||||
validConfidences := map[string]bool{"high": true, "medium": true, "low": true, "": true}
|
||||
for _, pat := range alias.Patterns {
|
||||
if !validConfidences[pat.Confidence] {
|
||||
return fmt.Errorf("provider %q: pattern confidence %q must be high, medium, or low", alias.Name, pat.Confidence)
|
||||
}
|
||||
}
|
||||
*p = Provider(alias)
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
Create the three reference YAML provider definitions. These are SCHEMA EXAMPLES for Phase 1; full pattern libraries come in Phase 2-3.
|
||||
|
||||
**providers/openai.yaml:**
|
||||
```yaml
|
||||
format_version: 1
|
||||
name: openai
|
||||
display_name: OpenAI
|
||||
tier: 1
|
||||
last_verified: "2026-04-04"
|
||||
keywords:
|
||||
- "sk-proj-"
|
||||
- "openai"
|
||||
patterns:
|
||||
- regex: 'sk-proj-[A-Za-z0-9_\-]{48,}'
|
||||
entropy_min: 3.5
|
||||
confidence: high
|
||||
verify:
|
||||
method: GET
|
||||
url: https://api.openai.com/v1/models
|
||||
headers:
|
||||
Authorization: "Bearer {KEY}"
|
||||
valid_status: [200]
|
||||
invalid_status: [401, 403]
|
||||
```
|
||||
|
||||
**providers/anthropic.yaml:**
|
||||
```yaml
|
||||
format_version: 1
|
||||
name: anthropic
|
||||
display_name: Anthropic
|
||||
tier: 1
|
||||
last_verified: "2026-04-04"
|
||||
keywords:
|
||||
- "sk-ant-api03-"
|
||||
- "anthropic"
|
||||
patterns:
|
||||
- regex: 'sk-ant-api03-[A-Za-z0-9_\-]{93,}'
|
||||
entropy_min: 3.5
|
||||
confidence: high
|
||||
verify:
|
||||
method: GET
|
||||
url: https://api.anthropic.com/v1/models
|
||||
headers:
|
||||
x-api-key: "{KEY}"
|
||||
anthropic-version: "2023-06-01"
|
||||
valid_status: [200]
|
||||
invalid_status: [401, 403]
|
||||
```
|
||||
|
||||
**providers/huggingface.yaml:**
|
||||
```yaml
|
||||
format_version: 1
|
||||
name: huggingface
|
||||
display_name: HuggingFace
|
||||
tier: 3
|
||||
last_verified: "2026-04-04"
|
||||
keywords:
|
||||
- "hf_"
|
||||
- "huggingface"
|
||||
patterns:
|
||||
- regex: 'hf_[A-Za-z0-9]{34,}'
|
||||
entropy_min: 3.5
|
||||
confidence: high
|
||||
verify:
|
||||
method: GET
|
||||
url: https://huggingface.co/api/whoami-v2
|
||||
headers:
|
||||
Authorization: "Bearer {KEY}"
|
||||
valid_status: [200]
|
||||
invalid_status: [401, 403]
|
||||
```
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go build ./pkg/providers/... && go test ./pkg/providers/... -run TestProviderSchemaValidation -v 2>&1 | head -30</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go build ./pkg/providers/...` exits 0
|
||||
- providers/openai.yaml contains `format_version: 1` and `last_verified`
|
||||
- providers/anthropic.yaml contains `format_version: 1` and `last_verified`
|
||||
- providers/huggingface.yaml contains `format_version: 1` and `last_verified`
|
||||
- pkg/providers/schema.go exports: Provider, Pattern, VerifySpec, RegistryStats
|
||||
- Provider.UnmarshalYAML returns error when format_version < 1
|
||||
- Provider.UnmarshalYAML returns error when last_verified is empty
|
||||
- `grep -q 'UnmarshalYAML' pkg/providers/schema.go` exits 0
|
||||
</acceptance_criteria>
|
||||
<done>Provider schema structs exist with validation. Three reference YAML files exist with all required fields.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 2: Embed loader, registry with Aho-Corasick, and filled test stubs</name>
|
||||
<files>pkg/providers/loader.go, pkg/providers/registry.go, pkg/providers/registry_test.go</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (Pattern 1: Provider Registry with Compile-Time Embed — exact code example)
|
||||
- /home/salva/Documents/apikey/pkg/providers/schema.go (types just created in Task 1)
|
||||
</read_first>
|
||||
<behavior>
|
||||
- Test 1: NewRegistry() loads 3 providers from embedded YAML → registry.List() returns slice of length 3
|
||||
- Test 2: registry.Get("openai") → returns Provider with Name=="openai", bool==true
|
||||
- Test 3: registry.Get("nonexistent") → returns zero Provider, bool==false
|
||||
- Test 4: registry.Stats().Total == 3 and Stats().ByTier[1] == 2 (openai + anthropic are tier 1)
|
||||
- Test 5: AC automaton built — registry.AC().FindAll("sk-proj-abc") returns non-empty slice
|
||||
- Test 6: AC automaton does NOT match — registry.AC().FindAll("hello world") returns empty slice
|
||||
</behavior>
|
||||
<action>
|
||||
IMPORTANT NOTE ON EMBED PATHS: Go's embed package does NOT allow paths containing "..".
|
||||
Since loader.go is at pkg/providers/loader.go, it CANNOT embed ../../providers/*.yaml.
|
||||
|
||||
Solution: Place provider YAML files at pkg/providers/definitions/*.yaml and use:
|
||||
//go:embed definitions/*.yaml
|
||||
|
||||
This means the YAML files created in Task 1 at providers/openai.yaml etc. are the
|
||||
"source of truth" files users may inspect, but the embedded versions live in
|
||||
pkg/providers/definitions/. Copy them there (or move and update Task 1 output).
|
||||
|
||||
Actually, the cleanest solution per Go embed docs: put an embed.go file at the PACKAGE
|
||||
level that embeds a subdirectory. Since pkg/providers/ package owns the embed, use:
|
||||
pkg/providers/definitions/openai.yaml (embedded)
|
||||
providers/openai.yaml (user-facing, can symlink or keep as docs)
|
||||
|
||||
For Phase 1, keep BOTH: the providers/ root dir for user reference, definitions/ for embed.
|
||||
Copy the three YAML files from providers/ to pkg/providers/definitions/ at the end.
|
||||
|
||||
Create **pkg/providers/loader.go**:
|
||||
```go
|
||||
package providers
|
||||
|
||||
import (
|
||||
"embed"
|
||||
"fmt"
|
||||
"io/fs"
|
||||
"path/filepath"
|
||||
|
||||
"gopkg.in/yaml.v3"
|
||||
)
|
||||
|
||||
//go:embed definitions/*.yaml
|
||||
var definitionsFS embed.FS
|
||||
|
||||
// loadProviders reads all YAML files from the embedded definitions FS.
|
||||
func loadProviders() ([]Provider, error) {
|
||||
var providers []Provider
|
||||
err := fs.WalkDir(definitionsFS, "definitions", func(path string, d fs.DirEntry, err error) error {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if d.IsDir() || filepath.Ext(path) != ".yaml" {
|
||||
return nil
|
||||
}
|
||||
data, err := definitionsFS.ReadFile(path)
|
||||
if err != nil {
|
||||
return fmt.Errorf("reading provider file %s: %w", path, err)
|
||||
}
|
||||
var p Provider
|
||||
if err := yaml.Unmarshal(data, &p); err != nil {
|
||||
return fmt.Errorf("parsing provider %s: %w", path, err)
|
||||
}
|
||||
providers = append(providers, p)
|
||||
return nil
|
||||
})
|
||||
return providers, err
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/providers/registry.go**:
|
||||
```go
|
||||
package providers
|
||||
|
||||
import (
|
||||
ahocorasick "github.com/petar-dambovaliev/aho-corasick"
|
||||
)
|
||||
|
||||
// Registry is the in-memory store of all loaded provider definitions.
|
||||
// It is initialized once at startup and is safe for concurrent reads.
|
||||
type Registry struct {
|
||||
providers []Provider
|
||||
index map[string]int // name -> slice index
|
||||
ac ahocorasick.AhoCorasick // pre-built automaton for keyword pre-filter
|
||||
}
|
||||
|
||||
// NewRegistry loads all embedded provider YAML files, validates them, builds the
|
||||
// Aho-Corasick automaton from all provider keywords, and returns the Registry.
|
||||
func NewRegistry() (*Registry, error) {
|
||||
providers, err := loadProviders()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("loading providers: %w", err)
|
||||
}
|
||||
|
||||
index := make(map[string]int, len(providers))
|
||||
var keywords []string
|
||||
for i, p := range providers {
|
||||
index[p.Name] = i
|
||||
keywords = append(keywords, p.Keywords...)
|
||||
}
|
||||
|
||||
builder := ahocorasick.NewAhoCorasickBuilder(ahocorasick.Opts{DFA: true})
|
||||
ac := builder.Build(keywords)
|
||||
|
||||
return &Registry{
|
||||
providers: providers,
|
||||
index: index,
|
||||
ac: ac,
|
||||
}, nil
|
||||
}
|
||||
|
||||
// List returns all loaded providers.
|
||||
func (r *Registry) List() []Provider {
|
||||
return r.providers
|
||||
}
|
||||
|
||||
// Get returns a provider by name and a boolean indicating whether it was found.
|
||||
func (r *Registry) Get(name string) (Provider, bool) {
|
||||
idx, ok := r.index[name]
|
||||
if !ok {
|
||||
return Provider{}, false
|
||||
}
|
||||
return r.providers[idx], true
|
||||
}
|
||||
|
||||
// Stats returns aggregate statistics about the loaded providers.
|
||||
func (r *Registry) Stats() RegistryStats {
|
||||
stats := RegistryStats{
|
||||
Total: len(r.providers),
|
||||
ByTier: make(map[int]int),
|
||||
ByConfidence: make(map[string]int),
|
||||
}
|
||||
for _, p := range r.providers {
|
||||
stats.ByTier[p.Tier]++
|
||||
for _, pat := range p.Patterns {
|
||||
stats.ByConfidence[pat.Confidence]++
|
||||
}
|
||||
}
|
||||
return stats
|
||||
}
|
||||
|
||||
// AC returns the pre-built Aho-Corasick automaton for keyword pre-filtering.
|
||||
func (r *Registry) AC() ahocorasick.AhoCorasick {
|
||||
return r.ac
|
||||
}
|
||||
```
|
||||
|
||||
Note: registry.go needs `import "fmt"` added.
|
||||
|
||||
Then copy the three YAML files into the embed location:
|
||||
```bash
|
||||
mkdir -p /home/salva/Documents/apikey/pkg/providers/definitions
|
||||
cp /home/salva/Documents/apikey/providers/openai.yaml /home/salva/Documents/apikey/pkg/providers/definitions/
|
||||
cp /home/salva/Documents/apikey/providers/anthropic.yaml /home/salva/Documents/apikey/pkg/providers/definitions/
|
||||
cp /home/salva/Documents/apikey/providers/huggingface.yaml /home/salva/Documents/apikey/pkg/providers/definitions/
|
||||
```
|
||||
|
||||
Finally, fill in **pkg/providers/registry_test.go** (replacing the stubs from Plan 01):
|
||||
```go
|
||||
package providers_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/providers"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestRegistryLoad(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
assert.GreaterOrEqual(t, len(reg.List()), 3, "expected at least 3 providers loaded")
|
||||
}
|
||||
|
||||
func TestRegistryGet(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
p, ok := reg.Get("openai")
|
||||
assert.True(t, ok)
|
||||
assert.Equal(t, "openai", p.Name)
|
||||
assert.Equal(t, 1, p.Tier)
|
||||
|
||||
_, ok = reg.Get("nonexistent-provider")
|
||||
assert.False(t, ok)
|
||||
}
|
||||
|
||||
func TestRegistryStats(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
stats := reg.Stats()
|
||||
assert.GreaterOrEqual(t, stats.Total, 3)
|
||||
assert.GreaterOrEqual(t, stats.ByTier[1], 2, "expected at least 2 tier-1 providers")
|
||||
}
|
||||
|
||||
func TestAhoCorasickBuild(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
ac := reg.AC()
|
||||
|
||||
// Should match OpenAI keyword
|
||||
matches := ac.FindAll("OPENAI_API_KEY=sk-proj-abc")
|
||||
assert.NotEmpty(t, matches, "expected AC to find keyword in string containing 'sk-proj-'")
|
||||
|
||||
// Should not match clean text
|
||||
noMatches := ac.FindAll("hello world no secrets here")
|
||||
assert.Empty(t, noMatches, "expected no AC matches in text with no provider keywords")
|
||||
}
|
||||
|
||||
func TestProviderSchemaValidation(t *testing.T) {
|
||||
import_yaml := `
|
||||
format_version: 0
|
||||
name: invalid
|
||||
last_verified: ""
|
||||
`
|
||||
// Directly test UnmarshalYAML via yaml.Unmarshal
|
||||
var p providers.Provider
|
||||
err := yaml.Unmarshal([]byte(import_yaml), &p) // NOTE: need import "gopkg.in/yaml.v3"
|
||||
assert.Error(t, err, "expected validation error for format_version=0")
|
||||
}
|
||||
```
|
||||
|
||||
Note: The TestProviderSchemaValidation test needs `import "gopkg.in/yaml.v3"` added.
|
||||
Add it to the imports. Full corrected test file with proper imports:
|
||||
|
||||
```go
|
||||
package providers_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/providers"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
"gopkg.in/yaml.v3"
|
||||
)
|
||||
|
||||
func TestRegistryLoad(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
assert.GreaterOrEqual(t, len(reg.List()), 3, "expected at least 3 providers")
|
||||
}
|
||||
|
||||
func TestRegistryGet(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
p, ok := reg.Get("openai")
|
||||
assert.True(t, ok)
|
||||
assert.Equal(t, "openai", p.Name)
|
||||
assert.Equal(t, 1, p.Tier)
|
||||
|
||||
_, notOk := reg.Get("nonexistent-provider")
|
||||
assert.False(t, notOk)
|
||||
}
|
||||
|
||||
func TestRegistryStats(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
stats := reg.Stats()
|
||||
assert.GreaterOrEqual(t, stats.Total, 3)
|
||||
assert.GreaterOrEqual(t, stats.ByTier[1], 2)
|
||||
}
|
||||
|
||||
func TestAhoCorasickBuild(t *testing.T) {
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
|
||||
ac := reg.AC()
|
||||
matches := ac.FindAll("export OPENAI_API_KEY=sk-proj-abc")
|
||||
assert.NotEmpty(t, matches)
|
||||
|
||||
noMatches := ac.FindAll("hello world nothing here")
|
||||
assert.Empty(t, noMatches)
|
||||
}
|
||||
|
||||
func TestProviderSchemaValidation(t *testing.T) {
|
||||
invalid := []byte("format_version: 0\nname: invalid\nlast_verified: \"\"\n")
|
||||
var p providers.Provider
|
||||
err := yaml.Unmarshal(invalid, &p)
|
||||
assert.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "format_version")
|
||||
}
|
||||
```
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/providers/... -v -count=1 2>&1 | tail -20</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go test ./pkg/providers/... -v` exits 0 with all 5 tests PASS (not SKIP)
|
||||
- TestRegistryLoad passes with >= 3 providers
|
||||
- TestRegistryGet passes — "openai" found, "nonexistent" not found
|
||||
- TestRegistryStats passes — Total >= 3
|
||||
- TestAhoCorasickBuild passes — "sk-proj-" match found, "hello world" empty
|
||||
- TestProviderSchemaValidation passes — error on format_version=0
|
||||
- `grep -r 'go:embed' pkg/providers/loader.go` exits 0
|
||||
- pkg/providers/definitions/ directory exists with 3 YAML files
|
||||
</acceptance_criteria>
|
||||
<done>Registry loads providers from embedded YAML, builds Aho-Corasick automaton, exposes List/Get/Stats/AC. All 5 tests pass.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
After both tasks:
|
||||
- `go test ./pkg/providers/... -v -count=1` exits 0 with 5 tests PASS
|
||||
- `go build ./...` still exits 0
|
||||
- `grep -q 'format_version' providers/openai.yaml providers/anthropic.yaml providers/huggingface.yaml` exits 0
|
||||
- `grep -q 'go:embed' pkg/providers/loader.go` exits 0
|
||||
- pkg/providers/definitions/ has 3 YAML files (same content as providers/)
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- 3 reference provider YAML files exist in providers/ and pkg/providers/definitions/ with format_version and last_verified
|
||||
- Provider schema validates format_version >= 1 and non-empty last_verified (PROV-10)
|
||||
- Registry loads providers from embed.FS at compile time (CORE-02)
|
||||
- Aho-Corasick automaton built from all keywords at NewRegistry() (CORE-06)
|
||||
- Registry exposes List(), Get(), Stats(), AC() (CORE-03)
|
||||
- 5 provider tests all pass
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/01-foundation/01-02-SUMMARY.md` following the summary template.
|
||||
</output>
|
||||
634
.planning/phases/01-foundation/01-03-PLAN.md
Normal file
634
.planning/phases/01-foundation/01-03-PLAN.md
Normal file
@@ -0,0 +1,634 @@
|
||||
---
|
||||
phase: 01-foundation
|
||||
plan: 03
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: [01-01]
|
||||
files_modified:
|
||||
- pkg/storage/schema.sql
|
||||
- pkg/storage/encrypt.go
|
||||
- pkg/storage/crypto.go
|
||||
- pkg/storage/db.go
|
||||
- pkg/storage/findings.go
|
||||
- pkg/storage/db_test.go
|
||||
autonomous: true
|
||||
requirements: [STOR-01, STOR-02, STOR-03]
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "SQLite database opens, runs migrations from embedded schema.sql, and closes cleanly"
|
||||
- "AES-256-GCM Encrypt/Decrypt roundtrip produces the original plaintext"
|
||||
- "Argon2id DeriveKey with the same passphrase and salt always returns the same 32-byte key"
|
||||
- "A Finding can be saved to the database with the key_value stored encrypted and retrieved as plaintext"
|
||||
- "The raw database file does NOT contain plaintext API key values"
|
||||
artifacts:
|
||||
- path: "pkg/storage/encrypt.go"
|
||||
provides: "Encrypt(plaintext, key) and Decrypt(ciphertext, key) using AES-256-GCM"
|
||||
exports: ["Encrypt", "Decrypt"]
|
||||
- path: "pkg/storage/crypto.go"
|
||||
provides: "DeriveKey(passphrase, salt) using Argon2id RFC 9106 params"
|
||||
exports: ["DeriveKey", "NewSalt"]
|
||||
- path: "pkg/storage/db.go"
|
||||
provides: "DB struct with Open(), Close(), WAL mode, embedded schema migration"
|
||||
exports: ["DB", "Open"]
|
||||
- path: "pkg/storage/findings.go"
|
||||
provides: "SaveFinding(finding, encKey) and ListFindings(encKey) CRUD"
|
||||
exports: ["SaveFinding", "ListFindings", "Finding"]
|
||||
- path: "pkg/storage/schema.sql"
|
||||
provides: "CREATE TABLE statements for findings, scans, settings"
|
||||
contains: "CREATE TABLE IF NOT EXISTS findings"
|
||||
key_links:
|
||||
- from: "pkg/storage/findings.go"
|
||||
to: "pkg/storage/encrypt.go"
|
||||
via: "Encrypt() called before INSERT, Decrypt() called after SELECT"
|
||||
pattern: "Encrypt|Decrypt"
|
||||
- from: "pkg/storage/db.go"
|
||||
to: "pkg/storage/schema.sql"
|
||||
via: "//go:embed schema.sql and db.Exec on open"
|
||||
pattern: "go:embed.*schema"
|
||||
- from: "pkg/storage/crypto.go"
|
||||
to: "golang.org/x/crypto/argon2"
|
||||
via: "argon2.IDKey call"
|
||||
pattern: "argon2\\.IDKey"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Build the storage layer: AES-256-GCM column encryption, Argon2id key derivation, SQLite database with WAL mode and embedded schema, and Finding CRUD operations that transparently encrypt key values on write and decrypt on read.
|
||||
|
||||
Purpose: Scanner results from Plan 04 and CLI commands from Plan 05 need a storage layer to persist findings. The encryption contract (Encrypt/Decrypt/DeriveKey) must exist before the scanner pipeline can store keys.
|
||||
Output: pkg/storage/{encrypt,crypto,db,findings,schema}.go and db_test.go (stubs filled).
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/01-foundation/01-RESEARCH.md
|
||||
@.planning/phases/01-foundation/01-01-SUMMARY.md
|
||||
|
||||
<interfaces>
|
||||
<!-- AES-256-GCM encrypt/decrypt pattern from RESEARCH.md Pattern 3 -->
|
||||
func Encrypt(plaintext []byte, key []byte) ([]byte, error)
|
||||
// key must be exactly 32 bytes (AES-256)
|
||||
// nonce prepended to ciphertext in returned []byte
|
||||
// uses crypto/aes + crypto/cipher GCM
|
||||
|
||||
func Decrypt(ciphertext []byte, key []byte) ([]byte, error)
|
||||
// expects nonce prepended format from Encrypt()
|
||||
// returns ErrCiphertextTooShort if len < nonceSize
|
||||
|
||||
<!-- Argon2id key derivation pattern from RESEARCH.md Pattern 4 -->
|
||||
func DeriveKey(passphrase []byte, salt []byte) []byte
|
||||
// params: time=1, memory=64*1024, threads=4, keyLen=32
|
||||
// returns exactly 32 bytes deterministically
|
||||
|
||||
func NewSalt() ([]byte, error)
|
||||
// generates 16 random bytes via crypto/rand
|
||||
|
||||
<!-- SQLite schema — findings table -->
|
||||
findings table columns:
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT
|
||||
scan_id INTEGER REFERENCES scans(id)
|
||||
provider_name TEXT NOT NULL
|
||||
key_value BLOB NOT NULL -- AES-256-GCM encrypted, nonce prepended
|
||||
key_masked TEXT NOT NULL -- first8...last4, stored plaintext for display
|
||||
confidence TEXT NOT NULL -- "high", "medium", "low"
|
||||
source_path TEXT
|
||||
source_type TEXT -- "file", "dir", "git", "stdin", "url"
|
||||
line_number INTEGER
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
|
||||
scans table columns:
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT
|
||||
started_at DATETIME NOT NULL
|
||||
finished_at DATETIME
|
||||
source_path TEXT
|
||||
finding_count INTEGER DEFAULT 0
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
|
||||
settings table columns:
|
||||
key TEXT PRIMARY KEY
|
||||
value TEXT NOT NULL
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
|
||||
<!-- Finding struct for inter-package communication -->
|
||||
type Finding struct {
|
||||
ID int64
|
||||
ScanID int64
|
||||
ProviderName string
|
||||
KeyValue string // plaintext — encrypted before storage
|
||||
KeyMasked string // first8chars...last4chars
|
||||
Confidence string
|
||||
SourcePath string
|
||||
SourceType string
|
||||
LineNumber int
|
||||
}
|
||||
|
||||
<!-- DB driver registration -->
|
||||
import _ "modernc.org/sqlite"
|
||||
// driver registered as "sqlite" (NOT "sqlite3")
|
||||
db, err := sql.Open("sqlite", dataSourceName)
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 1: AES-256-GCM encryption and Argon2id key derivation</name>
|
||||
<files>pkg/storage/encrypt.go, pkg/storage/crypto.go</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (Pattern 3: AES-256-GCM Column Encryption and Pattern 4: Argon2id Key Derivation — exact code examples)
|
||||
</read_first>
|
||||
<behavior>
|
||||
- Test 1: Encrypt then Decrypt same key → returns original plaintext exactly
|
||||
- Test 2: Encrypt produces output longer than input (nonce + tag overhead)
|
||||
- Test 3: Two Encrypt calls on same plaintext → different ciphertext (random nonce)
|
||||
- Test 4: Decrypt with wrong key → returns error (GCM authentication fails)
|
||||
- Test 5: DeriveKey with same passphrase+salt → same 32-byte output (deterministic)
|
||||
- Test 6: DeriveKey output is exactly 32 bytes
|
||||
- Test 7: NewSalt() returns 16 bytes, two calls return different values
|
||||
</behavior>
|
||||
<action>
|
||||
Create **pkg/storage/encrypt.go**:
|
||||
```go
|
||||
package storage
|
||||
|
||||
import (
|
||||
"crypto/aes"
|
||||
"crypto/cipher"
|
||||
"crypto/rand"
|
||||
"errors"
|
||||
"io"
|
||||
)
|
||||
|
||||
// ErrCiphertextTooShort is returned when ciphertext is shorter than the GCM nonce size.
|
||||
var ErrCiphertextTooShort = errors.New("ciphertext too short")
|
||||
|
||||
// Encrypt encrypts plaintext using AES-256-GCM with a random nonce.
|
||||
// The nonce is prepended to the returned ciphertext.
|
||||
// key must be exactly 32 bytes (AES-256).
|
||||
func Encrypt(plaintext []byte, key []byte) ([]byte, error) {
|
||||
block, err := aes.NewCipher(key)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
gcm, err := cipher.NewGCM(block)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
nonce := make([]byte, gcm.NonceSize())
|
||||
if _, err := io.ReadFull(rand.Reader, nonce); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
// Seal appends encrypted data to nonce, so nonce is prepended
|
||||
ciphertext := gcm.Seal(nonce, nonce, plaintext, nil)
|
||||
return ciphertext, nil
|
||||
}
|
||||
|
||||
// Decrypt decrypts ciphertext produced by Encrypt.
|
||||
// Expects the nonce to be prepended to the ciphertext.
|
||||
func Decrypt(ciphertext []byte, key []byte) ([]byte, error) {
|
||||
block, err := aes.NewCipher(key)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
gcm, err := cipher.NewGCM(block)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
nonceSize := gcm.NonceSize()
|
||||
if len(ciphertext) < nonceSize {
|
||||
return nil, ErrCiphertextTooShort
|
||||
}
|
||||
nonce, ciphertext := ciphertext[:nonceSize], ciphertext[nonceSize:]
|
||||
return gcm.Open(nil, nonce, ciphertext, nil)
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/storage/crypto.go**:
|
||||
```go
|
||||
package storage
|
||||
|
||||
import (
|
||||
"crypto/rand"
|
||||
|
||||
"golang.org/x/crypto/argon2"
|
||||
)
|
||||
|
||||
const (
|
||||
argon2Time uint32 = 1
|
||||
argon2Memory uint32 = 64 * 1024 // 64 MB — RFC 9106 Section 7.3
|
||||
argon2Threads uint8 = 4
|
||||
argon2KeyLen uint32 = 32 // AES-256 key length
|
||||
saltSize = 16
|
||||
)
|
||||
|
||||
// DeriveKey produces a 32-byte AES-256 key from a passphrase and salt using Argon2id.
|
||||
// Uses RFC 9106 Section 7.3 recommended parameters.
|
||||
// Given the same passphrase and salt, always returns the same key.
|
||||
func DeriveKey(passphrase []byte, salt []byte) []byte {
|
||||
return argon2.IDKey(passphrase, salt, argon2Time, argon2Memory, argon2Threads, argon2KeyLen)
|
||||
}
|
||||
|
||||
// NewSalt generates a cryptographically random 16-byte salt.
|
||||
// Store alongside the database and reuse on each key derivation.
|
||||
func NewSalt() ([]byte, error) {
|
||||
salt := make([]byte, saltSize)
|
||||
if _, err := rand.Read(salt); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return salt, nil
|
||||
}
|
||||
```
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go build ./pkg/storage/... && echo "BUILD OK"</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go build ./pkg/storage/...` exits 0
|
||||
- pkg/storage/encrypt.go exports: Encrypt, Decrypt, ErrCiphertextTooShort
|
||||
- pkg/storage/crypto.go exports: DeriveKey, NewSalt
|
||||
- `grep -q 'argon2\.IDKey' pkg/storage/crypto.go` exits 0
|
||||
- `grep -q 'crypto/aes' pkg/storage/encrypt.go` exits 0
|
||||
- `grep -q 'cipher\.NewGCM' pkg/storage/encrypt.go` exits 0
|
||||
</acceptance_criteria>
|
||||
<done>Encrypt/Decrypt and DeriveKey/NewSalt exist and compile. Encryption uses AES-256-GCM with random nonce. Key derivation uses Argon2id RFC 9106 parameters.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 2: SQLite database, schema, Finding CRUD, and filled test stubs</name>
|
||||
<files>pkg/storage/schema.sql, pkg/storage/db.go, pkg/storage/findings.go, pkg/storage/db_test.go</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (STOR-01 row, Pattern 1 for embed usage pattern)
|
||||
- /home/salva/Documents/apikey/pkg/storage/encrypt.go (Encrypt/Decrypt signatures)
|
||||
- /home/salva/Documents/apikey/pkg/storage/crypto.go (DeriveKey signature)
|
||||
</read_first>
|
||||
<behavior>
|
||||
- Test 1: Open(":memory:") returns *DB without error, schema tables exist
|
||||
- Test 2: Encrypt/Decrypt roundtrip — Encrypt([]byte("sk-proj-abc"), key) then Decrypt returns "sk-proj-abc"
|
||||
- Test 3: DeriveKey(passphrase, salt) twice returns identical 32 bytes
|
||||
- Test 4: NewSalt() twice returns different slices
|
||||
- Test 5: SaveFinding stores finding → ListFindings decrypts and returns KeyValue == "sk-proj-test"
|
||||
- Test 6: Database file (when not :memory:) does NOT contain literal "sk-proj-test" in raw bytes
|
||||
</behavior>
|
||||
<action>
|
||||
Create **pkg/storage/schema.sql**:
|
||||
```sql
|
||||
-- KeyHunter database schema
|
||||
-- Version: 1
|
||||
|
||||
CREATE TABLE IF NOT EXISTS scans (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
started_at DATETIME NOT NULL,
|
||||
finished_at DATETIME,
|
||||
source_path TEXT,
|
||||
finding_count INTEGER DEFAULT 0,
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS findings (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
scan_id INTEGER REFERENCES scans(id),
|
||||
provider_name TEXT NOT NULL,
|
||||
key_value BLOB NOT NULL,
|
||||
key_masked TEXT NOT NULL,
|
||||
confidence TEXT NOT NULL,
|
||||
source_path TEXT,
|
||||
source_type TEXT,
|
||||
line_number INTEGER,
|
||||
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS settings (
|
||||
key TEXT PRIMARY KEY,
|
||||
value TEXT NOT NULL,
|
||||
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
|
||||
-- Indexes for common queries
|
||||
CREATE INDEX IF NOT EXISTS idx_findings_scan_id ON findings(scan_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_findings_provider ON findings(provider_name);
|
||||
CREATE INDEX IF NOT EXISTS idx_findings_created ON findings(created_at DESC);
|
||||
```
|
||||
|
||||
Create **pkg/storage/db.go**:
|
||||
```go
|
||||
package storage
|
||||
|
||||
import (
|
||||
"database/sql"
|
||||
_ "embed"
|
||||
"fmt"
|
||||
|
||||
_ "modernc.org/sqlite"
|
||||
)
|
||||
|
||||
//go:embed schema.sql
|
||||
var schemaSQLBytes []byte
|
||||
|
||||
// DB wraps the sql.DB connection with KeyHunter-specific behavior.
|
||||
type DB struct {
|
||||
sql *sql.DB
|
||||
}
|
||||
|
||||
// Open opens or creates a SQLite database at path, runs embedded schema migrations,
|
||||
// and enables WAL mode for better concurrent read performance.
|
||||
// Use ":memory:" for tests.
|
||||
func Open(path string) (*DB, error) {
|
||||
sqlDB, err := sql.Open("sqlite", path)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("opening database: %w", err)
|
||||
}
|
||||
|
||||
// Enable WAL mode for concurrent reads
|
||||
if _, err := sqlDB.Exec("PRAGMA journal_mode=WAL"); err != nil {
|
||||
sqlDB.Close()
|
||||
return nil, fmt.Errorf("enabling WAL mode: %w", err)
|
||||
}
|
||||
|
||||
// Enable foreign keys
|
||||
if _, err := sqlDB.Exec("PRAGMA foreign_keys=ON"); err != nil {
|
||||
sqlDB.Close()
|
||||
return nil, fmt.Errorf("enabling foreign keys: %w", err)
|
||||
}
|
||||
|
||||
// Run schema migrations
|
||||
if _, err := sqlDB.Exec(string(schemaSQLBytes)); err != nil {
|
||||
sqlDB.Close()
|
||||
return nil, fmt.Errorf("running schema migrations: %w", err)
|
||||
}
|
||||
|
||||
return &DB{sql: sqlDB}, nil
|
||||
}
|
||||
|
||||
// Close closes the underlying database connection.
|
||||
func (db *DB) Close() error {
|
||||
return db.sql.Close()
|
||||
}
|
||||
|
||||
// SQL returns the underlying sql.DB for advanced use cases.
|
||||
func (db *DB) SQL() *sql.DB {
|
||||
return db.sql
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/storage/findings.go**:
|
||||
```go
|
||||
package storage
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"time"
|
||||
)
|
||||
|
||||
// Finding represents a detected API key with metadata.
|
||||
// KeyValue is always plaintext in this struct — encryption happens at the storage boundary.
|
||||
type Finding struct {
|
||||
ID int64
|
||||
ScanID int64
|
||||
ProviderName string
|
||||
KeyValue string // plaintext — encrypted before storage, decrypted after retrieval
|
||||
KeyMasked string // first8...last4, stored plaintext
|
||||
Confidence string
|
||||
SourcePath string
|
||||
SourceType string
|
||||
LineNumber int
|
||||
CreatedAt time.Time
|
||||
}
|
||||
|
||||
// MaskKey returns the masked form of a key: first 8 chars + "..." + last 4 chars.
|
||||
// If the key is too short (< 12 chars), returns the full key masked with asterisks.
|
||||
func MaskKey(key string) string {
|
||||
if len(key) < 12 {
|
||||
return "****"
|
||||
}
|
||||
return key[:8] + "..." + key[len(key)-4:]
|
||||
}
|
||||
|
||||
// SaveFinding encrypts the finding's KeyValue and persists the finding to the database.
|
||||
// encKey must be a 32-byte AES-256 key (from DeriveKey).
|
||||
func (db *DB) SaveFinding(f Finding, encKey []byte) (int64, error) {
|
||||
encrypted, err := Encrypt([]byte(f.KeyValue), encKey)
|
||||
if err != nil {
|
||||
return 0, fmt.Errorf("encrypting key value: %w", err)
|
||||
}
|
||||
|
||||
masked := f.KeyMasked
|
||||
if masked == "" {
|
||||
masked = MaskKey(f.KeyValue)
|
||||
}
|
||||
|
||||
res, err := db.sql.Exec(
|
||||
`INSERT INTO findings (scan_id, provider_name, key_value, key_masked, confidence, source_path, source_type, line_number)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?)`,
|
||||
f.ScanID, f.ProviderName, encrypted, masked, f.Confidence, f.SourcePath, f.SourceType, f.LineNumber,
|
||||
)
|
||||
if err != nil {
|
||||
return 0, fmt.Errorf("inserting finding: %w", err)
|
||||
}
|
||||
return res.LastInsertId()
|
||||
}
|
||||
|
||||
// ListFindings retrieves all findings, decrypting key values using encKey.
|
||||
// encKey must be the same 32-byte key used during SaveFinding.
|
||||
func (db *DB) ListFindings(encKey []byte) ([]Finding, error) {
|
||||
rows, err := db.sql.Query(
|
||||
`SELECT id, scan_id, provider_name, key_value, key_masked, confidence,
|
||||
source_path, source_type, line_number, created_at
|
||||
FROM findings ORDER BY created_at DESC`,
|
||||
)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("querying findings: %w", err)
|
||||
}
|
||||
defer rows.Close()
|
||||
|
||||
var findings []Finding
|
||||
for rows.Next() {
|
||||
var f Finding
|
||||
var encrypted []byte
|
||||
var createdAt string
|
||||
err := rows.Scan(
|
||||
&f.ID, &f.ScanID, &f.ProviderName, &encrypted, &f.KeyMasked,
|
||||
&f.Confidence, &f.SourcePath, &f.SourceType, &f.LineNumber, &createdAt,
|
||||
)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("scanning finding row: %w", err)
|
||||
}
|
||||
plain, err := Decrypt(encrypted, encKey)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("decrypting finding %d: %w", f.ID, err)
|
||||
}
|
||||
f.KeyValue = string(plain)
|
||||
f.CreatedAt, _ = time.Parse("2006-01-02 15:04:05", createdAt)
|
||||
findings = append(findings, f)
|
||||
}
|
||||
return findings, rows.Err()
|
||||
}
|
||||
```
|
||||
|
||||
Fill **pkg/storage/db_test.go** (replacing stubs from Plan 01):
|
||||
```go
|
||||
package storage_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/storage"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func TestDBOpen(t *testing.T) {
|
||||
db, err := storage.Open(":memory:")
|
||||
require.NoError(t, err)
|
||||
defer db.Close()
|
||||
|
||||
// Verify schema tables exist
|
||||
rows, err := db.SQL().Query("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
|
||||
require.NoError(t, err)
|
||||
defer rows.Close()
|
||||
|
||||
var tables []string
|
||||
for rows.Next() {
|
||||
var name string
|
||||
require.NoError(t, rows.Scan(&name))
|
||||
tables = append(tables, name)
|
||||
}
|
||||
assert.Contains(t, tables, "findings")
|
||||
assert.Contains(t, tables, "scans")
|
||||
assert.Contains(t, tables, "settings")
|
||||
}
|
||||
|
||||
func TestEncryptDecryptRoundtrip(t *testing.T) {
|
||||
key := make([]byte, 32) // all-zero key for test
|
||||
for i := range key {
|
||||
key[i] = byte(i)
|
||||
}
|
||||
plaintext := []byte("sk-proj-supersecretapikey1234")
|
||||
|
||||
ciphertext, err := storage.Encrypt(plaintext, key)
|
||||
require.NoError(t, err)
|
||||
assert.Greater(t, len(ciphertext), len(plaintext), "ciphertext should be longer than plaintext")
|
||||
|
||||
recovered, err := storage.Decrypt(ciphertext, key)
|
||||
require.NoError(t, err)
|
||||
assert.Equal(t, plaintext, recovered)
|
||||
}
|
||||
|
||||
func TestEncryptNonDeterministic(t *testing.T) {
|
||||
key := make([]byte, 32)
|
||||
plain := []byte("test-key")
|
||||
ct1, err1 := storage.Encrypt(plain, key)
|
||||
ct2, err2 := storage.Encrypt(plain, key)
|
||||
require.NoError(t, err1)
|
||||
require.NoError(t, err2)
|
||||
assert.NotEqual(t, ct1, ct2, "same plaintext encrypted twice should produce different ciphertext")
|
||||
}
|
||||
|
||||
func TestDecryptWrongKey(t *testing.T) {
|
||||
key1 := make([]byte, 32)
|
||||
key2 := make([]byte, 32)
|
||||
key2[0] = 0xFF
|
||||
|
||||
ct, err := storage.Encrypt([]byte("secret"), key1)
|
||||
require.NoError(t, err)
|
||||
|
||||
_, err = storage.Decrypt(ct, key2)
|
||||
assert.Error(t, err, "decryption with wrong key should fail")
|
||||
}
|
||||
|
||||
func TestArgon2KeyDerivation(t *testing.T) {
|
||||
passphrase := []byte("my-secure-passphrase")
|
||||
salt := []byte("1234567890abcdef") // 16 bytes
|
||||
|
||||
key1 := storage.DeriveKey(passphrase, salt)
|
||||
key2 := storage.DeriveKey(passphrase, salt)
|
||||
|
||||
assert.Equal(t, 32, len(key1), "derived key must be 32 bytes")
|
||||
assert.Equal(t, key1, key2, "same passphrase+salt must produce same key")
|
||||
}
|
||||
|
||||
func TestNewSalt(t *testing.T) {
|
||||
salt1, err1 := storage.NewSalt()
|
||||
salt2, err2 := storage.NewSalt()
|
||||
require.NoError(t, err1)
|
||||
require.NoError(t, err2)
|
||||
assert.Equal(t, 16, len(salt1))
|
||||
assert.NotEqual(t, salt1, salt2, "two salts should differ")
|
||||
}
|
||||
|
||||
func TestSaveFindingEncrypted(t *testing.T) {
|
||||
db, err := storage.Open(":memory:")
|
||||
require.NoError(t, err)
|
||||
defer db.Close()
|
||||
|
||||
// Derive a test key
|
||||
key := storage.DeriveKey([]byte("testpassphrase"), []byte("testsalt1234xxxx"))
|
||||
|
||||
f := storage.Finding{
|
||||
ProviderName: "openai",
|
||||
KeyValue: "sk-proj-test1234567890abcdefghijklmnopqr",
|
||||
Confidence: "high",
|
||||
SourcePath: "/test/file.env",
|
||||
SourceType: "file",
|
||||
LineNumber: 42,
|
||||
}
|
||||
|
||||
id, err := db.SaveFinding(f, key)
|
||||
require.NoError(t, err)
|
||||
assert.Greater(t, id, int64(0))
|
||||
|
||||
findings, err := db.ListFindings(key)
|
||||
require.NoError(t, err)
|
||||
require.Len(t, findings, 1)
|
||||
assert.Equal(t, "sk-proj-test1234567890abcdefghijklmnopqr", findings[0].KeyValue)
|
||||
assert.Equal(t, "openai", findings[0].ProviderName)
|
||||
// Verify masking
|
||||
assert.Equal(t, "sk-proj-...opqr", findings[0].KeyMasked)
|
||||
}
|
||||
```
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/storage/... -v -count=1 2>&1 | tail -25</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go test ./pkg/storage/... -v -count=1` exits 0 with all 7 tests PASS (no SKIP)
|
||||
- TestDBOpen finds tables: findings, scans, settings
|
||||
- TestEncryptDecryptRoundtrip passes — recovered plaintext matches original
|
||||
- TestEncryptNonDeterministic passes — two encryptions differ
|
||||
- TestDecryptWrongKey passes — wrong key causes error
|
||||
- TestArgon2KeyDerivation passes — 32 bytes, deterministic
|
||||
- TestNewSalt passes — 16 bytes, non-deterministic
|
||||
- TestSaveFindingEncrypted passes — stored and retrieved with correct KeyValue and KeyMasked
|
||||
- `grep -q 'go:embed.*schema' pkg/storage/db.go` exits 0
|
||||
- `grep -q 'modernc.org/sqlite' pkg/storage/db.go` exits 0
|
||||
- `grep -q 'journal_mode=WAL' pkg/storage/db.go` exits 0
|
||||
</acceptance_criteria>
|
||||
<done>Storage layer complete — SQLite opens with schema, AES-256-GCM encrypt/decrypt works, Argon2id key derivation works, SaveFinding/ListFindings encrypt/decrypt transparently. All 7 tests pass.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
After both tasks:
|
||||
- `go test ./pkg/storage/... -v -count=1` exits 0 with 7 tests PASS
|
||||
- `go build ./...` still exits 0
|
||||
- `grep -q 'argon2\.IDKey' pkg/storage/crypto.go` exits 0
|
||||
- `grep -q 'cipher\.NewGCM' pkg/storage/encrypt.go` exits 0
|
||||
- `grep -q 'journal_mode=WAL' pkg/storage/db.go` exits 0
|
||||
- schema.sql contains CREATE TABLE for findings, scans, settings
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- SQLite database opens and auto-migrates from embedded schema.sql (STOR-01)
|
||||
- AES-256-GCM column encryption works: Encrypt + Decrypt roundtrip returns original (STOR-02)
|
||||
- Argon2id key derivation: DeriveKey deterministic, 32 bytes, RFC 9106 params (STOR-03)
|
||||
- FindingCRUD: SaveFinding encrypts before INSERT, ListFindings decrypts after SELECT
|
||||
- All 7 storage tests pass
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/01-foundation/01-03-SUMMARY.md` following the summary template.
|
||||
</output>
|
||||
682
.planning/phases/01-foundation/01-04-PLAN.md
Normal file
682
.planning/phases/01-foundation/01-04-PLAN.md
Normal file
@@ -0,0 +1,682 @@
|
||||
---
|
||||
phase: 01-foundation
|
||||
plan: 04
|
||||
type: execute
|
||||
wave: 2
|
||||
depends_on: [01-02]
|
||||
files_modified:
|
||||
- pkg/engine/chunk.go
|
||||
- pkg/engine/finding.go
|
||||
- pkg/engine/entropy.go
|
||||
- pkg/engine/filter.go
|
||||
- pkg/engine/detector.go
|
||||
- pkg/engine/engine.go
|
||||
- pkg/engine/sources/source.go
|
||||
- pkg/engine/sources/file.go
|
||||
- pkg/engine/scanner_test.go
|
||||
autonomous: true
|
||||
requirements: [CORE-01, CORE-04, CORE-05, CORE-06, CORE-07]
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "Shannon entropy function returns expected values for known inputs"
|
||||
- "Aho-Corasick pre-filter passes chunks containing provider keywords and drops those without"
|
||||
- "Detector correctly identifies OpenAI and Anthropic key patterns in test fixtures via regex"
|
||||
- "Full scan pipeline: scan testdata/samples/openai_key.txt → Finding with ProviderName==openai"
|
||||
- "Full scan pipeline: scan testdata/samples/no_keys.txt → zero findings"
|
||||
- "Worker pool uses ants v2 with configurable worker count"
|
||||
artifacts:
|
||||
- path: "pkg/engine/chunk.go"
|
||||
provides: "Chunk struct (Data []byte, Source string, Offset int64)"
|
||||
exports: ["Chunk"]
|
||||
- path: "pkg/engine/finding.go"
|
||||
provides: "Finding struct (provider, key value, masked, confidence, source, line)"
|
||||
exports: ["Finding", "MaskKey"]
|
||||
- path: "pkg/engine/entropy.go"
|
||||
provides: "Shannon(s string) float64 — ~10 line stdlib math implementation"
|
||||
exports: ["Shannon"]
|
||||
- path: "pkg/engine/filter.go"
|
||||
provides: "KeywordFilter stage — runs Aho-Corasick and passes/drops chunks"
|
||||
exports: ["KeywordFilter"]
|
||||
- path: "pkg/engine/detector.go"
|
||||
provides: "Detector stage — applies provider regexps and entropy check to chunks"
|
||||
exports: ["Detector"]
|
||||
- path: "pkg/engine/engine.go"
|
||||
provides: "Engine struct with Scan(ctx, src, cfg) <-chan Finding"
|
||||
exports: ["Engine", "NewEngine", "ScanConfig"]
|
||||
- path: "pkg/engine/sources/source.go"
|
||||
provides: "Source interface with Chunks(ctx, chan<- Chunk) error"
|
||||
exports: ["Source"]
|
||||
- path: "pkg/engine/sources/file.go"
|
||||
provides: "FileSource implementing Source for single-file scanning"
|
||||
exports: ["FileSource", "NewFileSource"]
|
||||
key_links:
|
||||
- from: "pkg/engine/engine.go"
|
||||
to: "pkg/providers/registry.go"
|
||||
via: "Engine holds *providers.Registry, uses Registry.AC() for pre-filter"
|
||||
pattern: "providers\\.Registry"
|
||||
- from: "pkg/engine/filter.go"
|
||||
to: "github.com/petar-dambovaliev/aho-corasick"
|
||||
via: "AC.FindAll() on each chunk"
|
||||
pattern: "FindAll"
|
||||
- from: "pkg/engine/detector.go"
|
||||
to: "pkg/engine/entropy.go"
|
||||
via: "Shannon() called when EntropyMin > 0 in pattern"
|
||||
pattern: "Shannon"
|
||||
- from: "pkg/engine/engine.go"
|
||||
to: "github.com/panjf2000/ants/v2"
|
||||
via: "ants.NewPool for detector workers"
|
||||
pattern: "ants\\.NewPool"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Build the three-stage scanning engine pipeline: Aho-Corasick keyword pre-filter, regex + entropy detector workers using ants goroutine pool, and a FileSource adapter. Wire them together in an Engine that emits Findings on a channel.
|
||||
|
||||
Purpose: The scan engine is the core differentiator. Plans 02 and 03 provide its dependencies (Registry for patterns + keywords, storage types for Finding). The CLI (Plan 05) calls Engine.Scan() to implement `keyhunter scan`.
|
||||
Output: pkg/engine/{chunk,finding,entropy,filter,detector,engine}.go and sources/{source,file}.go. scanner_test.go stubs filled.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/01-foundation/01-RESEARCH.md
|
||||
@.planning/phases/01-foundation/01-02-SUMMARY.md
|
||||
|
||||
<interfaces>
|
||||
<!-- Provider Registry types (from Plan 02) -->
|
||||
package providers
|
||||
|
||||
type Provider struct {
|
||||
Name string
|
||||
Keywords []string
|
||||
Patterns []Pattern
|
||||
Tier int
|
||||
}
|
||||
|
||||
type Pattern struct {
|
||||
Regex string
|
||||
EntropyMin float64
|
||||
Confidence string
|
||||
}
|
||||
|
||||
type Registry struct { ... }
|
||||
func (r *Registry) List() []Provider
|
||||
func (r *Registry) AC() ahocorasick.AhoCorasick // pre-built Aho-Corasick
|
||||
|
||||
<!-- Three-stage pipeline pattern from RESEARCH.md Pattern 2 -->
|
||||
chunksChan chan Chunk (buffer: 1000)
|
||||
detectableChan chan Chunk (buffer: 500)
|
||||
resultsChan chan Finding (buffer: 100)
|
||||
|
||||
Stage 1: Source.Chunks() → chunksChan (goroutine, closes chan on done)
|
||||
Stage 2: KeywordFilter(chunksChan) → detectableChan (goroutine, AC.FindAll)
|
||||
Stage 3: N detector workers (ants pool) → resultsChan
|
||||
|
||||
<!-- ScanConfig -->
|
||||
type ScanConfig struct {
|
||||
Workers int // default: runtime.NumCPU() * 8
|
||||
Verify bool // Phase 5 — always false in Phase 1
|
||||
Unmask bool // for output layer
|
||||
}
|
||||
|
||||
<!-- Source interface -->
|
||||
type Source interface {
|
||||
Chunks(ctx context.Context, out chan<- Chunk) error
|
||||
}
|
||||
|
||||
<!-- FileSource -->
|
||||
type FileSource struct {
|
||||
Path string
|
||||
ChunkSize int // bytes per chunk, default 4096
|
||||
}
|
||||
|
||||
Chunking strategy: read file in chunks of ChunkSize bytes with overlap of max(256, maxPatternLen)
|
||||
to avoid splitting a key across chunk boundaries.
|
||||
|
||||
<!-- Aho-Corasick import -->
|
||||
import ahocorasick "github.com/petar-dambovaliev/aho-corasick"
|
||||
// ac.FindAll(s string) []ahocorasick.Match — returns match positions
|
||||
|
||||
<!-- ants import -->
|
||||
import "github.com/panjf2000/ants/v2"
|
||||
// pool, _ := ants.NewPool(workers, ants.WithOptions(...))
|
||||
// pool.Submit(func() { ... })
|
||||
// pool.ReleaseWithTimeout(timeout)
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 1: Core types and Shannon entropy function</name>
|
||||
<files>pkg/engine/chunk.go, pkg/engine/finding.go, pkg/engine/entropy.go</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (CORE-04 row: Shannon entropy, ~10-line stdlib function, threshold 3.5 bits/char)
|
||||
- /home/salva/Documents/apikey/pkg/storage/findings.go (Finding and MaskKey defined there — engine.Finding is a separate type for the pipeline)
|
||||
</read_first>
|
||||
<behavior>
|
||||
- Test 1: Shannon("aaaaaaa") → value near 0.0 (all same characters, no entropy)
|
||||
- Test 2: Shannon("abcdefgh") → value near 3.0 (8 distinct chars)
|
||||
- Test 3: Shannon("sk-proj-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqr") → >= 3.5 (real key entropy)
|
||||
- Test 4: Shannon("") → 0.0 (empty string)
|
||||
- Test 5: MaskKey("sk-proj-abc1234") → "sk-proj-...1234" (first 8 + last 4)
|
||||
- Test 6: MaskKey("abc") → "****" (too short to mask)
|
||||
</behavior>
|
||||
<action>
|
||||
Create **pkg/engine/chunk.go**:
|
||||
```go
|
||||
package engine
|
||||
|
||||
// Chunk is a segment of file content passed through the scanning pipeline.
|
||||
type Chunk struct {
|
||||
Data []byte // raw bytes
|
||||
Source string // file path, URL, or description
|
||||
Offset int64 // byte offset of this chunk within the source
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/engine/finding.go**:
|
||||
```go
|
||||
package engine
|
||||
|
||||
import "time"
|
||||
|
||||
// Finding represents a detected API key from the scanning pipeline.
|
||||
// KeyValue holds the plaintext key — the storage layer encrypts it before persisting.
|
||||
type Finding struct {
|
||||
ProviderName string
|
||||
KeyValue string // full plaintext key
|
||||
KeyMasked string // first8...last4
|
||||
Confidence string // "high", "medium", "low"
|
||||
Source string // file path or description
|
||||
SourceType string // "file", "dir", "git", "stdin", "url"
|
||||
LineNumber int
|
||||
Offset int64
|
||||
DetectedAt time.Time
|
||||
}
|
||||
|
||||
// MaskKey returns a masked representation: first 8 chars + "..." + last 4 chars.
|
||||
// Returns "****" if the key is shorter than 12 characters.
|
||||
func MaskKey(key string) string {
|
||||
if len(key) < 12 {
|
||||
return "****"
|
||||
}
|
||||
return key[:8] + "..." + key[len(key)-4:]
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/engine/entropy.go**:
|
||||
```go
|
||||
package engine
|
||||
|
||||
import "math"
|
||||
|
||||
// Shannon computes the Shannon entropy of a string in bits per character.
|
||||
// Returns 0.0 for empty strings.
|
||||
// A value >= 3.5 indicates high randomness, consistent with real API keys.
|
||||
func Shannon(s string) float64 {
|
||||
if len(s) == 0 {
|
||||
return 0.0
|
||||
}
|
||||
freq := make(map[rune]float64)
|
||||
for _, c := range s {
|
||||
freq[c]++
|
||||
}
|
||||
n := float64(len([]rune(s)))
|
||||
var entropy float64
|
||||
for _, count := range freq {
|
||||
p := count / n
|
||||
entropy -= p * math.Log2(p)
|
||||
}
|
||||
return entropy
|
||||
}
|
||||
```
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go build ./pkg/engine/... && echo "BUILD OK"</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go build ./pkg/engine/...` exits 0
|
||||
- pkg/engine/chunk.go exports Chunk with fields Data, Source, Offset
|
||||
- pkg/engine/finding.go exports Finding and MaskKey
|
||||
- pkg/engine/entropy.go exports Shannon using math.Log2
|
||||
- `grep -q 'math\.Log2' pkg/engine/entropy.go` exits 0
|
||||
- Shannon("aaaaaaa") == 0.0 (manually verifiable from code)
|
||||
- MaskKey("sk-proj-abc1234") produces "sk-proj-...1234"
|
||||
</acceptance_criteria>
|
||||
<done>Chunk, Finding, MaskKey, and Shannon exist and compile. Shannon uses stdlib math only — no external library.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 2: Pipeline stages, engine orchestration, FileSource, and filled test stubs</name>
|
||||
<files>
|
||||
pkg/engine/filter.go,
|
||||
pkg/engine/detector.go,
|
||||
pkg/engine/engine.go,
|
||||
pkg/engine/sources/source.go,
|
||||
pkg/engine/sources/file.go,
|
||||
pkg/engine/scanner_test.go
|
||||
</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (Pattern 2: Three-Stage Scanning Pipeline — exact channel-based code example)
|
||||
- /home/salva/Documents/apikey/pkg/engine/chunk.go
|
||||
- /home/salva/Documents/apikey/pkg/engine/finding.go
|
||||
- /home/salva/Documents/apikey/pkg/engine/entropy.go
|
||||
- /home/salva/Documents/apikey/pkg/providers/registry.go (Registry.AC() and Registry.List() signatures)
|
||||
</read_first>
|
||||
<behavior>
|
||||
- Test 1: Scan testdata/samples/openai_key.txt → 1 finding, ProviderName=="openai", KeyValue contains "sk-proj-"
|
||||
- Test 2: Scan testdata/samples/anthropic_key.txt → 1 finding, ProviderName=="anthropic"
|
||||
- Test 3: Scan testdata/samples/no_keys.txt → 0 findings
|
||||
- Test 4: Scan testdata/samples/multiple_keys.txt → 2 findings (openai + anthropic)
|
||||
- Test 5: Shannon("sk-proj-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqr") >= 3.5 (entropy check)
|
||||
- Test 6: KeywordFilter drops a chunk with text "hello world" (no provider keywords)
|
||||
</behavior>
|
||||
<action>
|
||||
Create **pkg/engine/sources/source.go**:
|
||||
```go
|
||||
package sources
|
||||
|
||||
import (
|
||||
"context"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine"
|
||||
)
|
||||
|
||||
// Source is the interface all input adapters must implement.
|
||||
// Chunks writes content segments to the out channel until the source is exhausted or ctx is cancelled.
|
||||
type Source interface {
|
||||
Chunks(ctx context.Context, out chan<- engine.Chunk) error
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/engine/sources/file.go**:
|
||||
```go
|
||||
package sources
|
||||
|
||||
import (
|
||||
"context"
|
||||
"os"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine"
|
||||
)
|
||||
|
||||
const defaultChunkSize = 4096
|
||||
const chunkOverlap = 256 // overlap between chunks to avoid splitting keys at boundaries
|
||||
|
||||
// FileSource reads a single file and emits overlapping chunks.
|
||||
type FileSource struct {
|
||||
Path string
|
||||
ChunkSize int
|
||||
}
|
||||
|
||||
// NewFileSource creates a FileSource for the given path with the default chunk size.
|
||||
func NewFileSource(path string) *FileSource {
|
||||
return &FileSource{Path: path, ChunkSize: defaultChunkSize}
|
||||
}
|
||||
|
||||
// Chunks reads the file in overlapping segments and sends each chunk to out.
|
||||
func (f *FileSource) Chunks(ctx context.Context, out chan<- engine.Chunk) error {
|
||||
data, err := os.ReadFile(f.Path)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
size := f.ChunkSize
|
||||
if size <= 0 {
|
||||
size = defaultChunkSize
|
||||
}
|
||||
if len(data) <= size {
|
||||
// File fits in one chunk
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return ctx.Err()
|
||||
case out <- engine.Chunk{Data: data, Source: f.Path, Offset: 0}:
|
||||
}
|
||||
return nil
|
||||
}
|
||||
// Emit overlapping chunks
|
||||
var offset int64
|
||||
for start := 0; start < len(data); start += size - chunkOverlap {
|
||||
end := start + size
|
||||
if end > len(data) {
|
||||
end = len(data)
|
||||
}
|
||||
chunk := engine.Chunk{
|
||||
Data: data[start:end],
|
||||
Source: f.Path,
|
||||
Offset: offset,
|
||||
}
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return ctx.Err()
|
||||
case out <- chunk:
|
||||
}
|
||||
offset += int64(end - start)
|
||||
if end == len(data) {
|
||||
break
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/engine/filter.go**:
|
||||
```go
|
||||
package engine
|
||||
|
||||
import (
|
||||
ahocorasick "github.com/petar-dambovaliev/aho-corasick"
|
||||
)
|
||||
|
||||
// KeywordFilter filters a stream of chunks using an Aho-Corasick automaton.
|
||||
// Only chunks that contain at least one provider keyword are sent to out.
|
||||
// This is Stage 2 of the pipeline (runs after Source, before Detector).
|
||||
func KeywordFilter(ac ahocorasick.AhoCorasick, in <-chan Chunk, out chan<- Chunk) {
|
||||
for chunk := range in {
|
||||
if len(ac.FindAll(string(chunk.Data))) > 0 {
|
||||
out <- chunk
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/engine/detector.go**:
|
||||
```go
|
||||
package engine
|
||||
|
||||
import (
|
||||
"regexp"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/providers"
|
||||
)
|
||||
|
||||
// Detector applies provider regex patterns and optional entropy checks to a chunk.
|
||||
// It returns all findings from the chunk.
|
||||
func Detect(chunk Chunk, providerList []providers.Provider) []Finding {
|
||||
var findings []Finding
|
||||
content := string(chunk.Data)
|
||||
|
||||
for _, p := range providerList {
|
||||
for _, pat := range p.Patterns {
|
||||
re, err := regexp.Compile(pat.Regex)
|
||||
if err != nil {
|
||||
continue // invalid regex — skip silently
|
||||
}
|
||||
matches := re.FindAllString(content, -1)
|
||||
for _, match := range matches {
|
||||
// Apply entropy check if threshold is set
|
||||
if pat.EntropyMin > 0 && Shannon(match) < pat.EntropyMin {
|
||||
continue // too low entropy — likely a placeholder
|
||||
}
|
||||
line := lineNumber(content, match)
|
||||
findings = append(findings, Finding{
|
||||
ProviderName: p.Name,
|
||||
KeyValue: match,
|
||||
KeyMasked: MaskKey(match),
|
||||
Confidence: pat.Confidence,
|
||||
Source: chunk.Source,
|
||||
SourceType: "file",
|
||||
LineNumber: line,
|
||||
Offset: chunk.Offset,
|
||||
DetectedAt: time.Now(),
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
return findings
|
||||
}
|
||||
|
||||
// lineNumber returns the 1-based line number where match first appears in content.
|
||||
func lineNumber(content, match string) int {
|
||||
idx := strings.Index(content, match)
|
||||
if idx < 0 {
|
||||
return 0
|
||||
}
|
||||
return strings.Count(content[:idx], "\n") + 1
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/engine/engine.go**:
|
||||
```go
|
||||
package engine
|
||||
|
||||
import (
|
||||
"context"
|
||||
"runtime"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/panjf2000/ants/v2"
|
||||
"github.com/salvacybersec/keyhunter/pkg/providers"
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine/sources"
|
||||
)
|
||||
|
||||
// ScanConfig controls scan execution parameters.
|
||||
type ScanConfig struct {
|
||||
Workers int // number of detector goroutines; defaults to runtime.NumCPU() * 8
|
||||
Verify bool // opt-in active verification (Phase 5)
|
||||
Unmask bool // include full key in Finding.KeyValue
|
||||
}
|
||||
|
||||
// Engine orchestrates the three-stage scanning pipeline.
|
||||
type Engine struct {
|
||||
registry *providers.Registry
|
||||
}
|
||||
|
||||
// NewEngine creates an Engine backed by the given provider registry.
|
||||
func NewEngine(registry *providers.Registry) *Engine {
|
||||
return &Engine{registry: registry}
|
||||
}
|
||||
|
||||
// Scan runs the three-stage pipeline against src and returns a channel of Findings.
|
||||
// The channel is closed when all chunks have been processed.
|
||||
// The caller must drain the channel fully or cancel ctx to avoid goroutine leaks.
|
||||
func (e *Engine) Scan(ctx context.Context, src sources.Source, cfg ScanConfig) (<-chan Finding, error) {
|
||||
workers := cfg.Workers
|
||||
if workers <= 0 {
|
||||
workers = runtime.NumCPU() * 8
|
||||
}
|
||||
|
||||
chunksChan := make(chan Chunk, 1000)
|
||||
detectableChan := make(chan Chunk, 500)
|
||||
resultsChan := make(chan Finding, 100)
|
||||
|
||||
// Stage 1: source → chunksChan
|
||||
go func() {
|
||||
defer close(chunksChan)
|
||||
_ = src.Chunks(ctx, chunksChan)
|
||||
}()
|
||||
|
||||
// Stage 2: keyword pre-filter → detectableChan
|
||||
go func() {
|
||||
defer close(detectableChan)
|
||||
KeywordFilter(e.registry.AC(), chunksChan, detectableChan)
|
||||
}()
|
||||
|
||||
// Stage 3: detector workers → resultsChan
|
||||
pool, err := ants.NewPool(workers)
|
||||
if err != nil {
|
||||
close(resultsChan)
|
||||
return nil, err
|
||||
}
|
||||
providerList := e.registry.List()
|
||||
|
||||
var wg sync.WaitGroup
|
||||
var mu sync.Mutex
|
||||
|
||||
go func() {
|
||||
defer func() {
|
||||
wg.Wait()
|
||||
close(resultsChan)
|
||||
pool.ReleaseWithTimeout(5 * time.Second)
|
||||
}()
|
||||
|
||||
for chunk := range detectableChan {
|
||||
c := chunk // capture
|
||||
wg.Add(1)
|
||||
_ = pool.Submit(func() {
|
||||
defer wg.Done()
|
||||
found := Detect(c, providerList)
|
||||
mu.Lock()
|
||||
for _, f := range found {
|
||||
select {
|
||||
case resultsChan <- f:
|
||||
case <-ctx.Done():
|
||||
}
|
||||
}
|
||||
mu.Unlock()
|
||||
})
|
||||
}
|
||||
}()
|
||||
|
||||
return resultsChan, nil
|
||||
}
|
||||
```
|
||||
|
||||
Fill **pkg/engine/scanner_test.go** (replacing stubs from Plan 01):
|
||||
```go
|
||||
package engine_test
|
||||
|
||||
import (
|
||||
"context"
|
||||
"testing"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine"
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine/sources"
|
||||
"github.com/salvacybersec/keyhunter/pkg/providers"
|
||||
"github.com/stretchr/testify/assert"
|
||||
"github.com/stretchr/testify/require"
|
||||
)
|
||||
|
||||
func newTestRegistry(t *testing.T) *providers.Registry {
|
||||
t.Helper()
|
||||
reg, err := providers.NewRegistry()
|
||||
require.NoError(t, err)
|
||||
return reg
|
||||
}
|
||||
|
||||
func TestShannonEntropy(t *testing.T) {
|
||||
assert.InDelta(t, 0.0, engine.Shannon("aaaaaaa"), 0.01)
|
||||
assert.Greater(t, engine.Shannon("sk-proj-ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqr"), 3.5)
|
||||
assert.Equal(t, 0.0, engine.Shannon(""))
|
||||
}
|
||||
|
||||
func TestKeywordPreFilter(t *testing.T) {
|
||||
reg := newTestRegistry(t)
|
||||
ac := reg.AC()
|
||||
|
||||
// Chunk with OpenAI keyword should pass
|
||||
matches := ac.FindAll("export OPENAI_API_KEY=sk-proj-test")
|
||||
assert.NotEmpty(t, matches)
|
||||
|
||||
// Chunk with no keywords should be dropped
|
||||
noMatches := ac.FindAll("hello world no secrets here")
|
||||
assert.Empty(t, noMatches)
|
||||
}
|
||||
|
||||
func TestScannerPipelineOpenAI(t *testing.T) {
|
||||
reg := newTestRegistry(t)
|
||||
eng := engine.NewEngine(reg)
|
||||
src := sources.NewFileSource("../../testdata/samples/openai_key.txt")
|
||||
cfg := engine.ScanConfig{Workers: 2}
|
||||
|
||||
ch, err := eng.Scan(context.Background(), src, cfg)
|
||||
require.NoError(t, err)
|
||||
|
||||
var findings []engine.Finding
|
||||
for f := range ch {
|
||||
findings = append(findings, f)
|
||||
}
|
||||
|
||||
require.Len(t, findings, 1, "expected exactly 1 finding in openai_key.txt")
|
||||
assert.Equal(t, "openai", findings[0].ProviderName)
|
||||
assert.Contains(t, findings[0].KeyValue, "sk-proj-")
|
||||
}
|
||||
|
||||
func TestScannerPipelineNoKeys(t *testing.T) {
|
||||
reg := newTestRegistry(t)
|
||||
eng := engine.NewEngine(reg)
|
||||
src := sources.NewFileSource("../../testdata/samples/no_keys.txt")
|
||||
cfg := engine.ScanConfig{Workers: 2}
|
||||
|
||||
ch, err := eng.Scan(context.Background(), src, cfg)
|
||||
require.NoError(t, err)
|
||||
|
||||
var findings []engine.Finding
|
||||
for f := range ch {
|
||||
findings = append(findings, f)
|
||||
}
|
||||
|
||||
assert.Empty(t, findings, "expected zero findings in no_keys.txt")
|
||||
}
|
||||
|
||||
func TestScannerPipelineMultipleKeys(t *testing.T) {
|
||||
reg := newTestRegistry(t)
|
||||
eng := engine.NewEngine(reg)
|
||||
src := sources.NewFileSource("../../testdata/samples/multiple_keys.txt")
|
||||
cfg := engine.ScanConfig{Workers: 2}
|
||||
|
||||
ch, err := eng.Scan(context.Background(), src, cfg)
|
||||
require.NoError(t, err)
|
||||
|
||||
var findings []engine.Finding
|
||||
for f := range ch {
|
||||
findings = append(findings, f)
|
||||
}
|
||||
|
||||
assert.GreaterOrEqual(t, len(findings), 2, "expected at least 2 findings in multiple_keys.txt")
|
||||
|
||||
var names []string
|
||||
for _, f := range findings {
|
||||
names = append(names, f.ProviderName)
|
||||
}
|
||||
assert.Contains(t, names, "openai")
|
||||
assert.Contains(t, names, "anthropic")
|
||||
}
|
||||
```
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/engine/... -v -count=1 2>&1 | tail -30</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go test ./pkg/engine/... -v -count=1` exits 0 with all tests PASS (no SKIP)
|
||||
- TestShannonEntropy passes — 0.0 for "aaaaaaa", >= 3.5 for real key pattern
|
||||
- TestKeywordPreFilter passes — AC matches sk-proj-, empty for "hello world"
|
||||
- TestScannerPipelineOpenAI passes — 1 finding with ProviderName=="openai"
|
||||
- TestScannerPipelineNoKeys passes — 0 findings
|
||||
- TestScannerPipelineMultipleKeys passes — >= 2 findings with both provider names
|
||||
- `grep -q 'ants\.NewPool' pkg/engine/engine.go` exits 0
|
||||
- `grep -q 'KeywordFilter' pkg/engine/engine.go` exits 0
|
||||
- `go build ./...` still exits 0
|
||||
</acceptance_criteria>
|
||||
<done>Three-stage scanning pipeline works end-to-end: FileSource → KeywordFilter (AC) → Detect (regex + entropy) → Finding channel. All engine tests pass.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
After both tasks:
|
||||
- `go test ./pkg/engine/... -v -count=1` exits 0 with 6 tests PASS
|
||||
- `go build ./...` exits 0
|
||||
- `grep -q 'ants\.NewPool' pkg/engine/engine.go` exits 0
|
||||
- `grep -q 'math\.Log2' pkg/engine/entropy.go` exits 0
|
||||
- Scanning testdata/samples/openai_key.txt returns 1 finding with provider "openai"
|
||||
- Scanning testdata/samples/no_keys.txt returns 0 findings
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- Three-stage pipeline: AC pre-filter → regex + entropy detector → results channel (CORE-01, CORE-06)
|
||||
- Shannon entropy function using stdlib math (CORE-04)
|
||||
- ants v2 goroutine pool with configurable worker count (CORE-05)
|
||||
- FileSource adapter reading files in overlapping chunks (CORE-07 partial — full mmap in Phase 4)
|
||||
- All engine tests pass against real testdata fixtures
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/01-foundation/01-04-SUMMARY.md` following the summary template.
|
||||
</output>
|
||||
748
.planning/phases/01-foundation/01-05-PLAN.md
Normal file
748
.planning/phases/01-foundation/01-05-PLAN.md
Normal file
@@ -0,0 +1,748 @@
|
||||
---
|
||||
phase: 01-foundation
|
||||
plan: 05
|
||||
type: execute
|
||||
wave: 3
|
||||
depends_on: [01-02, 01-03, 01-04]
|
||||
files_modified:
|
||||
- cmd/root.go
|
||||
- cmd/scan.go
|
||||
- cmd/providers.go
|
||||
- cmd/config.go
|
||||
- pkg/config/config.go
|
||||
- pkg/output/table.go
|
||||
autonomous: false
|
||||
requirements: [CLI-01, CLI-02, CLI-03, CLI-04, CLI-05]
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "`keyhunter scan ./testdata/samples/openai_key.txt` runs the pipeline and prints a finding"
|
||||
- "`keyhunter providers list` prints a table with at least 3 providers"
|
||||
- "`keyhunter providers info openai` prints OpenAI provider details"
|
||||
- "`keyhunter config init` creates ~/.keyhunter.yaml without error"
|
||||
- "`keyhunter config set workers 16` persists the value to ~/.keyhunter.yaml"
|
||||
- "`keyhunter --help` shows all top-level commands: scan, providers, config"
|
||||
artifacts:
|
||||
- path: "cmd/root.go"
|
||||
provides: "Cobra root command with PersistentPreRunE config loading"
|
||||
contains: "cobra.Command"
|
||||
- path: "cmd/scan.go"
|
||||
provides: "scan command wiring Engine + FileSource + output table"
|
||||
exports: ["scanCmd"]
|
||||
- path: "cmd/providers.go"
|
||||
provides: "providers list/info/stats subcommands using Registry"
|
||||
exports: ["providersCmd"]
|
||||
- path: "cmd/config.go"
|
||||
provides: "config init/set/get subcommands using Viper"
|
||||
exports: ["configCmd"]
|
||||
- path: "pkg/config/config.go"
|
||||
provides: "Config struct with Load() and defaults"
|
||||
exports: ["Config", "Load"]
|
||||
- path: "pkg/output/table.go"
|
||||
provides: "lipgloss terminal table for printing Findings"
|
||||
exports: ["PrintFindings"]
|
||||
key_links:
|
||||
- from: "cmd/scan.go"
|
||||
to: "pkg/engine/engine.go"
|
||||
via: "engine.NewEngine(registry).Scan() called in RunE"
|
||||
pattern: "engine\\.NewEngine"
|
||||
- from: "cmd/scan.go"
|
||||
to: "pkg/storage/db.go"
|
||||
via: "storage.Open() called, SaveFinding for each result"
|
||||
pattern: "storage\\.Open"
|
||||
- from: "cmd/root.go"
|
||||
to: "github.com/spf13/viper"
|
||||
via: "viper.SetConfigFile in PersistentPreRunE"
|
||||
pattern: "viper\\.SetConfigFile"
|
||||
- from: "cmd/providers.go"
|
||||
to: "pkg/providers/registry.go"
|
||||
via: "Registry.List(), Registry.Get(), Registry.Stats() called"
|
||||
pattern: "registry\\.List|registry\\.Get|registry\\.Stats"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Wire all subsystems together through the Cobra CLI: scan command (engine + storage + output), providers list/info/stats commands, and config init/set/get commands. This is the integration layer — all business logic lives in pkg/, cmd/ only wires.
|
||||
|
||||
Purpose: Satisfies all Phase 1 CLI requirements and delivers the first working `keyhunter scan` command that completes the end-to-end success criteria.
|
||||
Output: cmd/{root,scan,providers,config}.go, pkg/config/config.go, pkg/output/table.go.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/01-foundation/01-RESEARCH.md
|
||||
@.planning/phases/01-foundation/01-02-SUMMARY.md
|
||||
@.planning/phases/01-foundation/01-03-SUMMARY.md
|
||||
@.planning/phases/01-foundation/01-04-SUMMARY.md
|
||||
|
||||
<interfaces>
|
||||
<!-- Engine (from Plan 04) -->
|
||||
package engine
|
||||
type ScanConfig struct { Workers int; Verify bool; Unmask bool }
|
||||
func NewEngine(registry *providers.Registry) *Engine
|
||||
func (e *Engine) Scan(ctx context.Context, src sources.Source, cfg ScanConfig) (<-chan Finding, error)
|
||||
|
||||
<!-- FileSource (from Plan 04) -->
|
||||
package sources
|
||||
func NewFileSource(path string) *FileSource
|
||||
|
||||
<!-- Finding type (from Plan 04) -->
|
||||
type Finding struct {
|
||||
ProviderName string
|
||||
KeyValue string
|
||||
KeyMasked string
|
||||
Confidence string
|
||||
Source string
|
||||
LineNumber int
|
||||
}
|
||||
|
||||
<!-- Storage (from Plan 03) -->
|
||||
package storage
|
||||
func Open(path string) (*DB, error)
|
||||
func (db *DB) SaveFinding(f Finding, encKey []byte) (int64, error)
|
||||
func DeriveKey(passphrase []byte, salt []byte) []byte
|
||||
func NewSalt() ([]byte, error)
|
||||
|
||||
<!-- Registry (from Plan 02) -->
|
||||
package providers
|
||||
func NewRegistry() (*Registry, error)
|
||||
func (r *Registry) List() []Provider
|
||||
func (r *Registry) Get(name string) (Provider, bool)
|
||||
func (r *Registry) Stats() RegistryStats
|
||||
|
||||
<!-- Config defaults -->
|
||||
DBPath: ~/.keyhunter/keyhunter.db
|
||||
ConfigPath: ~/.keyhunter.yaml
|
||||
Workers: runtime.NumCPU() * 8
|
||||
Passphrase: (prompt if not in env KEYHUNTER_PASSPHRASE — Phase 1: use empty string as dev default)
|
||||
|
||||
<!-- Viper config keys -->
|
||||
"database.path" → DBPath
|
||||
"scan.workers" → Workers
|
||||
"encryption.passphrase" → Passphrase (sensitive — warn in help)
|
||||
|
||||
<!-- lipgloss table output -->
|
||||
Columns: PROVIDER | MASKED KEY | CONFIDENCE | SOURCE | LINE
|
||||
Colors: use lipgloss.NewStyle().Foreground() for confidence: high=green, medium=yellow, low=red
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="false">
|
||||
<name>Task 1: Config package, output table, and root command</name>
|
||||
<files>pkg/config/config.go, pkg/output/table.go, cmd/root.go</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (CLI-01, CLI-02, CLI-03 rows, Standard Stack: cobra v1.10.2 + viper v1.21.0)
|
||||
- /home/salva/Documents/apikey/pkg/engine/finding.go (Finding struct fields for output)
|
||||
</read_first>
|
||||
<action>
|
||||
Create **pkg/config/config.go**:
|
||||
```go
|
||||
package config
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
"runtime"
|
||||
)
|
||||
|
||||
// Config holds all KeyHunter runtime configuration.
|
||||
// Values are populated from ~/.keyhunter.yaml, environment variables, and CLI flags (in that precedence order).
|
||||
type Config struct {
|
||||
DBPath string // path to SQLite database file
|
||||
ConfigPath string // path to config YAML file
|
||||
Workers int // number of scanner worker goroutines
|
||||
Passphrase string // encryption passphrase (sensitive)
|
||||
}
|
||||
|
||||
// Load returns a Config with defaults applied.
|
||||
// Callers should override individual fields after Load() using viper-bound values.
|
||||
func Load() Config {
|
||||
home, _ := os.UserHomeDir()
|
||||
return Config{
|
||||
DBPath: filepath.Join(home, ".keyhunter", "keyhunter.db"),
|
||||
ConfigPath: filepath.Join(home, ".keyhunter.yaml"),
|
||||
Workers: runtime.NumCPU() * 8,
|
||||
Passphrase: "", // Phase 1: empty passphrase; Phase 6+ will prompt
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Create **pkg/output/table.go**:
|
||||
```go
|
||||
package output
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/charmbracelet/lipgloss"
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine"
|
||||
)
|
||||
|
||||
var (
|
||||
styleHigh = lipgloss.NewStyle().Foreground(lipgloss.Color("2")) // green
|
||||
styleMedium = lipgloss.NewStyle().Foreground(lipgloss.Color("3")) // yellow
|
||||
styleLow = lipgloss.NewStyle().Foreground(lipgloss.Color("1")) // red
|
||||
styleHeader = lipgloss.NewStyle().Bold(true).Underline(true)
|
||||
)
|
||||
|
||||
// PrintFindings writes findings as a colored terminal table to stdout.
|
||||
// If unmask is true, KeyValue is shown; otherwise KeyMasked is shown.
|
||||
func PrintFindings(findings []engine.Finding, unmask bool) {
|
||||
if len(findings) == 0 {
|
||||
fmt.Println("No API keys found.")
|
||||
return
|
||||
}
|
||||
|
||||
// Header
|
||||
fmt.Fprintf(os.Stdout, "%-20s %-40s %-10s %-30s %s\n",
|
||||
styleHeader.Render("PROVIDER"),
|
||||
styleHeader.Render("KEY"),
|
||||
styleHeader.Render("CONFIDENCE"),
|
||||
styleHeader.Render("SOURCE"),
|
||||
styleHeader.Render("LINE"),
|
||||
)
|
||||
fmt.Println(lipgloss.NewStyle().Foreground(lipgloss.Color("8")).Render(
|
||||
"──────────────────────────────────────────────────────────────────────────────────────────────────────────",
|
||||
))
|
||||
|
||||
for _, f := range findings {
|
||||
keyDisplay := f.KeyMasked
|
||||
if unmask {
|
||||
keyDisplay = f.KeyValue
|
||||
}
|
||||
|
||||
confStyle := styleLow
|
||||
switch f.Confidence {
|
||||
case "high":
|
||||
confStyle = styleHigh
|
||||
case "medium":
|
||||
confStyle = styleMedium
|
||||
}
|
||||
|
||||
fmt.Fprintf(os.Stdout, "%-20s %-40s %-10s %-30s %d\n",
|
||||
f.ProviderName,
|
||||
keyDisplay,
|
||||
confStyle.Render(f.Confidence),
|
||||
truncate(f.Source, 28),
|
||||
f.LineNumber,
|
||||
)
|
||||
}
|
||||
fmt.Printf("\n%d key(s) found.\n", len(findings))
|
||||
}
|
||||
|
||||
func truncate(s string, max int) string {
|
||||
if len(s) <= max {
|
||||
return s
|
||||
}
|
||||
return "..." + s[len(s)-max+3:]
|
||||
}
|
||||
```
|
||||
|
||||
Create **cmd/root.go** (replaces the stub from Plan 01):
|
||||
```go
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
|
||||
"github.com/spf13/cobra"
|
||||
"github.com/spf13/viper"
|
||||
)
|
||||
|
||||
var cfgFile string
|
||||
|
||||
// rootCmd is the base command when called without any subcommands.
|
||||
var rootCmd = &cobra.Command{
|
||||
Use: "keyhunter",
|
||||
Short: "KeyHunter — detect leaked LLM API keys across 108+ providers",
|
||||
Long: `KeyHunter scans files, git history, and internet sources for leaked LLM API keys.
|
||||
Supports 108+ providers with Aho-Corasick pre-filtering and regex + entropy detection.`,
|
||||
SilenceUsage: true,
|
||||
}
|
||||
|
||||
// Execute is the entry point called by main.go.
|
||||
func Execute() {
|
||||
if err := rootCmd.Execute(); err != nil {
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
|
||||
func init() {
|
||||
cobra.OnInitialize(initConfig)
|
||||
rootCmd.PersistentFlags().StringVar(&cfgFile, "config", "", "config file (default: ~/.keyhunter.yaml)")
|
||||
rootCmd.AddCommand(scanCmd)
|
||||
rootCmd.AddCommand(providersCmd)
|
||||
rootCmd.AddCommand(configCmd)
|
||||
}
|
||||
|
||||
func initConfig() {
|
||||
if cfgFile != "" {
|
||||
viper.SetConfigFile(cfgFile)
|
||||
} else {
|
||||
home, err := os.UserHomeDir()
|
||||
if err != nil {
|
||||
fmt.Fprintln(os.Stderr, "warning: cannot determine home directory:", err)
|
||||
return
|
||||
}
|
||||
viper.SetConfigName(".keyhunter")
|
||||
viper.SetConfigType("yaml")
|
||||
viper.AddConfigPath(home)
|
||||
viper.AddConfigPath(".")
|
||||
}
|
||||
|
||||
viper.SetEnvPrefix("KEYHUNTER")
|
||||
viper.AutomaticEnv()
|
||||
|
||||
// Defaults
|
||||
viper.SetDefault("scan.workers", 0) // 0 = auto (CPU*8)
|
||||
viper.SetDefault("database.path", filepath.Join(mustHomeDir(), ".keyhunter", "keyhunter.db"))
|
||||
|
||||
// Config file is optional — ignore if not found
|
||||
_ = viper.ReadInConfig()
|
||||
}
|
||||
|
||||
func mustHomeDir() string {
|
||||
h, _ := os.UserHomeDir()
|
||||
return h
|
||||
}
|
||||
```
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go build ./... && ./keyhunter --help 2>&1 | grep -E "scan|providers|config" && echo "HELP OK"</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go build ./...` exits 0
|
||||
- `./keyhunter --help` shows "scan", "providers", and "config" in command list
|
||||
- pkg/config/config.go exports Config and Load
|
||||
- pkg/output/table.go exports PrintFindings
|
||||
- cmd/root.go declares rootCmd, Execute(), scanCmd, providersCmd, configCmd referenced
|
||||
- `grep -q 'viper\.SetConfigFile\|viper\.SetConfigName' cmd/root.go` exits 0
|
||||
- lipgloss used for header and confidence coloring
|
||||
</acceptance_criteria>
|
||||
<done>Root command, config package, and output table exist. `keyhunter --help` shows the three top-level commands.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto" tdd="false">
|
||||
<name>Task 2: scan, providers, and config subcommands</name>
|
||||
<files>cmd/scan.go, cmd/providers.go, cmd/config.go</files>
|
||||
<read_first>
|
||||
- /home/salva/Documents/apikey/.planning/phases/01-foundation/01-RESEARCH.md (CLI-04, CLI-05 rows, Pattern 2 pipeline usage)
|
||||
- /home/salva/Documents/apikey/cmd/root.go (rootCmd, viper setup)
|
||||
- /home/salva/Documents/apikey/pkg/engine/engine.go (Engine.Scan, ScanConfig)
|
||||
- /home/salva/Documents/apikey/pkg/storage/db.go (Open, SaveFinding)
|
||||
- /home/salva/Documents/apikey/pkg/providers/registry.go (NewRegistry, List, Get, Stats)
|
||||
</read_first>
|
||||
<action>
|
||||
Create **cmd/scan.go**:
|
||||
```go
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"runtime"
|
||||
|
||||
"github.com/spf13/cobra"
|
||||
"github.com/spf13/viper"
|
||||
"github.com/salvacybersec/keyhunter/pkg/config"
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine"
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine/sources"
|
||||
"github.com/salvacybersec/keyhunter/pkg/output"
|
||||
"github.com/salvacybersec/keyhunter/pkg/providers"
|
||||
"github.com/salvacybersec/keyhunter/pkg/storage"
|
||||
)
|
||||
|
||||
var (
|
||||
flagWorkers int
|
||||
flagVerify bool
|
||||
flagUnmask bool
|
||||
flagOutput string
|
||||
flagExclude []string
|
||||
)
|
||||
|
||||
var scanCmd = &cobra.Command{
|
||||
Use: "scan <path>",
|
||||
Short: "Scan a file or directory for leaked API keys",
|
||||
Args: cobra.ExactArgs(1),
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
target := args[0]
|
||||
|
||||
// Load config
|
||||
cfg := config.Load()
|
||||
if viper.GetInt("scan.workers") > 0 {
|
||||
cfg.Workers = viper.GetInt("scan.workers")
|
||||
}
|
||||
|
||||
// Workers flag overrides config
|
||||
workers := flagWorkers
|
||||
if workers <= 0 {
|
||||
workers = cfg.Workers
|
||||
}
|
||||
if workers <= 0 {
|
||||
workers = runtime.NumCPU() * 8
|
||||
}
|
||||
|
||||
// Initialize registry
|
||||
reg, err := providers.NewRegistry()
|
||||
if err != nil {
|
||||
return fmt.Errorf("loading providers: %w", err)
|
||||
}
|
||||
|
||||
// Initialize engine
|
||||
eng := engine.NewEngine(reg)
|
||||
src := sources.NewFileSource(target)
|
||||
|
||||
scanCfg := engine.ScanConfig{
|
||||
Workers: workers,
|
||||
Verify: flagVerify,
|
||||
Unmask: flagUnmask,
|
||||
}
|
||||
|
||||
// Open database (ensure directory exists)
|
||||
dbPath := viper.GetString("database.path")
|
||||
if dbPath == "" {
|
||||
dbPath = cfg.DBPath
|
||||
}
|
||||
if err := os.MkdirAll(filepath.Dir(dbPath), 0700); err != nil {
|
||||
return fmt.Errorf("creating database directory: %w", err)
|
||||
}
|
||||
db, err := storage.Open(dbPath)
|
||||
if err != nil {
|
||||
return fmt.Errorf("opening database: %w", err)
|
||||
}
|
||||
defer db.Close()
|
||||
|
||||
// Derive encryption key (Phase 1: empty passphrase with fixed dev salt)
|
||||
salt := []byte("keyhunter-dev-s0") // Phase 1 placeholder — Phase 6 replaces with proper salt storage
|
||||
encKey := storage.DeriveKey([]byte(cfg.Passphrase), salt)
|
||||
|
||||
// Run scan
|
||||
ch, err := eng.Scan(context.Background(), src, scanCfg)
|
||||
if err != nil {
|
||||
return fmt.Errorf("starting scan: %w", err)
|
||||
}
|
||||
|
||||
var findings []engine.Finding
|
||||
for f := range ch {
|
||||
findings = append(findings, f)
|
||||
// Persist to storage
|
||||
storeFinding := storage.Finding{
|
||||
ProviderName: f.ProviderName,
|
||||
KeyValue: f.KeyValue,
|
||||
KeyMasked: f.KeyMasked,
|
||||
Confidence: f.Confidence,
|
||||
SourcePath: f.Source,
|
||||
SourceType: f.SourceType,
|
||||
LineNumber: f.LineNumber,
|
||||
}
|
||||
if _, err := db.SaveFinding(storeFinding, encKey); err != nil {
|
||||
fmt.Fprintf(os.Stderr, "warning: failed to save finding: %v\n", err)
|
||||
}
|
||||
}
|
||||
|
||||
// Output
|
||||
switch flagOutput {
|
||||
case "json":
|
||||
// Phase 6 — basic JSON for now
|
||||
fmt.Printf("[] # JSON output: Phase 6\n")
|
||||
default:
|
||||
output.PrintFindings(findings, flagUnmask)
|
||||
}
|
||||
|
||||
// Exit code semantics (CLI-05 / OUT-06): 0=clean, 1=found, 2=error
|
||||
if len(findings) > 0 {
|
||||
os.Exit(1)
|
||||
}
|
||||
return nil
|
||||
},
|
||||
}
|
||||
|
||||
func init() {
|
||||
scanCmd.Flags().IntVar(&flagWorkers, "workers", 0, "number of worker goroutines (default: CPU*8)")
|
||||
scanCmd.Flags().BoolVar(&flagVerify, "verify", false, "actively verify found keys (opt-in, Phase 5)")
|
||||
scanCmd.Flags().BoolVar(&flagUnmask, "unmask", false, "show full key values (default: masked)")
|
||||
scanCmd.Flags().StringVar(&flagOutput, "output", "table", "output format: table, json (more in Phase 6)")
|
||||
scanCmd.Flags().StringSliceVar(&flagExclude, "exclude", nil, "glob patterns to exclude (e.g. *.min.js)")
|
||||
viper.BindPFlag("scan.workers", scanCmd.Flags().Lookup("workers"))
|
||||
}
|
||||
```
|
||||
|
||||
Create **cmd/providers.go**:
|
||||
```go
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"strings"
|
||||
|
||||
"github.com/charmbracelet/lipgloss"
|
||||
"github.com/spf13/cobra"
|
||||
"github.com/salvacybersec/keyhunter/pkg/providers"
|
||||
)
|
||||
|
||||
var providersCmd = &cobra.Command{
|
||||
Use: "providers",
|
||||
Short: "Manage and inspect provider definitions",
|
||||
}
|
||||
|
||||
var providersListCmd = &cobra.Command{
|
||||
Use: "list",
|
||||
Short: "List all loaded provider definitions",
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
reg, err := providers.NewRegistry()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
bold := lipgloss.NewStyle().Bold(true)
|
||||
fmt.Fprintf(os.Stdout, "%-20s %-6s %-8s %s\n",
|
||||
bold.Render("NAME"), bold.Render("TIER"), bold.Render("PATTERNS"), bold.Render("KEYWORDS"))
|
||||
fmt.Println(strings.Repeat("─", 70))
|
||||
for _, p := range reg.List() {
|
||||
fmt.Fprintf(os.Stdout, "%-20s %-6d %-8d %s\n",
|
||||
p.Name, p.Tier, len(p.Patterns), strings.Join(p.Keywords, ", "))
|
||||
}
|
||||
stats := reg.Stats()
|
||||
fmt.Printf("\nTotal: %d providers\n", stats.Total)
|
||||
return nil
|
||||
},
|
||||
}
|
||||
|
||||
var providersInfoCmd = &cobra.Command{
|
||||
Use: "info <name>",
|
||||
Short: "Show detailed info for a provider",
|
||||
Args: cobra.ExactArgs(1),
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
reg, err := providers.NewRegistry()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
p, ok := reg.Get(args[0])
|
||||
if !ok {
|
||||
return fmt.Errorf("provider %q not found", args[0])
|
||||
}
|
||||
fmt.Printf("Name: %s\n", p.Name)
|
||||
fmt.Printf("Display Name: %s\n", p.DisplayName)
|
||||
fmt.Printf("Tier: %d\n", p.Tier)
|
||||
fmt.Printf("Last Verified: %s\n", p.LastVerified)
|
||||
fmt.Printf("Keywords: %s\n", strings.Join(p.Keywords, ", "))
|
||||
fmt.Printf("Patterns: %d\n", len(p.Patterns))
|
||||
for i, pat := range p.Patterns {
|
||||
fmt.Printf(" [%d] regex=%s confidence=%s entropy_min=%.1f\n",
|
||||
i+1, pat.Regex, pat.Confidence, pat.EntropyMin)
|
||||
}
|
||||
if p.Verify.URL != "" {
|
||||
fmt.Printf("Verify URL: %s %s\n", p.Verify.Method, p.Verify.URL)
|
||||
}
|
||||
return nil
|
||||
},
|
||||
}
|
||||
|
||||
var providersStatsCmd = &cobra.Command{
|
||||
Use: "stats",
|
||||
Short: "Show provider statistics",
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
reg, err := providers.NewRegistry()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
stats := reg.Stats()
|
||||
fmt.Printf("Total providers: %d\n", stats.Total)
|
||||
fmt.Printf("By tier:\n")
|
||||
for tier := 1; tier <= 9; tier++ {
|
||||
if count := stats.ByTier[tier]; count > 0 {
|
||||
fmt.Printf(" Tier %d: %d\n", tier, count)
|
||||
}
|
||||
}
|
||||
fmt.Printf("By confidence:\n")
|
||||
for conf, count := range stats.ByConfidence {
|
||||
fmt.Printf(" %s: %d\n", conf, count)
|
||||
}
|
||||
return nil
|
||||
},
|
||||
}
|
||||
|
||||
func init() {
|
||||
providersCmd.AddCommand(providersListCmd)
|
||||
providersCmd.AddCommand(providersInfoCmd)
|
||||
providersCmd.AddCommand(providersStatsCmd)
|
||||
}
|
||||
```
|
||||
|
||||
Create **cmd/config.go**:
|
||||
```go
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
|
||||
"github.com/spf13/cobra"
|
||||
"github.com/spf13/viper"
|
||||
)
|
||||
|
||||
var configCmd = &cobra.Command{
|
||||
Use: "config",
|
||||
Short: "Manage KeyHunter configuration",
|
||||
}
|
||||
|
||||
var configInitCmd = &cobra.Command{
|
||||
Use: "init",
|
||||
Short: "Create default configuration file at ~/.keyhunter.yaml",
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
home, err := os.UserHomeDir()
|
||||
if err != nil {
|
||||
return fmt.Errorf("cannot determine home directory: %w", err)
|
||||
}
|
||||
configPath := filepath.Join(home, ".keyhunter.yaml")
|
||||
|
||||
// Set defaults before writing
|
||||
viper.SetDefault("scan.workers", 0)
|
||||
viper.SetDefault("database.path", filepath.Join(home, ".keyhunter", "keyhunter.db"))
|
||||
|
||||
if err := viper.WriteConfigAs(configPath); err != nil {
|
||||
return fmt.Errorf("writing config: %w", err)
|
||||
}
|
||||
fmt.Printf("Config initialized: %s\n", configPath)
|
||||
return nil
|
||||
},
|
||||
}
|
||||
|
||||
var configSetCmd = &cobra.Command{
|
||||
Use: "set <key> <value>",
|
||||
Short: "Set a configuration value",
|
||||
Args: cobra.ExactArgs(2),
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
key, value := args[0], args[1]
|
||||
viper.Set(key, value)
|
||||
if err := viper.WriteConfig(); err != nil {
|
||||
// If config file doesn't exist yet, create it
|
||||
home, _ := os.UserHomeDir()
|
||||
configPath := filepath.Join(home, ".keyhunter.yaml")
|
||||
if err2 := viper.WriteConfigAs(configPath); err2 != nil {
|
||||
return fmt.Errorf("writing config: %w", err2)
|
||||
}
|
||||
}
|
||||
fmt.Printf("Set %s = %s\n", key, value)
|
||||
return nil
|
||||
},
|
||||
}
|
||||
|
||||
var configGetCmd = &cobra.Command{
|
||||
Use: "get <key>",
|
||||
Short: "Get a configuration value",
|
||||
Args: cobra.ExactArgs(1),
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
val := viper.Get(args[0])
|
||||
if val == nil {
|
||||
return fmt.Errorf("key %q not found", args[0])
|
||||
}
|
||||
fmt.Printf("%v\n", val)
|
||||
return nil
|
||||
},
|
||||
}
|
||||
|
||||
func init() {
|
||||
configCmd.AddCommand(configInitCmd)
|
||||
configCmd.AddCommand(configSetCmd)
|
||||
configCmd.AddCommand(configGetCmd)
|
||||
}
|
||||
```
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go build -o keyhunter . && ./keyhunter providers list && ./keyhunter providers info openai && echo "PROVIDERS OK"</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `go build -o keyhunter .` exits 0
|
||||
- `./keyhunter --help` shows scan, providers, config commands
|
||||
- `./keyhunter providers list` prints table with >= 3 rows including "openai"
|
||||
- `./keyhunter providers info openai` prints Name, Tier, Keywords, Patterns, Verify URL
|
||||
- `./keyhunter providers stats` prints "Total providers: 3" or more
|
||||
- `./keyhunter config init` creates or updates ~/.keyhunter.yaml
|
||||
- `./keyhunter config set scan.workers 16` exits 0
|
||||
- `./keyhunter scan testdata/samples/openai_key.txt` exits 1 (keys found) and prints a table row with "openai"
|
||||
- `./keyhunter scan testdata/samples/no_keys.txt` exits 0 and prints "No API keys found."
|
||||
- `grep -q 'viper\.BindPFlag' cmd/scan.go` exits 0
|
||||
</acceptance_criteria>
|
||||
<done>Full CLI works: scan finds and persists keys, providers list/info/stats work, config init/set/get work. Phase 1 success criteria all met.</done>
|
||||
</task>
|
||||
|
||||
<task type="checkpoint:human-verify" gate="blocking">
|
||||
<what-built>
|
||||
Complete Phase 1 implementation:
|
||||
- Provider registry with 3 YAML definitions, Aho-Corasick automaton, schema validation
|
||||
- Storage layer with AES-256-GCM encryption, Argon2id key derivation, SQLite WAL mode
|
||||
- Three-stage scan engine: keyword pre-filter → regex + entropy detector → finding channel
|
||||
- CLI: keyhunter scan, providers list/info/stats, config init/set/get
|
||||
</what-built>
|
||||
<how-to-verify>
|
||||
Run these commands from the project root and confirm each expected output:
|
||||
|
||||
1. `cd /home/salva/Documents/apikey && go test ./... -v -count=1`
|
||||
Expected: All tests PASS, zero FAIL, zero SKIP (except original stubs now filled)
|
||||
|
||||
2. `./keyhunter scan testdata/samples/openai_key.txt`
|
||||
Expected: Exit code 1, table printed with 1 row showing "openai" provider, masked key
|
||||
|
||||
3. `./keyhunter scan testdata/samples/no_keys.txt`
|
||||
Expected: Exit code 0, "No API keys found." printed
|
||||
|
||||
4. `./keyhunter providers list`
|
||||
Expected: Table with openai, anthropic, huggingface rows
|
||||
|
||||
5. `./keyhunter providers info openai`
|
||||
Expected: Name, Tier 1, Keywords including "sk-proj-", Pattern regex shown
|
||||
|
||||
6. `./keyhunter config init`
|
||||
Expected: "Config initialized: ~/.keyhunter.yaml" and the file exists
|
||||
|
||||
7. `./keyhunter config set scan.workers 16 && ./keyhunter config get scan.workers`
|
||||
Expected: "Set scan.workers = 16" then "16"
|
||||
|
||||
8. Build the binary with production flags:
|
||||
`CGO_ENABLED=0 go build -ldflags="-s -w" -o keyhunter-prod .`
|
||||
Expected: Builds without error, binary produced
|
||||
</how-to-verify>
|
||||
<resume-signal>Type "approved" if all 8 checks pass, or describe which check failed and what output you saw.</resume-signal>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
Full Phase 1 integration check:
|
||||
- `go test ./... -count=1` exits 0
|
||||
- `./keyhunter scan testdata/samples/openai_key.txt` exits 1 with findings table
|
||||
- `./keyhunter scan testdata/samples/no_keys.txt` exits 0 with "No API keys found."
|
||||
- `./keyhunter providers list` shows 3+ providers
|
||||
- `./keyhunter config init` creates ~/.keyhunter.yaml
|
||||
- `CGO_ENABLED=0 go build -ldflags="-s -w" -o keyhunter-prod .` exits 0
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- Cobra CLI with scan, providers, config commands (CLI-01)
|
||||
- `keyhunter config init` creates ~/.keyhunter.yaml (CLI-02)
|
||||
- `keyhunter config set key value` persists (CLI-03)
|
||||
- `keyhunter providers list/info/stats` work (CLI-04)
|
||||
- scan flags: --workers, --verify, --unmask, --output, --exclude (CLI-05)
|
||||
- All Phase 1 success criteria from ROADMAP.md satisfied:
|
||||
1. `keyhunter scan ./somefile` runs three-stage pipeline and returns findings with provider names
|
||||
2. Findings persisted to SQLite with AES-256 encrypted key_value
|
||||
3. `keyhunter config init` and `config set` work
|
||||
4. `keyhunter providers list/info` return provider metadata from YAML
|
||||
5. Provider YAML has format_version and last_verified, validated at load time
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/01-foundation/01-05-SUMMARY.md` following the summary template.
|
||||
</output>
|
||||
Reference in New Issue
Block a user