docs(05): create phase 5 verification engine plans
This commit is contained in:
424
.planning/phases/05-verification-engine/05-03-PLAN.md
Normal file
424
.planning/phases/05-verification-engine/05-03-PLAN.md
Normal file
@@ -0,0 +1,424 @@
|
||||
---
|
||||
phase: 05-verification-engine
|
||||
plan: 03
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: [05-01]
|
||||
files_modified:
|
||||
- pkg/verify/verifier.go
|
||||
- pkg/verify/verifier_test.go
|
||||
- pkg/verify/result.go
|
||||
autonomous: true
|
||||
requirements: [VRFY-02, VRFY-03, VRFY-05]
|
||||
must_haves:
|
||||
truths:
|
||||
- "HTTPVerifier.Verify(ctx, finding, provider) returns a Result with Status in {live,dead,rate_limited,error,unknown}"
|
||||
- "{{KEY}} in Headers values and Body is substituted with the plaintext key"
|
||||
- "HTTP codes in EffectiveSuccessCodes → Status='live'; in EffectiveFailureCodes → Status='dead'; in EffectiveRateLimitCodes → Status='rate_limited'"
|
||||
- "Metadata extracted from JSON response via gjson paths when response Content-Type is application/json"
|
||||
- "Per-call context timeout is respected; timeout → Status='error', Error contains 'timeout' or 'deadline'"
|
||||
- "http:// verify URLs are rejected (HTTPS-only); missing verify URL → Status='unknown'"
|
||||
- "ants pool with configurable worker count runs verification in parallel"
|
||||
artifacts:
|
||||
- path: "pkg/verify/verifier.go"
|
||||
provides: "HTTPVerifier struct, VerifyAll(ctx, []Finding, reg) chan Result"
|
||||
contains: "HTTPVerifier"
|
||||
- path: "pkg/verify/result.go"
|
||||
provides: "Result struct with Status constants"
|
||||
contains: "StatusLive"
|
||||
key_links:
|
||||
- from: "pkg/verify/verifier.go"
|
||||
to: "provider.Verify (VerifySpec)"
|
||||
via: "template substitution + http.Client.Do"
|
||||
pattern: "{{KEY}}"
|
||||
- from: "pkg/verify/verifier.go"
|
||||
to: "github.com/tidwall/gjson"
|
||||
via: "metadata extraction"
|
||||
pattern: "gjson.GetBytes"
|
||||
- from: "pkg/verify/verifier.go"
|
||||
to: "github.com/panjf2000/ants/v2"
|
||||
via: "worker pool"
|
||||
pattern: "ants.NewPool"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Build the core HTTPVerifier. It takes a Finding plus its Provider, substitutes {{KEY}} into the VerifySpec headers/body, makes a single HTTP call with a bounded timeout, classifies the response into live/dead/rate_limited/error, and extracts metadata via gjson. Includes an ants worker pool for parallel verification across many findings.
|
||||
|
||||
Purpose: VRFY-02 (YAML-driven verification, no hardcoded logic), VRFY-03 (metadata extraction), VRFY-05 (configurable per-key timeout).
|
||||
Output: pkg/verify/verifier.go with the HTTPVerifier, Result types, and unit tests using httptest.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/05-verification-engine/05-CONTEXT.md
|
||||
@pkg/providers/schema.go
|
||||
@pkg/engine/finding.go
|
||||
|
||||
<interfaces>
|
||||
After Plan 05-01 completes, these are the shapes available:
|
||||
|
||||
```go
|
||||
// pkg/providers/schema.go
|
||||
type VerifySpec struct {
|
||||
Method, URL, Body string
|
||||
Headers map[string]string
|
||||
SuccessCodes, FailureCodes, RateLimitCodes []int
|
||||
MetadataPaths map[string]string // display-name -> gjson path
|
||||
ValidStatus, InvalidStatus []int // legacy
|
||||
}
|
||||
func (v VerifySpec) EffectiveSuccessCodes() []int
|
||||
func (v VerifySpec) EffectiveFailureCodes() []int
|
||||
func (v VerifySpec) EffectiveRateLimitCodes() []int
|
||||
|
||||
type Provider struct {
|
||||
Name string
|
||||
Verify VerifySpec
|
||||
// ...
|
||||
}
|
||||
|
||||
// pkg/engine/finding.go
|
||||
type Finding struct {
|
||||
ProviderName, KeyValue, KeyMasked string
|
||||
// ...
|
||||
Verified bool
|
||||
VerifyStatus string
|
||||
VerifyHTTPCode int
|
||||
VerifyMetadata map[string]string
|
||||
VerifyError string
|
||||
}
|
||||
```
|
||||
|
||||
Registry (existing) exposes `func (r *Registry) Get(name string) (*Provider, bool)`.
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 1: Result types + HTTPVerifier.Verify single-key logic</name>
|
||||
<files>pkg/verify/result.go, pkg/verify/verifier.go, pkg/verify/verifier_test.go</files>
|
||||
<behavior>
|
||||
- Verify(ctx, finding, provider) with missing VerifySpec.URL → Result{Status: StatusUnknown}
|
||||
- URL starting with "http://" → Result{Status: StatusError, Error: "verify URL must be HTTPS"}
|
||||
- Default Method is GET when VerifySpec.Method is empty
|
||||
- "{{KEY}}" substituted in every Header value and in Body
|
||||
- 200 (or any code in EffectiveSuccessCodes) → StatusLive
|
||||
- 401/403 (or any EffectiveFailureCodes) → StatusDead
|
||||
- 429 (or EffectiveRateLimitCodes) → StatusRateLimited; Retry-After header captured in Result.RetryAfter
|
||||
- Unknown code → StatusUnknown
|
||||
- JSON response with MetadataPaths set → Metadata populated via gjson
|
||||
- Non-JSON response → Metadata empty (no error)
|
||||
- ctx deadline exceeded → StatusError with "timeout" or "deadline" in Error
|
||||
</behavior>
|
||||
<action>
|
||||
1. Create `pkg/verify/result.go`:
|
||||
```go
|
||||
package verify
|
||||
|
||||
import "time"
|
||||
|
||||
const (
|
||||
StatusLive = "live"
|
||||
StatusDead = "dead"
|
||||
StatusRateLimited = "rate_limited"
|
||||
StatusError = "error"
|
||||
StatusUnknown = "unknown"
|
||||
)
|
||||
|
||||
type Result struct {
|
||||
ProviderName string
|
||||
KeyMasked string
|
||||
Status string // one of the Status* constants
|
||||
HTTPCode int
|
||||
Metadata map[string]string
|
||||
RetryAfter time.Duration
|
||||
ResponseTime time.Duration
|
||||
Error string
|
||||
}
|
||||
```
|
||||
|
||||
2. Create `pkg/verify/verifier.go`:
|
||||
```go
|
||||
package verify
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"crypto/tls"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
"strconv"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine"
|
||||
"github.com/salvacybersec/keyhunter/pkg/providers"
|
||||
"github.com/tidwall/gjson"
|
||||
)
|
||||
|
||||
const DefaultTimeout = 10 * time.Second
|
||||
|
||||
type HTTPVerifier struct {
|
||||
Client *http.Client
|
||||
Timeout time.Duration
|
||||
}
|
||||
|
||||
func NewHTTPVerifier(timeout time.Duration) *HTTPVerifier {
|
||||
if timeout <= 0 {
|
||||
timeout = DefaultTimeout
|
||||
}
|
||||
return &HTTPVerifier{
|
||||
Client: &http.Client{
|
||||
Timeout: timeout,
|
||||
Transport: &http.Transport{
|
||||
TLSClientConfig: &tls.Config{MinVersion: tls.VersionTLS12},
|
||||
},
|
||||
},
|
||||
Timeout: timeout,
|
||||
}
|
||||
}
|
||||
|
||||
// Verify runs a single verification against a provider's verify endpoint.
|
||||
// It never returns an error — transport/classification errors are encoded in Result.
|
||||
func (v *HTTPVerifier) Verify(ctx context.Context, f engine.Finding, p *providers.Provider) Result {
|
||||
start := time.Now()
|
||||
res := Result{ProviderName: f.ProviderName, KeyMasked: f.KeyMasked, Status: StatusUnknown}
|
||||
|
||||
spec := p.Verify
|
||||
if spec.URL == "" {
|
||||
return res // StatusUnknown: provider has no verify endpoint
|
||||
}
|
||||
if strings.HasPrefix(strings.ToLower(spec.URL), "http://") {
|
||||
res.Status = StatusError
|
||||
res.Error = "verify URL must be HTTPS"
|
||||
return res
|
||||
}
|
||||
|
||||
// Substitute {{KEY}} in URL (some providers pass key in query string e.g. Google AI)
|
||||
url := strings.ReplaceAll(spec.URL, "{{KEY}}", f.KeyValue)
|
||||
// Also support legacy {KEY} form used by some existing YAMLs
|
||||
url = strings.ReplaceAll(url, "{KEY}", f.KeyValue)
|
||||
|
||||
method := spec.Method
|
||||
if method == "" {
|
||||
method = http.MethodGet
|
||||
}
|
||||
|
||||
var bodyReader io.Reader
|
||||
if spec.Body != "" {
|
||||
body := strings.ReplaceAll(spec.Body, "{{KEY}}", f.KeyValue)
|
||||
body = strings.ReplaceAll(body, "{KEY}", f.KeyValue)
|
||||
bodyReader = bytes.NewBufferString(body)
|
||||
}
|
||||
|
||||
reqCtx, cancel := context.WithTimeout(ctx, v.Timeout)
|
||||
defer cancel()
|
||||
|
||||
req, err := http.NewRequestWithContext(reqCtx, method, url, bodyReader)
|
||||
if err != nil {
|
||||
res.Status = StatusError
|
||||
res.Error = err.Error()
|
||||
return res
|
||||
}
|
||||
for k, val := range spec.Headers {
|
||||
substituted := strings.ReplaceAll(val, "{{KEY}}", f.KeyValue)
|
||||
substituted = strings.ReplaceAll(substituted, "{KEY}", f.KeyValue)
|
||||
req.Header.Set(k, substituted)
|
||||
}
|
||||
|
||||
resp, err := v.Client.Do(req)
|
||||
res.ResponseTime = time.Since(start)
|
||||
if err != nil {
|
||||
res.Status = StatusError
|
||||
res.Error = err.Error()
|
||||
return res
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
res.HTTPCode = resp.StatusCode
|
||||
|
||||
// Classify
|
||||
if containsInt(spec.EffectiveSuccessCodes(), resp.StatusCode) {
|
||||
res.Status = StatusLive
|
||||
} else if containsInt(spec.EffectiveFailureCodes(), resp.StatusCode) {
|
||||
res.Status = StatusDead
|
||||
} else if containsInt(spec.EffectiveRateLimitCodes(), resp.StatusCode) {
|
||||
res.Status = StatusRateLimited
|
||||
if ra := resp.Header.Get("Retry-After"); ra != "" {
|
||||
if secs, err := strconv.Atoi(ra); err == nil {
|
||||
res.RetryAfter = time.Duration(secs) * time.Second
|
||||
}
|
||||
}
|
||||
} else {
|
||||
res.Status = StatusUnknown
|
||||
}
|
||||
|
||||
// Metadata extraction only on live responses with JSON body and MetadataPaths
|
||||
if res.Status == StatusLive && len(spec.MetadataPaths) > 0 {
|
||||
ct := resp.Header.Get("Content-Type")
|
||||
if strings.Contains(ct, "application/json") {
|
||||
bodyBytes, _ := io.ReadAll(io.LimitReader(resp.Body, 1<<20)) // 1 MiB cap
|
||||
res.Metadata = make(map[string]string, len(spec.MetadataPaths))
|
||||
for displayName, path := range spec.MetadataPaths {
|
||||
r := gjson.GetBytes(bodyBytes, path)
|
||||
if r.Exists() {
|
||||
res.Metadata[displayName] = r.String()
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return res
|
||||
}
|
||||
|
||||
func containsInt(haystack []int, needle int) bool {
|
||||
for _, x := range haystack {
|
||||
if x == needle {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// Unused import guard
|
||||
var _ = fmt.Sprintf
|
||||
```
|
||||
(Remove the `_ = fmt.Sprintf` line if `fmt` ends up unused.)
|
||||
|
||||
3. Create `pkg/verify/verifier_test.go` using httptest.NewTLSServer. Tests:
|
||||
- `TestVerify_Live_200` — server returns 200, assert StatusLive, HTTPCode=200
|
||||
- `TestVerify_Dead_401` — server returns 401, assert StatusDead
|
||||
- `TestVerify_RateLimited_429_WithRetryAfter` — server returns 429 with `Retry-After: 30`, assert StatusRateLimited and RetryAfter == 30s
|
||||
- `TestVerify_MetadataExtraction` — JSON response `{"organization":{"name":"Acme"},"tier":"plus"}`, MetadataPaths={"org":"organization.name","tier":"tier"}, assert Metadata["org"]=="Acme" and Metadata["tier"]=="plus"
|
||||
- `TestVerify_KeySubstitution_InHeader` — server inspects `Authorization` header, verify spec Headers={"Authorization":"Bearer {{KEY}}"}, assert server received "Bearer sk-test-keyvalue"
|
||||
- `TestVerify_KeySubstitution_InBody` — POST with Body `{"api_key":"{{KEY}}"}`, server reads body and asserts substitution
|
||||
- `TestVerify_KeySubstitution_InURL` — URL `https://host/v1/models?key={{KEY}}`, server inspects req.URL.Query().Get("key")
|
||||
- `TestVerify_MissingURL_Unknown` — empty spec.URL, assert StatusUnknown
|
||||
- `TestVerify_HTTPRejected` — URL `http://example.com`, assert StatusError, Error contains "HTTPS"
|
||||
- `TestVerify_Timeout` — server sleeps 200ms, verifier timeout 50ms, assert StatusError and Error matches /timeout|deadline|canceled/i
|
||||
|
||||
For httptest.NewTLSServer, set `verifier.Client.Transport = server.Client().Transport` so the test cert validates. Use a small helper to build a *providers.Provider inline.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/verify/... -run Verify -v</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `grep -q 'HTTPVerifier' pkg/verify/verifier.go`
|
||||
- `grep -q 'StatusLive\|StatusDead\|StatusRateLimited' pkg/verify/result.go`
|
||||
- `grep -q 'gjson.GetBytes' pkg/verify/verifier.go`
|
||||
- `grep -q '{{KEY}}' pkg/verify/verifier.go`
|
||||
- All 10 verifier test cases pass
|
||||
- `go build ./...` succeeds
|
||||
</acceptance_criteria>
|
||||
<done>Single-key verification classifies status correctly, substitutes key template, extracts JSON metadata, enforces HTTPS + timeout.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 2: VerifyAll worker pool with ants</name>
|
||||
<files>pkg/verify/verifier.go, pkg/verify/verifier_test.go</files>
|
||||
<behavior>
|
||||
- VerifyAll(ctx, findings, reg, workers) returns chan Result; closes channel after all findings processed
|
||||
- Workers count respected (default 10 if <= 0)
|
||||
- Findings whose provider is missing from registry → emit Result{Status: StatusUnknown, Error: "provider not found"}
|
||||
- ctx cancellation stops further dispatch; channel still closes cleanly
|
||||
</behavior>
|
||||
<action>
|
||||
Append to `pkg/verify/verifier.go`:
|
||||
```go
|
||||
import "github.com/panjf2000/ants/v2"
|
||||
import "sync"
|
||||
|
||||
const DefaultWorkers = 10
|
||||
|
||||
// VerifyAll runs verification for all findings via an ants worker pool.
|
||||
// The returned channel is closed after every finding has been processed or ctx is cancelled.
|
||||
func (v *HTTPVerifier) VerifyAll(ctx context.Context, findings []engine.Finding, reg *providers.Registry, workers int) <-chan Result {
|
||||
if workers <= 0 {
|
||||
workers = DefaultWorkers
|
||||
}
|
||||
out := make(chan Result, len(findings))
|
||||
pool, err := ants.NewPool(workers)
|
||||
if err != nil {
|
||||
// On pool creation failure, emit one error result per finding and close.
|
||||
go func() {
|
||||
defer close(out)
|
||||
for _, f := range findings {
|
||||
out <- Result{ProviderName: f.ProviderName, KeyMasked: f.KeyMasked, Status: StatusError, Error: "pool init: " + err.Error()}
|
||||
}
|
||||
}()
|
||||
return out
|
||||
}
|
||||
|
||||
var wg sync.WaitGroup
|
||||
go func() {
|
||||
defer close(out)
|
||||
defer pool.Release()
|
||||
for i := range findings {
|
||||
if ctx.Err() != nil {
|
||||
break
|
||||
}
|
||||
f := findings[i]
|
||||
wg.Add(1)
|
||||
submitErr := pool.Submit(func() {
|
||||
defer wg.Done()
|
||||
prov, ok := reg.Get(f.ProviderName)
|
||||
if !ok {
|
||||
out <- Result{ProviderName: f.ProviderName, KeyMasked: f.KeyMasked, Status: StatusUnknown, Error: "provider not found in registry"}
|
||||
return
|
||||
}
|
||||
out <- v.Verify(ctx, f, prov)
|
||||
})
|
||||
if submitErr != nil {
|
||||
wg.Done()
|
||||
out <- Result{ProviderName: f.ProviderName, KeyMasked: f.KeyMasked, Status: StatusError, Error: submitErr.Error()}
|
||||
}
|
||||
}
|
||||
wg.Wait()
|
||||
}()
|
||||
return out
|
||||
}
|
||||
```
|
||||
|
||||
NOTE: verify the exact API of `reg.Get` — check pkg/providers/registry.go before writing. If the method is named differently (e.g. `Find`, `Lookup`), use that. Also verify that ants/v2 is already in go.mod from earlier phases; if not, `go get github.com/panjf2000/ants/v2`.
|
||||
|
||||
Append to `pkg/verify/verifier_test.go`:
|
||||
- `TestVerifyAll_MultipleFindings` — 5 findings against one test server returning 200, workers=3, assert 5 StatusLive results received
|
||||
- `TestVerifyAll_MissingProvider` — finding with ProviderName="nonexistent", assert Result.Status == StatusUnknown and Error contains "not found"
|
||||
- `TestVerifyAll_ContextCancellation` — 100 findings, server sleeps 100ms each, cancel ctx after 50ms, assert channel closes within 1s and fewer than 100 results received
|
||||
|
||||
Use a real Registry built via providers.NewRegistry() or a minimal test helper that constructs a Registry with a single test provider. If NewRegistry embeds all real providers, prefer that and add a test provider dynamically if there is an API for it; otherwise add a `newTestRegistry(t, p *Provider) *Registry` helper in the test file.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/verify/... -run VerifyAll -v</automated>
|
||||
</verify>
|
||||
<acceptance_criteria>
|
||||
- `grep -q 'ants.NewPool' pkg/verify/verifier.go`
|
||||
- `grep -q 'VerifyAll' pkg/verify/verifier.go`
|
||||
- All 3 VerifyAll test cases pass
|
||||
- `go build ./...` succeeds
|
||||
- Race detector clean: `go test ./pkg/verify/... -race -run VerifyAll`
|
||||
</acceptance_criteria>
|
||||
<done>Parallel verification via ants pool works; graceful cancellation; missing providers handled.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
- `go build ./...` clean
|
||||
- `go test ./pkg/verify/... -v -race` all pass
|
||||
- Verifier is YAML-driven (no provider name switches in verifier.go): `grep -v "StatusLive\|StatusDead\|StatusError\|StatusUnknown\|StatusRateLimited" pkg/verify/verifier.go | grep -i "openai\|anthropic\|groq"` returns nothing
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- VRFY-02: single HTTPVerifier drives all providers via YAML VerifySpec
|
||||
- VRFY-03: metadata extracted via gjson paths on JSON responses
|
||||
- VRFY-05: per-call timeout respected, default 10s, configurable
|
||||
- Unit tests cover live/dead/rate-limited/error/unknown + key substitution + metadata + cancellation
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/05-verification-engine/05-03-SUMMARY.md`
|
||||
</output>
|
||||
Reference in New Issue
Block a user