Wave 0 contracts for the verification engine are in place: - VerifySpec extended with SuccessCodes/FailureCodes/RateLimitCodes/MetadataPaths/Body - Finding extended with Verified/VerifyStatus/VerifyHTTPCode/VerifyMetadata/VerifyError - findings table schema migrated with verify_* columns (fresh + legacy DBs) - gjson dep wired as direct require - VRFY-02, VRFY-03 marked complete
114 lines
4.8 KiB
Markdown
114 lines
4.8 KiB
Markdown
---
|
|
gsd_state_version: 1.0
|
|
milestone: v1.0
|
|
milestone_name: milestone
|
|
status: executing
|
|
stopped_at: Completed 05-01-PLAN.md
|
|
last_updated: "2026-04-05T12:44:11.076Z"
|
|
last_activity: 2026-04-05
|
|
progress:
|
|
total_phases: 18
|
|
completed_phases: 4
|
|
total_plans: 28
|
|
completed_plans: 24
|
|
percent: 20
|
|
---
|
|
|
|
# Project State
|
|
|
|
## Project Reference
|
|
|
|
See: .planning/PROJECT.md (updated 2026-04-04)
|
|
|
|
**Core value:** Detect leaked LLM API keys across more providers and more internet sources than any other tool, with active verification to confirm keys are real and alive.
|
|
**Current focus:** Phase 05 — verification-engine
|
|
|
|
## Current Position
|
|
|
|
Phase: 05 (verification-engine) — EXECUTING
|
|
Plan: 2 of 5
|
|
Status: Ready to execute
|
|
Last activity: 2026-04-05
|
|
|
|
Progress: [██░░░░░░░░] 20%
|
|
|
|
## Performance Metrics
|
|
|
|
**Velocity:**
|
|
|
|
- Total plans completed: 0
|
|
- Average duration: —
|
|
- Total execution time: 0 hours
|
|
|
|
**By Phase:**
|
|
|
|
| Phase | Plans | Total | Avg/Plan |
|
|
|-------|-------|-------|----------|
|
|
| - | - | - | - |
|
|
|
|
**Recent Trend:**
|
|
|
|
- Last 5 plans: —
|
|
- Trend: —
|
|
|
|
*Updated after each plan completion*
|
|
| Phase 01-foundation P02 | 9 | 2 tasks | 11 files |
|
|
| Phase 01-foundation P04 | 5min | 2 tasks | 12 files |
|
|
| Phase 01-foundation P05 | 4min | 2 tasks | 8 files |
|
|
| Phase 02-tier-1-2-providers P02 | 1m | 2 tasks | 12 files |
|
|
| Phase 02-tier-1-2-providers P03 | 3min | 2 tasks | 14 files |
|
|
| Phase 02-tier-1-2-providers P01 | 3min | 2 tasks | 12 files |
|
|
| Phase 02-tier-1-2-providers P04 | 1min | 2 tasks tasks | 14 files files |
|
|
| Phase 02-tier-1-2-providers P05 | 2min | 1 tasks | 1 files |
|
|
| Phase 03-tier-3-9-providers P04 | 3m | 2 tasks | 20 files |
|
|
| Phase 03-tier-3-9-providers P02 | 70 | 2 tasks | 22 files |
|
|
| Phase 03-tier-3-9-providers P06 | 3m | 2 tasks | 16 files |
|
|
| Phase 03-tier-3-9-providers P01 | 3m | 2 tasks | 32 files |
|
|
| Phase 03 P08 | 2min | 1 tasks | 1 files |
|
|
| Phase 04 P01 | 1m | 1 tasks | 2 files |
|
|
| Phase 04-input-sources P03 | 6m | 1 tasks | 2 files |
|
|
| Phase 04 P02 | 4min | 1 tasks | 3 files |
|
|
| Phase 04 P05 | 3min | 1 tasks | 2 files |
|
|
| Phase 05 P01 | 3m43s | 2 tasks | 10 files |
|
|
|
|
## Accumulated Context
|
|
|
|
### Decisions
|
|
|
|
Decisions are logged in PROJECT.md Key Decisions table.
|
|
Recent decisions affecting current work:
|
|
|
|
- Roadmap: CGO_ENABLED=0 throughout — modernc.org/sqlite over mattn/go-sqlite3 (see PROJECT.md)
|
|
- Roadmap: Per-source rate limiter architecture (Phase 9) must precede all OSINT source modules (Phases 10-16)
|
|
- Roadmap: AES-256 encryption added in Phase 1, not post-hoc — avoids migration complexity
|
|
- Roadmap: Verification (Phase 5) requires consent prompt + LEGAL.md — not optional polish
|
|
- [Phase 01-foundation]: Provider YAML in dual locations: providers/ (user-visible) and pkg/providers/definitions/ (embed) — Go embed cannot use '..' paths
|
|
- [Phase 01-foundation]: Aho-Corasick built with DFA=true at NewRegistry() for O(n) keyword pre-filtering across all providers
|
|
- [Phase 01-foundation]: pkg/types/chunk.go breaks engine<->sources circular import; ants pool with WaitGroup+Mutex for detector coordination
|
|
- [Phase 01-foundation]: Per-installation salt via settings table -- no hardcoded salt in production code
|
|
- [Phase 01-foundation]: Exit code semantics: 0=clean, 1=keys-found, 2=error for CI/CD integration
|
|
- [Phase 02-tier-1-2-providers]: AWS Bedrock verify URL left empty — SigV4 signing deferred to Phase 5 verification engine
|
|
- [Phase 03-tier-3-9-providers]: Keyword-only detection for providers without documented key prefixes (You.com, Unstructured, Runway, Midjourney) to avoid false positives.
|
|
- [Phase 04]: Use 'go mod download' instead of 'go mod tidy' when bootstrapping dependencies ahead of their consumers
|
|
- [Phase 04-input-sources]: GitSource walks heads+tags+remotes+stash with per-OID blob dedup
|
|
- [Phase 04]: Introduced selectSource dispatcher with sourceFlags struct for testable CLI source routing
|
|
- [Phase 05]: Keep legacy VerifySpec ValidStatus/InvalidStatus alongside canonical SuccessCodes/FailureCodes; Effective*() helpers pick canonical-first with fallback
|
|
- [Phase 05]: Store Finding.VerifyMetadata as JSON TEXT column; legacy DBs migrated in-place via PRAGMA table_info + conditional ALTER TABLE in storage.Open()
|
|
|
|
### Pending Todos
|
|
|
|
None yet.
|
|
|
|
### Blockers/Concerns
|
|
|
|
- Phase 1: Argon2 vs PBKDF2 for database encryption key derivation — needs decision before Storage Layer implementation
|
|
- Phase 1: Aho-Corasick library choice (cloudflare/ahocorasick vs bobrik/ahocorasick) — verify which TruffleHog uses
|
|
- Phase 2+: Provider YAML patterns for 108 providers — lesser-known providers need targeted research (Chinese LLMs, niche APIs)
|
|
- Phase 11: Google Custom Search API quota (100 queries/day free tier) vs direct scraping ToS trade-off — product decision needed
|
|
|
|
## Session Continuity
|
|
|
|
Last session: 2026-04-05T12:44:04.063Z
|
|
Stopped at: Completed 05-01-PLAN.md
|
|
Resume file: None
|