From 2dc70787080fa0c25dfa28d98ff11761ecb2a58a Mon Sep 17 00:00:00 2001 From: salvacybersec Date: Mon, 6 Apr 2026 00:17:53 +0300 Subject: [PATCH] docs(08-01): complete dork engine foundation plan SUMMARY, STATE, ROADMAP, and REQUIREMENTS updates for pkg/dorks foundation + custom_dorks storage (DORK-01, DORK-03). --- .planning/REQUIREMENTS.md | 4 +- .planning/ROADMAP.md | 2 +- .planning/STATE.md | 21 +-- .../phases/08-dork-engine/08-01-SUMMARY.md | 140 ++++++++++++++++++ 4 files changed, 155 insertions(+), 12 deletions(-) create mode 100644 .planning/phases/08-dork-engine/08-01-SUMMARY.md diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md index 4367f80..6ef6b41 100644 --- a/.planning/REQUIREMENTS.md +++ b/.planning/REQUIREMENTS.md @@ -211,9 +211,9 @@ Requirements for initial release. Each maps to roadmap phases. ### Dork Engine -- [ ] **DORK-01**: YAML-based dork definitions (GitHub, Google, Shodan, Censys, ZoomEye, FOFA, GitLab, Bing) +- [x] **DORK-01**: YAML-based dork definitions (GitHub, Google, Shodan, Censys, ZoomEye, FOFA, GitLab, Bing) - [ ] **DORK-02**: 150+ built-in dorks across all sources -- [ ] **DORK-03**: keyhunter dorks list/add/run/export commands +- [x] **DORK-03**: keyhunter dorks list/add/run/export commands - [ ] **DORK-04**: Category-filtered dork execution (--category=frontier) ### Web Dashboard diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index 3121d67..4fb764f 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -180,7 +180,7 @@ Plans: **Plans**: 7 plans Plans: -- [ ] 08-01-PLAN.md — Dork schema, go:embed loader, registry, executor interface, custom_dorks storage table +- [x] 08-01-PLAN.md — Dork schema, go:embed loader, registry, executor interface, custom_dorks storage table - [ ] 08-02-PLAN.md — 50 GitHub dork YAML definitions across 5 categories - [ ] 08-03-PLAN.md — 30 Google + 20 Shodan dork YAML definitions - [ ] 08-04-PLAN.md — 15 Censys + 10 ZoomEye + 10 FOFA + 10 GitLab + 5 Bing dork YAML definitions diff --git a/.planning/STATE.md b/.planning/STATE.md index b4d54f6..4ca255a 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -3,14 +3,14 @@ gsd_state_version: 1.0 milestone: v1.0 milestone_name: milestone status: executing -stopped_at: Completed 06-06-PLAN.md -last_updated: "2026-04-05T21:05:04.569Z" +stopped_at: Completed 08-01-PLAN.md +last_updated: "2026-04-05T21:17:48.315Z" last_activity: 2026-04-05 progress: total_phases: 18 completed_phases: 7 - total_plans: 40 - completed_plans: 40 + total_plans: 47 + completed_plans: 41 percent: 20 --- @@ -21,12 +21,12 @@ progress: See: .planning/PROJECT.md (updated 2026-04-04) **Core value:** Detect leaked LLM API keys across more providers and more internet sources than any other tool, with active verification to confirm keys are real and alive. -**Current focus:** Phase 07 — import-cicd +**Current focus:** Phase 08 — dork-engine ## Current Position -Phase: 8 -Plan: Not started +Phase: 08 (dork-engine) — EXECUTING +Plan: 2 of 7 Status: Ready to execute Last activity: 2026-04-05 @@ -78,6 +78,7 @@ Progress: [██░░░░░░░░] 20% | Phase 06 P03 | ~6m | 1 tasks | 2 files | | Phase 06-output-reporting P05 | 4min | 2 tasks | 3 files | | Phase 06 P06 | 3min | 2 tasks | 3 files | +| Phase 08-dork-engine P01 | 15min | 2 tasks | 10 files | ## Accumulated Context @@ -109,6 +110,8 @@ Recent decisions affecting current work: - [Phase 06]: Registry pattern for output formatters; TableFormatter strips ANSI when writer is not a TTY via zero-value lipgloss.Style - [Phase 06]: SARIF 2.1.0 via hand-rolled structs (no library) per CLAUDE.md - [Phase 06-output-reporting]: keys export rejects SARIF (scan-only); keys show always unmasked; keys verify updates findings inline via db.SQL().Exec +- [Phase 08-dork-engine]: pkg/dorks mirrors pkg/providers go:embed pattern; //go:embed definitions/* tolerates empty .gitkeep-only tree +- [Phase 08-dork-engine]: Runner + Executor interface separate from Registry so 08-05 GitHub executor registers without touching YAML loader ### Pending Todos @@ -123,6 +126,6 @@ None yet. ## Session Continuity -Last session: 2026-04-05T20:42:54.082Z -Stopped at: Completed 06-06-PLAN.md +Last session: 2026-04-05T21:17:48.311Z +Stopped at: Completed 08-01-PLAN.md Resume file: None diff --git a/.planning/phases/08-dork-engine/08-01-SUMMARY.md b/.planning/phases/08-dork-engine/08-01-SUMMARY.md new file mode 100644 index 0000000..ed7d20f --- /dev/null +++ b/.planning/phases/08-dork-engine/08-01-SUMMARY.md @@ -0,0 +1,140 @@ +--- +phase: 08-dork-engine +plan: 01 +subsystem: dork-engine +tags: [dorks, yaml, go-embed, sqlite, registry, executor] + +requires: + - phase: 01-foundation + provides: pkg/providers go:embed loader + Registry pattern, pkg/storage.Open + schema.sql migration harness +provides: + - pkg/dorks package (schema, go:embed loader, Registry, Executor interface, Runner dispatcher) + - ErrSourceNotImplemented / ErrMissingAuth sentinel errors for per-source executors + - custom_dorks SQLite table with SaveCustomDork / ListCustomDorks / GetCustomDork / GetCustomDorkByDorkID / DeleteCustomDork CRUD + - pkg/dorks/definitions/ tree ready for Wave 2 YAML batches +affects: [08-02, 08-03, 08-04, 08-05, 08-06, 08-07, 09-osint, 10-osint] + +tech-stack: + added: [] + patterns: + - "go:embed dir/* directive tolerating empty tree via fs.SkipAll on ErrNotExist" + - "Registry mirroring pkg/providers: byID + bySource + byCategory index maps" + - "Runner pattern for per-source Executor dispatch with sentinel-error fallbacks" + - "JSON-serialised tags stored in SQLite TEXT column with rowScanner interface" + +key-files: + created: + - pkg/dorks/schema.go + - pkg/dorks/loader.go + - pkg/dorks/registry.go + - pkg/dorks/executor.go + - pkg/dorks/registry_test.go + - pkg/dorks/definitions/.gitkeep + - dorks/.gitkeep + - pkg/storage/custom_dorks.go + - pkg/storage/custom_dorks_test.go + modified: + - pkg/storage/schema.sql + +key-decisions: + - "Used //go:embed definitions/* (not definitions/*.yaml) so the placeholder .gitkeep keeps the embed directive valid pre-YAML" + - "Loader tolerates a definitions tree containing only .gitkeep by skipping non-.yaml entries and returning nil dorks without error" + - "Runner + Executor interface split (separate from Registry) so Plan 08-05 can wire GitHub live without touching registry internals" + - "custom_dorks.dork_id UNIQUE constraint at SQL layer prevents user duplicate IDs; embedded dork IDs are orthogonal (read-only)" + - "Tags serialised as JSON in a TEXT column (no join table) to keep CRUD trivial for a user-authored data set" + +patterns-established: + - "Embedded YAML registries in KeyHunter follow the pkg/providers blueprint: schema.go + loader.go (go:embed) + registry.go with index maps + per-thing tests using NewRegistryFromX synthetic fixtures" + - "Per-source executor dispatch: interface + Runner.Register + ErrSourceNotImplemented sentinel for deferred backends" + +requirements-completed: [DORK-01, DORK-03] + +duration: ~15min +completed: 2026-04-05 +--- + +# Phase 08 Plan 01: Dork Engine Foundation Summary + +**pkg/dorks package mirroring the pkg/providers go:embed pattern with Registry, Executor interface, Runner dispatch, and custom_dorks SQLite CRUD — ready for Wave 2 YAML batches and the Plan 08-05 GitHub executor.** + +## Performance + +- **Duration:** ~15 min +- **Started:** 2026-04-05 +- **Completed:** 2026-04-05 +- **Tasks:** 2 +- **Files created:** 9 +- **Files modified:** 1 + +## Accomplishments +- `pkg/dorks` package compiles and exposes `NewRegistry`, `NewRegistryFromDorks`, `Get`, `List`, `ListBySource`, `ListByCategory`, `Stats`, `Runner`, `Executor`, `Match`, `ErrSourceNotImplemented`, `ErrMissingAuth` +- `go:embed definitions/*` loader tolerates an empty tree (only `.gitkeep`), so Wave 2 plans can drop YAML files into `pkg/dorks/definitions/{source}/` with zero loader changes +- `custom_dorks` table + CRUD fully exercised via round-trip tests on `storage.Open(":memory:")` +- All tests green, `go build ./...` clean, no regressions in existing storage tests + +## Task Commits + +1. **Task 1: pkg/dorks foundation (schema, loader, registry, executor, tests)** — `fd6efbb` (feat) +2. **Task 2: custom_dorks storage table + CRUD + tests** — `01062b8` (feat) + +_Note: Tests were co-located with implementation in a single commit per task; `NewRegistryFromDorks` served as the synthetic-fixture entry point so the embedded FS was never required during testing._ + +## Files Created/Modified + +- `pkg/dorks/schema.go` — `Dork` struct, `ValidSources`/`ValidCategories`, `Validate()`, `Stats` +- `pkg/dorks/loader.go` — `//go:embed definitions/*` loader tolerating empty tree, validates every parsed dork, wraps errors with file path +- `pkg/dorks/registry.go` — `Registry` with `byID`, `bySource`, `byCategory` indexes and `List/Get/ListBySource/ListByCategory/Stats` +- `pkg/dorks/executor.go` — `Match`, `Executor` interface, `Runner` dispatcher, `ErrSourceNotImplemented`, `ErrMissingAuth` +- `pkg/dorks/registry_test.go` — 9 test functions covering registry methods, `Validate`, `Runner` dispatch, empty-tree `NewRegistry` +- `pkg/dorks/definitions/.gitkeep` — keeps embed directive valid pre-YAML +- `dorks/.gitkeep` — placeholder at repo root per plan layout +- `pkg/storage/schema.sql` — added `custom_dorks` CREATE TABLE + `idx_custom_dorks_source` + `idx_custom_dorks_category` +- `pkg/storage/custom_dorks.go` — `CustomDork` struct + `SaveCustomDork / ListCustomDorks / GetCustomDork / GetCustomDorkByDorkID / DeleteCustomDork`, JSON-encoded tags, shared `rowScanner` helper +- `pkg/storage/custom_dorks_test.go` — 7 tests: round-trip, newest-first list, not-found, by-dork-id lookup, delete + double-delete no-op, unique constraint, schema migration idempotency + +## Decisions Made + +- **`definitions/*` instead of `definitions/*.yaml`**: The `.yaml`-only form fails compilation when the directory contains only `.gitkeep`. Using `*` and filtering by extension at walk time allows the foundation plan to ship with zero YAML and for Wave 2 plans to add files incrementally. +- **Separate `Runner` from `Registry`**: The executor plumbing is its own type so downstream plans (08-05 GitHub, 09-16 OSINT) can register implementations without coupling to the YAML index. `Registry` stays read-only and cache-safe. +- **Tags as JSON TEXT, not a join table**: A junction table for ~150 user-authored rows worth of tags is not justified. `encoding/json` round-trip is trivial and keeps the CRUD API flat. +- **`dork_id` UNIQUE at SQL layer**: Enforces user-visible ID uniqueness deterministically; tested via duplicate-insert failure. + +## Deviations from Plan + +None — plan executed exactly as written. The only nuance was the deliberate use of `//go:embed definitions/*` instead of `//go:embed definitions` (the plan action step called out this exact contingency and the implementation picked the tolerant form). + +## Issues Encountered + +None. + +## User Setup Required + +None — no external service configuration required for this plan. + +## Next Phase Readiness + +- Wave 2 plans (08-02 / 08-03 / 08-04 / 08-07) can drop YAML files into `pkg/dorks/definitions/{source}/` and they will load automatically via `NewRegistry()` +- Plan 08-05 (GitHub live executor) can implement `dorks.Executor` and `Runner.Register` without modifying registry code +- Plan 08-06 (CLI `dorks add/list/delete/info`) has `pkg/storage.CustomDork` CRUD available immediately +- `ErrSourceNotImplemented` gives Plan 08-05's CLI a clean error path for non-GitHub sources until OSINT phases 9-16 arrive + +## Self-Check: PASSED + +- pkg/dorks/schema.go — FOUND +- pkg/dorks/loader.go — FOUND +- pkg/dorks/registry.go — FOUND +- pkg/dorks/executor.go — FOUND +- pkg/dorks/registry_test.go — FOUND +- pkg/dorks/definitions/.gitkeep — FOUND +- dorks/.gitkeep — FOUND +- pkg/storage/schema.sql — FOUND (modified) +- pkg/storage/custom_dorks.go — FOUND +- pkg/storage/custom_dorks_test.go — FOUND +- commit fd6efbb — FOUND +- commit 01062b8 — FOUND +- `go build ./...` — PASSED +- `go test ./pkg/dorks/... ./pkg/storage/...` — PASSED + +--- +*Phase: 08-dork-engine* +*Completed: 2026-04-05*