--- phase: 08-dork-engine plan: 01 subsystem: dork-engine tags: [dorks, yaml, go-embed, sqlite, registry, executor] requires: - phase: 01-foundation provides: pkg/providers go:embed loader + Registry pattern, pkg/storage.Open + schema.sql migration harness provides: - pkg/dorks package (schema, go:embed loader, Registry, Executor interface, Runner dispatcher) - ErrSourceNotImplemented / ErrMissingAuth sentinel errors for per-source executors - custom_dorks SQLite table with SaveCustomDork / ListCustomDorks / GetCustomDork / GetCustomDorkByDorkID / DeleteCustomDork CRUD - pkg/dorks/definitions/ tree ready for Wave 2 YAML batches affects: [08-02, 08-03, 08-04, 08-05, 08-06, 08-07, 09-osint, 10-osint] tech-stack: added: [] patterns: - "go:embed dir/* directive tolerating empty tree via fs.SkipAll on ErrNotExist" - "Registry mirroring pkg/providers: byID + bySource + byCategory index maps" - "Runner pattern for per-source Executor dispatch with sentinel-error fallbacks" - "JSON-serialised tags stored in SQLite TEXT column with rowScanner interface" key-files: created: - pkg/dorks/schema.go - pkg/dorks/loader.go - pkg/dorks/registry.go - pkg/dorks/executor.go - pkg/dorks/registry_test.go - pkg/dorks/definitions/.gitkeep - dorks/.gitkeep - pkg/storage/custom_dorks.go - pkg/storage/custom_dorks_test.go modified: - pkg/storage/schema.sql key-decisions: - "Used //go:embed definitions/* (not definitions/*.yaml) so the placeholder .gitkeep keeps the embed directive valid pre-YAML" - "Loader tolerates a definitions tree containing only .gitkeep by skipping non-.yaml entries and returning nil dorks without error" - "Runner + Executor interface split (separate from Registry) so Plan 08-05 can wire GitHub live without touching registry internals" - "custom_dorks.dork_id UNIQUE constraint at SQL layer prevents user duplicate IDs; embedded dork IDs are orthogonal (read-only)" - "Tags serialised as JSON in a TEXT column (no join table) to keep CRUD trivial for a user-authored data set" patterns-established: - "Embedded YAML registries in KeyHunter follow the pkg/providers blueprint: schema.go + loader.go (go:embed) + registry.go with index maps + per-thing tests using NewRegistryFromX synthetic fixtures" - "Per-source executor dispatch: interface + Runner.Register + ErrSourceNotImplemented sentinel for deferred backends" requirements-completed: [DORK-01, DORK-03] duration: ~15min completed: 2026-04-05 --- # Phase 08 Plan 01: Dork Engine Foundation Summary **pkg/dorks package mirroring the pkg/providers go:embed pattern with Registry, Executor interface, Runner dispatch, and custom_dorks SQLite CRUD — ready for Wave 2 YAML batches and the Plan 08-05 GitHub executor.** ## Performance - **Duration:** ~15 min - **Started:** 2026-04-05 - **Completed:** 2026-04-05 - **Tasks:** 2 - **Files created:** 9 - **Files modified:** 1 ## Accomplishments - `pkg/dorks` package compiles and exposes `NewRegistry`, `NewRegistryFromDorks`, `Get`, `List`, `ListBySource`, `ListByCategory`, `Stats`, `Runner`, `Executor`, `Match`, `ErrSourceNotImplemented`, `ErrMissingAuth` - `go:embed definitions/*` loader tolerates an empty tree (only `.gitkeep`), so Wave 2 plans can drop YAML files into `pkg/dorks/definitions/{source}/` with zero loader changes - `custom_dorks` table + CRUD fully exercised via round-trip tests on `storage.Open(":memory:")` - All tests green, `go build ./...` clean, no regressions in existing storage tests ## Task Commits 1. **Task 1: pkg/dorks foundation (schema, loader, registry, executor, tests)** — `fd6efbb` (feat) 2. **Task 2: custom_dorks storage table + CRUD + tests** — `01062b8` (feat) _Note: Tests were co-located with implementation in a single commit per task; `NewRegistryFromDorks` served as the synthetic-fixture entry point so the embedded FS was never required during testing._ ## Files Created/Modified - `pkg/dorks/schema.go` — `Dork` struct, `ValidSources`/`ValidCategories`, `Validate()`, `Stats` - `pkg/dorks/loader.go` — `//go:embed definitions/*` loader tolerating empty tree, validates every parsed dork, wraps errors with file path - `pkg/dorks/registry.go` — `Registry` with `byID`, `bySource`, `byCategory` indexes and `List/Get/ListBySource/ListByCategory/Stats` - `pkg/dorks/executor.go` — `Match`, `Executor` interface, `Runner` dispatcher, `ErrSourceNotImplemented`, `ErrMissingAuth` - `pkg/dorks/registry_test.go` — 9 test functions covering registry methods, `Validate`, `Runner` dispatch, empty-tree `NewRegistry` - `pkg/dorks/definitions/.gitkeep` — keeps embed directive valid pre-YAML - `dorks/.gitkeep` — placeholder at repo root per plan layout - `pkg/storage/schema.sql` — added `custom_dorks` CREATE TABLE + `idx_custom_dorks_source` + `idx_custom_dorks_category` - `pkg/storage/custom_dorks.go` — `CustomDork` struct + `SaveCustomDork / ListCustomDorks / GetCustomDork / GetCustomDorkByDorkID / DeleteCustomDork`, JSON-encoded tags, shared `rowScanner` helper - `pkg/storage/custom_dorks_test.go` — 7 tests: round-trip, newest-first list, not-found, by-dork-id lookup, delete + double-delete no-op, unique constraint, schema migration idempotency ## Decisions Made - **`definitions/*` instead of `definitions/*.yaml`**: The `.yaml`-only form fails compilation when the directory contains only `.gitkeep`. Using `*` and filtering by extension at walk time allows the foundation plan to ship with zero YAML and for Wave 2 plans to add files incrementally. - **Separate `Runner` from `Registry`**: The executor plumbing is its own type so downstream plans (08-05 GitHub, 09-16 OSINT) can register implementations without coupling to the YAML index. `Registry` stays read-only and cache-safe. - **Tags as JSON TEXT, not a join table**: A junction table for ~150 user-authored rows worth of tags is not justified. `encoding/json` round-trip is trivial and keeps the CRUD API flat. - **`dork_id` UNIQUE at SQL layer**: Enforces user-visible ID uniqueness deterministically; tested via duplicate-insert failure. ## Deviations from Plan None — plan executed exactly as written. The only nuance was the deliberate use of `//go:embed definitions/*` instead of `//go:embed definitions` (the plan action step called out this exact contingency and the implementation picked the tolerant form). ## Issues Encountered None. ## User Setup Required None — no external service configuration required for this plan. ## Next Phase Readiness - Wave 2 plans (08-02 / 08-03 / 08-04 / 08-07) can drop YAML files into `pkg/dorks/definitions/{source}/` and they will load automatically via `NewRegistry()` - Plan 08-05 (GitHub live executor) can implement `dorks.Executor` and `Runner.Register` without modifying registry code - Plan 08-06 (CLI `dorks add/list/delete/info`) has `pkg/storage.CustomDork` CRUD available immediately - `ErrSourceNotImplemented` gives Plan 08-05's CLI a clean error path for non-GitHub sources until OSINT phases 9-16 arrive ## Self-Check: PASSED - pkg/dorks/schema.go — FOUND - pkg/dorks/loader.go — FOUND - pkg/dorks/registry.go — FOUND - pkg/dorks/executor.go — FOUND - pkg/dorks/registry_test.go — FOUND - pkg/dorks/definitions/.gitkeep — FOUND - dorks/.gitkeep — FOUND - pkg/storage/schema.sql — FOUND (modified) - pkg/storage/custom_dorks.go — FOUND - pkg/storage/custom_dorks_test.go — FOUND - commit fd6efbb — FOUND - commit 01062b8 — FOUND - `go build ./...` — PASSED - `go test ./pkg/dorks/... ./pkg/storage/...` — PASSED --- *Phase: 08-dork-engine* *Completed: 2026-04-05*