--- phase: 08-dork-engine plan: 01 type: execute wave: 1 depends_on: [] files_modified: - pkg/dorks/schema.go - pkg/dorks/loader.go - pkg/dorks/registry.go - pkg/dorks/executor.go - pkg/dorks/registry_test.go - pkg/dorks/definitions/.gitkeep - dorks/.gitkeep - pkg/storage/schema.sql - pkg/storage/custom_dorks.go - pkg/storage/custom_dorks_test.go autonomous: true requirements: - DORK-01 - DORK-03 must_haves: truths: - "pkg/dorks.NewRegistry() loads embedded YAML files without error" - "Registry.List(), Get(id), Stats(), ListBySource(), ListByCategory() return correct data" - "ExecuteDork interface defined and per-source Executor map exists (all stubbed except placeholder)" - "custom_dorks table exists and SaveCustomDork/ListCustomDorks/DeleteCustomDork work round-trip" artifacts: - path: "pkg/dorks/schema.go" provides: "Dork struct matching 08-CONTEXT YAML schema" contains: "type Dork struct" - path: "pkg/dorks/loader.go" provides: "go:embed loader mirroring pkg/providers/loader.go" contains: "//go:embed definitions" - path: "pkg/dorks/registry.go" provides: "Registry with List/Get/Stats/ListBySource/ListByCategory" contains: "func NewRegistry" - path: "pkg/dorks/executor.go" provides: "Executor interface + source dispatch + ErrSourceNotImplemented" contains: "type Executor interface" - path: "pkg/storage/custom_dorks.go" provides: "SaveCustomDork/ListCustomDorks/DeleteCustomDork/GetCustomDork" contains: "custom_dorks" key_links: - from: "pkg/dorks/loader.go" to: "pkg/dorks/definitions/*/*.yaml" via: "go:embed" pattern: "embed.FS" - from: "pkg/storage/schema.sql" to: "custom_dorks table" via: "CREATE TABLE" pattern: "CREATE TABLE IF NOT EXISTS custom_dorks" --- Foundation of the dork engine: schema, go:embed loader, registry, executor interface, and storage table for user-added custom dorks. Mirrors the proven pkg/providers pattern from Phase 1 so downstream plans can drop 150+ YAML files into pkg/dorks/definitions/{source}/ and have them immediately load at startup. Purpose: Unblock parallel Wave 2 plans (50-dork YAML batches and GitHub live executor). Output: pkg/dorks package with passing tests + custom_dorks table migration. @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/08-dork-engine/08-CONTEXT.md @pkg/providers/loader.go @pkg/providers/registry.go @pkg/storage/db.go @pkg/storage/schema.sql From pkg/providers/loader.go: ```go //go:embed definitions/*.yaml var definitionsFS embed.FS func loadProviders() ([]Provider, error) { fs.WalkDir(definitionsFS, "definitions", func(path string, d fs.DirEntry, err error) error { ... }) } ``` From pkg/providers/registry.go: ```go type Registry struct { providers []Provider; index map[string]int; ... } func NewRegistry() (*Registry, error) func (r *Registry) List() []Provider func (r *Registry) Get(name string) (Provider, bool) func (r *Registry) Stats() RegistryStats ``` From pkg/storage/db.go: ```go type DB struct { sql *sql.DB } func (db *DB) SQL() *sql.DB ``` Task 1: Dork schema, go:embed loader, registry, executor interface pkg/dorks/schema.go, pkg/dorks/loader.go, pkg/dorks/registry.go, pkg/dorks/executor.go, pkg/dorks/registry_test.go, pkg/dorks/definitions/.gitkeep, dorks/.gitkeep - Test: registry with two synthetic YAMLs under definitions/ loads 2 dorks - Test: Registry.Get("openai-github-envfile") returns the correct Dork - Test: Registry.ListBySource("github") returns only github dorks - Test: Registry.ListByCategory("frontier") returns only frontier dorks - Test: Registry.Stats() returns ByCategory + BySource counts - Test: executor.ExecuteDork with source "shodan" returns ErrSourceNotImplemented - Test: Dork.Validate() rejects empty id/source/query 1. Create pkg/dorks/schema.go: ```go package dorks type Dork struct { ID string `yaml:"id"` Name string `yaml:"name"` Source string `yaml:"source"` // github|google|shodan|censys|zoomeye|fofa|gitlab|bing Category string `yaml:"category"` // frontier|specialized|infrastructure|emerging|enterprise Query string `yaml:"query"` Description string `yaml:"description"` Tags []string `yaml:"tags"` } var ValidSources = []string{"github","google","shodan","censys","zoomeye","fofa","gitlab","bing"} func (d Dork) Validate() error { /* non-empty id/source/query + source in ValidSources */ } ``` 2. Create pkg/dorks/loader.go mirroring pkg/providers/loader.go: ```go //go:embed definitions var definitionsFS embed.FS func loadDorks() ([]Dork, error) { // fs.WalkDir on "definitions", descend into {source}/ subdirs, parse *.yaml } ``` Walk pattern: definitions/github/*.yaml, definitions/google/*.yaml, etc. Every file decoded via yaml.Unmarshal into Dork. Call Validate() per file; wrap errors with file path. Return combined slice. 3. Create pkg/dorks/registry.go: ```go type Registry struct { dorks []Dork byID map[string]int bySource map[string][]int byCategory map[string][]int } func NewRegistry() (*Registry, error) // uses loadDorks() func NewRegistryFromDorks(ds []Dork) *Registry // for tests func (r *Registry) List() []Dork func (r *Registry) Get(id string) (Dork, bool) func (r *Registry) ListBySource(src string) []Dork func (r *Registry) ListByCategory(cat string) []Dork func (r *Registry) Stats() Stats // {Total int; BySource map[string]int; ByCategory map[string]int} ``` 4. Create pkg/dorks/executor.go (interface + source dispatcher, stubs only — GitHub real impl comes in Plan 08-05): ```go var ErrSourceNotImplemented = errors.New("dork source not yet implemented") var ErrMissingAuth = errors.New("dork source requires auth credentials") type Match struct { DorkID string Source string URL string Snippet string // content chunk to feed into engine detector Path string // file path in repo, if applicable } type Executor interface { Source() string Execute(ctx context.Context, d Dork, limit int) ([]Match, error) } type Runner struct { executors map[string]Executor } func NewRunner() *Runner { return &Runner{executors: map[string]Executor{}} } func (r *Runner) Register(e Executor) { r.executors[e.Source()] = e } func (r *Runner) Run(ctx context.Context, d Dork, limit int) ([]Match, error) { ex, ok := r.executors[d.Source] if !ok { return nil, fmt.Errorf("%w: %s (coming Phase 9-16)", ErrSourceNotImplemented, d.Source) } return ex.Execute(ctx, d, limit) } ``` No real executors are registered here — Plan 08-05 wires the GitHub executor via a separate constructor (NewRunnerWithGitHub or similar). 5. Create pkg/dorks/registry_test.go with the behavior cases listed above. Use NewRegistryFromDorks for synthetic fixtures — do NOT touch the real embedded FS (downstream plans populate it). One test MAY call NewRegistry() and only assert err is nil or "definitions directory empty" — acceptable either way pre-YAML. 6. Create placeholder files to make go:embed succeed with empty tree: - pkg/dorks/definitions/.gitkeep (empty) - dorks/.gitkeep (empty) IMPORTANT: go:embed requires at least one matching file. If `//go:embed definitions` fails when only .gitkeep exists, switch the directive to `//go:embed definitions/*` and handle the empty case by returning nil dorks (no error) when WalkDir sees only .gitkeep. Test must pass with zero real YAML present. cd /home/salva/Documents/apikey && go test ./pkg/dorks/... -v pkg/dorks builds, all registry + executor tests pass, loader tolerates empty definitions tree, ErrSourceNotImplemented returned for unknown source. Task 2: custom_dorks storage table + CRUD pkg/storage/schema.sql, pkg/storage/custom_dorks.go, pkg/storage/custom_dorks_test.go - Test: SaveCustomDork inserts a row and returns an auto-increment ID - Test: ListCustomDorks returns all saved custom dorks newest first - Test: GetCustomDork(id) returns the dork or sql.ErrNoRows - Test: DeleteCustomDork(id) removes it; subsequent Get returns ErrNoRows - Test: schema migration is idempotent (Open twice on same :memory: is fine — new DB each call, so instead verify CREATE TABLE IF NOT EXISTS form via re-exec on same *sql.DB) 1. Append to pkg/storage/schema.sql: ```sql CREATE TABLE IF NOT EXISTS custom_dorks ( id INTEGER PRIMARY KEY AUTOINCREMENT, dork_id TEXT NOT NULL UNIQUE, name TEXT NOT NULL, source TEXT NOT NULL, category TEXT NOT NULL, query TEXT NOT NULL, description TEXT, tags TEXT, -- JSON array created_at DATETIME DEFAULT CURRENT_TIMESTAMP ); CREATE INDEX IF NOT EXISTS idx_custom_dorks_source ON custom_dorks(source); CREATE INDEX IF NOT EXISTS idx_custom_dorks_category ON custom_dorks(category); ``` 2. Create pkg/storage/custom_dorks.go: ```go type CustomDork struct { ID int64 DorkID string Name string Source string Category string Query string Description string Tags []string CreatedAt time.Time } func (db *DB) SaveCustomDork(d CustomDork) (int64, error) func (db *DB) ListCustomDorks() ([]CustomDork, error) func (db *DB) GetCustomDork(id int64) (CustomDork, error) // returns sql.ErrNoRows if missing func (db *DB) GetCustomDorkByDorkID(dorkID string) (CustomDork, error) func (db *DB) DeleteCustomDork(id int64) (int64, error) ``` Tags round-tripped via encoding/json (TEXT column). Dork_id UNIQUE so user cannot create duplicate custom IDs. 3. Create pkg/storage/custom_dorks_test.go covering the behavior cases above. Use storage.Open(":memory:") as the existing storage tests do. cd /home/salva/Documents/apikey && go test ./pkg/storage/... -run CustomDork -v custom_dorks table created on Open(), CRUD round-trip tests pass, no regressions in the existing storage test suite. - `go build ./...` succeeds - `go test ./pkg/dorks/... ./pkg/storage/...` passes - `grep -r "//go:embed" pkg/dorks/` shows the definitions embed directive - pkg/dorks.NewRegistry() compiles and runs (zero or more embedded dorks) - Executor interface + ErrSourceNotImplemented in place for Plan 08-05 and 08-06 - custom_dorks CRUD functional; downstream `dorks add`/`dorks delete` commands have a storage backend to call After completion, create `.planning/phases/08-dork-engine/08-01-SUMMARY.md`