Files
keyhunter/.planning/phases/08-dork-engine/08-01-PLAN.md
2026-04-06 00:13:13 +03:00

12 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
08-dork-engine 01 execute 1
pkg/dorks/schema.go
pkg/dorks/loader.go
pkg/dorks/registry.go
pkg/dorks/executor.go
pkg/dorks/registry_test.go
pkg/dorks/definitions/.gitkeep
dorks/.gitkeep
pkg/storage/schema.sql
pkg/storage/custom_dorks.go
pkg/storage/custom_dorks_test.go
true
DORK-01
DORK-03
truths artifacts key_links
pkg/dorks.NewRegistry() loads embedded YAML files without error
Registry.List(), Get(id), Stats(), ListBySource(), ListByCategory() return correct data
ExecuteDork interface defined and per-source Executor map exists (all stubbed except placeholder)
custom_dorks table exists and SaveCustomDork/ListCustomDorks/DeleteCustomDork work round-trip
path provides contains
pkg/dorks/schema.go Dork struct matching 08-CONTEXT YAML schema type Dork struct
path provides contains
pkg/dorks/loader.go go:embed loader mirroring pkg/providers/loader.go //go:embed definitions
path provides contains
pkg/dorks/registry.go Registry with List/Get/Stats/ListBySource/ListByCategory func NewRegistry
path provides contains
pkg/dorks/executor.go Executor interface + source dispatch + ErrSourceNotImplemented type Executor interface
path provides contains
pkg/storage/custom_dorks.go SaveCustomDork/ListCustomDorks/DeleteCustomDork/GetCustomDork custom_dorks
from to via pattern
pkg/dorks/loader.go pkg/dorks/definitions/*/*.yaml go:embed embed.FS
from to via pattern
pkg/storage/schema.sql custom_dorks table CREATE TABLE CREATE TABLE IF NOT EXISTS custom_dorks
Foundation of the dork engine: schema, go:embed loader, registry, executor interface, and storage table for user-added custom dorks. Mirrors the proven pkg/providers pattern from Phase 1 so downstream plans can drop 150+ YAML files into pkg/dorks/definitions/{source}/ and have them immediately load at startup.

Purpose: Unblock parallel Wave 2 plans (50-dork YAML batches and GitHub live executor). Output: pkg/dorks package with passing tests + custom_dorks table migration.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/08-dork-engine/08-CONTEXT.md @pkg/providers/loader.go @pkg/providers/registry.go @pkg/storage/db.go @pkg/storage/schema.sql

From pkg/providers/loader.go:

//go:embed definitions/*.yaml
var definitionsFS embed.FS

func loadProviders() ([]Provider, error) {
    fs.WalkDir(definitionsFS, "definitions", func(path string, d fs.DirEntry, err error) error { ... })
}

From pkg/providers/registry.go:

type Registry struct { providers []Provider; index map[string]int; ... }
func NewRegistry() (*Registry, error)
func (r *Registry) List() []Provider
func (r *Registry) Get(name string) (Provider, bool)
func (r *Registry) Stats() RegistryStats

From pkg/storage/db.go:

type DB struct { sql *sql.DB }
func (db *DB) SQL() *sql.DB
Task 1: Dork schema, go:embed loader, registry, executor interface pkg/dorks/schema.go, pkg/dorks/loader.go, pkg/dorks/registry.go, pkg/dorks/executor.go, pkg/dorks/registry_test.go, pkg/dorks/definitions/.gitkeep, dorks/.gitkeep - Test: registry with two synthetic YAMLs under definitions/ loads 2 dorks - Test: Registry.Get("openai-github-envfile") returns the correct Dork - Test: Registry.ListBySource("github") returns only github dorks - Test: Registry.ListByCategory("frontier") returns only frontier dorks - Test: Registry.Stats() returns ByCategory + BySource counts - Test: executor.ExecuteDork with source "shodan" returns ErrSourceNotImplemented - Test: Dork.Validate() rejects empty id/source/query 1. Create pkg/dorks/schema.go: ```go package dorks
   type Dork struct {
       ID          string   `yaml:"id"`
       Name        string   `yaml:"name"`
       Source      string   `yaml:"source"`   // github|google|shodan|censys|zoomeye|fofa|gitlab|bing
       Category    string   `yaml:"category"` // frontier|specialized|infrastructure|emerging|enterprise
       Query       string   `yaml:"query"`
       Description string   `yaml:"description"`
       Tags        []string `yaml:"tags"`
   }

   var ValidSources = []string{"github","google","shodan","censys","zoomeye","fofa","gitlab","bing"}

   func (d Dork) Validate() error { /* non-empty id/source/query + source in ValidSources */ }
   ```

2. Create pkg/dorks/loader.go mirroring pkg/providers/loader.go:
   ```go
   //go:embed definitions
   var definitionsFS embed.FS

   func loadDorks() ([]Dork, error) {
       // fs.WalkDir on "definitions", descend into {source}/ subdirs, parse *.yaml
   }
   ```
   Walk pattern: definitions/github/*.yaml, definitions/google/*.yaml, etc.
   Every file decoded via yaml.Unmarshal into Dork. Call Validate() per file; wrap
   errors with file path. Return combined slice.

3. Create pkg/dorks/registry.go:
   ```go
   type Registry struct {
       dorks      []Dork
       byID       map[string]int
       bySource   map[string][]int
       byCategory map[string][]int
   }

   func NewRegistry() (*Registry, error) // uses loadDorks()
   func NewRegistryFromDorks(ds []Dork) *Registry // for tests
   func (r *Registry) List() []Dork
   func (r *Registry) Get(id string) (Dork, bool)
   func (r *Registry) ListBySource(src string) []Dork
   func (r *Registry) ListByCategory(cat string) []Dork
   func (r *Registry) Stats() Stats // {Total int; BySource map[string]int; ByCategory map[string]int}
   ```

4. Create pkg/dorks/executor.go (interface + source dispatcher, stubs only —
   GitHub real impl comes in Plan 08-05):
   ```go
   var ErrSourceNotImplemented = errors.New("dork source not yet implemented")
   var ErrMissingAuth = errors.New("dork source requires auth credentials")

   type Match struct {
       DorkID   string
       Source   string
       URL      string
       Snippet  string // content chunk to feed into engine detector
       Path     string // file path in repo, if applicable
   }

   type Executor interface {
       Source() string
       Execute(ctx context.Context, d Dork, limit int) ([]Match, error)
   }

   type Runner struct {
       executors map[string]Executor
   }

   func NewRunner() *Runner { return &Runner{executors: map[string]Executor{}} }
   func (r *Runner) Register(e Executor) { r.executors[e.Source()] = e }
   func (r *Runner) Run(ctx context.Context, d Dork, limit int) ([]Match, error) {
       ex, ok := r.executors[d.Source]
       if !ok { return nil, fmt.Errorf("%w: %s (coming Phase 9-16)", ErrSourceNotImplemented, d.Source) }
       return ex.Execute(ctx, d, limit)
   }
   ```
   No real executors are registered here — Plan 08-05 wires the GitHub executor via
   a separate constructor (NewRunnerWithGitHub or similar).

5. Create pkg/dorks/registry_test.go with the behavior cases listed above.
   Use NewRegistryFromDorks for synthetic fixtures — do NOT touch the real
   embedded FS (downstream plans populate it). One test MAY call NewRegistry()
   and only assert err is nil or "definitions directory empty" — acceptable
   either way pre-YAML.

6. Create placeholder files to make go:embed succeed with empty tree:
   - pkg/dorks/definitions/.gitkeep (empty)
   - dorks/.gitkeep (empty)

   IMPORTANT: go:embed requires at least one matching file. If
   `//go:embed definitions` fails when only .gitkeep exists, switch the directive
   to `//go:embed definitions/*` and handle the empty case by returning nil
   dorks (no error) when WalkDir sees only .gitkeep. Test must pass with
   zero real YAML present.
cd /home/salva/Documents/apikey && go test ./pkg/dorks/... -v pkg/dorks builds, all registry + executor tests pass, loader tolerates empty definitions tree, ErrSourceNotImplemented returned for unknown source. Task 2: custom_dorks storage table + CRUD pkg/storage/schema.sql, pkg/storage/custom_dorks.go, pkg/storage/custom_dorks_test.go - Test: SaveCustomDork inserts a row and returns an auto-increment ID - Test: ListCustomDorks returns all saved custom dorks newest first - Test: GetCustomDork(id) returns the dork or sql.ErrNoRows - Test: DeleteCustomDork(id) removes it; subsequent Get returns ErrNoRows - Test: schema migration is idempotent (Open twice on same :memory: is fine — new DB each call, so instead verify CREATE TABLE IF NOT EXISTS form via re-exec on same *sql.DB) 1. Append to pkg/storage/schema.sql: ```sql CREATE TABLE IF NOT EXISTS custom_dorks ( id INTEGER PRIMARY KEY AUTOINCREMENT, dork_id TEXT NOT NULL UNIQUE, name TEXT NOT NULL, source TEXT NOT NULL, category TEXT NOT NULL, query TEXT NOT NULL, description TEXT, tags TEXT, -- JSON array created_at DATETIME DEFAULT CURRENT_TIMESTAMP );
   CREATE INDEX IF NOT EXISTS idx_custom_dorks_source ON custom_dorks(source);
   CREATE INDEX IF NOT EXISTS idx_custom_dorks_category ON custom_dorks(category);
   ```

2. Create pkg/storage/custom_dorks.go:
   ```go
   type CustomDork struct {
       ID          int64
       DorkID      string
       Name        string
       Source      string
       Category    string
       Query       string
       Description string
       Tags        []string
       CreatedAt   time.Time
   }

   func (db *DB) SaveCustomDork(d CustomDork) (int64, error)
   func (db *DB) ListCustomDorks() ([]CustomDork, error)
   func (db *DB) GetCustomDork(id int64) (CustomDork, error) // returns sql.ErrNoRows if missing
   func (db *DB) GetCustomDorkByDorkID(dorkID string) (CustomDork, error)
   func (db *DB) DeleteCustomDork(id int64) (int64, error)
   ```
   Tags round-tripped via encoding/json (TEXT column). Dork_id UNIQUE so
   user cannot create duplicate custom IDs.

3. Create pkg/storage/custom_dorks_test.go covering the behavior cases above.
   Use storage.Open(":memory:") as the existing storage tests do.
cd /home/salva/Documents/apikey && go test ./pkg/storage/... -run CustomDork -v custom_dorks table created on Open(), CRUD round-trip tests pass, no regressions in the existing storage test suite. - `go build ./...` succeeds - `go test ./pkg/dorks/... ./pkg/storage/...` passes - `grep -r "//go:embed" pkg/dorks/` shows the definitions embed directive

<success_criteria>

  • pkg/dorks.NewRegistry() compiles and runs (zero or more embedded dorks)
  • Executor interface + ErrSourceNotImplemented in place for Plan 08-05 and 08-06
  • custom_dorks CRUD functional; downstream dorks add/dorks delete commands have a storage backend to call </success_criteria>
After completion, create `.planning/phases/08-dork-engine/08-01-SUMMARY.md`