---
phase: 08-dork-engine
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/dorks/schema.go
- pkg/dorks/loader.go
- pkg/dorks/registry.go
- pkg/dorks/executor.go
- pkg/dorks/registry_test.go
- pkg/dorks/definitions/.gitkeep
- dorks/.gitkeep
- pkg/storage/schema.sql
- pkg/storage/custom_dorks.go
- pkg/storage/custom_dorks_test.go
autonomous: true
requirements:
- DORK-01
- DORK-03
must_haves:
truths:
- "pkg/dorks.NewRegistry() loads embedded YAML files without error"
- "Registry.List(), Get(id), Stats(), ListBySource(), ListByCategory() return correct data"
- "ExecuteDork interface defined and per-source Executor map exists (all stubbed except placeholder)"
- "custom_dorks table exists and SaveCustomDork/ListCustomDorks/DeleteCustomDork work round-trip"
artifacts:
- path: "pkg/dorks/schema.go"
provides: "Dork struct matching 08-CONTEXT YAML schema"
contains: "type Dork struct"
- path: "pkg/dorks/loader.go"
provides: "go:embed loader mirroring pkg/providers/loader.go"
contains: "//go:embed definitions"
- path: "pkg/dorks/registry.go"
provides: "Registry with List/Get/Stats/ListBySource/ListByCategory"
contains: "func NewRegistry"
- path: "pkg/dorks/executor.go"
provides: "Executor interface + source dispatch + ErrSourceNotImplemented"
contains: "type Executor interface"
- path: "pkg/storage/custom_dorks.go"
provides: "SaveCustomDork/ListCustomDorks/DeleteCustomDork/GetCustomDork"
contains: "custom_dorks"
key_links:
- from: "pkg/dorks/loader.go"
to: "pkg/dorks/definitions/*/*.yaml"
via: "go:embed"
pattern: "embed.FS"
- from: "pkg/storage/schema.sql"
to: "custom_dorks table"
via: "CREATE TABLE"
pattern: "CREATE TABLE IF NOT EXISTS custom_dorks"
---
Foundation of the dork engine: schema, go:embed loader, registry, executor interface,
and storage table for user-added custom dorks. Mirrors the proven pkg/providers pattern
from Phase 1 so downstream plans can drop 150+ YAML files into pkg/dorks/definitions/{source}/
and have them immediately load at startup.
Purpose: Unblock parallel Wave 2 plans (50-dork YAML batches and GitHub live executor).
Output: pkg/dorks package with passing tests + custom_dorks table migration.
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/08-dork-engine/08-CONTEXT.md
@pkg/providers/loader.go
@pkg/providers/registry.go
@pkg/storage/db.go
@pkg/storage/schema.sql
From pkg/providers/loader.go:
```go
//go:embed definitions/*.yaml
var definitionsFS embed.FS
func loadProviders() ([]Provider, error) {
fs.WalkDir(definitionsFS, "definitions", func(path string, d fs.DirEntry, err error) error { ... })
}
```
From pkg/providers/registry.go:
```go
type Registry struct { providers []Provider; index map[string]int; ... }
func NewRegistry() (*Registry, error)
func (r *Registry) List() []Provider
func (r *Registry) Get(name string) (Provider, bool)
func (r *Registry) Stats() RegistryStats
```
From pkg/storage/db.go:
```go
type DB struct { sql *sql.DB }
func (db *DB) SQL() *sql.DB
```
Task 1: Dork schema, go:embed loader, registry, executor interface
pkg/dorks/schema.go,
pkg/dorks/loader.go,
pkg/dorks/registry.go,
pkg/dorks/executor.go,
pkg/dorks/registry_test.go,
pkg/dorks/definitions/.gitkeep,
dorks/.gitkeep
- Test: registry with two synthetic YAMLs under definitions/ loads 2 dorks
- Test: Registry.Get("openai-github-envfile") returns the correct Dork
- Test: Registry.ListBySource("github") returns only github dorks
- Test: Registry.ListByCategory("frontier") returns only frontier dorks
- Test: Registry.Stats() returns ByCategory + BySource counts
- Test: executor.ExecuteDork with source "shodan" returns ErrSourceNotImplemented
- Test: Dork.Validate() rejects empty id/source/query
1. Create pkg/dorks/schema.go:
```go
package dorks
type Dork struct {
ID string `yaml:"id"`
Name string `yaml:"name"`
Source string `yaml:"source"` // github|google|shodan|censys|zoomeye|fofa|gitlab|bing
Category string `yaml:"category"` // frontier|specialized|infrastructure|emerging|enterprise
Query string `yaml:"query"`
Description string `yaml:"description"`
Tags []string `yaml:"tags"`
}
var ValidSources = []string{"github","google","shodan","censys","zoomeye","fofa","gitlab","bing"}
func (d Dork) Validate() error { /* non-empty id/source/query + source in ValidSources */ }
```
2. Create pkg/dorks/loader.go mirroring pkg/providers/loader.go:
```go
//go:embed definitions
var definitionsFS embed.FS
func loadDorks() ([]Dork, error) {
// fs.WalkDir on "definitions", descend into {source}/ subdirs, parse *.yaml
}
```
Walk pattern: definitions/github/*.yaml, definitions/google/*.yaml, etc.
Every file decoded via yaml.Unmarshal into Dork. Call Validate() per file; wrap
errors with file path. Return combined slice.
3. Create pkg/dorks/registry.go:
```go
type Registry struct {
dorks []Dork
byID map[string]int
bySource map[string][]int
byCategory map[string][]int
}
func NewRegistry() (*Registry, error) // uses loadDorks()
func NewRegistryFromDorks(ds []Dork) *Registry // for tests
func (r *Registry) List() []Dork
func (r *Registry) Get(id string) (Dork, bool)
func (r *Registry) ListBySource(src string) []Dork
func (r *Registry) ListByCategory(cat string) []Dork
func (r *Registry) Stats() Stats // {Total int; BySource map[string]int; ByCategory map[string]int}
```
4. Create pkg/dorks/executor.go (interface + source dispatcher, stubs only —
GitHub real impl comes in Plan 08-05):
```go
var ErrSourceNotImplemented = errors.New("dork source not yet implemented")
var ErrMissingAuth = errors.New("dork source requires auth credentials")
type Match struct {
DorkID string
Source string
URL string
Snippet string // content chunk to feed into engine detector
Path string // file path in repo, if applicable
}
type Executor interface {
Source() string
Execute(ctx context.Context, d Dork, limit int) ([]Match, error)
}
type Runner struct {
executors map[string]Executor
}
func NewRunner() *Runner { return &Runner{executors: map[string]Executor{}} }
func (r *Runner) Register(e Executor) { r.executors[e.Source()] = e }
func (r *Runner) Run(ctx context.Context, d Dork, limit int) ([]Match, error) {
ex, ok := r.executors[d.Source]
if !ok { return nil, fmt.Errorf("%w: %s (coming Phase 9-16)", ErrSourceNotImplemented, d.Source) }
return ex.Execute(ctx, d, limit)
}
```
No real executors are registered here — Plan 08-05 wires the GitHub executor via
a separate constructor (NewRunnerWithGitHub or similar).
5. Create pkg/dorks/registry_test.go with the behavior cases listed above.
Use NewRegistryFromDorks for synthetic fixtures — do NOT touch the real
embedded FS (downstream plans populate it). One test MAY call NewRegistry()
and only assert err is nil or "definitions directory empty" — acceptable
either way pre-YAML.
6. Create placeholder files to make go:embed succeed with empty tree:
- pkg/dorks/definitions/.gitkeep (empty)
- dorks/.gitkeep (empty)
IMPORTANT: go:embed requires at least one matching file. If
`//go:embed definitions` fails when only .gitkeep exists, switch the directive
to `//go:embed definitions/*` and handle the empty case by returning nil
dorks (no error) when WalkDir sees only .gitkeep. Test must pass with
zero real YAML present.
cd /home/salva/Documents/apikey && go test ./pkg/dorks/... -v
pkg/dorks builds, all registry + executor tests pass, loader tolerates empty
definitions tree, ErrSourceNotImplemented returned for unknown source.
Task 2: custom_dorks storage table + CRUD
pkg/storage/schema.sql,
pkg/storage/custom_dorks.go,
pkg/storage/custom_dorks_test.go
- Test: SaveCustomDork inserts a row and returns an auto-increment ID
- Test: ListCustomDorks returns all saved custom dorks newest first
- Test: GetCustomDork(id) returns the dork or sql.ErrNoRows
- Test: DeleteCustomDork(id) removes it; subsequent Get returns ErrNoRows
- Test: schema migration is idempotent (Open twice on same :memory: is fine — new DB each call, so instead verify CREATE TABLE IF NOT EXISTS form via re-exec on same *sql.DB)
1. Append to pkg/storage/schema.sql:
```sql
CREATE TABLE IF NOT EXISTS custom_dorks (
id INTEGER PRIMARY KEY AUTOINCREMENT,
dork_id TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
source TEXT NOT NULL,
category TEXT NOT NULL,
query TEXT NOT NULL,
description TEXT,
tags TEXT, -- JSON array
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX IF NOT EXISTS idx_custom_dorks_source ON custom_dorks(source);
CREATE INDEX IF NOT EXISTS idx_custom_dorks_category ON custom_dorks(category);
```
2. Create pkg/storage/custom_dorks.go:
```go
type CustomDork struct {
ID int64
DorkID string
Name string
Source string
Category string
Query string
Description string
Tags []string
CreatedAt time.Time
}
func (db *DB) SaveCustomDork(d CustomDork) (int64, error)
func (db *DB) ListCustomDorks() ([]CustomDork, error)
func (db *DB) GetCustomDork(id int64) (CustomDork, error) // returns sql.ErrNoRows if missing
func (db *DB) GetCustomDorkByDorkID(dorkID string) (CustomDork, error)
func (db *DB) DeleteCustomDork(id int64) (int64, error)
```
Tags round-tripped via encoding/json (TEXT column). Dork_id UNIQUE so
user cannot create duplicate custom IDs.
3. Create pkg/storage/custom_dorks_test.go covering the behavior cases above.
Use storage.Open(":memory:") as the existing storage tests do.
cd /home/salva/Documents/apikey && go test ./pkg/storage/... -run CustomDork -v
custom_dorks table created on Open(), CRUD round-trip tests pass, no
regressions in the existing storage test suite.
- `go build ./...` succeeds
- `go test ./pkg/dorks/... ./pkg/storage/...` passes
- `grep -r "//go:embed" pkg/dorks/` shows the definitions embed directive
- pkg/dorks.NewRegistry() compiles and runs (zero or more embedded dorks)
- Executor interface + ErrSourceNotImplemented in place for Plan 08-05 and 08-06
- custom_dorks CRUD functional; downstream `dorks add`/`dorks delete` commands have
a storage backend to call