328 lines
12 KiB
Markdown
328 lines
12 KiB
Markdown
---
|
|
phase: 08-dork-engine
|
|
plan: 01
|
|
type: execute
|
|
wave: 1
|
|
depends_on: []
|
|
files_modified:
|
|
- pkg/dorks/schema.go
|
|
- pkg/dorks/loader.go
|
|
- pkg/dorks/registry.go
|
|
- pkg/dorks/executor.go
|
|
- pkg/dorks/registry_test.go
|
|
- pkg/dorks/definitions/.gitkeep
|
|
- dorks/.gitkeep
|
|
- pkg/storage/schema.sql
|
|
- pkg/storage/custom_dorks.go
|
|
- pkg/storage/custom_dorks_test.go
|
|
autonomous: true
|
|
requirements:
|
|
- DORK-01
|
|
- DORK-03
|
|
must_haves:
|
|
truths:
|
|
- "pkg/dorks.NewRegistry() loads embedded YAML files without error"
|
|
- "Registry.List(), Get(id), Stats(), ListBySource(), ListByCategory() return correct data"
|
|
- "ExecuteDork interface defined and per-source Executor map exists (all stubbed except placeholder)"
|
|
- "custom_dorks table exists and SaveCustomDork/ListCustomDorks/DeleteCustomDork work round-trip"
|
|
artifacts:
|
|
- path: "pkg/dorks/schema.go"
|
|
provides: "Dork struct matching 08-CONTEXT YAML schema"
|
|
contains: "type Dork struct"
|
|
- path: "pkg/dorks/loader.go"
|
|
provides: "go:embed loader mirroring pkg/providers/loader.go"
|
|
contains: "//go:embed definitions"
|
|
- path: "pkg/dorks/registry.go"
|
|
provides: "Registry with List/Get/Stats/ListBySource/ListByCategory"
|
|
contains: "func NewRegistry"
|
|
- path: "pkg/dorks/executor.go"
|
|
provides: "Executor interface + source dispatch + ErrSourceNotImplemented"
|
|
contains: "type Executor interface"
|
|
- path: "pkg/storage/custom_dorks.go"
|
|
provides: "SaveCustomDork/ListCustomDorks/DeleteCustomDork/GetCustomDork"
|
|
contains: "custom_dorks"
|
|
key_links:
|
|
- from: "pkg/dorks/loader.go"
|
|
to: "pkg/dorks/definitions/*/*.yaml"
|
|
via: "go:embed"
|
|
pattern: "embed.FS"
|
|
- from: "pkg/storage/schema.sql"
|
|
to: "custom_dorks table"
|
|
via: "CREATE TABLE"
|
|
pattern: "CREATE TABLE IF NOT EXISTS custom_dorks"
|
|
---
|
|
|
|
<objective>
|
|
Foundation of the dork engine: schema, go:embed loader, registry, executor interface,
|
|
and storage table for user-added custom dorks. Mirrors the proven pkg/providers pattern
|
|
from Phase 1 so downstream plans can drop 150+ YAML files into pkg/dorks/definitions/{source}/
|
|
and have them immediately load at startup.
|
|
|
|
Purpose: Unblock parallel Wave 2 plans (50-dork YAML batches and GitHub live executor).
|
|
Output: pkg/dorks package with passing tests + custom_dorks table migration.
|
|
</objective>
|
|
|
|
<execution_context>
|
|
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
|
@$HOME/.claude/get-shit-done/templates/summary.md
|
|
</execution_context>
|
|
|
|
<context>
|
|
@.planning/PROJECT.md
|
|
@.planning/ROADMAP.md
|
|
@.planning/STATE.md
|
|
@.planning/phases/08-dork-engine/08-CONTEXT.md
|
|
@pkg/providers/loader.go
|
|
@pkg/providers/registry.go
|
|
@pkg/storage/db.go
|
|
@pkg/storage/schema.sql
|
|
|
|
<interfaces>
|
|
<!-- Mirror from pkg/providers — use the exact same go:embed / Registry pattern. -->
|
|
|
|
From pkg/providers/loader.go:
|
|
```go
|
|
//go:embed definitions/*.yaml
|
|
var definitionsFS embed.FS
|
|
|
|
func loadProviders() ([]Provider, error) {
|
|
fs.WalkDir(definitionsFS, "definitions", func(path string, d fs.DirEntry, err error) error { ... })
|
|
}
|
|
```
|
|
|
|
From pkg/providers/registry.go:
|
|
```go
|
|
type Registry struct { providers []Provider; index map[string]int; ... }
|
|
func NewRegistry() (*Registry, error)
|
|
func (r *Registry) List() []Provider
|
|
func (r *Registry) Get(name string) (Provider, bool)
|
|
func (r *Registry) Stats() RegistryStats
|
|
```
|
|
|
|
From pkg/storage/db.go:
|
|
```go
|
|
type DB struct { sql *sql.DB }
|
|
func (db *DB) SQL() *sql.DB
|
|
```
|
|
</interfaces>
|
|
</context>
|
|
|
|
<tasks>
|
|
|
|
<task type="auto" tdd="true">
|
|
<name>Task 1: Dork schema, go:embed loader, registry, executor interface</name>
|
|
<files>
|
|
pkg/dorks/schema.go,
|
|
pkg/dorks/loader.go,
|
|
pkg/dorks/registry.go,
|
|
pkg/dorks/executor.go,
|
|
pkg/dorks/registry_test.go,
|
|
pkg/dorks/definitions/.gitkeep,
|
|
dorks/.gitkeep
|
|
</files>
|
|
<behavior>
|
|
- Test: registry with two synthetic YAMLs under definitions/ loads 2 dorks
|
|
- Test: Registry.Get("openai-github-envfile") returns the correct Dork
|
|
- Test: Registry.ListBySource("github") returns only github dorks
|
|
- Test: Registry.ListByCategory("frontier") returns only frontier dorks
|
|
- Test: Registry.Stats() returns ByCategory + BySource counts
|
|
- Test: executor.ExecuteDork with source "shodan" returns ErrSourceNotImplemented
|
|
- Test: Dork.Validate() rejects empty id/source/query
|
|
</behavior>
|
|
<action>
|
|
1. Create pkg/dorks/schema.go:
|
|
```go
|
|
package dorks
|
|
|
|
type Dork struct {
|
|
ID string `yaml:"id"`
|
|
Name string `yaml:"name"`
|
|
Source string `yaml:"source"` // github|google|shodan|censys|zoomeye|fofa|gitlab|bing
|
|
Category string `yaml:"category"` // frontier|specialized|infrastructure|emerging|enterprise
|
|
Query string `yaml:"query"`
|
|
Description string `yaml:"description"`
|
|
Tags []string `yaml:"tags"`
|
|
}
|
|
|
|
var ValidSources = []string{"github","google","shodan","censys","zoomeye","fofa","gitlab","bing"}
|
|
|
|
func (d Dork) Validate() error { /* non-empty id/source/query + source in ValidSources */ }
|
|
```
|
|
|
|
2. Create pkg/dorks/loader.go mirroring pkg/providers/loader.go:
|
|
```go
|
|
//go:embed definitions
|
|
var definitionsFS embed.FS
|
|
|
|
func loadDorks() ([]Dork, error) {
|
|
// fs.WalkDir on "definitions", descend into {source}/ subdirs, parse *.yaml
|
|
}
|
|
```
|
|
Walk pattern: definitions/github/*.yaml, definitions/google/*.yaml, etc.
|
|
Every file decoded via yaml.Unmarshal into Dork. Call Validate() per file; wrap
|
|
errors with file path. Return combined slice.
|
|
|
|
3. Create pkg/dorks/registry.go:
|
|
```go
|
|
type Registry struct {
|
|
dorks []Dork
|
|
byID map[string]int
|
|
bySource map[string][]int
|
|
byCategory map[string][]int
|
|
}
|
|
|
|
func NewRegistry() (*Registry, error) // uses loadDorks()
|
|
func NewRegistryFromDorks(ds []Dork) *Registry // for tests
|
|
func (r *Registry) List() []Dork
|
|
func (r *Registry) Get(id string) (Dork, bool)
|
|
func (r *Registry) ListBySource(src string) []Dork
|
|
func (r *Registry) ListByCategory(cat string) []Dork
|
|
func (r *Registry) Stats() Stats // {Total int; BySource map[string]int; ByCategory map[string]int}
|
|
```
|
|
|
|
4. Create pkg/dorks/executor.go (interface + source dispatcher, stubs only —
|
|
GitHub real impl comes in Plan 08-05):
|
|
```go
|
|
var ErrSourceNotImplemented = errors.New("dork source not yet implemented")
|
|
var ErrMissingAuth = errors.New("dork source requires auth credentials")
|
|
|
|
type Match struct {
|
|
DorkID string
|
|
Source string
|
|
URL string
|
|
Snippet string // content chunk to feed into engine detector
|
|
Path string // file path in repo, if applicable
|
|
}
|
|
|
|
type Executor interface {
|
|
Source() string
|
|
Execute(ctx context.Context, d Dork, limit int) ([]Match, error)
|
|
}
|
|
|
|
type Runner struct {
|
|
executors map[string]Executor
|
|
}
|
|
|
|
func NewRunner() *Runner { return &Runner{executors: map[string]Executor{}} }
|
|
func (r *Runner) Register(e Executor) { r.executors[e.Source()] = e }
|
|
func (r *Runner) Run(ctx context.Context, d Dork, limit int) ([]Match, error) {
|
|
ex, ok := r.executors[d.Source]
|
|
if !ok { return nil, fmt.Errorf("%w: %s (coming Phase 9-16)", ErrSourceNotImplemented, d.Source) }
|
|
return ex.Execute(ctx, d, limit)
|
|
}
|
|
```
|
|
No real executors are registered here — Plan 08-05 wires the GitHub executor via
|
|
a separate constructor (NewRunnerWithGitHub or similar).
|
|
|
|
5. Create pkg/dorks/registry_test.go with the behavior cases listed above.
|
|
Use NewRegistryFromDorks for synthetic fixtures — do NOT touch the real
|
|
embedded FS (downstream plans populate it). One test MAY call NewRegistry()
|
|
and only assert err is nil or "definitions directory empty" — acceptable
|
|
either way pre-YAML.
|
|
|
|
6. Create placeholder files to make go:embed succeed with empty tree:
|
|
- pkg/dorks/definitions/.gitkeep (empty)
|
|
- dorks/.gitkeep (empty)
|
|
|
|
IMPORTANT: go:embed requires at least one matching file. If
|
|
`//go:embed definitions` fails when only .gitkeep exists, switch the directive
|
|
to `//go:embed definitions/*` and handle the empty case by returning nil
|
|
dorks (no error) when WalkDir sees only .gitkeep. Test must pass with
|
|
zero real YAML present.
|
|
</action>
|
|
<verify>
|
|
<automated>cd /home/salva/Documents/apikey && go test ./pkg/dorks/... -v</automated>
|
|
</verify>
|
|
<done>
|
|
pkg/dorks builds, all registry + executor tests pass, loader tolerates empty
|
|
definitions tree, ErrSourceNotImplemented returned for unknown source.
|
|
</done>
|
|
</task>
|
|
|
|
<task type="auto" tdd="true">
|
|
<name>Task 2: custom_dorks storage table + CRUD</name>
|
|
<files>
|
|
pkg/storage/schema.sql,
|
|
pkg/storage/custom_dorks.go,
|
|
pkg/storage/custom_dorks_test.go
|
|
</files>
|
|
<behavior>
|
|
- Test: SaveCustomDork inserts a row and returns an auto-increment ID
|
|
- Test: ListCustomDorks returns all saved custom dorks newest first
|
|
- Test: GetCustomDork(id) returns the dork or sql.ErrNoRows
|
|
- Test: DeleteCustomDork(id) removes it; subsequent Get returns ErrNoRows
|
|
- Test: schema migration is idempotent (Open twice on same :memory: is fine — new DB each call, so instead verify CREATE TABLE IF NOT EXISTS form via re-exec on same *sql.DB)
|
|
</behavior>
|
|
<action>
|
|
1. Append to pkg/storage/schema.sql:
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS custom_dorks (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
dork_id TEXT NOT NULL UNIQUE,
|
|
name TEXT NOT NULL,
|
|
source TEXT NOT NULL,
|
|
category TEXT NOT NULL,
|
|
query TEXT NOT NULL,
|
|
description TEXT,
|
|
tags TEXT, -- JSON array
|
|
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
CREATE INDEX IF NOT EXISTS idx_custom_dorks_source ON custom_dorks(source);
|
|
CREATE INDEX IF NOT EXISTS idx_custom_dorks_category ON custom_dorks(category);
|
|
```
|
|
|
|
2. Create pkg/storage/custom_dorks.go:
|
|
```go
|
|
type CustomDork struct {
|
|
ID int64
|
|
DorkID string
|
|
Name string
|
|
Source string
|
|
Category string
|
|
Query string
|
|
Description string
|
|
Tags []string
|
|
CreatedAt time.Time
|
|
}
|
|
|
|
func (db *DB) SaveCustomDork(d CustomDork) (int64, error)
|
|
func (db *DB) ListCustomDorks() ([]CustomDork, error)
|
|
func (db *DB) GetCustomDork(id int64) (CustomDork, error) // returns sql.ErrNoRows if missing
|
|
func (db *DB) GetCustomDorkByDorkID(dorkID string) (CustomDork, error)
|
|
func (db *DB) DeleteCustomDork(id int64) (int64, error)
|
|
```
|
|
Tags round-tripped via encoding/json (TEXT column). Dork_id UNIQUE so
|
|
user cannot create duplicate custom IDs.
|
|
|
|
3. Create pkg/storage/custom_dorks_test.go covering the behavior cases above.
|
|
Use storage.Open(":memory:") as the existing storage tests do.
|
|
</action>
|
|
<verify>
|
|
<automated>cd /home/salva/Documents/apikey && go test ./pkg/storage/... -run CustomDork -v</automated>
|
|
</verify>
|
|
<done>
|
|
custom_dorks table created on Open(), CRUD round-trip tests pass, no
|
|
regressions in the existing storage test suite.
|
|
</done>
|
|
</task>
|
|
|
|
</tasks>
|
|
|
|
<verification>
|
|
- `go build ./...` succeeds
|
|
- `go test ./pkg/dorks/... ./pkg/storage/...` passes
|
|
- `grep -r "//go:embed" pkg/dorks/` shows the definitions embed directive
|
|
</verification>
|
|
|
|
<success_criteria>
|
|
- pkg/dorks.NewRegistry() compiles and runs (zero or more embedded dorks)
|
|
- Executor interface + ErrSourceNotImplemented in place for Plan 08-05 and 08-06
|
|
- custom_dorks CRUD functional; downstream `dorks add`/`dorks delete` commands have
|
|
a storage backend to call
|
|
</success_criteria>
|
|
|
|
<output>
|
|
After completion, create `.planning/phases/08-dork-engine/08-01-SUMMARY.md`
|
|
</output>
|