8.0 KiB
8.0 KiB
phase, plan, subsystem, tags, requires, provides, affects, tech-stack, key-files, key-decisions, patterns-established, requirements-completed, duration, completed
| phase | plan | subsystem | tags | requires | provides | affects | tech-stack | key-files | key-decisions | patterns-established | requirements-completed | duration | completed | |||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 08-dork-engine | 02 | dorks |
|
|
|
|
|
|
|
|
|
12min | 2026-04-05 |
Phase 8 Plan 02: 50 GitHub Dorks Summary
50 production GitHub code-search dorks across 5 categories (frontier, specialized, infrastructure, emerging, enterprise) covering 40+ LLM/AI providers, embedded via go:embed and mirrored into the user-visible dorks/ tree.
Performance
- Duration: ~12 min
- Started: 2026-04-05T21:09:00Z
- Completed: 2026-04-05T21:21:00Z
- Tasks: 2
- Files modified: 11 (10 created, 1 modified)
Accomplishments
- 50 GitHub dorks loadable via
pkg/dorks.NewRegistry().ListBySource("github") - All 5 dork taxonomy categories populated (frontier 15, specialized 10, infrastructure 10, emerging 10, enterprise 5)
- Loader extended to parse YAML list form without breaking existing one-dork-per-file tests
- Dual-location mirror maintained per the Phase 8 architecture decision
Task Commits
- Task 1: 25 GitHub dorks — frontier + specialized categories —
09722ea(feat) - Task 2: 25 GitHub dorks — infrastructure + emerging + enterprise —
9755b37(feat)
Files Created/Modified
pkg/dorks/definitions/github/frontier.yaml— 15 Tier 1/2 dorks (OpenAI, Anthropic, Google AI, Azure OpenAI, AWS Bedrock, xAI, Cohere, Mistral, Groq, Together, Replicate)pkg/dorks/definitions/github/specialized.yaml— 10 Tier 3 dorks (Perplexity, Voyage, Jina, AssemblyAI, Deepgram, ElevenLabs, Stability, HuggingFace)pkg/dorks/definitions/github/infrastructure.yaml— 10 Tier 5 gateway + Tier 8 self-hosted dorks (OpenRouter, LiteLLM, Portkey, Helicone, Cloudflare AI, Vercel AI, Ollama, vLLM, LocalAI)pkg/dorks/definitions/github/emerging.yaml— 10 Tier 4 Chinese + Tier 6 vector DB dorks (DeepSeek, Moonshot, Qwen, Zhipu, MiniMax, Pinecone, Weaviate, Qdrant, Chroma, Writer)pkg/dorks/definitions/github/enterprise.yaml— 5 Tier 7/9 enterprise dorks (Codeium, Tabnine, Databricks, Snowflake Cortex, IBM watsonx)dorks/github/*.yaml— mirror of all five category files for user-visible inspectionpkg/dorks/loader.go— parse YAML list first, fall back to single-dork mapping
Decisions Made
- Accept both YAML shapes in the loader (list + single) so the existing
TestNewRegistry_EmptyDefinitionsTreeOKtest and any future one-off dork files keep working. - Split the 50 dorks into five category files rather than one
github.yaml— easier to review and aligns with the taxonomy categories enumerated inschema.ValidCategories. - Use real provider prefixes verified against
pkg/providers/definitions/*.yaml(sk-proj-, sk-ant-api03-, AIzaSy, gsk_, r8_, pplx-, hf_, sk-or-v1-, etc.) so future live execution in plan 08-05 returns genuine hits.
Deviations from Plan
Auto-fixed Issues
1. [Rule 3 — Blocking] Removed empty source subdirectories that broke go:embed
- Found during: Task 1 verification (
go test ./pkg/dorks/...) - Issue: Plan 08-01 left
pkg/dorks/definitions/{bing,fofa,gitlab,shodan}/as empty directories. Oncepkg/dorks/definitions/github/gained real YAML files,//go:embed definitions/*started recursing into siblings and errored withcannot embed directory definitions/bing: contains no embeddable files. - Fix: Removed the four empty subdirs. They will be re-created by the Wave 2 plans that populate each source (shodan 08-03, etc.). The remaining non-empty siblings (
censys,google,zoomeye) already contain YAML from parallel plans and embed fine. - Files modified: deleted
pkg/dorks/definitions/{bing,fofa,gitlab,shodan}/ - Verification:
go test ./pkg/dorks/...passes, registry loads 126 dorks total (50 github + 76 from parallel sources) with 0 errors. - Committed in:
09722ea(folded into Task 1)
2. [Rule 2 — Missing critical] Extended loader to accept YAML list form
- Found during: Task 1 planning (reading
pkg/dorks/loader.go) - Issue: The 08-01 loader only accepted one-dork-per-file (
yaml.Unmarshal(data, &Dork{})). The plan explicitly anticipated this and instructed me to adapt the loader to also accept top-level lists. - Fix: Loader now tries
[]Dorkfirst; if that yields entries, each is validated and appended. Otherwise it falls back to single-dork parsing (preserving the empty-definitions-tree test from 08-01). Empty YAML is tolerated. - Files modified:
pkg/dorks/loader.go - Verification:
go test ./pkg/dorks/...passes — both the empty-tree test and the new 50-dork load succeed. - Committed in:
09722ea(Task 1 commit)
Total deviations: 2 auto-fixed (1 blocking, 1 missing-critical — both anticipated by the plan text)
Impact on plan: None — both changes were explicitly forecast in the plan's <action> block. No scope creep.
Issues Encountered
- go:embed does not embed empty directories, so the empty source subdirs left over from 08-01 broke compilation once the
github/sibling had content. Resolved by deleting them; future source plans will recreate them with real content.
User Setup Required
None — these are built-in dorks embedded into the binary.
Next Phase Readiness
- Plan 08-03 and later source plans can follow the same multi-dork YAML list pattern established here.
- Plan 08-05 (GitHub live executor) now has 50 real queries to execute against the GitHub Code Search API.
- Registry statistics:
ListBySource("github")returns 50; all 5 categories represented.
Self-Check: PASSED
- File exists:
pkg/dorks/definitions/github/frontier.yaml— FOUND - File exists:
pkg/dorks/definitions/github/specialized.yaml— FOUND - File exists:
pkg/dorks/definitions/github/infrastructure.yaml— FOUND - File exists:
pkg/dorks/definitions/github/emerging.yaml— FOUND - File exists:
pkg/dorks/definitions/github/enterprise.yaml— FOUND - File exists:
dorks/github/frontier.yaml— FOUND - File exists:
dorks/github/specialized.yaml— FOUND - File exists:
dorks/github/infrastructure.yaml— FOUND - File exists:
dorks/github/emerging.yaml— FOUND - File exists:
dorks/github/enterprise.yaml— FOUND - Commit exists:
09722ea— FOUND - Commit exists:
9755b37— FOUND - Runtime verification:
dorks.NewRegistry().ListBySource("github")returned 50 dorks across 5 categories — PASSED go test ./pkg/dorks/...— PASSEDgo build ./...— PASSED
Phase: 08-dork-engine Completed: 2026-04-05