Files
keyhunter/.planning/phases/08-dork-engine/08-02-PLAN.md
2026-04-06 00:13:13 +03:00

8.9 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
08-dork-engine 02 execute 2
08-01
pkg/dorks/definitions/github/frontier.yaml
pkg/dorks/definitions/github/specialized.yaml
pkg/dorks/definitions/github/infrastructure.yaml
pkg/dorks/definitions/github/emerging.yaml
pkg/dorks/definitions/github/enterprise.yaml
dorks/github/frontier.yaml
dorks/github/specialized.yaml
dorks/github/infrastructure.yaml
dorks/github/emerging.yaml
dorks/github/enterprise.yaml
true
DORK-01
DORK-02
DORK-04
truths artifacts key_links
pkg/dorks.NewRegistry() loads at least 50 github dorks
Dorks cover all 5 categories (frontier, specialized, infrastructure, emerging, enterprise)
Registry.ListBySource("github") returns >= 50 entries
All dork IDs are unique and pass Dork.Validate()
path provides contains
pkg/dorks/definitions/github/frontier.yaml ~15 GitHub dorks for Tier 1/2 frontier providers source: github
path provides contains
pkg/dorks/definitions/github/specialized.yaml ~10 GitHub dorks for Tier 3 specialized providers category: specialized
from to via pattern
pkg/dorks/definitions/github/*.yaml pkg/dorks/loader.go go:embed compile-time embed source: github
Populate the GitHub source with 50 production dork queries covering every provider category. Each dork is a real GitHub Code Search query formatted per the Dork schema from Plan 08-01. Mirrored into `dorks/github/` (user-visible) and `pkg/dorks/definitions/github/` (go:embed target) per the Phase 1 dual-location pattern.

Purpose: Half of the 150+ dork requirement (DORK-02) lives here. GitHub is the largest single source because it is the primary live executor (Plan 08-05) and because leaked keys overwhelmingly show up in .env/config files. Output: 50 GitHub dorks, embedded and loadable.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/phases/08-dork-engine/08-CONTEXT.md @.planning/phases/08-dork-engine/08-01-PLAN.md @pkg/providers/definitions/openai.yaml @pkg/dorks/schema.go Task 1: 25 GitHub dorks — frontier + specialized categories pkg/dorks/definitions/github/frontier.yaml, pkg/dorks/definitions/github/specialized.yaml, dorks/github/frontier.yaml, dorks/github/specialized.yaml Create both files with the YAML list format supported by the loader. Each file is a YAML document containing a top-level list of Dork entries. If the loader in 08-01 was written to expect one-Dork-per-file, update it here to also accept a list — check pkg/dorks/loader.go and adapt (preferred: loader accepts both `type dorkFile struct { Dorks []Dork }` wrapper OR top-level list). Use the list form.
File format (list of Dork):
```yaml
- id: openai-github-envfile
  name: "OpenAI API Key in .env files"
  source: github
  category: frontier
  query: 'sk-proj- extension:env'
  description: "Finds OpenAI project keys committed in .env files"
  tags: [openai, env, tier1]
- id: openai-github-pyfile
  ...
```

**frontier.yaml — 15 dorks** covering Tier 1/2 providers. Each provider gets
1-2 dorks. Use real, validated prefixes from pkg/providers/definitions/*.yaml:
- openai-github-envfile: `sk-proj- extension:env`
- openai-github-pyfile: `sk-proj- extension:py`
- openai-github-jsonfile: `sk-proj- extension:json`
- anthropic-github-envfile: `sk-ant-api03- extension:env`
- anthropic-github-pyfile: `sk-ant-api03- extension:py`
- google-ai-github-envfile: `AIzaSy extension:env "GOOGLE_API_KEY"`
- google-ai-github-jsonfile: `AIzaSy extension:json "generativelanguage"`
- azure-openai-envfile: `AZURE_OPENAI_KEY extension:env`
- aws-bedrock-envfile: `AKIA extension:env "bedrock"`
- xai-envfile: `xai- extension:env`
- cohere-envfile: `COHERE_API_KEY extension:env`
- mistral-envfile: `MISTRAL_API_KEY extension:env`
- groq-envfile: `gsk_ extension:env`
- together-envfile: `TOGETHER_API_KEY extension:env`
- replicate-envfile: `r8_ extension:env`

All with category: frontier, appropriate tags. Each query MUST be a literal
GitHub Code Search query — no templating.

**specialized.yaml — 10 dorks** covering Tier 3 providers:
- perplexity-envfile: `pplx- extension:env`
- voyage-envfile: `VOYAGE_API_KEY extension:env`
- jina-envfile: `jina_ extension:env`
- assemblyai-envfile: `ASSEMBLYAI_API_KEY extension:env`
- deepgram-envfile: `DEEPGRAM_API_KEY extension:env`
- elevenlabs-envfile: `ELEVENLABS_API_KEY extension:env`
- stability-envfile: `sk-stability- extension:env`
- huggingface-envfile: `hf_ extension:env`
- perplexity-config: `pplx- filename:config.yaml`
- deepgram-config: `DEEPGRAM filename:.env.local`

category: specialized.

Write identical content to both `pkg/dorks/definitions/github/{file}.yaml`
and `dorks/github/{file}.yaml`. The pkg/ copy is for go:embed, the dorks/
copy is user-visible.

**Adapt loader if needed.** If 08-01 wrote `yaml.Unmarshal(data, &Dork{})`
(single dork per file), change to:
```go
var list []Dork
if err := yaml.Unmarshal(data, &list); err != nil { return err }
dorks = append(dorks, list...)
```
Run `go test ./pkg/dorks/...` to confirm.
cd /home/salva/Documents/apikey && go test ./pkg/dorks/... && go run ./cmd/... 2>&1 || true; awk 'FNR==1{print FILENAME}/^- id:/{c++}END{print "count:",c}' pkg/dorks/definitions/github/frontier.yaml pkg/dorks/definitions/github/specialized.yaml 25 dorks loaded, all pass Validate(), tests pass. Task 2: 25 GitHub dorks — infrastructure + emerging + enterprise pkg/dorks/definitions/github/infrastructure.yaml, pkg/dorks/definitions/github/emerging.yaml, pkg/dorks/definitions/github/enterprise.yaml, dorks/github/infrastructure.yaml, dorks/github/emerging.yaml, dorks/github/enterprise.yaml Create six YAML files (three pairs) using the same list format as Task 1.
**infrastructure.yaml — 10 dorks** (Tier 5 gateways + Tier 8 self-hosted):
- openrouter-envfile: `sk-or-v1- extension:env`
- openrouter-pyfile: `sk-or-v1- extension:py`
- litellm-envfile: `LITELLM_MASTER_KEY extension:env`
- portkey-envfile: `PORTKEY_API_KEY extension:env`
- helicone-envfile: `sk-helicone- extension:env`
- cloudflare-ai-envfile: `CF_API_TOKEN "ai.run"`
- vercel-ai-envfile: `VERCEL_AI extension:env`
- ollama-config: `OLLAMA_HOST filename:docker-compose.yaml`
- vllm-config: `vllm.entrypoints filename:config.yaml`
- localai-envfile: `LOCALAI_API_KEY extension:env`

category: infrastructure.

**emerging.yaml — 10 dorks** (Tier 4 Chinese + Tier 6 niche + vector DBs):
- deepseek-envfile: `sk- extension:env "deepseek"`
- moonshot-envfile: `sk- extension:env "moonshot"`
- qwen-envfile: `DASHSCOPE_API_KEY extension:env`
- zhipu-envfile: `ZHIPU_API_KEY extension:env`
- minimax-envfile: `MINIMAX_API_KEY extension:env`
- pinecone-envfile: `PINECONE_API_KEY extension:env`
- weaviate-envfile: `WEAVIATE_API_KEY extension:env`
- qdrant-envfile: `QDRANT_API_KEY extension:env`
- chroma-envfile: `CHROMA_API_KEY extension:env`
- writer-envfile: `WRITER_API_KEY extension:env`

category: emerging.

**enterprise.yaml — 5 dorks** (Tier 7 dev tools + Tier 9 enterprise):
- codeium-envfile: `CODEIUM_API_KEY extension:env`
- tabnine-envfile: `TABNINE_TOKEN extension:env`
- databricks-envfile: `DATABRICKS_TOKEN extension:env`
- snowflake-cortex: `SNOWFLAKE_PASSWORD "cortex"`
- watsonx-envfile: `WATSONX_APIKEY extension:env`

category: enterprise.

Write each YAML to both pkg/dorks/definitions/github/ and dorks/github/.
All dorks use source: github.
cd /home/salva/Documents/apikey && go test ./pkg/dorks/... && grep -c '^- id:' pkg/dorks/definitions/github/*.yaml | awk -F: '{s+=$NF}END{print "total github dorks:",s; if(s<50) exit 1}' 50 total GitHub dorks across 5 category files, loader picks all up, counts pass. `cd /home/salva/Documents/apikey && go test ./pkg/dorks/...` passes Registry reports >= 50 dorks via a throwaway main or test assertion.

<success_criteria>

  • 50 GitHub dorks loadable via pkg/dorks.NewRegistry()
  • All 5 categories represented
  • Dual location (dorks/ + pkg/dorks/definitions/) maintained </success_criteria>
After completion, create `.planning/phases/08-dork-engine/08-02-SUMMARY.md`