--- phase: 08-dork-engine plan: 04 subsystem: dork-engine tags: [dorks, censys, zoomeye, fofa, gitlab, bing, yaml, embed] requires: - 08-01 # dork schema, loader, registry, executor foundation provides: - 15-censys-dorks - 10-zoomeye-dorks - 10-fofa-dorks - 10-gitlab-dorks - 5-bing-dorks - 150-dork-grand-total affects: - pkg/dorks/definitions/ - dorks/ tech-stack: added: [] patterns: - "Dual-located YAML: authoritative copy under pkg/dorks/definitions/{source}/*.yaml is go:embed'd into the binary; dorks/ mirror at repo root stays discoverable for operators browsing the tree." - "Top-level YAML sequence (`- id: ...`) per source — loader.go already supports both list and single-dork shapes." - "Category mix per source reflects the real surface: infrastructure sources (Censys/ZoomEye/FOFA) lean heavy infrastructure + a couple frontier cert/proxy catches; code-search (GitLab) spreads across frontier/specialized/emerging; Bing mixes pastebin leaks with one infra catch." key-files: created: - pkg/dorks/definitions/censys/all.yaml - pkg/dorks/definitions/zoomeye/all.yaml - pkg/dorks/definitions/fofa/all.yaml - pkg/dorks/definitions/gitlab/all.yaml - pkg/dorks/definitions/bing/all.yaml - dorks/censys/all.yaml - dorks/zoomeye/all.yaml - dorks/fofa/all.yaml - dorks/gitlab/all.yaml - dorks/bing/all.yaml modified: [] decisions: - "Used a single all.yaml per source (rather than per-category files) because the volumes per source are small (5-15) — splitting by category would scatter 1-3 dork files across directories with no offsetting gain." - "Marked Azure OpenAI cert, Bedrock cert, and OpenAI-proxy queries as `frontier` (not infrastructure) because they directly fingerprint frontier-model vendors, even when the underlying delivery is via infra indicators (TLS CN / body fragments)." - "Did not author an executor stub for any of the 5 new sources — plan 08-01 already returns ErrSourceNotImplemented for every non-GitHub source, and live execution is explicitly deferred to OSINT phases 9-16." metrics: duration: ~5min tasks_completed: 2 files_created: 10 dorks_added: 50 grand_total: 150 completed: 2026-04-05 --- # Phase 08 Plan 04: Censys + ZoomEye + FOFA + GitLab + Bing Dorks Summary 50 dorks across five non-GitHub/Google/Shodan sources delivered as embedded YAML, bringing the phase grand total to the DORK-02 target of 150. ## What Was Built - **Censys (15):** Search 2.0 queries against `services.http.response.*`, `services.tls.certificates.*`, and `services.port` for Ollama (:11434), vLLM, LocalAI, Open WebUI, LM Studio, NVIDIA Triton, Hugging Face TGI, LiteLLM (:4000), Portkey, LangServe, FastChat, text-generation-webui, plus Azure OpenAI and AWS Bedrock certificate CNs and an OpenAI-compatible proxy body-content catch. 12 infrastructure + 3 frontier. - **ZoomEye (10):** Mirrors the Censys surface using `app:`, `title:`, `service:`, and `port:` operators. 9 infrastructure + 1 frontier (OpenAI-proxy title match). - **FOFA (10):** Native `title=`, `body=`, `port=`, `cert=` queries covering Ollama, vLLM, LocalAI, Open WebUI, LiteLLM, Triton, LangServe, TGI, plus two frontier catches (Azure OpenAI cert, OpenAI proxy leaking `api_key`). 8 infrastructure + 2 frontier. - **GitLab (10):** Code-search dorks for committed `.env`, `.json`, and `.py` files across OpenAI, Anthropic, Google Generative AI, Groq, Cohere, Hugging Face, OpenRouter, Perplexity, DeepSeek, and Pinecone. Mix: frontier (3), specialized (3), emerging (3), infrastructure (1). - **Bing (5):** `site:pastebin.com` + `filetype:env` + `intitle:/inbody:` operators catching pasted OpenAI/Anthropic/HF keys, `.env` files, and exposed Ollama dashboards. 3 frontier + 1 specialized + 1 infrastructure. ## Success Criteria Status - [x] `Registry.ListBySource("censys")` returns 15 (verified via `grep -c '^- id:'`) - [x] `Registry.ListBySource("zoomeye")` returns 10 - [x] `Registry.ListBySource("fofa")` returns 10 - [x] `Registry.ListBySource("gitlab")` returns 10 - [x] `Registry.ListBySource("bing")` returns 5 - [x] Grand total across all 8 sources: **150** (github 50 + google 30 + shodan 20 + censys 15 + zoomeye 10 + fofa 10 + gitlab 10 + bing 5) - [x] All files dual-located under `pkg/dorks/definitions/` and `dorks/` - [x] `go test ./pkg/dorks/...` passes ## Decisions Made 1. **Single `all.yaml` per source** — per-category splits would create 1-3-entry files for sources that top out at 5-15 dorks total. Single file keeps the tree flat and matches the volume. 2. **Cert fingerprints tagged frontier** — Azure OpenAI and Bedrock certificate CNs are infra-level indicators, but their target is a frontier vendor. The category field drives filtering, so they belong in `frontier` for operators running `dorks run --category frontier`. 3. **No executor stubs needed** — 08-01 already routes non-GitHub sources to `ErrSourceNotImplemented`, and live execution lands in Phase 9-16. These YAML files are pure definitions. ## Deviations from Plan None — plan executed exactly as written. The loader already supported list-form YAML from plans 08-02/08-03, so no loader change was required. ## Commits | Task | Description | Commit | | ---- | ----------------------------------- | --------- | | 1 | 15 Censys + 10 ZoomEye dorks | `1c86800` | | 2 | 10 FOFA + 10 GitLab + 5 Bing dorks | `c504cbd` | ## Verification ``` $ go test ./pkg/dorks/... ok github.com/salvacybersec/keyhunter/pkg/dorks $ for s in github google shodan censys zoomeye fofa gitlab bing; do grep -rh '^- id:' pkg/dorks/definitions/$s/ | wc -l done github: 50 google: 30 shodan: 20 censys: 15 zoomeye: 10 fofa: 10 gitlab: 10 bing: 5 TOTAL: 150 ``` ## Known Stubs None. All 50 dorks are complete definitions; execution is intentionally deferred to OSINT phases 9-16 per 08-CONTEXT.md. ## Self-Check: PASSED - FOUND: pkg/dorks/definitions/censys/all.yaml - FOUND: pkg/dorks/definitions/zoomeye/all.yaml - FOUND: pkg/dorks/definitions/fofa/all.yaml - FOUND: pkg/dorks/definitions/gitlab/all.yaml - FOUND: pkg/dorks/definitions/bing/all.yaml - FOUND: dorks/censys/all.yaml - FOUND: dorks/zoomeye/all.yaml - FOUND: dorks/fofa/all.yaml - FOUND: dorks/gitlab/all.yaml - FOUND: dorks/bing/all.yaml - FOUND commit: 1c86800 - FOUND commit: c504cbd - TEST: go test ./pkg/dorks/... PASSED - COUNT: grand total 150 (target >= 150)