Files
keyhunter/.planning/phases/08-dork-engine/08-04-SUMMARY.md

6.3 KiB

phase, plan, subsystem, tags, requires, provides, affects, tech-stack, key-files, decisions, metrics
phase plan subsystem tags requires provides affects tech-stack key-files decisions metrics
08-dork-engine 04 dork-engine
dorks
censys
zoomeye
fofa
gitlab
bing
yaml
embed
08-01
15-censys-dorks
10-zoomeye-dorks
10-fofa-dorks
10-gitlab-dorks
5-bing-dorks
150-dork-grand-total
pkg/dorks/definitions/
dorks/
added patterns
Dual-located YAML: authoritative copy under pkg/dorks/definitions/{source}/*.yaml is go:embed'd into the binary; dorks/ mirror at repo root stays discoverable for operators browsing the tree.
Top-level YAML sequence (`- id: ...`) per source — loader.go already supports both list and single-dork shapes.
Category mix per source reflects the real surface: infrastructure sources (Censys/ZoomEye/FOFA) lean heavy infrastructure + a couple frontier cert/proxy catches; code-search (GitLab) spreads across frontier/specialized/emerging; Bing mixes pastebin leaks with one infra catch.
created modified
pkg/dorks/definitions/censys/all.yaml
pkg/dorks/definitions/zoomeye/all.yaml
pkg/dorks/definitions/fofa/all.yaml
pkg/dorks/definitions/gitlab/all.yaml
pkg/dorks/definitions/bing/all.yaml
dorks/censys/all.yaml
dorks/zoomeye/all.yaml
dorks/fofa/all.yaml
dorks/gitlab/all.yaml
dorks/bing/all.yaml
Used a single all.yaml per source (rather than per-category files) because the volumes per source are small (5-15) — splitting by category would scatter 1-3 dork files across directories with no offsetting gain.
Marked Azure OpenAI cert, Bedrock cert, and OpenAI-proxy queries as `frontier` (not infrastructure) because they directly fingerprint frontier-model vendors, even when the underlying delivery is via infra indicators (TLS CN / body fragments).
Did not author an executor stub for any of the 5 new sources — plan 08-01 already returns ErrSourceNotImplemented for every non-GitHub source, and live execution is explicitly deferred to OSINT phases 9-16.
duration tasks_completed files_created dorks_added grand_total completed
~5min 2 10 50 150 2026-04-05

Phase 08 Plan 04: Censys + ZoomEye + FOFA + GitLab + Bing Dorks Summary

50 dorks across five non-GitHub/Google/Shodan sources delivered as embedded YAML, bringing the phase grand total to the DORK-02 target of 150.

What Was Built

  • Censys (15): Search 2.0 queries against services.http.response.*, services.tls.certificates.*, and services.port for Ollama (:11434), vLLM, LocalAI, Open WebUI, LM Studio, NVIDIA Triton, Hugging Face TGI, LiteLLM (:4000), Portkey, LangServe, FastChat, text-generation-webui, plus Azure OpenAI and AWS Bedrock certificate CNs and an OpenAI-compatible proxy body-content catch. 12 infrastructure + 3 frontier.
  • ZoomEye (10): Mirrors the Censys surface using app:, title:, service:, and port: operators. 9 infrastructure + 1 frontier (OpenAI-proxy title match).
  • FOFA (10): Native title=, body=, port=, cert= queries covering Ollama, vLLM, LocalAI, Open WebUI, LiteLLM, Triton, LangServe, TGI, plus two frontier catches (Azure OpenAI cert, OpenAI proxy leaking api_key). 8 infrastructure + 2 frontier.
  • GitLab (10): Code-search dorks for committed .env, .json, and .py files across OpenAI, Anthropic, Google Generative AI, Groq, Cohere, Hugging Face, OpenRouter, Perplexity, DeepSeek, and Pinecone. Mix: frontier (3), specialized (3), emerging (3), infrastructure (1).
  • Bing (5): site:pastebin.com + filetype:env + intitle:/inbody: operators catching pasted OpenAI/Anthropic/HF keys, .env files, and exposed Ollama dashboards. 3 frontier + 1 specialized + 1 infrastructure.

Success Criteria Status

  • Registry.ListBySource("censys") returns 15 (verified via grep -c '^- id:')
  • Registry.ListBySource("zoomeye") returns 10
  • Registry.ListBySource("fofa") returns 10
  • Registry.ListBySource("gitlab") returns 10
  • Registry.ListBySource("bing") returns 5
  • Grand total across all 8 sources: 150 (github 50 + google 30 + shodan 20 + censys 15 + zoomeye 10 + fofa 10 + gitlab 10 + bing 5)
  • All files dual-located under pkg/dorks/definitions/ and dorks/
  • go test ./pkg/dorks/... passes

Decisions Made

  1. Single all.yaml per source — per-category splits would create 1-3-entry files for sources that top out at 5-15 dorks total. Single file keeps the tree flat and matches the volume.
  2. Cert fingerprints tagged frontier — Azure OpenAI and Bedrock certificate CNs are infra-level indicators, but their target is a frontier vendor. The category field drives filtering, so they belong in frontier for operators running dorks run --category frontier.
  3. No executor stubs needed — 08-01 already routes non-GitHub sources to ErrSourceNotImplemented, and live execution lands in Phase 9-16. These YAML files are pure definitions.

Deviations from Plan

None — plan executed exactly as written. The loader already supported list-form YAML from plans 08-02/08-03, so no loader change was required.

Commits

Task Description Commit
1 15 Censys + 10 ZoomEye dorks 1c86800
2 10 FOFA + 10 GitLab + 5 Bing dorks c504cbd

Verification

$ go test ./pkg/dorks/...
ok  	github.com/salvacybersec/keyhunter/pkg/dorks

$ for s in github google shodan censys zoomeye fofa gitlab bing; do
    grep -rh '^- id:' pkg/dorks/definitions/$s/ | wc -l
  done
github: 50
google: 30
shodan: 20
censys: 15
zoomeye: 10
fofa: 10
gitlab: 10
bing: 5
TOTAL:  150

Known Stubs

None. All 50 dorks are complete definitions; execution is intentionally deferred to OSINT phases 9-16 per 08-CONTEXT.md.

Self-Check: PASSED

  • FOUND: pkg/dorks/definitions/censys/all.yaml
  • FOUND: pkg/dorks/definitions/zoomeye/all.yaml
  • FOUND: pkg/dorks/definitions/fofa/all.yaml
  • FOUND: pkg/dorks/definitions/gitlab/all.yaml
  • FOUND: pkg/dorks/definitions/bing/all.yaml
  • FOUND: dorks/censys/all.yaml
  • FOUND: dorks/zoomeye/all.yaml
  • FOUND: dorks/fofa/all.yaml
  • FOUND: dorks/gitlab/all.yaml
  • FOUND: dorks/bing/all.yaml
  • FOUND commit: 1c86800
  • FOUND commit: c504cbd
  • TEST: go test ./pkg/dorks/... PASSED
  • COUNT: grand total 150 (target >= 150)