docs(08-04): complete censys+zoomeye+fofa+gitlab+bing dorks plan

This commit is contained in:
salvacybersec
2026-04-06 00:22:36 +03:00
parent 17f17edf1e
commit 213177ddf4

View File

@@ -0,0 +1,129 @@
---
phase: 08-dork-engine
plan: 04
subsystem: dork-engine
tags: [dorks, censys, zoomeye, fofa, gitlab, bing, yaml, embed]
requires:
- 08-01 # dork schema, loader, registry, executor foundation
provides:
- 15-censys-dorks
- 10-zoomeye-dorks
- 10-fofa-dorks
- 10-gitlab-dorks
- 5-bing-dorks
- 150-dork-grand-total
affects:
- pkg/dorks/definitions/
- dorks/
tech-stack:
added: []
patterns:
- "Dual-located YAML: authoritative copy under pkg/dorks/definitions/{source}/*.yaml is go:embed'd into the binary; dorks/ mirror at repo root stays discoverable for operators browsing the tree."
- "Top-level YAML sequence (`- id: ...`) per source — loader.go already supports both list and single-dork shapes."
- "Category mix per source reflects the real surface: infrastructure sources (Censys/ZoomEye/FOFA) lean heavy infrastructure + a couple frontier cert/proxy catches; code-search (GitLab) spreads across frontier/specialized/emerging; Bing mixes pastebin leaks with one infra catch."
key-files:
created:
- pkg/dorks/definitions/censys/all.yaml
- pkg/dorks/definitions/zoomeye/all.yaml
- pkg/dorks/definitions/fofa/all.yaml
- pkg/dorks/definitions/gitlab/all.yaml
- pkg/dorks/definitions/bing/all.yaml
- dorks/censys/all.yaml
- dorks/zoomeye/all.yaml
- dorks/fofa/all.yaml
- dorks/gitlab/all.yaml
- dorks/bing/all.yaml
modified: []
decisions:
- "Used a single all.yaml per source (rather than per-category files) because the volumes per source are small (5-15) — splitting by category would scatter 1-3 dork files across directories with no offsetting gain."
- "Marked Azure OpenAI cert, Bedrock cert, and OpenAI-proxy queries as `frontier` (not infrastructure) because they directly fingerprint frontier-model vendors, even when the underlying delivery is via infra indicators (TLS CN / body fragments)."
- "Did not author an executor stub for any of the 5 new sources — plan 08-01 already returns ErrSourceNotImplemented for every non-GitHub source, and live execution is explicitly deferred to OSINT phases 9-16."
metrics:
duration: ~5min
tasks_completed: 2
files_created: 10
dorks_added: 50
grand_total: 150
completed: 2026-04-05
---
# Phase 08 Plan 04: Censys + ZoomEye + FOFA + GitLab + Bing Dorks Summary
50 dorks across five non-GitHub/Google/Shodan sources delivered as embedded YAML, bringing the phase grand total to the DORK-02 target of 150.
## What Was Built
- **Censys (15):** Search 2.0 queries against `services.http.response.*`, `services.tls.certificates.*`, and `services.port` for Ollama (:11434), vLLM, LocalAI, Open WebUI, LM Studio, NVIDIA Triton, Hugging Face TGI, LiteLLM (:4000), Portkey, LangServe, FastChat, text-generation-webui, plus Azure OpenAI and AWS Bedrock certificate CNs and an OpenAI-compatible proxy body-content catch. 12 infrastructure + 3 frontier.
- **ZoomEye (10):** Mirrors the Censys surface using `app:`, `title:`, `service:`, and `port:` operators. 9 infrastructure + 1 frontier (OpenAI-proxy title match).
- **FOFA (10):** Native `title=`, `body=`, `port=`, `cert=` queries covering Ollama, vLLM, LocalAI, Open WebUI, LiteLLM, Triton, LangServe, TGI, plus two frontier catches (Azure OpenAI cert, OpenAI proxy leaking `api_key`). 8 infrastructure + 2 frontier.
- **GitLab (10):** Code-search dorks for committed `.env`, `.json`, and `.py` files across OpenAI, Anthropic, Google Generative AI, Groq, Cohere, Hugging Face, OpenRouter, Perplexity, DeepSeek, and Pinecone. Mix: frontier (3), specialized (3), emerging (3), infrastructure (1).
- **Bing (5):** `site:pastebin.com` + `filetype:env` + `intitle:/inbody:` operators catching pasted OpenAI/Anthropic/HF keys, `.env` files, and exposed Ollama dashboards. 3 frontier + 1 specialized + 1 infrastructure.
## Success Criteria Status
- [x] `Registry.ListBySource("censys")` returns 15 (verified via `grep -c '^- id:'`)
- [x] `Registry.ListBySource("zoomeye")` returns 10
- [x] `Registry.ListBySource("fofa")` returns 10
- [x] `Registry.ListBySource("gitlab")` returns 10
- [x] `Registry.ListBySource("bing")` returns 5
- [x] Grand total across all 8 sources: **150** (github 50 + google 30 + shodan 20 + censys 15 + zoomeye 10 + fofa 10 + gitlab 10 + bing 5)
- [x] All files dual-located under `pkg/dorks/definitions/` and `dorks/`
- [x] `go test ./pkg/dorks/...` passes
## Decisions Made
1. **Single `all.yaml` per source** — per-category splits would create 1-3-entry files for sources that top out at 5-15 dorks total. Single file keeps the tree flat and matches the volume.
2. **Cert fingerprints tagged frontier** — Azure OpenAI and Bedrock certificate CNs are infra-level indicators, but their target is a frontier vendor. The category field drives filtering, so they belong in `frontier` for operators running `dorks run --category frontier`.
3. **No executor stubs needed** — 08-01 already routes non-GitHub sources to `ErrSourceNotImplemented`, and live execution lands in Phase 9-16. These YAML files are pure definitions.
## Deviations from Plan
None — plan executed exactly as written. The loader already supported list-form YAML from plans 08-02/08-03, so no loader change was required.
## Commits
| Task | Description | Commit |
| ---- | ----------------------------------- | --------- |
| 1 | 15 Censys + 10 ZoomEye dorks | `1c86800` |
| 2 | 10 FOFA + 10 GitLab + 5 Bing dorks | `c504cbd` |
## Verification
```
$ go test ./pkg/dorks/...
ok github.com/salvacybersec/keyhunter/pkg/dorks
$ for s in github google shodan censys zoomeye fofa gitlab bing; do
grep -rh '^- id:' pkg/dorks/definitions/$s/ | wc -l
done
github: 50
google: 30
shodan: 20
censys: 15
zoomeye: 10
fofa: 10
gitlab: 10
bing: 5
TOTAL: 150
```
## Known Stubs
None. All 50 dorks are complete definitions; execution is intentionally deferred to OSINT phases 9-16 per 08-CONTEXT.md.
## Self-Check: PASSED
- FOUND: pkg/dorks/definitions/censys/all.yaml
- FOUND: pkg/dorks/definitions/zoomeye/all.yaml
- FOUND: pkg/dorks/definitions/fofa/all.yaml
- FOUND: pkg/dorks/definitions/gitlab/all.yaml
- FOUND: pkg/dorks/definitions/bing/all.yaml
- FOUND: dorks/censys/all.yaml
- FOUND: dorks/zoomeye/all.yaml
- FOUND: dorks/fofa/all.yaml
- FOUND: dorks/gitlab/all.yaml
- FOUND: dorks/bing/all.yaml
- FOUND commit: 1c86800
- FOUND commit: c504cbd
- TEST: go test ./pkg/dorks/... PASSED
- COUNT: grand total 150 (target >= 150)