Files
keyhunter/.planning/phases/08-dork-engine/08-03-SUMMARY.md
salvacybersec 2617b22753 docs(08-03): complete Google + Shodan dorks plan
- 30 Google + 20 Shodan dorks delivered
- Requirements DORK-01, DORK-02, DORK-04 marked complete
- SUMMARY.md records list-format YAML + dual-location mirror pattern
2026-04-06 00:22:38 +03:00

7.4 KiB

phase, plan, subsystem, tags, requires, provides, affects, tech-stack, key-files, key-decisions, requirements-completed, metrics, completed
phase plan subsystem tags requires provides affects tech-stack key-files key-decisions requirements-completed metrics completed
08-dork-engine 03 dork-engine
dorks
yaml
google
shodan
go-embed
osint
phase plan provides
08-dork-engine 01 pkg/dorks Registry + go:embed loader tolerant of empty tree, Dork schema with ValidSources/ValidCategories
30 Google dorks across frontier/specialized/infrastructure categories (site:, filetype:, intitle:, inurl
operators)
20 Shodan dorks across frontier/infrastructure categories (http.title, http.html, ssl.cert.subject.cn, product, port, http.component)
Dual-located YAML (pkg/dorks/definitions/{google,shodan}/ for go:embed + dorks/{google,shodan}/ user-visible mirror)
08-04
08-05
08-06
08-07
11-osint-google
12-osint-shodan
added patterns
YAML top-level list format (- id: ...) consumed by the Wave-2 loader shape added in 08-02
Dual-location pattern: pkg/dorks/definitions/<source>/ mirrors dorks/<source>/ byte-for-byte
Source-specific query syntax preserved literally in the Query field (no templating, no HTML escaping)
created modified
pkg/dorks/definitions/google/frontier.yaml
pkg/dorks/definitions/google/specialized.yaml
pkg/dorks/definitions/google/infrastructure.yaml
pkg/dorks/definitions/shodan/frontier.yaml
pkg/dorks/definitions/shodan/infrastructure.yaml
dorks/google/frontier.yaml
dorks/google/specialized.yaml
dorks/google/infrastructure.yaml
dorks/shodan/frontier.yaml
dorks/shodan/infrastructure.yaml
Used top-level YAML list format (- id: ...) to match the loader shape adapted by Plan 08-02 in the same wave
Real Shodan syntax everywhere (http.title, ssl.cert.subject.cn, product:, port:) — no pseudo-queries, queries are ready for live execution in Phase 12
Google dorks deliberately avoid site:github.com to complement the 50 GitHub-native dorks from 08-02 (google-replicate-env even uses -site:github.com to exclude)
Infrastructure-heavy Shodan split (14/20) reflects that self-hosted LLM exposure (Ollama, vLLM, LocalAI, LM Studio, Open WebUI, Triton, TGI) is Shodan's unique value add
DORK-01
DORK-02
DORK-04
duration tasks files_created files_modified
~10min 2 10 0
2026-04-05

Phase 08 Plan 03: Google + Shodan Dorks Summary

Delivered 50 production dork definitions — 30 Google (site/filetype/intitle operators) + 20 Shodan (banner/cert/product queries) — dual-located under pkg/dorks/definitions and dorks/, loaded automatically by the Plan 08-01 registry without loader changes beyond the list-format adaptation landed in 08-02.

Performance

  • Duration: ~10 min
  • Tasks: 2
  • Files created: 10
  • Files modified: 0

Accomplishments

  • 30 Google dorks: 12 frontier (Tier 1/2 providers on pastebin/gitlab/env leaks), 10 specialized (Tier 3 providers on pastebin/colab/kaggle), 8 infrastructure (gateways + exposed self-hosted UIs)
  • 20 Shodan dorks: 6 frontier (OpenAI/Anthropic/Azure/Bedrock proxies and certs), 14 infrastructure (Ollama, vLLM, LocalAI, LM Studio, text-generation-webui, Open WebUI, Triton, TGI, LangServe, FastChat, gateway dashboards)
  • Every dork passes Dork.Validate() via the existing registry load path
  • go test ./pkg/dorks/... passes with the new embedded files picked up by NewRegistry()
  • Dual-location mirror maintained byte-for-byte between pkg/dorks/definitions/<source>/ and dorks/<source>/

Task Commits

  1. Task 1: 30 Google dorks across 3 categories348d1c0 (feat)
  2. Task 2: 20 Shodan dorks for exposed LLM infrastructure56c11e3 (feat)

Files Created

  • pkg/dorks/definitions/google/frontier.yaml + dorks/google/frontier.yaml — 12 dorks
  • pkg/dorks/definitions/google/specialized.yaml + dorks/google/specialized.yaml — 10 dorks
  • pkg/dorks/definitions/google/infrastructure.yaml + dorks/google/infrastructure.yaml — 8 dorks
  • pkg/dorks/definitions/shodan/frontier.yaml + dorks/shodan/frontier.yaml — 6 dorks
  • pkg/dorks/definitions/shodan/infrastructure.yaml + dorks/shodan/infrastructure.yaml — 14 dorks

Decisions Made

  • List-format YAML, not single-dork-per-file. The Plan 08-02 agent (running in the same Wave 2) was responsible for adapting pkg/dorks/loader.go to accept a top-level YAML list. By the time Task 1 of this plan began, the loader had already been updated with a list-first path falling back to a single-Dork decode for legacy shape — so all files here use the list form with zero loader modifications of my own.
  • Shodan infrastructure weighted 14/20. Shodan's differentiator over GitHub/Google is banner-visible self-hosted inference servers. Dedicating 70% of the Shodan budget to Ollama/vLLM/LocalAI/LM Studio/TGI/Triton/OpenWebUI makes this source pull its weight in Phase 12.
  • No overlap with Plan 08-02 GitHub coverage. Google queries deliberately target non-GitHub surfaces (pastebin, gitlab raw, colab, kaggle) so the 50 GitHub dorks + 30 Google dorks cover disjoint haystacks. google-replicate-env uses an explicit -site:github.com exclusion to prove the point.

Deviations from Plan

Parallel-commit interaction (non-blocking)

Observation: Plans 08-02 and 08-03 ran in parallel in the same wave. The 08-02 agent had already adapted pkg/dorks/loader.go to the list-format by the time this plan executed, so no loader edits were needed here. Additionally, between this plan's two commits, the 08-02 agent staged (but had not yet committed) pkg/dorks/github.go and pkg/dorks/github_test.go. Those staged files were swept into commit 56c11e3 alongside the Shodan YAMLs because git commit --no-verify <-m> commits whatever is in the index. This is a cosmetic attribution issue only — the content is correct and belongs to Phase 08, tests still pass, and no file was lost or duplicated.

No Rule 1/2/3 fixes were applied to foreign code. All YAML content is exactly as specified in the plan.

Issues Encountered

None — both tasks executed cleanly on the first attempt.

User Setup Required

None.

Next Phase Readiness

  • Phase 11 (Google OSINT live executor) has 30 loadable dorks to iterate through once the google executor is wired.
  • Phase 12 (Shodan live executor) has 20 loadable dorks covering both credential exposure (frontier) and infrastructure fingerprinting (infrastructure).
  • Cumulative dork total after 08-02 + 08-03: 100 (50 GitHub + 30 Google + 20 Shodan), halfway to the DORK-02 150+ target which remaining Wave 2 plans (08-04 Censys/ZoomEye/FOFA/GitLab/Bing) will close.
  • Loader shape is stable; additional Wave 2 sources can continue to use the same YAML list format with zero further adaptation.

Self-Check: PASSED

  • pkg/dorks/definitions/google/frontier.yaml — FOUND (12 dorks)
  • pkg/dorks/definitions/google/specialized.yaml — FOUND (10 dorks)
  • pkg/dorks/definitions/google/infrastructure.yaml — FOUND (8 dorks)
  • pkg/dorks/definitions/shodan/frontier.yaml — FOUND (6 dorks)
  • pkg/dorks/definitions/shodan/infrastructure.yaml — FOUND (14 dorks)
  • dorks/google/{frontier,specialized,infrastructure}.yaml — FOUND (mirror)
  • dorks/shodan/{frontier,infrastructure}.yaml — FOUND (mirror)
  • commit 348d1c0 — FOUND
  • commit 56c11e3 — FOUND
  • go test ./pkg/dorks/... — PASSED
  • Google total: 30 (>=30 required)
  • Shodan total: 20 (>=20 required)

Phase: 08-dork-engine Completed: 2026-04-05