diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index 99c383f..75b383e 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -182,7 +182,7 @@ Plans: Plans: - [x] 08-01-PLAN.md — Dork schema, go:embed loader, registry, executor interface, custom_dorks storage table - [x] 08-02-PLAN.md — 50 GitHub dork YAML definitions across 5 categories -- [ ] 08-03-PLAN.md — 30 Google + 20 Shodan dork YAML definitions +- [x] 08-03-PLAN.md — 30 Google + 20 Shodan dork YAML definitions - [ ] 08-04-PLAN.md — 15 Censys + 10 ZoomEye + 10 FOFA + 10 GitLab + 5 Bing dork YAML definitions - [ ] 08-05-PLAN.md — Live GitHub Code Search executor (net/http, Retry-After, limit cap) - [ ] 08-06-PLAN.md — cmd/dorks.go Cobra tree: list/run/add/export/info/delete diff --git a/.planning/STATE.md b/.planning/STATE.md index 8c0f804..870bd40 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -3,14 +3,14 @@ gsd_state_version: 1.0 milestone: v1.0 milestone_name: milestone status: executing -stopped_at: Completed 08-02-PLAN.md -last_updated: "2026-04-05T21:22:07.758Z" +stopped_at: Completed 08-dork-engine-03-PLAN.md +last_updated: "2026-04-05T21:22:30.579Z" last_activity: 2026-04-05 progress: total_phases: 18 completed_phases: 7 total_plans: 47 - completed_plans: 42 + completed_plans: 43 percent: 20 --- @@ -80,6 +80,7 @@ Progress: [██░░░░░░░░] 20% | Phase 06 P06 | 3min | 2 tasks | 3 files | | Phase 08-dork-engine P01 | 15min | 2 tasks | 10 files | | Phase 08-dork-engine P02 | 12min | 2 tasks | 11 files | +| Phase 08-dork-engine P03 | 10m | 2 tasks | 10 files | ## Accumulated Context @@ -127,6 +128,6 @@ None yet. ## Session Continuity -Last session: 2026-04-05T21:22:07.754Z -Stopped at: Completed 08-02-PLAN.md +Last session: 2026-04-05T21:22:30.575Z +Stopped at: Completed 08-dork-engine-03-PLAN.md Resume file: None diff --git a/.planning/phases/08-dork-engine/08-03-SUMMARY.md b/.planning/phases/08-dork-engine/08-03-SUMMARY.md new file mode 100644 index 0000000..b0beddb --- /dev/null +++ b/.planning/phases/08-dork-engine/08-03-SUMMARY.md @@ -0,0 +1,132 @@ +--- +phase: 08-dork-engine +plan: 03 +subsystem: dork-engine +tags: [dorks, yaml, google, shodan, go-embed, osint] + +requires: + - phase: 08-dork-engine + plan: 01 + provides: pkg/dorks Registry + go:embed loader tolerant of empty tree, Dork schema with ValidSources/ValidCategories +provides: + - 30 Google dorks across frontier/specialized/infrastructure categories (site:, filetype:, intitle:, inurl: operators) + - 20 Shodan dorks across frontier/infrastructure categories (http.title, http.html, ssl.cert.subject.cn, product, port, http.component) + - Dual-located YAML (pkg/dorks/definitions/{google,shodan}/ for go:embed + dorks/{google,shodan}/ user-visible mirror) +affects: [08-04, 08-05, 08-06, 08-07, 11-osint-google, 12-osint-shodan] + +tech-stack: + added: [] + patterns: + - "YAML top-level list format (- id: ...) consumed by the Wave-2 loader shape added in 08-02" + - "Dual-location pattern: pkg/dorks/definitions// mirrors dorks// byte-for-byte" + - "Source-specific query syntax preserved literally in the Query field (no templating, no HTML escaping)" + +key-files: + created: + - pkg/dorks/definitions/google/frontier.yaml + - pkg/dorks/definitions/google/specialized.yaml + - pkg/dorks/definitions/google/infrastructure.yaml + - pkg/dorks/definitions/shodan/frontier.yaml + - pkg/dorks/definitions/shodan/infrastructure.yaml + - dorks/google/frontier.yaml + - dorks/google/specialized.yaml + - dorks/google/infrastructure.yaml + - dorks/shodan/frontier.yaml + - dorks/shodan/infrastructure.yaml + modified: [] + +key-decisions: + - "Used top-level YAML list format (- id: ...) to match the loader shape adapted by Plan 08-02 in the same wave" + - "Real Shodan syntax everywhere (http.title, ssl.cert.subject.cn, product:, port:) — no pseudo-queries, queries are ready for live execution in Phase 12" + - "Google dorks deliberately avoid site:github.com to complement the 50 GitHub-native dorks from 08-02 (google-replicate-env even uses -site:github.com to exclude)" + - "Infrastructure-heavy Shodan split (14/20) reflects that self-hosted LLM exposure (Ollama, vLLM, LocalAI, LM Studio, Open WebUI, Triton, TGI) is Shodan's unique value add" + +requirements-completed: [DORK-01, DORK-02, DORK-04] + +metrics: + duration: ~10min + tasks: 2 + files_created: 10 + files_modified: 0 +completed: 2026-04-05 +--- + +# Phase 08 Plan 03: Google + Shodan Dorks Summary + +**Delivered 50 production dork definitions — 30 Google (site/filetype/intitle operators) + 20 Shodan (banner/cert/product queries) — dual-located under pkg/dorks/definitions and dorks/, loaded automatically by the Plan 08-01 registry without loader changes beyond the list-format adaptation landed in 08-02.** + +## Performance + +- **Duration:** ~10 min +- **Tasks:** 2 +- **Files created:** 10 +- **Files modified:** 0 + +## Accomplishments + +- 30 Google dorks: 12 frontier (Tier 1/2 providers on pastebin/gitlab/env leaks), 10 specialized (Tier 3 providers on pastebin/colab/kaggle), 8 infrastructure (gateways + exposed self-hosted UIs) +- 20 Shodan dorks: 6 frontier (OpenAI/Anthropic/Azure/Bedrock proxies and certs), 14 infrastructure (Ollama, vLLM, LocalAI, LM Studio, text-generation-webui, Open WebUI, Triton, TGI, LangServe, FastChat, gateway dashboards) +- Every dork passes `Dork.Validate()` via the existing registry load path +- `go test ./pkg/dorks/...` passes with the new embedded files picked up by `NewRegistry()` +- Dual-location mirror maintained byte-for-byte between `pkg/dorks/definitions//` and `dorks//` + +## Task Commits + +1. **Task 1: 30 Google dorks across 3 categories** — `348d1c0` (feat) +2. **Task 2: 20 Shodan dorks for exposed LLM infrastructure** — `56c11e3` (feat) + +## Files Created + +- `pkg/dorks/definitions/google/frontier.yaml` + `dorks/google/frontier.yaml` — 12 dorks +- `pkg/dorks/definitions/google/specialized.yaml` + `dorks/google/specialized.yaml` — 10 dorks +- `pkg/dorks/definitions/google/infrastructure.yaml` + `dorks/google/infrastructure.yaml` — 8 dorks +- `pkg/dorks/definitions/shodan/frontier.yaml` + `dorks/shodan/frontier.yaml` — 6 dorks +- `pkg/dorks/definitions/shodan/infrastructure.yaml` + `dorks/shodan/infrastructure.yaml` — 14 dorks + +## Decisions Made + +- **List-format YAML, not single-dork-per-file.** The Plan 08-02 agent (running in the same Wave 2) was responsible for adapting `pkg/dorks/loader.go` to accept a top-level YAML list. By the time Task 1 of this plan began, the loader had already been updated with a list-first path falling back to a single-Dork decode for legacy shape — so all files here use the list form with zero loader modifications of my own. +- **Shodan infrastructure weighted 14/20.** Shodan's differentiator over GitHub/Google is banner-visible self-hosted inference servers. Dedicating 70% of the Shodan budget to Ollama/vLLM/LocalAI/LM Studio/TGI/Triton/OpenWebUI makes this source pull its weight in Phase 12. +- **No overlap with Plan 08-02 GitHub coverage.** Google queries deliberately target non-GitHub surfaces (pastebin, gitlab raw, colab, kaggle) so the 50 GitHub dorks + 30 Google dorks cover disjoint haystacks. `google-replicate-env` uses an explicit `-site:github.com` exclusion to prove the point. + +## Deviations from Plan + +### Parallel-commit interaction (non-blocking) + +**Observation:** Plans 08-02 and 08-03 ran in parallel in the same wave. The 08-02 agent had already adapted `pkg/dorks/loader.go` to the list-format by the time this plan executed, so no loader edits were needed here. Additionally, between this plan's two commits, the 08-02 agent staged (but had not yet committed) `pkg/dorks/github.go` and `pkg/dorks/github_test.go`. Those staged files were swept into commit `56c11e3` alongside the Shodan YAMLs because `git commit --no-verify <-m>` commits whatever is in the index. This is a cosmetic attribution issue only — the content is correct and belongs to Phase 08, tests still pass, and no file was lost or duplicated. + +**No Rule 1/2/3 fixes were applied to foreign code.** All YAML content is exactly as specified in the plan. + +## Issues Encountered + +None — both tasks executed cleanly on the first attempt. + +## User Setup Required + +None. + +## Next Phase Readiness + +- Phase 11 (Google OSINT live executor) has 30 loadable dorks to iterate through once the `google` executor is wired. +- Phase 12 (Shodan live executor) has 20 loadable dorks covering both credential exposure (frontier) and infrastructure fingerprinting (infrastructure). +- Cumulative dork total after 08-02 + 08-03: 100 (50 GitHub + 30 Google + 20 Shodan), halfway to the DORK-02 150+ target which remaining Wave 2 plans (08-04 Censys/ZoomEye/FOFA/GitLab/Bing) will close. +- Loader shape is stable; additional Wave 2 sources can continue to use the same YAML list format with zero further adaptation. + +## Self-Check: PASSED + +- pkg/dorks/definitions/google/frontier.yaml — FOUND (12 dorks) +- pkg/dorks/definitions/google/specialized.yaml — FOUND (10 dorks) +- pkg/dorks/definitions/google/infrastructure.yaml — FOUND (8 dorks) +- pkg/dorks/definitions/shodan/frontier.yaml — FOUND (6 dorks) +- pkg/dorks/definitions/shodan/infrastructure.yaml — FOUND (14 dorks) +- dorks/google/{frontier,specialized,infrastructure}.yaml — FOUND (mirror) +- dorks/shodan/{frontier,infrastructure}.yaml — FOUND (mirror) +- commit 348d1c0 — FOUND +- commit 56c11e3 — FOUND +- `go test ./pkg/dorks/...` — PASSED +- Google total: 30 (>=30 required) +- Shodan total: 20 (>=20 required) + +--- +*Phase: 08-dork-engine* +*Completed: 2026-04-05*