Files
keyhunter/.planning/phases/11-osint_search_paste/11-01-SUMMARY.md
salvacybersec 61a9d527ee docs(11-01): complete search engine dorking sources plan
- SUMMARY.md for 5 search engine sources (Google, Bing, DuckDuckGo, Yandex, Brave)
- STATE.md updated with position and decisions
- Requirements RECON-DORK-01/02/03 marked complete
2026-04-06 11:55:46 +03:00

4.7 KiB

phase, plan, subsystem, tags, requires, provides, affects, tech-stack, key-files, key-decisions, patterns-established, requirements-completed, duration, completed
phase plan subsystem tags requires provides affects tech-stack key-files key-decisions patterns-established requirements-completed duration completed
11-osint-search-paste 01 recon
google-custom-search
bing-web-search
duckduckgo
yandex-xml
brave-search
dorking
osint
phase provides
10-osint-code-hosting ReconSource interface, sources.Client, LimiterRegistry, BuildQueries/formatQuery
GoogleDorkSource - Google Custom Search JSON API dorking
BingDorkSource - Bing Web Search API v7 dorking
DuckDuckGoSource - HTML scraping (credential-free)
YandexSource - Yandex XML Search API dorking
BraveSource - Brave Search API dorking
formatQuery cases for all five search engines
11-osint-search-paste
11-03 RegisterAll wiring
added patterns
encoding/xml for Yandex XML parsing
search-engine dork query format via formatQuery
XML API response parsing
created modified
pkg/recon/sources/google.go
pkg/recon/sources/google_test.go
pkg/recon/sources/bing.go
pkg/recon/sources/bing_test.go
pkg/recon/sources/duckduckgo.go
pkg/recon/sources/duckduckgo_test.go
pkg/recon/sources/yandex.go
pkg/recon/sources/yandex_test.go
pkg/recon/sources/brave.go
pkg/recon/sources/brave_test.go
pkg/recon/sources/queries.go
All five search sources use dork query format: site:pastebin.com OR site:github.com "keyword" to focus on paste/code hosting leak sites
DuckDuckGo is credential-free (HTML scraping) with RespectsRobots=true; other four require API keys
Yandex uses encoding/xml for XML response parsing; all others use encoding/json
extractGoogleKeyword reverse-parser shared by Bing/Yandex/Brave for keyword-to-provider mapping
Search engine dork sources: same Sweep loop pattern as Phase 10 code hosting sources
XML API sources: encoding/xml with nested struct unmarshaling (Yandex)
RECON-DORK-01
RECON-DORK-02
RECON-DORK-03
3min 2026-04-06

Phase 11 Plan 01: Search Engine Dorking Sources Summary

Five search engine dorking ReconSource implementations (Google, Bing, DuckDuckGo, Yandex, Brave) with dork-style queries targeting paste/code hosting sites

Performance

  • Duration: 3 min
  • Started: 2026-04-06T08:51:30Z
  • Completed: 2026-04-06T08:54:52Z
  • Tasks: 2
  • Files modified: 11

Accomplishments

  • GoogleDorkSource and BingDorkSource with JSON API integration and httptest-based tests
  • DuckDuckGoSource with HTML scraping (credential-free, RespectsRobots=true)
  • YandexSource with XML Search API and encoding/xml response parsing
  • BraveSource with Brave Search API and X-Subscription-Token auth
  • formatQuery updated with dork syntax for all five search engines

Task Commits

Each task was committed atomically:

  1. Task 1: GoogleDorkSource + BingDorkSource + formatQuery updates - 7272e65 (feat)
  2. Task 2: DuckDuckGoSource + YandexSource + BraveSource - 7707053 (feat)

Files Created/Modified

  • pkg/recon/sources/google.go - Google Custom Search JSON API source (APIKey + CX required)
  • pkg/recon/sources/google_test.go - Google source tests (enabled, sweep, cancel, unauth)
  • pkg/recon/sources/bing.go - Bing Web Search API v7 source (Ocp-Apim-Subscription-Key)
  • pkg/recon/sources/bing_test.go - Bing source tests
  • pkg/recon/sources/duckduckgo.go - DuckDuckGo HTML scraper (no API key, always enabled)
  • pkg/recon/sources/duckduckgo_test.go - DuckDuckGo tests including empty registry
  • pkg/recon/sources/yandex.go - Yandex XML Search API (user + key required, XML parsing)
  • pkg/recon/sources/yandex_test.go - Yandex tests
  • pkg/recon/sources/brave.go - Brave Search API (X-Subscription-Token)
  • pkg/recon/sources/brave_test.go - Brave tests
  • pkg/recon/sources/queries.go - Added google/bing/duckduckgo/yandex/brave formatQuery cases

Decisions Made

  • All five search sources use dork query format site:pastebin.com OR site:github.com "keyword" to focus results on leak-likely sites
  • DuckDuckGo is the only credential-free source; uses HTML scraping with extractAnchorHrefs (shared with Replit)
  • Yandex requires encoding/xml for its XML Search API response format
  • extractGoogleKeyword reverse-parser reused across Bing/Yandex/Brave for keyword-to-provider name mapping

Deviations from Plan

None - plan executed exactly as written.

Issues Encountered

None.

User Setup Required

None - no external service configuration required.

Next Phase Readiness

  • All five search engine sources ready for RegisterAll wiring in Plan 11-03
  • Each source follows established ReconSource pattern for seamless engine integration

Phase: 11-osint-search-paste Completed: 2026-04-06