- SUMMARY.md for 5 search engine sources (Google, Bing, DuckDuckGo, Yandex, Brave) - STATE.md updated with position and decisions - Requirements RECON-DORK-01/02/03 marked complete
118 lines
4.7 KiB
Markdown
118 lines
4.7 KiB
Markdown
---
|
|
phase: 11-osint-search-paste
|
|
plan: 01
|
|
subsystem: recon
|
|
tags: [google-custom-search, bing-web-search, duckduckgo, yandex-xml, brave-search, dorking, osint]
|
|
|
|
requires:
|
|
- phase: 10-osint-code-hosting
|
|
provides: "ReconSource interface, sources.Client, LimiterRegistry, BuildQueries/formatQuery"
|
|
provides:
|
|
- "GoogleDorkSource - Google Custom Search JSON API dorking"
|
|
- "BingDorkSource - Bing Web Search API v7 dorking"
|
|
- "DuckDuckGoSource - HTML scraping (credential-free)"
|
|
- "YandexSource - Yandex XML Search API dorking"
|
|
- "BraveSource - Brave Search API dorking"
|
|
- "formatQuery cases for all five search engines"
|
|
affects: [11-osint-search-paste, 11-03 RegisterAll wiring]
|
|
|
|
tech-stack:
|
|
added: [encoding/xml for Yandex XML parsing]
|
|
patterns: [search-engine dork query format via formatQuery, XML API response parsing]
|
|
|
|
key-files:
|
|
created:
|
|
- pkg/recon/sources/google.go
|
|
- pkg/recon/sources/google_test.go
|
|
- pkg/recon/sources/bing.go
|
|
- pkg/recon/sources/bing_test.go
|
|
- pkg/recon/sources/duckduckgo.go
|
|
- pkg/recon/sources/duckduckgo_test.go
|
|
- pkg/recon/sources/yandex.go
|
|
- pkg/recon/sources/yandex_test.go
|
|
- pkg/recon/sources/brave.go
|
|
- pkg/recon/sources/brave_test.go
|
|
modified:
|
|
- pkg/recon/sources/queries.go
|
|
|
|
key-decisions:
|
|
- "All five search sources use dork query format: site:pastebin.com OR site:github.com \"keyword\" to focus on paste/code hosting leak sites"
|
|
- "DuckDuckGo is credential-free (HTML scraping) with RespectsRobots=true; other four require API keys"
|
|
- "Yandex uses encoding/xml for XML response parsing; all others use encoding/json"
|
|
- "extractGoogleKeyword reverse-parser shared by Bing/Yandex/Brave for keyword-to-provider mapping"
|
|
|
|
patterns-established:
|
|
- "Search engine dork sources: same Sweep loop pattern as Phase 10 code hosting sources"
|
|
- "XML API sources: encoding/xml with nested struct unmarshaling (Yandex)"
|
|
|
|
requirements-completed: [RECON-DORK-01, RECON-DORK-02, RECON-DORK-03]
|
|
|
|
duration: 3min
|
|
completed: 2026-04-06
|
|
---
|
|
|
|
# Phase 11 Plan 01: Search Engine Dorking Sources Summary
|
|
|
|
**Five search engine dorking ReconSource implementations (Google, Bing, DuckDuckGo, Yandex, Brave) with dork-style queries targeting paste/code hosting sites**
|
|
|
|
## Performance
|
|
|
|
- **Duration:** 3 min
|
|
- **Started:** 2026-04-06T08:51:30Z
|
|
- **Completed:** 2026-04-06T08:54:52Z
|
|
- **Tasks:** 2
|
|
- **Files modified:** 11
|
|
|
|
## Accomplishments
|
|
- GoogleDorkSource and BingDorkSource with JSON API integration and httptest-based tests
|
|
- DuckDuckGoSource with HTML scraping (credential-free, RespectsRobots=true)
|
|
- YandexSource with XML Search API and encoding/xml response parsing
|
|
- BraveSource with Brave Search API and X-Subscription-Token auth
|
|
- formatQuery updated with dork syntax for all five search engines
|
|
|
|
## Task Commits
|
|
|
|
Each task was committed atomically:
|
|
|
|
1. **Task 1: GoogleDorkSource + BingDorkSource + formatQuery updates** - `7272e65` (feat)
|
|
2. **Task 2: DuckDuckGoSource + YandexSource + BraveSource** - `7707053` (feat)
|
|
|
|
## Files Created/Modified
|
|
- `pkg/recon/sources/google.go` - Google Custom Search JSON API source (APIKey + CX required)
|
|
- `pkg/recon/sources/google_test.go` - Google source tests (enabled, sweep, cancel, unauth)
|
|
- `pkg/recon/sources/bing.go` - Bing Web Search API v7 source (Ocp-Apim-Subscription-Key)
|
|
- `pkg/recon/sources/bing_test.go` - Bing source tests
|
|
- `pkg/recon/sources/duckduckgo.go` - DuckDuckGo HTML scraper (no API key, always enabled)
|
|
- `pkg/recon/sources/duckduckgo_test.go` - DuckDuckGo tests including empty registry
|
|
- `pkg/recon/sources/yandex.go` - Yandex XML Search API (user + key required, XML parsing)
|
|
- `pkg/recon/sources/yandex_test.go` - Yandex tests
|
|
- `pkg/recon/sources/brave.go` - Brave Search API (X-Subscription-Token)
|
|
- `pkg/recon/sources/brave_test.go` - Brave tests
|
|
- `pkg/recon/sources/queries.go` - Added google/bing/duckduckgo/yandex/brave formatQuery cases
|
|
|
|
## Decisions Made
|
|
- All five search sources use dork query format `site:pastebin.com OR site:github.com "keyword"` to focus results on leak-likely sites
|
|
- DuckDuckGo is the only credential-free source; uses HTML scraping with extractAnchorHrefs (shared with Replit)
|
|
- Yandex requires encoding/xml for its XML Search API response format
|
|
- extractGoogleKeyword reverse-parser reused across Bing/Yandex/Brave for keyword-to-provider name mapping
|
|
|
|
## Deviations from Plan
|
|
|
|
None - plan executed exactly as written.
|
|
|
|
## Issues Encountered
|
|
|
|
None.
|
|
|
|
## User Setup Required
|
|
|
|
None - no external service configuration required.
|
|
|
|
## Next Phase Readiness
|
|
- All five search engine sources ready for RegisterAll wiring in Plan 11-03
|
|
- Each source follows established ReconSource pattern for seamless engine integration
|
|
|
|
---
|
|
*Phase: 11-osint-search-paste*
|
|
*Completed: 2026-04-06*
|