- FOFASource searches FOFA API with base64-encoded queries (email+key auth)
- NetlasSource searches Netlas API with X-API-Key header auth
- BinaryEdgeSource searches BinaryEdge API with X-Key header auth
- All three implement recon.ReconSource with shared Client retry/backoff
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extend httptest mux with fixtures for Google, Bing, DuckDuckGo, Yandex, Brave
- Add Pastebin (routed /pb/), GistPaste (/gp/), PasteSites (injected platform)
- Assert all 18 SourceTypes emit at least one finding via SweepAll
- DuckDuckGoSource scrapes HTML search (no API key, always enabled, RespectsRobots=true)
- YandexSource uses Yandex XML Search API (user+key required, XML response parsing)
- BraveSource uses Brave Search API (X-Subscription-Token header, JSON response)
- All three follow established error handling: 401 aborts, transient continues, ctx cancellation returns
- GoogleDorkSource uses Google Custom Search JSON API (APIKey+CX required)
- BingDorkSource uses Bing Web Search API v7 (Ocp-Apim-Subscription-Key header)
- formatQuery now handles google/bing/duckduckgo/yandex/brave dork syntax
- Both sources follow established pattern: retry via Client, rate limit via LimiterRegistry
- PastebinSource: two-phase search+raw-fetch with keyword matching
- GistPasteSource: scrapes gist.github.com public search (no auth)
- Both implement recon.ReconSource with httptest-based tests
Closes 2 verification gaps:
1. --sources=github,gitlab flag filters registered sources before sweep
2. Findings persisted to SQLite via storage.SaveFinding after dedup
Also adds Engine.Get() method for source lookup by name.
- queries /api/spaces and /api/models via Hub API
- token optional: slower rate when absent (10s vs 3.6s)
- emits Findings with SourceType=recon:huggingface and prefixed Source URLs
- compile-time assert implements recon.ReconSource
- BuildQueries(reg, source) dedups keywords and formats per-source syntax
- github/gist use 'keyword' in:file; others use bare keyword
- SourcesConfig placeholder struct for Wave 2 plans to depend on
- RegisterAll no-op stub (Plan 10-09 will fill)
- Dedup drops duplicates keyed by sha256(ProviderName|KeyMasked|Source)
- Preserves input order and first-seen metadata (stable dedup)
- Same provider+masked with different Source URLs are kept separate
- Uses engine.Finding directly to avoid alias collision with Plan 09-01
- Engine.Register/List/SweepAll with ants pool fanout
- ExampleSource emits two deterministic findings (SourceType=recon:example)
- Tests cover Register/List idempotency, SweepAll aggregation, empty-registry,
and Enabled() filtering
- Parses robots.txt via temoto/robotstxt
- Caches per host for 1 hour; second call within TTL skips HTTP fetch
- Default-allow on network/parse/4xx/5xx errors
- Matches 'keyhunter' user-agent against disallowed paths
- Client field allows httptest injection
Satisfies RECON-INFRA-07.
- Pool of 10 realistic browser User-Agents (Chrome/Firefox/Safari/Edge)
- Covers Windows, macOS, Linux, iOS, Android
- RandomUserAgent returns a random pool entry
- StealthHeaders returns UA + Accept-Language header map