Commit Graph

64 Commits

Author SHA1 Message Date
salvacybersec
03deb603b3 test(10-02): add failing tests for GitHubSource 2026-04-06 01:12:56 +03:00
salvacybersec
9273f356e6 feat(10-01): add provider-driven query generator and RegisterAll skeleton
- BuildQueries(reg, source) dedups keywords and formats per-source syntax
- github/gist use 'keyword' in:file; others use bare keyword
- SourcesConfig placeholder struct for Wave 2 plans to depend on
- RegisterAll no-op stub (Plan 10-09 will fill)
2026-04-06 01:09:57 +03:00
salvacybersec
75024e4701 feat(10-01): add shared retry HTTP client for recon sources
- Client.Do retries 429/403/5xx honoring Retry-After
- 401 returns ErrUnauthorized immediately (no retry)
- Context cancellation honored during retry sleeps
- Default UA keyhunter-recon/1.0, 30s timeout, 2 retries
2026-04-06 01:09:02 +03:00
salvacybersec
a754ff7546 test(09-06): add recon pipeline integration test
- Exercises Engine + LimiterRegistry + Stealth + Dedup end-to-end
- testSource emits 5 findings with one duplicate pair (Dedup -> 4)
- TestRobotsOnlyWhenRespectsRobots asserts robots gating via httptest
- Covers RECON-INFRA-05/06/07/08
2026-04-06 00:51:08 +03:00
salvacybersec
c2137edc41 merge: plan 09-03 stealth+dedup 2026-04-06 00:45:13 +03:00
salvacybersec
2988fdf9b3 feat(09-03): implement stable cross-source finding Dedup
- Dedup drops duplicates keyed by sha256(ProviderName|KeyMasked|Source)
- Preserves input order and first-seen metadata (stable dedup)
- Same provider+masked with different Source URLs are kept separate
- Uses engine.Finding directly to avoid alias collision with Plan 09-01
2026-04-06 00:43:07 +03:00
salvacybersec
851b2432b8 feat(09-01): add Engine with parallel fanout and ExampleSource
- Engine.Register/List/SweepAll with ants pool fanout
- ExampleSource emits two deterministic findings (SourceType=recon:example)
- Tests cover Register/List idempotency, SweepAll aggregation, empty-registry,
  and Enabled() filtering
2026-04-06 00:42:51 +03:00
salvacybersec
ecfa2bff28 test(09-03): add failing test for cross-source Dedup 2026-04-06 00:42:45 +03:00
salvacybersec
0373931490 feat(09-04): implement RobotsCache with 1h per-host TTL
- Parses robots.txt via temoto/robotstxt
- Caches per host for 1 hour; second call within TTL skips HTTP fetch
- Default-allow on network/parse/4xx/5xx errors
- Matches 'keyhunter' user-agent against disallowed paths
- Client field allows httptest injection

Satisfies RECON-INFRA-07.
2026-04-06 00:42:33 +03:00
salvacybersec
2c140e9661 feat(09-03): implement stealth UA pool and StealthHeaders
- Pool of 10 realistic browser User-Agents (Chrome/Firefox/Safari/Edge)
- Covers Windows, macOS, Linux, iOS, Android
- RandomUserAgent returns a random pool entry
- StealthHeaders returns UA + Accept-Language header map
2026-04-06 00:42:22 +03:00
salvacybersec
4bd6c6b05f test(09-04): add failing tests for RobotsCache
- Allowed/Disallowed path matching
- Cache hit counter assertion
- Default-allow on 5xx network error
- keyhunter UA matching precedence
2026-04-06 00:42:03 +03:00
salvacybersec
bbbc05fa46 test(09-03): add failing test for stealth UA pool 2026-04-06 00:41:55 +03:00
salvacybersec
590fc33955 feat(09-02): add LimiterRegistry with per-source rate limiters and jitter
- NewLimiterRegistry + For(name, rate, burst) idempotent lookup
- Wait blocks on token then applies 100ms-1s jitter when stealth
- Per-source isolation (RECON-INFRA-05), ctx cancellation honored
- Tests: isolation, idempotency, ctx cancel, jitter range, no-jitter
2026-04-06 00:41:33 +03:00
salvacybersec
10af12d358 feat(09-01): add ReconSource interface and Config
- Define ReconSource interface: Name/RateLimit/Burst/RespectsRobots/Enabled/Sweep
- Alias recon.Finding = engine.Finding for shared storage path
- Config struct carries Stealth, RespectRobots, EnabledSources, Query
2026-04-06 00:40:46 +03:00