--- phase: 09-osint-infrastructure plan: 03 subsystem: recon tags: [stealth, user-agent, dedup, sha256, osint] requires: - phase: 09-osint-infrastructure provides: "pkg/recon package namespace (Plan 09-01, parallel wave 1)" provides: - "pkg/recon/stealth.go: 10-entry browser UA pool with RandomUserAgent/StealthHeaders helpers" - "pkg/recon/dedup.go: stable cross-source Finding dedup keyed by sha256(provider|masked|source)" affects: [09-01, 09-02, 10-sources, 11-sources, 12-sources, 13-sources, 14-sources, 15-sources, 16-sources] tech-stack: added: [] patterns: - "stdlib-only dedup (crypto/sha256 + encoding/hex)" - "first-seen-wins stable dedup preserving input order" - "cross-platform UA pool covering desktop + mobile" key-files: created: - pkg/recon/stealth.go - pkg/recon/stealth_test.go - pkg/recon/dedup.go - pkg/recon/dedup_test.go modified: [] key-decisions: - "Use engine.Finding directly in dedup.go instead of a local Finding alias to avoid duplicate type declaration with Plan 09-01's source.go in parallel wave 1" - "Hash key = sha256(ProviderName|KeyMasked|Source) so same key found at different URLs is retained" - "Stable dedup: first-seen metadata (DetectedAt, Confidence) wins over later duplicates" patterns-established: - "Stealth mode helpers: exported RandomUserAgent + StealthHeaders for recon sources to merge into requests" - "Stable dedup primitive: Dedup([]engine.Finding) []engine.Finding, stdlib only, O(n)" requirements-completed: [RECON-INFRA-06] duration: 8min completed: 2026-04-05 --- # Phase 09 Plan 03: Stealth UA Pool + Cross-Source Dedup Summary **10-entry browser User-Agent pool with RandomUserAgent/StealthHeaders and a stable SHA256-keyed Finding Dedup primitive ready for SweepAll orchestration.** ## Performance - **Duration:** ~8 min - **Started:** 2026-04-05T21:35:00Z - **Completed:** 2026-04-05T21:43:18Z - **Tasks:** 2 (both TDD) - **Files created:** 4 ## Accomplishments - Stealth UA pool with 10 realistic browser User-Agents covering Chrome/Firefox/Safari/Edge on Windows, macOS, Linux, iOS, and Android - `RandomUserAgent()` + `StealthHeaders()` helpers returning rotated UA + `Accept-Language: en-US,en;q=0.9` - Stable cross-source `Dedup([]engine.Finding) []engine.Finding` keyed by `sha256(ProviderName|KeyMasked|Source)` - First-seen metadata preserved; different Source URLs keep the same provider+masked key as distinct findings - `go test ./pkg/recon/` green, `go vet ./pkg/recon/...` clean ## Task Commits TDD flow (test → feat per task): 1. **Task 1: Stealth UA pool + RandomUserAgent** - RED: `bbbc05f` (test: add failing test for stealth UA pool) - GREEN: `2c140e9` (feat: implement stealth UA pool and StealthHeaders) 2. **Task 2: Cross-source finding dedup** - RED: `ecfa2bf` (test: add failing test for cross-source Dedup) - GREEN: `2988fdf` (feat: implement stable cross-source finding Dedup) ## Files Created/Modified - `pkg/recon/stealth.go` — 10-entry UA pool, `RandomUserAgent`, `StealthHeaders` - `pkg/recon/stealth_test.go` — `TestUAPoolSize`, `TestRandomUserAgentInPool` (100 iterations), `TestStealthHeadersHasUA` - `pkg/recon/dedup.go` — `Dedup([]engine.Finding) []engine.Finding` with sha256 key + stable first-seen semantics - `pkg/recon/dedup_test.go` — `TestDedupEmpty`, `TestDedupNoDuplicates`, `TestDedupAllDuplicates`, `TestDedupPreservesFirstSeen`, `TestDedupDifferentSource` ## Decisions Made - **Use `engine.Finding` directly in `dedup.go` rather than a local `recon.Finding` alias.** Plan 09-01 (same wave, parallel) will declare `type Finding = engine.Finding` in `pkg/recon/source.go`. Declaring it again here would cause a post-merge duplicate declaration. Importing `engine.Finding` explicitly is forward-compatible — when 09-01 merges, `recon.Finding` becomes available and this file continues to compile either way. - **Dedup key = `sha256(ProviderName|KeyMasked|Source)`.** Masked key avoids hashing plaintext; including `Source` ensures a leaked key found at multiple URLs is reported at every location rather than collapsed to one. - **Stable first-seen wins.** Iteration is single-pass with a `seen` map; output order matches input order. ## Deviations from Plan ### Auto-fixed Issues **1. [Rule 3 - Blocking] Use `engine.Finding` instead of local `Finding` alias** - **Found during:** Task 2 (Dedup implementation) - **Issue:** Plan 09-03 executes in wave 1 parallel with Plan 09-01. Plan 09-01 declares `type Finding = engine.Finding` in `pkg/recon/source.go`. The original plan body for 09-03 referenced bare `Finding` in `dedup.go`, which would require either a duplicate alias (post-merge conflict/duplicate declaration) or a dependency on 09-01's file that does not yet exist on this branch. - **Fix:** Imported `github.com/salvacybersec/keyhunter/pkg/engine` in `dedup.go` and `dedup_test.go` and used `engine.Finding` directly. Behavior and test coverage are identical; signature is `Dedup([]engine.Finding) []engine.Finding`. A doc comment in `dedup.go` records the rationale. - **Files modified:** `pkg/recon/dedup.go`, `pkg/recon/dedup_test.go` - **Verification:** `go test ./pkg/recon/ -count=1` passes; `go vet ./pkg/recon/...` clean. - **Committed in:** `2988fdf` (Task 2 GREEN commit) --- **Total deviations:** 1 auto-fixed (1 blocking / parallel-safety) **Impact on plan:** No scope change. The public signature matches downstream expectations because `recon.Finding` is a type alias — `[]recon.Finding` and `[]engine.Finding` are interchangeable, so SweepAll (Plan 09-01) can still call `Dedup` without any adapter. ## Issues Encountered None beyond the deviation above. ## User Setup Required None. ## Next Phase Readiness - Plan 09-02 (rate limiter + jitter) can import `StealthHeaders` for outbound requests when `Config.Stealth` is true. - Plan 09-01's `Engine.SweepAll` can call `recon.Dedup(all)` before returning to satisfy RECON-INFRA-08's "deduplicates findings before persisting" criterion. - RECON-INFRA-06 (stealth UA rotation) satisfied. ## Self-Check: PASSED - FOUND: pkg/recon/stealth.go - FOUND: pkg/recon/stealth_test.go - FOUND: pkg/recon/dedup.go - FOUND: pkg/recon/dedup_test.go - FOUND commit: bbbc05f - FOUND commit: 2c140e9 - FOUND commit: ecfa2bf - FOUND commit: 2988fdf --- *Phase: 09-osint-infrastructure* *Plan: 03* *Completed: 2026-04-05*