TestReconPipelineIntegration wires all four together
TestReconPipelineIntegration
Phase 9 integration test + phase summary. Proves the four recon infra components compose correctly before Phases 10-16 start building sources on top, and documents completion for roadmap tracking.
Purpose: Final safety net for the phase. Catches cross-component bugs (e.g., limiter deadlock, dedup hash collision, robots TTL leak) that unit tests on individual files miss.
Output: pkg/recon/integration_test.go, .planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md
@.planning/phases/09-osint-infrastructure/09-CONTEXT.md
@.planning/phases/09-osint-infrastructure/09-01-SUMMARY.md
@.planning/phases/09-osint-infrastructure/09-02-SUMMARY.md
@.planning/phases/09-osint-infrastructure/09-03-SUMMARY.md
@.planning/phases/09-osint-infrastructure/09-04-SUMMARY.md
@.planning/phases/09-osint-infrastructure/09-05-SUMMARY.md
Task 1: End-to-end integration test
pkg/recon/integration_test.go
- Define a local TestSource struct (in the _test.go file) that:
- Name() returns "test"
- RateLimit() returns rate.Limit(100), Burst() returns 10
- RespectsRobots() returns false
- Enabled returns true
- Sweep emits 5 Findings, 2 of which are exact duplicates (same provider+masked+source)
- TestReconPipelineIntegration:
- Construct Engine, Register TestSource
- Construct LimiterRegistry and call Wait("test", 100, 10, true) once to verify jitter path does not panic
- Call Engine.SweepAll(ctx, Config{Stealth: true})
- Assert len(findings) == 5 (raw), len(Dedup(findings)) == 4 (after dedup)
- Assert every finding has SourceType starting with "recon:"
- TestRobotsOnlyWhenRespectsRobots:
- Create two sources: webSource (RespectsRobots true) and apiSource (RespectsRobots false)
- Verify that a RobotsCache call path is only exercised for webSource (use a counter via a shim: the test can simulate this by manually invoking RobotsCache.Allowed for webSource before calling webSource.Sweep, and asserting apiSource path skips it)
- This is a documentation-style test; minimal logic: assert `webSource.RespectsRobots() == true && apiSource.RespectsRobots() == false`, then assert RobotsCache.Allowed works when called, and is never called when RespectsRobots returns false (trivially satisfied by not invoking it).
Create pkg/recon/integration_test.go. Declare testSource and testWebSource structs within the test file. Use `httptest.NewServer` for the robots portion, serving "User-agent: *\nAllow: /\n".
The test should import pkg/recon-internal identifiers directly (same package `recon`, not `recon_test`) so it can access all exported symbols.
Assertions via testify require:
- require.Equal(t, 5, len(raw))
- require.Equal(t, 4, len(recon.Dedup(raw)))
- require.Equal(t, "recon:test", raw[0].SourceType)
- require.NoError(t, limiter.Wait(ctx, "test", rate.Limit(100), 10, true))
- require.True(t, webSource.RespectsRobots())
- require.False(t, apiSource.RespectsRobots())
- allowed, err := rc.Allowed(ctx, server.URL+"/foo"); require.NoError(t, err); require.True(t, allowed)
Per RECON-INFRA-05/06/07/08 — each requirement has at least one assertion in this integration test.
cd /home/salva/Documents/apikey && go test ./pkg/recon/ -run 'TestReconPipelineIntegration|TestRobotsOnlyWhenRespectsRobots' -count=1
Integration test passes. All 4 RECON-INFRA requirement IDs have at least one assertion covering them.
Task 2: Write 09-PHASE-SUMMARY.md
.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md
Create the phase summary documenting:
- Requirements closed: RECON-INFRA-05, RECON-INFRA-06, RECON-INFRA-07, RECON-INFRA-08 (all 4)
- Key artifacts: pkg/recon/{source,engine,limiter,stealth,dedup,robots,example}.go + tests
- CLI surface: `keyhunter recon full`, `keyhunter recon list`
- Decisions adopted: per-source limiter (no centralization), default-allow on robots fetch failure, dedup by sha256(provider|masked|source), UA pool of 10
- New dependency: github.com/temoto/robotstxt
- Handoff to Phase 10: all real sources implement ReconSource interface and register via `buildReconEngine()` in cmd/recon.go (or ideally via package init side-effects once the pattern is established in Phase 10)
- Known gaps deferred: proxy/TOR (out of scope), per-source retry (each source handles own retries), distributed rate limiting (out of scope)
Follow the standard SUMMARY.md template from @$HOME/.claude/get-shit-done/templates/summary.md.
test -s /home/salva/Documents/apikey/.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md && grep -q "RECON-INFRA-05" /home/salva/Documents/apikey/.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md && grep -q "RECON-INFRA-08" /home/salva/Documents/apikey/.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md
09-PHASE-SUMMARY.md exists, non-empty, names all 4 requirement IDs.
- `go test ./pkg/recon/... -count=1` passes (all unit + integration)
- `go build ./...` passes
- `go vet ./...` clean
- 09-PHASE-SUMMARY.md exists with all 4 RECON-INFRA IDs