Files
keyhunter/.planning/phases/09-osint-infrastructure/09-06-PLAN.md
2026-04-06 00:39:27 +03:00

137 lines
7.2 KiB
Markdown

---
phase: 09-osint-infrastructure
plan: 06
type: execute
wave: 2
depends_on: ["09-01", "09-02", "09-03", "09-04", "09-05"]
files_modified:
- pkg/recon/integration_test.go
- .planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md
autonomous: true
requirements: [RECON-INFRA-05, RECON-INFRA-06, RECON-INFRA-07, RECON-INFRA-08]
must_haves:
truths:
- "Integration test exercises Engine + LimiterRegistry + Dedup together against a synthetic source that emits duplicates"
- "Integration test verifies --stealth path calls RandomUserAgent without errors"
- "Integration test verifies RobotsCache.Allowed is invoked only when RespectsRobots()==true"
- "Phase summary documents all 4 requirement IDs as complete"
artifacts:
- path: "pkg/recon/integration_test.go"
provides: "End-to-end test: Engine + Limiter + Stealth + Robots + Dedup"
contains: "func TestReconPipelineIntegration"
- path: ".planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md"
provides: "Phase completion summary with requirement ID coverage and next-phase guidance"
key_links:
- from: "pkg/recon/integration_test.go"
to: "pkg/recon.Engine + LimiterRegistry + RobotsCache + Dedup"
via: "TestReconPipelineIntegration wires all four together"
pattern: "TestReconPipelineIntegration"
---
<objective>
Phase 9 integration test + phase summary. Proves the four recon infra components compose correctly before Phases 10-16 start building sources on top, and documents completion for roadmap tracking.
Purpose: Final safety net for the phase. Catches cross-component bugs (e.g., limiter deadlock, dedup hash collision, robots TTL leak) that unit tests on individual files miss.
Output: pkg/recon/integration_test.go, .planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/phases/09-osint-infrastructure/09-CONTEXT.md
@.planning/phases/09-osint-infrastructure/09-01-SUMMARY.md
@.planning/phases/09-osint-infrastructure/09-02-SUMMARY.md
@.planning/phases/09-osint-infrastructure/09-03-SUMMARY.md
@.planning/phases/09-osint-infrastructure/09-04-SUMMARY.md
@.planning/phases/09-osint-infrastructure/09-05-SUMMARY.md
</context>
<tasks>
<task type="auto" tdd="true">
<name>Task 1: End-to-end integration test</name>
<files>pkg/recon/integration_test.go</files>
<behavior>
- Define a local TestSource struct (in the _test.go file) that:
- Name() returns "test"
- RateLimit() returns rate.Limit(100), Burst() returns 10
- RespectsRobots() returns false
- Enabled returns true
- Sweep emits 5 Findings, 2 of which are exact duplicates (same provider+masked+source)
- TestReconPipelineIntegration:
- Construct Engine, Register TestSource
- Construct LimiterRegistry and call Wait("test", 100, 10, true) once to verify jitter path does not panic
- Call Engine.SweepAll(ctx, Config{Stealth: true})
- Assert len(findings) == 5 (raw), len(Dedup(findings)) == 4 (after dedup)
- Assert every finding has SourceType starting with "recon:"
- TestRobotsOnlyWhenRespectsRobots:
- Create two sources: webSource (RespectsRobots true) and apiSource (RespectsRobots false)
- Verify that a RobotsCache call path is only exercised for webSource (use a counter via a shim: the test can simulate this by manually invoking RobotsCache.Allowed for webSource before calling webSource.Sweep, and asserting apiSource path skips it)
- This is a documentation-style test; minimal logic: assert `webSource.RespectsRobots() == true && apiSource.RespectsRobots() == false`, then assert RobotsCache.Allowed works when called, and is never called when RespectsRobots returns false (trivially satisfied by not invoking it).
</behavior>
<action>
Create pkg/recon/integration_test.go. Declare testSource and testWebSource structs within the test file. Use `httptest.NewServer` for the robots portion, serving "User-agent: *\nAllow: /\n".
The test should import pkg/recon-internal identifiers directly (same package `recon`, not `recon_test`) so it can access all exported symbols.
Assertions via testify require:
- require.Equal(t, 5, len(raw))
- require.Equal(t, 4, len(recon.Dedup(raw)))
- require.Equal(t, "recon:test", raw[0].SourceType)
- require.NoError(t, limiter.Wait(ctx, "test", rate.Limit(100), 10, true))
- require.True(t, webSource.RespectsRobots())
- require.False(t, apiSource.RespectsRobots())
- allowed, err := rc.Allowed(ctx, server.URL+"/foo"); require.NoError(t, err); require.True(t, allowed)
Per RECON-INFRA-05/06/07/08 — each requirement has at least one assertion in this integration test.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/ -run 'TestReconPipelineIntegration|TestRobotsOnlyWhenRespectsRobots' -count=1</automated>
</verify>
<done>Integration test passes. All 4 RECON-INFRA requirement IDs have at least one assertion covering them.</done>
</task>
<task type="auto">
<name>Task 2: Write 09-PHASE-SUMMARY.md</name>
<files>.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md</files>
<action>
Create the phase summary documenting:
- Requirements closed: RECON-INFRA-05, RECON-INFRA-06, RECON-INFRA-07, RECON-INFRA-08 (all 4)
- Key artifacts: pkg/recon/{source,engine,limiter,stealth,dedup,robots,example}.go + tests
- CLI surface: `keyhunter recon full`, `keyhunter recon list`
- Decisions adopted: per-source limiter (no centralization), default-allow on robots fetch failure, dedup by sha256(provider|masked|source), UA pool of 10
- New dependency: github.com/temoto/robotstxt
- Handoff to Phase 10: all real sources implement ReconSource interface and register via `buildReconEngine()` in cmd/recon.go (or ideally via package init side-effects once the pattern is established in Phase 10)
- Known gaps deferred: proxy/TOR (out of scope), per-source retry (each source handles own retries), distributed rate limiting (out of scope)
Follow the standard SUMMARY.md template from @$HOME/.claude/get-shit-done/templates/summary.md.
</action>
<verify>
<automated>test -s /home/salva/Documents/apikey/.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md && grep -q "RECON-INFRA-05" /home/salva/Documents/apikey/.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md && grep -q "RECON-INFRA-08" /home/salva/Documents/apikey/.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md</automated>
</verify>
<done>09-PHASE-SUMMARY.md exists, non-empty, names all 4 requirement IDs.</done>
</task>
</tasks>
<verification>
- `go test ./pkg/recon/... -count=1` passes (all unit + integration)
- `go build ./...` passes
- `go vet ./...` clean
- 09-PHASE-SUMMARY.md exists with all 4 RECON-INFRA IDs
</verification>
<success_criteria>
- Integration test proves Engine + Limiter + Stealth + Robots + Dedup compose correctly
- Phase summary documents completion of all 4 requirement IDs
- Phase 10 can start immediately against a stable pkg/recon contract
</success_criteria>
<output>
After completion, create `.planning/phases/09-osint-infrastructure/09-06-SUMMARY.md`
</output>
</content>