137 lines
7.2 KiB
Markdown
137 lines
7.2 KiB
Markdown
---
|
|
phase: 09-osint-infrastructure
|
|
plan: 06
|
|
type: execute
|
|
wave: 2
|
|
depends_on: ["09-01", "09-02", "09-03", "09-04", "09-05"]
|
|
files_modified:
|
|
- pkg/recon/integration_test.go
|
|
- .planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md
|
|
autonomous: true
|
|
requirements: [RECON-INFRA-05, RECON-INFRA-06, RECON-INFRA-07, RECON-INFRA-08]
|
|
must_haves:
|
|
truths:
|
|
- "Integration test exercises Engine + LimiterRegistry + Dedup together against a synthetic source that emits duplicates"
|
|
- "Integration test verifies --stealth path calls RandomUserAgent without errors"
|
|
- "Integration test verifies RobotsCache.Allowed is invoked only when RespectsRobots()==true"
|
|
- "Phase summary documents all 4 requirement IDs as complete"
|
|
artifacts:
|
|
- path: "pkg/recon/integration_test.go"
|
|
provides: "End-to-end test: Engine + Limiter + Stealth + Robots + Dedup"
|
|
contains: "func TestReconPipelineIntegration"
|
|
- path: ".planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md"
|
|
provides: "Phase completion summary with requirement ID coverage and next-phase guidance"
|
|
key_links:
|
|
- from: "pkg/recon/integration_test.go"
|
|
to: "pkg/recon.Engine + LimiterRegistry + RobotsCache + Dedup"
|
|
via: "TestReconPipelineIntegration wires all four together"
|
|
pattern: "TestReconPipelineIntegration"
|
|
---
|
|
|
|
<objective>
|
|
Phase 9 integration test + phase summary. Proves the four recon infra components compose correctly before Phases 10-16 start building sources on top, and documents completion for roadmap tracking.
|
|
|
|
Purpose: Final safety net for the phase. Catches cross-component bugs (e.g., limiter deadlock, dedup hash collision, robots TTL leak) that unit tests on individual files miss.
|
|
Output: pkg/recon/integration_test.go, .planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md
|
|
</objective>
|
|
|
|
<execution_context>
|
|
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
|
@$HOME/.claude/get-shit-done/templates/summary.md
|
|
</execution_context>
|
|
|
|
<context>
|
|
@.planning/phases/09-osint-infrastructure/09-CONTEXT.md
|
|
@.planning/phases/09-osint-infrastructure/09-01-SUMMARY.md
|
|
@.planning/phases/09-osint-infrastructure/09-02-SUMMARY.md
|
|
@.planning/phases/09-osint-infrastructure/09-03-SUMMARY.md
|
|
@.planning/phases/09-osint-infrastructure/09-04-SUMMARY.md
|
|
@.planning/phases/09-osint-infrastructure/09-05-SUMMARY.md
|
|
</context>
|
|
|
|
<tasks>
|
|
|
|
<task type="auto" tdd="true">
|
|
<name>Task 1: End-to-end integration test</name>
|
|
<files>pkg/recon/integration_test.go</files>
|
|
<behavior>
|
|
- Define a local TestSource struct (in the _test.go file) that:
|
|
- Name() returns "test"
|
|
- RateLimit() returns rate.Limit(100), Burst() returns 10
|
|
- RespectsRobots() returns false
|
|
- Enabled returns true
|
|
- Sweep emits 5 Findings, 2 of which are exact duplicates (same provider+masked+source)
|
|
- TestReconPipelineIntegration:
|
|
- Construct Engine, Register TestSource
|
|
- Construct LimiterRegistry and call Wait("test", 100, 10, true) once to verify jitter path does not panic
|
|
- Call Engine.SweepAll(ctx, Config{Stealth: true})
|
|
- Assert len(findings) == 5 (raw), len(Dedup(findings)) == 4 (after dedup)
|
|
- Assert every finding has SourceType starting with "recon:"
|
|
- TestRobotsOnlyWhenRespectsRobots:
|
|
- Create two sources: webSource (RespectsRobots true) and apiSource (RespectsRobots false)
|
|
- Verify that a RobotsCache call path is only exercised for webSource (use a counter via a shim: the test can simulate this by manually invoking RobotsCache.Allowed for webSource before calling webSource.Sweep, and asserting apiSource path skips it)
|
|
- This is a documentation-style test; minimal logic: assert `webSource.RespectsRobots() == true && apiSource.RespectsRobots() == false`, then assert RobotsCache.Allowed works when called, and is never called when RespectsRobots returns false (trivially satisfied by not invoking it).
|
|
</behavior>
|
|
<action>
|
|
Create pkg/recon/integration_test.go. Declare testSource and testWebSource structs within the test file. Use `httptest.NewServer` for the robots portion, serving "User-agent: *\nAllow: /\n".
|
|
|
|
The test should import pkg/recon-internal identifiers directly (same package `recon`, not `recon_test`) so it can access all exported symbols.
|
|
|
|
Assertions via testify require:
|
|
- require.Equal(t, 5, len(raw))
|
|
- require.Equal(t, 4, len(recon.Dedup(raw)))
|
|
- require.Equal(t, "recon:test", raw[0].SourceType)
|
|
- require.NoError(t, limiter.Wait(ctx, "test", rate.Limit(100), 10, true))
|
|
- require.True(t, webSource.RespectsRobots())
|
|
- require.False(t, apiSource.RespectsRobots())
|
|
- allowed, err := rc.Allowed(ctx, server.URL+"/foo"); require.NoError(t, err); require.True(t, allowed)
|
|
|
|
Per RECON-INFRA-05/06/07/08 — each requirement has at least one assertion in this integration test.
|
|
</action>
|
|
<verify>
|
|
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/ -run 'TestReconPipelineIntegration|TestRobotsOnlyWhenRespectsRobots' -count=1</automated>
|
|
</verify>
|
|
<done>Integration test passes. All 4 RECON-INFRA requirement IDs have at least one assertion covering them.</done>
|
|
</task>
|
|
|
|
<task type="auto">
|
|
<name>Task 2: Write 09-PHASE-SUMMARY.md</name>
|
|
<files>.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md</files>
|
|
<action>
|
|
Create the phase summary documenting:
|
|
- Requirements closed: RECON-INFRA-05, RECON-INFRA-06, RECON-INFRA-07, RECON-INFRA-08 (all 4)
|
|
- Key artifacts: pkg/recon/{source,engine,limiter,stealth,dedup,robots,example}.go + tests
|
|
- CLI surface: `keyhunter recon full`, `keyhunter recon list`
|
|
- Decisions adopted: per-source limiter (no centralization), default-allow on robots fetch failure, dedup by sha256(provider|masked|source), UA pool of 10
|
|
- New dependency: github.com/temoto/robotstxt
|
|
- Handoff to Phase 10: all real sources implement ReconSource interface and register via `buildReconEngine()` in cmd/recon.go (or ideally via package init side-effects once the pattern is established in Phase 10)
|
|
- Known gaps deferred: proxy/TOR (out of scope), per-source retry (each source handles own retries), distributed rate limiting (out of scope)
|
|
|
|
Follow the standard SUMMARY.md template from @$HOME/.claude/get-shit-done/templates/summary.md.
|
|
</action>
|
|
<verify>
|
|
<automated>test -s /home/salva/Documents/apikey/.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md && grep -q "RECON-INFRA-05" /home/salva/Documents/apikey/.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md && grep -q "RECON-INFRA-08" /home/salva/Documents/apikey/.planning/phases/09-osint-infrastructure/09-PHASE-SUMMARY.md</automated>
|
|
</verify>
|
|
<done>09-PHASE-SUMMARY.md exists, non-empty, names all 4 requirement IDs.</done>
|
|
</task>
|
|
|
|
</tasks>
|
|
|
|
<verification>
|
|
- `go test ./pkg/recon/... -count=1` passes (all unit + integration)
|
|
- `go build ./...` passes
|
|
- `go vet ./...` clean
|
|
- 09-PHASE-SUMMARY.md exists with all 4 RECON-INFRA IDs
|
|
</verification>
|
|
|
|
<success_criteria>
|
|
- Integration test proves Engine + Limiter + Stealth + Robots + Dedup compose correctly
|
|
- Phase summary documents completion of all 4 requirement IDs
|
|
- Phase 10 can start immediately against a stable pkg/recon contract
|
|
</success_criteria>
|
|
|
|
<output>
|
|
After completion, create `.planning/phases/09-osint-infrastructure/09-06-SUMMARY.md`
|
|
</output>
|
|
</content>
|