docs(09-01): complete recon framework foundation plan
This commit is contained in:
135
.planning/phases/09-osint-infrastructure/09-01-SUMMARY.md
Normal file
135
.planning/phases/09-osint-infrastructure/09-01-SUMMARY.md
Normal file
@@ -0,0 +1,135 @@
|
||||
---
|
||||
phase: 09-osint-infrastructure
|
||||
plan: 01
|
||||
subsystem: pkg/recon
|
||||
tags: [recon, osint, interface, engine, ants, fanout]
|
||||
dependency-graph:
|
||||
requires:
|
||||
- pkg/engine.Finding
|
||||
- github.com/panjf2000/ants/v2
|
||||
- golang.org/x/time/rate
|
||||
provides:
|
||||
- recon.ReconSource interface
|
||||
- recon.Engine (Register/List/SweepAll)
|
||||
- recon.Config
|
||||
- recon.Finding (alias of engine.Finding)
|
||||
- recon.ExampleSource (reference stub)
|
||||
affects:
|
||||
- Wave 1 siblings (limiter, stealth, robots) compose with Engine
|
||||
- Wave 2 CLI plan (09-05) wires Engine into cmd/recon
|
||||
- Phases 10-16 implement ReconSource for real sources
|
||||
tech-stack:
|
||||
added: []
|
||||
patterns:
|
||||
- Ants pool for parallel source fanout
|
||||
- Type alias (Finding = engine.Finding) for shared storage path
|
||||
- Interface-per-source plugin model
|
||||
key-files:
|
||||
created:
|
||||
- pkg/recon/source.go
|
||||
- pkg/recon/engine.go
|
||||
- pkg/recon/example.go
|
||||
- pkg/recon/engine_test.go
|
||||
modified: []
|
||||
decisions:
|
||||
- "Finding is a type alias of engine.Finding, not a new struct — keeps storage and verification paths unified; recon sources only need to set SourceType=recon:<name>"
|
||||
- "Dedup is intentionally NOT done in SweepAll — plan 09-03 owns pkg/recon/dedup.go; SweepAll only aggregates"
|
||||
- "Engine sizes the ants pool to len(active) sources — small N (tens, not thousands), so per-sweep pool allocation is cheap and avoids cross-sweep state"
|
||||
- "Context cancellation in SweepAll drains the out channel in a detached goroutine to prevent source senders from blocking after cancel"
|
||||
- "Register is idempotent by Name() — re-registering replaces; guards against double-init loops"
|
||||
metrics:
|
||||
duration: "~3m"
|
||||
completed: "2026-04-05"
|
||||
tasks: 2
|
||||
files: 4
|
||||
---
|
||||
|
||||
# Phase 9 Plan 1: Recon Framework Foundation Summary
|
||||
|
||||
ReconSource interface, Engine with ants-pool parallel fanout, and ExampleSource stub — the contract every OSINT source in Phases 10-16 will implement.
|
||||
|
||||
## What Was Built
|
||||
|
||||
**`pkg/recon/source.go`** defines the public contract:
|
||||
|
||||
- `ReconSource` interface: `Name() / RateLimit() / Burst() / RespectsRobots() / Enabled(Config) / Sweep(ctx, query, out)`
|
||||
- `Config` struct: `Stealth`, `RespectRobots`, `EnabledSources`, `Query`
|
||||
- `Finding` as a Go type alias of `engine.Finding` — recon findings flow through the same storage path as file/git/stdin scanning; sources simply set `SourceType = "recon:<name>"`.
|
||||
|
||||
**`pkg/recon/engine.go`** is the orchestrator:
|
||||
|
||||
- `NewEngine()` / `Register(src)` / `List()` with an RWMutex-guarded map and sorted name listing.
|
||||
- `SweepAll(ctx, cfg)` collects enabled sources, allocates an `ants.Pool` sized to `len(active)`, submits one goroutine per source, aggregates findings via a buffered channel, and closes on WaitGroup completion. A context-cancel branch starts a detached drainer so senders never block post-cancel.
|
||||
- Deduplication is deliberately deferred to plan 09-03 (`dedup.go`).
|
||||
|
||||
**`pkg/recon/example.go`** ships a deterministic `ExampleSource` that emits two fake findings (openai + anthropic, masked keys, `recon:example` SourceType). It lets Wave 2 CLI work and the dashboard verify the end-to-end pipeline without any network I/O.
|
||||
|
||||
**`pkg/recon/engine_test.go`** covers:
|
||||
|
||||
- `TestRegisterList` — empty engine, register, idempotent re-register, sorted output.
|
||||
- `TestSweepAll` — full fanout path via ExampleSource, asserts 2 findings tagged `recon:example` with populated provider/masked/source fields.
|
||||
- `TestSweepAll_NoSources` — empty registry returns `nil, nil`.
|
||||
- `TestSweepAll_FiltersDisabled` — sources whose `Enabled()` returns false are excluded.
|
||||
|
||||
## Tasks Completed
|
||||
|
||||
| Task | Name | Commit | Files |
|
||||
| ---- | ------------------------------------------ | -------- | ------------------------------------------------------- |
|
||||
| 1 | ReconSource interface + Config | 10af12d | pkg/recon/source.go |
|
||||
| 2 | Engine + ExampleSource + tests | 851b243 | pkg/recon/engine.go, example.go, engine_test.go |
|
||||
|
||||
## Verification
|
||||
|
||||
- `go build ./pkg/recon/...` — clean
|
||||
- `go vet ./pkg/recon/...` — clean
|
||||
- `go test ./pkg/recon/ -count=1` — PASS (4/4 new tests; existing limiter/robots tests from sibling Wave 1 plans continue to pass)
|
||||
|
||||
## Key Decisions
|
||||
|
||||
1. **Type alias over new struct.** `type Finding = engine.Finding` means recon findings are byte-identical to scan findings. Storage, verification, and output paths already handle them; sources only tag `SourceType = "recon:<name>"`. Avoids a parallel Finding hierarchy.
|
||||
|
||||
2. **Per-sweep pool.** `ants.NewPool(len(active))` is allocated inside `SweepAll` and released via `defer pool.Release()`. With tens of sources this is cheap and eliminates shared-state bugs across concurrent sweeps. A long-lived shared pool can be introduced later if profiling warrants it.
|
||||
|
||||
3. **Dedup deferred.** Per 09-CONTEXT.md, `pkg/recon/dedup.go` is owned by plan 09-03. `SweepAll` returns the raw aggregate so the caller can choose when to dedup (batched persistence vs streaming).
|
||||
|
||||
4. **Cancellation safety.** On `ctx.Done()` mid-collection, `SweepAll` spawns a detached `for range out {}` drainer before returning `ctx.Err()`. This prevents goroutines inside the ants pool from blocking on `out <- f` after the caller has left.
|
||||
|
||||
5. **ExampleSource is a real implementation, not a mock.** It lives in the production package (no `_test.go` suffix) so the CLI (`keyhunter recon list` / `recon full`) and the dashboard can exercise the pipeline end-to-end before any Phase 10-16 source lands. It performs zero network I/O.
|
||||
|
||||
## Deviations from Plan
|
||||
|
||||
None — plan executed exactly as written. Tests were written RED before the engine/example implementation (TDD per `tdd="true"`), then driven to GREEN. Added two extra tests beyond the plan's stated minimum (`TestSweepAll_NoSources`, `TestSweepAll_FiltersDisabled`) to cover the empty-registry and disabled-source branches of `SweepAll` — pure additive coverage, no behavior change.
|
||||
|
||||
## Known Stubs
|
||||
|
||||
- `ExampleSource` is itself a stub by design (documented in 09-CONTEXT.md and in the source file doc comment). It will remain in the package as a reference implementation and CI smoke-test source; real sources replace it in Phases 10-16. This is an intentional stub, not an unfinished task.
|
||||
|
||||
## Interfaces Provided to Downstream Plans
|
||||
|
||||
```go
|
||||
// Wave 1 siblings (limiter, stealth, robots) compose orthogonally with Engine:
|
||||
type ReconSource interface {
|
||||
Name() string
|
||||
RateLimit() rate.Limit
|
||||
Burst() int
|
||||
RespectsRobots() bool
|
||||
Enabled(cfg Config) bool
|
||||
Sweep(ctx context.Context, query string, out chan<- Finding) error
|
||||
}
|
||||
|
||||
// Wave 2 CLI (plan 09-05) will call:
|
||||
e := recon.NewEngine()
|
||||
e.Register(shodan.New(...))
|
||||
e.Register(github.New(...))
|
||||
findings, err := e.SweepAll(ctx, recon.Config{Stealth: true, Query: "..."})
|
||||
```
|
||||
|
||||
## Self-Check: PASSED
|
||||
|
||||
- pkg/recon/source.go — FOUND
|
||||
- pkg/recon/engine.go — FOUND
|
||||
- pkg/recon/example.go — FOUND
|
||||
- pkg/recon/engine_test.go — FOUND
|
||||
- Commit 10af12d — FOUND
|
||||
- Commit 851b243 — FOUND
|
||||
- `go test ./pkg/recon/ -count=1` — PASS
|
||||
Reference in New Issue
Block a user