Files
2026-04-06 00:44:04 +03:00

7.0 KiB

phase, plan, subsystem, tags, dependency-graph, tech-stack, key-files, decisions, metrics
phase plan subsystem tags dependency-graph tech-stack key-files decisions metrics
09-osint-infrastructure 01 pkg/recon
recon
osint
interface
engine
ants
fanout
requires provides affects
pkg/engine.Finding
github.com/panjf2000/ants/v2
golang.org/x/time/rate
recon.ReconSource interface
recon.Engine (Register/List/SweepAll)
recon.Config
recon.Finding (alias of engine.Finding)
recon.ExampleSource (reference stub)
Wave 1 siblings (limiter, stealth, robots) compose with Engine
Wave 2 CLI plan (09-05) wires Engine into cmd/recon
Phases 10-16 implement ReconSource for real sources
added patterns
Ants pool for parallel source fanout
Type alias (Finding = engine.Finding) for shared storage path
Interface-per-source plugin model
created modified
pkg/recon/source.go
pkg/recon/engine.go
pkg/recon/example.go
pkg/recon/engine_test.go
Finding is a type alias of engine.Finding, not a new struct — keeps storage and verification paths unified; recon sources only need to set SourceType=recon:<name>
Dedup is intentionally NOT done in SweepAll — plan 09-03 owns pkg/recon/dedup.go; SweepAll only aggregates
Engine sizes the ants pool to len(active) sources — small N (tens, not thousands), so per-sweep pool allocation is cheap and avoids cross-sweep state
Context cancellation in SweepAll drains the out channel in a detached goroutine to prevent source senders from blocking after cancel
Register is idempotent by Name() — re-registering replaces; guards against double-init loops
duration completed tasks files
~3m 2026-04-05 2 4

Phase 9 Plan 1: Recon Framework Foundation Summary

ReconSource interface, Engine with ants-pool parallel fanout, and ExampleSource stub — the contract every OSINT source in Phases 10-16 will implement.

What Was Built

pkg/recon/source.go defines the public contract:

  • ReconSource interface: Name() / RateLimit() / Burst() / RespectsRobots() / Enabled(Config) / Sweep(ctx, query, out)
  • Config struct: Stealth, RespectRobots, EnabledSources, Query
  • Finding as a Go type alias of engine.Finding — recon findings flow through the same storage path as file/git/stdin scanning; sources simply set SourceType = "recon:<name>".

pkg/recon/engine.go is the orchestrator:

  • NewEngine() / Register(src) / List() with an RWMutex-guarded map and sorted name listing.
  • SweepAll(ctx, cfg) collects enabled sources, allocates an ants.Pool sized to len(active), submits one goroutine per source, aggregates findings via a buffered channel, and closes on WaitGroup completion. A context-cancel branch starts a detached drainer so senders never block post-cancel.
  • Deduplication is deliberately deferred to plan 09-03 (dedup.go).

pkg/recon/example.go ships a deterministic ExampleSource that emits two fake findings (openai + anthropic, masked keys, recon:example SourceType). It lets Wave 2 CLI work and the dashboard verify the end-to-end pipeline without any network I/O.

pkg/recon/engine_test.go covers:

  • TestRegisterList — empty engine, register, idempotent re-register, sorted output.
  • TestSweepAll — full fanout path via ExampleSource, asserts 2 findings tagged recon:example with populated provider/masked/source fields.
  • TestSweepAll_NoSources — empty registry returns nil, nil.
  • TestSweepAll_FiltersDisabled — sources whose Enabled() returns false are excluded.

Tasks Completed

Task Name Commit Files
1 ReconSource interface + Config 10af12d pkg/recon/source.go
2 Engine + ExampleSource + tests 851b243 pkg/recon/engine.go, example.go, engine_test.go

Verification

  • go build ./pkg/recon/... — clean
  • go vet ./pkg/recon/... — clean
  • go test ./pkg/recon/ -count=1 — PASS (4/4 new tests; existing limiter/robots tests from sibling Wave 1 plans continue to pass)

Key Decisions

  1. Type alias over new struct. type Finding = engine.Finding means recon findings are byte-identical to scan findings. Storage, verification, and output paths already handle them; sources only tag SourceType = "recon:<name>". Avoids a parallel Finding hierarchy.

  2. Per-sweep pool. ants.NewPool(len(active)) is allocated inside SweepAll and released via defer pool.Release(). With tens of sources this is cheap and eliminates shared-state bugs across concurrent sweeps. A long-lived shared pool can be introduced later if profiling warrants it.

  3. Dedup deferred. Per 09-CONTEXT.md, pkg/recon/dedup.go is owned by plan 09-03. SweepAll returns the raw aggregate so the caller can choose when to dedup (batched persistence vs streaming).

  4. Cancellation safety. On ctx.Done() mid-collection, SweepAll spawns a detached for range out {} drainer before returning ctx.Err(). This prevents goroutines inside the ants pool from blocking on out <- f after the caller has left.

  5. ExampleSource is a real implementation, not a mock. It lives in the production package (no _test.go suffix) so the CLI (keyhunter recon list / recon full) and the dashboard can exercise the pipeline end-to-end before any Phase 10-16 source lands. It performs zero network I/O.

Deviations from Plan

None — plan executed exactly as written. Tests were written RED before the engine/example implementation (TDD per tdd="true"), then driven to GREEN. Added two extra tests beyond the plan's stated minimum (TestSweepAll_NoSources, TestSweepAll_FiltersDisabled) to cover the empty-registry and disabled-source branches of SweepAll — pure additive coverage, no behavior change.

Known Stubs

  • ExampleSource is itself a stub by design (documented in 09-CONTEXT.md and in the source file doc comment). It will remain in the package as a reference implementation and CI smoke-test source; real sources replace it in Phases 10-16. This is an intentional stub, not an unfinished task.

Interfaces Provided to Downstream Plans

// Wave 1 siblings (limiter, stealth, robots) compose orthogonally with Engine:
type ReconSource interface {
    Name() string
    RateLimit() rate.Limit
    Burst() int
    RespectsRobots() bool
    Enabled(cfg Config) bool
    Sweep(ctx context.Context, query string, out chan<- Finding) error
}

// Wave 2 CLI (plan 09-05) will call:
e := recon.NewEngine()
e.Register(shodan.New(...))
e.Register(github.New(...))
findings, err := e.SweepAll(ctx, recon.Config{Stealth: true, Query: "..."})

Self-Check: PASSED

  • pkg/recon/source.go — FOUND
  • pkg/recon/engine.go — FOUND
  • pkg/recon/example.go — FOUND
  • pkg/recon/engine_test.go — FOUND
  • Commit 10af12d — FOUND
  • Commit 851b243 — FOUND
  • go test ./pkg/recon/ -count=1 — PASS