Files
keyhunter/.planning/phases/10-osint-code-hosting/10-01-SUMMARY.md
2026-04-06 01:10:57 +03:00

4.9 KiB

phase, plan, subsystem, tags, requires, provides, affects, tech_stack_added, patterns, key_files_created, key_files_modified, decisions, metrics
phase plan subsystem tags requires provides affects tech_stack_added patterns key_files_created key_files_modified decisions metrics
10-osint-code-hosting 01 recon/sources
recon
osint
http
foundation
wave-1
pkg/recon.Engine (Phase 9)
pkg/providers.Registry
pkg/recon.LimiterRegistry (Phase 9)
pkg/recon/sources.Client (retry-aware HTTP wrapper)
pkg/recon/sources.ErrUnauthorized
pkg/recon/sources.ParseRetryAfter
pkg/recon/sources.BuildQueries
pkg/recon/sources.SourcesConfig
pkg/recon/sources.RegisterAll (stub)
pkg/recon/sources (new package)
Retry on 429/403/5xx honoring Retry-After; 401 is terminal
Context cancellation honored during retry backoff sleeps
Provider-driven query generation with per-source syntax switch
pkg/recon/sources/doc.go
pkg/recon/sources/httpclient.go
pkg/recon/sources/httpclient_test.go
pkg/recon/sources/queries.go
pkg/recon/sources/queries_test.go
pkg/recon/sources/register.go
pkg/recon/sources/testhelpers_test.go
Client handles retry only; callers invoke LimiterRegistry.Wait before Do (single-purpose)
github/gist use 'kw' in:file syntax; gitlab/bitbucket/codeberg/huggingface use bare keywords
Unknown source names fall back to bare keyword (safe default for future sources)
SourcesConfig shipped as placeholder struct so Wave 2 plans can type-depend on its shape
duration_minutes tasks_completed tests_added completed_at
4 2 18 2026-04-05T22:10:00Z

Phase 10 Plan 01: Recon Sources Foundation Summary

One-liner: Retry-aware HTTP client, provider-driven query generator, and empty RegisterAll bootstrap that unblocks Wave 2 plans 10-02..10-08 to run in parallel.

What Was Built

The shared foundation for every Phase 10 code-hosting source now lives in pkg/recon/sources:

  1. Client — wraps *http.Client with retry on 429/403/5xx, Retry-After honoring, and context cancellation during backoff. 401 short-circuits to ErrUnauthorized (no retries). Default UA keyhunter-recon/1.0, 30s timeout, 2 retries.
  2. BuildQueries(reg, source) — iterates providers.Registry.List(), dedups keywords across providers, sorts for determinism, and applies per-source search syntax via formatQuery. GitHub and Gist get "keyword" in:file; all others get the bare keyword.
  3. SourcesConfig + RegisterAll — placeholder struct carrying per-source tokens and shared Registry/Limiters, plus a no-op registration function with a nil-engine guard. Plan 10-09 will fill the body after Wave 2 delivers individual sources.

Tasks

# Name Commit Status
1 Shared retry HTTP client helper 75024e4 done
2 Provider-driven query generator + RegisterAll skeleton 9273f35 done

Tests

All tests green (go test ./pkg/recon/sources/... → PASS in ~3.1s).

HTTP client tests (httptest-backed):

  • OK pass-through, 429 retry, 403 retry, 401 no-retry (ErrUnauthorized), ctx cancel during backoff, retries exhausted, default UA, ParseRetryAfter table.

Query generator tests:

  • GitHub/Gist in:file syntax, GitLab/HuggingFace bare keywords, unknown-source default, nil registry, cross-provider dedup, empty-keyword skip.

RegisterAll tests:

  • Nil-engine no-panic, empty-cfg no-panic on real engine.

Decisions Made

  • Single-purpose Client: rate limiting is caller's job via recon.LimiterRegistry.Wait, keeping retry/backoff logic decoupled from rate policy. Avoids coupling 10 sources to a single limiter injection shape.
  • Deterministic queries: sorting keywords means test output is stable and cache keys are reproducible when future plans memoize search results.
  • Placeholder SourcesConfig: Wave 2 plans can write sources.SourcesConfig{...} against a stable shape before Plan 10-09 ships credential loading.

Deviations from Plan

None — plan executed exactly as written. A small testhelpers_test.go file was added (not listed in files_modified) purely to expose a test-only newTestEngine() helper shared between test files; this is idiomatic Go test scaffolding, not a functional deviation.

Verification

  • go build ./... — clean
  • go vet ./pkg/recon/sources/... — clean
  • go test ./pkg/recon/sources/... -timeout 60s — PASS (3.11s)

Ready For

Wave 2 plans (10-02 GitHub, 10-03 GitLab, 10-04 Bitbucket, 10-05 Gist, 10-06 Codeberg, 10-07 HuggingFace, 10-08 Kaggle/sandboxes) can now import pkg/recon/sources and use Client + BuildQueries in parallel without conflicts. Plan 10-09 will populate RegisterAll with the full source list and wire it into cmd/recon.go.

Self-Check: PASSED

All 7 artifact files present, both commits (75024e4, 9273f35) reachable in git history, SUMMARY.md on disk.