Files
keyhunter/.planning/phases/11-osint_search_paste/11-03-PLAN.md

11 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
11-osint-search-paste 03 execute 2
11-01
11-02
pkg/recon/sources/register.go
pkg/recon/sources/register_test.go
pkg/recon/sources/integration_test.go
cmd/recon.go
true
RECON-DORK-01
RECON-DORK-02
RECON-DORK-03
RECON-PASTE-01
truths artifacts key_links
RegisterAll wires all 8 new Phase 11 sources onto the recon engine alongside the 10 Phase 10 sources
cmd/recon.go reads Google/Bing/Yandex/Brave API keys from env vars and viper config
keyhunter recon list shows all 18 sources (10 Phase 10 + 8 Phase 11)
Integration test with httptest fixtures proves SweepAll emits findings from all 18 source types
Sources with missing credentials are registered but Enabled()==false
path provides contains
pkg/recon/sources/register.go RegisterAll extended with Phase 11 sources GoogleDorkSource
path provides contains
pkg/recon/sources/register_test.go Guardrail test asserting 18 sources registered 18
path provides contains
pkg/recon/sources/integration_test.go SweepAll integration test covering all 18 sources recon:google
path provides contains
cmd/recon.go Credential wiring for search engine API keys GoogleAPIKey
from to via pattern
pkg/recon/sources/register.go pkg/recon/sources/google.go RegisterAll calls engine.Register(GoogleDorkSource) GoogleDorkSource
from to via pattern
cmd/recon.go pkg/recon/sources/register.go SourcesConfig credential fields GoogleAPIKey|GoogleCX|BingAPIKey|YandexUser|YandexAPIKey|BraveAPIKey
Wire all 8 Phase 11 sources into RegisterAll, extend SourcesConfig with search engine credentials, update cmd/recon.go for env/viper credential lookup, and create the integration test proving all 18 sources work end-to-end via SweepAll.

Purpose: Complete Phase 11 by connecting all new sources to the engine and proving the full 18-source sweep works. Output: Updated register.go, register_test.go, integration_test.go, cmd/recon.go.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @pkg/recon/sources/register.go @pkg/recon/sources/register_test.go @pkg/recon/sources/integration_test.go @cmd/recon.go From pkg/recon/sources/register.go (current): ```go type SourcesConfig struct { GitHubToken string GitLabToken string BitbucketToken string BitbucketWorkspace string CodebergToken string HuggingFaceToken string KaggleUser string KaggleKey string Registry *providers.Registry Limiters *recon.LimiterRegistry } func RegisterAll(engine *recon.Engine, cfg SourcesConfig) ```

From cmd/recon.go (current):

func buildReconEngine() *recon.Engine  // constructs SourcesConfig, calls RegisterAll
func firstNonEmpty(a, b string) string

New sources from Plan 11-01 (to be registered):

type GoogleDorkSource struct { APIKey, CX, BaseURL string; Registry; Limiters; client }
type BingDorkSource struct { APIKey, BaseURL string; Registry; Limiters; client }
type DuckDuckGoSource struct { BaseURL string; Registry; Limiters; Client }
type YandexSource struct { User, APIKey, BaseURL string; Registry; Limiters; client }
type BraveSource struct { APIKey, BaseURL string; Registry; Limiters; client }

New sources from Plan 11-02 (to be registered):

type PastebinSource struct { BaseURL string; Registry; Limiters; Client }
type GistPasteSource struct { BaseURL string; Registry; Limiters; Client }
type PasteSitesSource struct { Platforms; BaseURL string; Registry; Limiters; Client }
Task 1: Extend SourcesConfig + RegisterAll + cmd/recon.go credential wiring pkg/recon/sources/register.go, pkg/recon/sources/register_test.go, cmd/recon.go - SourcesConfig gains 6 new fields: GoogleAPIKey, GoogleCX, BingAPIKey, YandexUser, YandexAPIKey, BraveAPIKey - RegisterAll registers 18 sources total (10 Phase 10 + 8 Phase 11) - RegisterAll with nil engine is still a no-op - TestRegisterAll_WiresAllEighteenSources asserts eng.List() contains all 18 names sorted - TestRegisterAll_MissingCredsStillRegistered asserts 18 sources with empty config - buildReconEngine reads: GOOGLE_API_KEY / recon.google.api_key, GOOGLE_CX / recon.google.cx, BING_API_KEY / recon.bing.api_key, YANDEX_USER / recon.yandex.user, YANDEX_API_KEY / recon.yandex.api_key, BRAVE_API_KEY / recon.brave.api_key - reconCmd Long description updated to mention Phase 11 sources Update `pkg/recon/sources/register.go`: - Add to SourcesConfig: GoogleAPIKey, GoogleCX, BingAPIKey, YandexUser, YandexAPIKey, BraveAPIKey (all string) - Add Phase 11 registrations to RegisterAll after the Phase 10 block: ``` // Phase 11: Search engine dorking sources. engine.Register(&GoogleDorkSource{APIKey: cfg.GoogleAPIKey, CX: cfg.GoogleCX, Registry: reg, Limiters: lim}) engine.Register(&BingDorkSource{APIKey: cfg.BingAPIKey, Registry: reg, Limiters: lim}) engine.Register(&DuckDuckGoSource{Registry: reg, Limiters: lim}) engine.Register(&YandexSource{User: cfg.YandexUser, APIKey: cfg.YandexAPIKey, Registry: reg, Limiters: lim}) engine.Register(&BraveSource{APIKey: cfg.BraveAPIKey, Registry: reg, Limiters: lim})
  // Phase 11: Paste site sources.
  engine.Register(&PastebinSource{Registry: reg, Limiters: lim})
  engine.Register(&GistPasteSource{Registry: reg, Limiters: lim})
  engine.Register(&PasteSitesSource{Registry: reg, Limiters: lim})
  ```
- Update doc comment on RegisterAll to say "Phase 10 + Phase 11" and total "18 sources"

Update `pkg/recon/sources/register_test.go`:
- TestRegisterAll_WiresAllEighteenSources: want list = sorted 18 names: ["bing", "bitbucket", "brave", "codeberg", "codesandbox", "duckduckgo", "gist", "gistpaste", "github", "gitlab", "google", "huggingface", "kaggle", "pastebin", "pastesites", "replit", "sandboxes", "yandex"]
- TestRegisterAll_MissingCredsStillRegistered: assert n == 18

Update `cmd/recon.go`:
- Add to SourcesConfig construction in buildReconEngine():
  GoogleAPIKey: firstNonEmpty(os.Getenv("GOOGLE_API_KEY"), viper.GetString("recon.google.api_key")),
  GoogleCX:     firstNonEmpty(os.Getenv("GOOGLE_CX"), viper.GetString("recon.google.cx")),
  BingAPIKey:   firstNonEmpty(os.Getenv("BING_API_KEY"), viper.GetString("recon.bing.api_key")),
  YandexUser:   firstNonEmpty(os.Getenv("YANDEX_USER"), viper.GetString("recon.yandex.user")),
  YandexAPIKey: firstNonEmpty(os.Getenv("YANDEX_API_KEY"), viper.GetString("recon.yandex.api_key")),
  BraveAPIKey:  firstNonEmpty(os.Getenv("BRAVE_API_KEY"), viper.GetString("recon.brave.api_key")),
- Update reconCmd.Long to list Phase 11 sources
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestRegisterAll" -v -count=1 && go build ./cmd/... RegisterAll registers 18 sources. cmd/recon.go compiles with credential wiring. Guardrail tests pass. Task 2: Integration test -- SweepAll across all 18 sources pkg/recon/sources/integration_test.go - TestIntegration_AllSources_SweepAll registers all 18 sources with BaseURL overrides pointing at an httptest mux - SweepAll returns findings from all 18 SourceType values - Each SourceType (recon:github, recon:gitlab, ..., recon:google, recon:bing, recon:duckduckgo, recon:yandex, recon:brave, recon:pastebin, recon:gistpaste, recon:pastesites) has at least 1 finding Update `pkg/recon/sources/integration_test.go`: - Extend the existing httptest mux with handlers for the 8 new sources:
Google Custom Search: mux.HandleFunc("/customsearch/v1", ...) serves JSON `{"items":[{"link":"https://pastebin.com/abc123","title":"leak","snippet":"sk-proj-xxx"}]}`

Bing Web Search: mux.HandleFunc("/v7.0/search", ...) serves JSON `{"webPages":{"value":[{"url":"https://example.com/leak","name":"leak"}]}}`

DuckDuckGo HTML: mux.HandleFunc("/html/", ...) serves HTML with `<a class="result__a" href="https://example.com/ddg-leak">result</a>`

Yandex XML: mux.HandleFunc("/search/xml", ...) serves XML `<yandexsearch><response><results><grouping><group><doc><url>https://example.com/yandex-leak</url></doc></group></grouping></results></response></yandexsearch>`

Brave Search: mux.HandleFunc("/res/v1/web/search", ...) serves JSON `{"web":{"results":[{"url":"https://example.com/brave-leak","title":"leak"}]}}`

Pastebin search + raw: mux.HandleFunc("/pastebin-search", ...) serves HTML with paste links; mux.HandleFunc("/pastebin-raw/", ...) serves raw content with "sk-proj-ABC"

GistPaste search + raw: mux.HandleFunc("/gistpaste-search", ...) serves HTML with gist links; mux.HandleFunc("/gistpaste-raw/", ...) serves raw content with keyword

PasteSites: mux.HandleFunc("/pastesites-search", ...) + mux.HandleFunc("/pastesites-raw/", ...) similar pattern

Register all 18 sources on the engine with BaseURL=srv.URL, appropriate credentials for API sources (fake tokens). Then call eng.SweepAll and assert byType map has all 18 SourceType keys.

Update wantTypes to include: "recon:google", "recon:bing", "recon:duckduckgo", "recon:yandex", "recon:brave", "recon:pastebin", "recon:gistpaste", "recon:pastesites"

Keep the existing 10 Phase 10 source fixtures and registrations intact.
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestIntegration_AllSources" -v -count=1 -timeout=60s Integration test proves SweepAll emits findings from all 18 sources. Full Phase 11 wiring confirmed end-to-end. Full Phase 11 verification: ```bash cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -v -count=1 -timeout=120s && go build ./cmd/... ```

<success_criteria>

  • RegisterAll registers 18 sources (10 Phase 10 + 8 Phase 11)
  • cmd/recon.go compiles with all credential wiring
  • Integration test passes with all 18 SourceTypes emitting findings
  • go build ./cmd/... succeeds
  • Guardrail test asserts exact 18-source name list </success_criteria>
After completion, create `.planning/phases/11-osint_search_paste/11-03-SUMMARY.md`