Files
keyhunter/.planning/phases/10-osint-code-hosting/10-09-PLAN.md

11 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
10-osint-code-hosting 09 execute 3
10-01
10-02
10-03
10-04
10-05
10-06
10-07
10-08
pkg/recon/sources/register.go
pkg/recon/sources/register_test.go
pkg/recon/sources/integration_test.go
cmd/recon.go
true
truths artifacts key_links
RegisterAll wires all 10 Phase 10 sources onto a recon.Engine
cmd/recon.go buildReconEngine() reads viper config + env vars for tokens and calls RegisterAll
Integration test spins up httptest servers for all sources, runs SweepAll via Engine, asserts Findings from each source arrive with correct SourceType
Guardrail: enabling a source without its required credential logs a skip but does not error
path provides contains
pkg/recon/sources/register.go RegisterAll with 10 source constructors wired engine.Register
path provides
pkg/recon/sources/integration_test.go End-to-end SweepAll test with httptest fixtures for every source
path provides
cmd/recon.go CLI reads config and invokes sources.RegisterAll
from to via pattern
cmd/recon.go pkg/recon/sources.RegisterAll sources.RegisterAll(eng, cfg) sources.RegisterAll
from to via pattern
pkg/recon/sources/register.go pkg/recon.Engine.Register engine.Register(source) engine.Register
Final Wave 3 plan: wire every Phase 10 source into `sources.RegisterAll`, update `cmd/recon.go` to construct a real `SourcesConfig` from viper/env, and add an end-to-end integration test that drives all 10 sources through recon.Engine.SweepAll using httptest fixtures.

Purpose: Users can run keyhunter recon full --sources=github,gitlab,... and get actual findings from any Phase 10 source whose credential is configured. Output: Wired register.go + cmd/recon.go + passing integration test.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/phases/10-osint-code-hosting/10-CONTEXT.md @.planning/phases/10-osint-code-hosting/10-01-SUMMARY.md @.planning/phases/10-osint-code-hosting/10-02-SUMMARY.md @.planning/phases/10-osint-code-hosting/10-03-SUMMARY.md @.planning/phases/10-osint-code-hosting/10-04-SUMMARY.md @.planning/phases/10-osint-code-hosting/10-05-SUMMARY.md @.planning/phases/10-osint-code-hosting/10-06-SUMMARY.md @.planning/phases/10-osint-code-hosting/10-07-SUMMARY.md @.planning/phases/10-osint-code-hosting/10-08-SUMMARY.md @pkg/recon/engine.go @pkg/recon/source.go @pkg/providers/registry.go @cmd/recon.go After Wave 2, each source file in pkg/recon/sources/ exports a constructor roughly of the form: func NewGitHubSource(token, reg, lim) *GitHubSource func NewGitLabSource(token, reg, lim) *GitLabSource func NewBitbucketSource(token, workspace, reg, lim) *BitbucketSource func NewGistSource(token, reg, lim) *GistSource func NewCodebergSource(token, reg, lim) *CodebergSource func NewHuggingFaceSource(token, reg, lim) *HuggingFaceSource func NewReplitSource(reg, lim) *ReplitSource func NewCodeSandboxSource(reg, lim) *CodeSandboxSource func NewSandboxesSource(reg, lim) *SandboxesSource func NewKaggleSource(user, key, reg, lim) *KaggleSource

(Verify actual signatures when reading Wave 2 SUMMARYs before writing register.go.)

Task 1: Wire RegisterAll + register_test.go pkg/recon/sources/register.go, pkg/recon/sources/register_test.go - Test A: RegisterAll with a fresh engine and empty SourcesConfig registers all 10 sources by name (GitHub/GitLab/Bitbucket/Gist/Codeberg/HuggingFace/Replit/CodeSandbox/Sandboxes/Kaggle) - Test B: engine.List() returns all 10 source names in sorted order - Test C: Calling RegisterAll(nil, cfg) is a no-op (no panic) - Test D: Sources without creds are still registered but their Enabled() returns false Rewrite `pkg/recon/sources/register.go` RegisterAll body to construct each source with appropriate fields from SourcesConfig and call engine.Register: ```go func RegisterAll(engine *recon.Engine, cfg SourcesConfig) { if engine == nil { return } reg := cfg.Registry lim := cfg.Limiters engine.Register(NewGitHubSource(cfg.GitHubToken, reg, lim)) engine.Register(NewGitLabSource(cfg.GitLabToken, reg, lim)) engine.Register(NewBitbucketSource(cfg.BitbucketToken, cfg.BitbucketWorkspace, reg, lim)) engine.Register(NewGistSource(cfg.GitHubToken, reg, lim)) engine.Register(NewCodebergSource(cfg.CodebergToken, reg, lim)) engine.Register(NewHuggingFaceSource(cfg.HuggingFaceToken, reg, lim)) engine.Register(NewReplitSource(reg, lim)) engine.Register(NewCodeSandboxSource(reg, lim)) engine.Register(NewSandboxesSource(reg, lim)) engine.Register(NewKaggleSource(cfg.KaggleUser, cfg.KaggleKey, reg, lim)) } ```
Extend SourcesConfig with any fields Wave 2 introduced (BitbucketWorkspace,
CodebergToken). Adjust field names to actual Wave 2 SUMMARY signatures.

Create `pkg/recon/sources/register_test.go`:
- Build minimal registry via providers.NewRegistryFromProviders with 1 synthetic provider
- Build recon.Engine, call RegisterAll with cfg having all creds empty
- Assert eng.List() returns exactly these 10 names:
  bitbucket, codeberg, codesandbox, gist, github, gitlab, huggingface, kaggle, replit, sandboxes
- Assert nil engine call is no-op (no panic)
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run TestRegisterAll -v -timeout 30s RegisterAll wires all 10 sources; registry_test green. Task 2: Integration test across all sources + cmd/recon.go wiring pkg/recon/sources/integration_test.go, cmd/recon.go - Integration test: spins up 10 httptest servers (or one multiplexed server with per-path routing) that return canned responses for each source's endpoints - Uses BaseURL overrides on each source (direct construction, not RegisterAll, since RegisterAll uses production URLs) - Registers each override-configured source on a fresh recon.Engine and calls SweepAll - Asserts at least 1 Finding emerged for each of the 10 SourceType values: recon:github, recon:gitlab, recon:bitbucket, recon:gist, recon:codeberg, recon:huggingface, recon:replit, recon:codesandbox, recon:sandboxes, recon:kaggle - CLI: `keyhunter recon list` (after wiring) prints all 10 source names in addition to "example" Create `pkg/recon/sources/integration_test.go`: - Build a single httptest server with a mux routing per-path: `/search/code` (github) → ghSearchResponse JSON `/api/v4/search` (gitlab) → blob array JSON `/2.0/workspaces/ws/search/code` (bitbucket) → values JSON `/gists/public` + `/raw/gist1` (gist) → gist list + raw matching keyword `/api/v1/repos/search` (codeberg) → data array `/api/spaces`, `/api/models` (huggingface) → id arrays `/search?q=...&type=repls` (replit) → HTML fixture `/search?query=...&type=sandboxes` (codesandbox) → HTML fixture `/codepen-search` (sandboxes sub) → HTML; `/jsfiddle-search` → JSON `/api/v1/kernels/list` (kaggle) → ref array - For each source, construct with BaseURL/Platforms overrides pointing at test server - Register all on a fresh recon.Engine - Provide synthetic providers.Registry with keyword "sk-proj-" matching openai - Call eng.SweepAll(ctx, recon.Config{Query:"ignored"}) - Assert findings grouped by SourceType covers all 10 expected values - Use a 30s test timeout
Update `cmd/recon.go`:
- Import `github.com/salvacybersec/keyhunter/pkg/recon/sources`, `github.com/spf13/viper`, and the providers package
- In `buildReconEngine()`:
  ```go
  func buildReconEngine() *recon.Engine {
      e := recon.NewEngine()
      e.Register(recon.ExampleSource{})
      reg, err := providers.NewRegistry()
      if err != nil {
          fmt.Fprintf(os.Stderr, "recon: failed to load providers: %v\n", err)
          return e
      }
      cfg := sources.SourcesConfig{
          Registry:           reg,
          Limiters:           recon.NewLimiterRegistry(),
          GitHubToken:        firstNonEmpty(os.Getenv("GITHUB_TOKEN"), viper.GetString("recon.github.token")),
          GitLabToken:        firstNonEmpty(os.Getenv("GITLAB_TOKEN"), viper.GetString("recon.gitlab.token")),
          BitbucketToken:     firstNonEmpty(os.Getenv("BITBUCKET_TOKEN"), viper.GetString("recon.bitbucket.token")),
          BitbucketWorkspace: viper.GetString("recon.bitbucket.workspace"),
          CodebergToken:      firstNonEmpty(os.Getenv("CODEBERG_TOKEN"), viper.GetString("recon.codeberg.token")),
          HuggingFaceToken:   firstNonEmpty(os.Getenv("HUGGINGFACE_TOKEN"), viper.GetString("recon.huggingface.token")),
          KaggleUser:         firstNonEmpty(os.Getenv("KAGGLE_USERNAME"), viper.GetString("recon.kaggle.username")),
          KaggleKey:          firstNonEmpty(os.Getenv("KAGGLE_KEY"), viper.GetString("recon.kaggle.key")),
      }
      sources.RegisterAll(e, cfg)
      return e
  }

  func firstNonEmpty(a, b string) string { if a != "" { return a }; return b }
  ```
- Preserve existing reconFullCmd / reconListCmd behavior.
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run TestIntegration -v -timeout 60s && go build ./... && go run . recon list | sort Integration test passes with at least one Finding per SourceType across all 10 sources. `keyhunter recon list` prints all 10 source names plus "example". - `go build ./...` - `go vet ./...` - `go test ./pkg/recon/sources/... -v -timeout 60s` - `go test ./pkg/recon/... -timeout 60s` (ensure no regression in Phase 9 recon tests) - `go run . recon list` prints all 10 new source names

<success_criteria> All Phase 10 code hosting sources registered via sources.RegisterAll, wired into cmd/recon.go, and exercised end-to-end by an integration test hitting httptest fixtures for every source. Phase 10 requirements RECON-CODE-01..10 complete. </success_criteria>

After completion, create `.planning/phases/10-osint-code-hosting/10-09-SUMMARY.md`.