228 lines
11 KiB
Markdown
228 lines
11 KiB
Markdown
---
|
|
phase: 10-osint-code-hosting
|
|
plan: 09
|
|
type: execute
|
|
wave: 3
|
|
depends_on: [10-01, 10-02, 10-03, 10-04, 10-05, 10-06, 10-07, 10-08]
|
|
files_modified:
|
|
- pkg/recon/sources/register.go
|
|
- pkg/recon/sources/register_test.go
|
|
- pkg/recon/sources/integration_test.go
|
|
- cmd/recon.go
|
|
autonomous: true
|
|
requirements: []
|
|
must_haves:
|
|
truths:
|
|
- "RegisterAll wires all 10 Phase 10 sources onto a recon.Engine"
|
|
- "cmd/recon.go buildReconEngine() reads viper config + env vars for tokens and calls RegisterAll"
|
|
- "Integration test spins up httptest servers for all sources, runs SweepAll via Engine, asserts Findings from each source arrive with correct SourceType"
|
|
- "Guardrail: enabling a source without its required credential logs a skip but does not error"
|
|
artifacts:
|
|
- path: "pkg/recon/sources/register.go"
|
|
provides: "RegisterAll with 10 source constructors wired"
|
|
contains: "engine.Register"
|
|
- path: "pkg/recon/sources/integration_test.go"
|
|
provides: "End-to-end SweepAll test with httptest fixtures for every source"
|
|
- path: "cmd/recon.go"
|
|
provides: "CLI reads config and invokes sources.RegisterAll"
|
|
key_links:
|
|
- from: "cmd/recon.go"
|
|
to: "pkg/recon/sources.RegisterAll"
|
|
via: "sources.RegisterAll(eng, cfg)"
|
|
pattern: "sources\\.RegisterAll"
|
|
- from: "pkg/recon/sources/register.go"
|
|
to: "pkg/recon.Engine.Register"
|
|
via: "engine.Register(source)"
|
|
pattern: "engine\\.Register"
|
|
---
|
|
|
|
<objective>
|
|
Final Wave 3 plan: wire every Phase 10 source into `sources.RegisterAll`, update
|
|
`cmd/recon.go` to construct a real `SourcesConfig` from viper/env, and add an
|
|
end-to-end integration test that drives all 10 sources through recon.Engine.SweepAll
|
|
using httptest fixtures.
|
|
|
|
Purpose: Users can run `keyhunter recon full --sources=github,gitlab,...` and get
|
|
actual findings from any Phase 10 source whose credential is configured.
|
|
Output: Wired register.go + cmd/recon.go + passing integration test.
|
|
</objective>
|
|
|
|
<execution_context>
|
|
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
|
@$HOME/.claude/get-shit-done/templates/summary.md
|
|
</execution_context>
|
|
|
|
<context>
|
|
@.planning/phases/10-osint-code-hosting/10-CONTEXT.md
|
|
@.planning/phases/10-osint-code-hosting/10-01-SUMMARY.md
|
|
@.planning/phases/10-osint-code-hosting/10-02-SUMMARY.md
|
|
@.planning/phases/10-osint-code-hosting/10-03-SUMMARY.md
|
|
@.planning/phases/10-osint-code-hosting/10-04-SUMMARY.md
|
|
@.planning/phases/10-osint-code-hosting/10-05-SUMMARY.md
|
|
@.planning/phases/10-osint-code-hosting/10-06-SUMMARY.md
|
|
@.planning/phases/10-osint-code-hosting/10-07-SUMMARY.md
|
|
@.planning/phases/10-osint-code-hosting/10-08-SUMMARY.md
|
|
@pkg/recon/engine.go
|
|
@pkg/recon/source.go
|
|
@pkg/providers/registry.go
|
|
@cmd/recon.go
|
|
|
|
<interfaces>
|
|
After Wave 2, each source file in pkg/recon/sources/ exports a constructor
|
|
roughly of the form:
|
|
func NewGitHubSource(token, reg, lim) *GitHubSource
|
|
func NewGitLabSource(token, reg, lim) *GitLabSource
|
|
func NewBitbucketSource(token, workspace, reg, lim) *BitbucketSource
|
|
func NewGistSource(token, reg, lim) *GistSource
|
|
func NewCodebergSource(token, reg, lim) *CodebergSource
|
|
func NewHuggingFaceSource(token, reg, lim) *HuggingFaceSource
|
|
func NewReplitSource(reg, lim) *ReplitSource
|
|
func NewCodeSandboxSource(reg, lim) *CodeSandboxSource
|
|
func NewSandboxesSource(reg, lim) *SandboxesSource
|
|
func NewKaggleSource(user, key, reg, lim) *KaggleSource
|
|
|
|
(Verify actual signatures when reading Wave 2 SUMMARYs before writing register.go.)
|
|
</interfaces>
|
|
</context>
|
|
|
|
<tasks>
|
|
|
|
<task type="auto" tdd="true">
|
|
<name>Task 1: Wire RegisterAll + register_test.go</name>
|
|
<files>pkg/recon/sources/register.go, pkg/recon/sources/register_test.go</files>
|
|
<behavior>
|
|
- Test A: RegisterAll with a fresh engine and empty SourcesConfig registers all 10 sources by name (GitHub/GitLab/Bitbucket/Gist/Codeberg/HuggingFace/Replit/CodeSandbox/Sandboxes/Kaggle)
|
|
- Test B: engine.List() returns all 10 source names in sorted order
|
|
- Test C: Calling RegisterAll(nil, cfg) is a no-op (no panic)
|
|
- Test D: Sources without creds are still registered but their Enabled() returns false
|
|
</behavior>
|
|
<action>
|
|
Rewrite `pkg/recon/sources/register.go` RegisterAll body to construct each
|
|
source with appropriate fields from SourcesConfig and call engine.Register:
|
|
```go
|
|
func RegisterAll(engine *recon.Engine, cfg SourcesConfig) {
|
|
if engine == nil { return }
|
|
reg := cfg.Registry
|
|
lim := cfg.Limiters
|
|
engine.Register(NewGitHubSource(cfg.GitHubToken, reg, lim))
|
|
engine.Register(NewGitLabSource(cfg.GitLabToken, reg, lim))
|
|
engine.Register(NewBitbucketSource(cfg.BitbucketToken, cfg.BitbucketWorkspace, reg, lim))
|
|
engine.Register(NewGistSource(cfg.GitHubToken, reg, lim))
|
|
engine.Register(NewCodebergSource(cfg.CodebergToken, reg, lim))
|
|
engine.Register(NewHuggingFaceSource(cfg.HuggingFaceToken, reg, lim))
|
|
engine.Register(NewReplitSource(reg, lim))
|
|
engine.Register(NewCodeSandboxSource(reg, lim))
|
|
engine.Register(NewSandboxesSource(reg, lim))
|
|
engine.Register(NewKaggleSource(cfg.KaggleUser, cfg.KaggleKey, reg, lim))
|
|
}
|
|
```
|
|
|
|
Extend SourcesConfig with any fields Wave 2 introduced (BitbucketWorkspace,
|
|
CodebergToken). Adjust field names to actual Wave 2 SUMMARY signatures.
|
|
|
|
Create `pkg/recon/sources/register_test.go`:
|
|
- Build minimal registry via providers.NewRegistryFromProviders with 1 synthetic provider
|
|
- Build recon.Engine, call RegisterAll with cfg having all creds empty
|
|
- Assert eng.List() returns exactly these 10 names:
|
|
bitbucket, codeberg, codesandbox, gist, github, gitlab, huggingface, kaggle, replit, sandboxes
|
|
- Assert nil engine call is no-op (no panic)
|
|
</action>
|
|
<verify>
|
|
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run TestRegisterAll -v -timeout 30s</automated>
|
|
</verify>
|
|
<done>
|
|
RegisterAll wires all 10 sources; registry_test green.
|
|
</done>
|
|
</task>
|
|
|
|
<task type="auto" tdd="true">
|
|
<name>Task 2: Integration test across all sources + cmd/recon.go wiring</name>
|
|
<files>pkg/recon/sources/integration_test.go, cmd/recon.go</files>
|
|
<behavior>
|
|
- Integration test: spins up 10 httptest servers (or one multiplexed server with per-path routing) that return canned responses for each source's endpoints
|
|
- Uses BaseURL overrides on each source (direct construction, not RegisterAll, since RegisterAll uses production URLs)
|
|
- Registers each override-configured source on a fresh recon.Engine and calls SweepAll
|
|
- Asserts at least 1 Finding emerged for each of the 10 SourceType values: recon:github, recon:gitlab, recon:bitbucket, recon:gist, recon:codeberg, recon:huggingface, recon:replit, recon:codesandbox, recon:sandboxes, recon:kaggle
|
|
- CLI: `keyhunter recon list` (after wiring) prints all 10 source names in addition to "example"
|
|
</behavior>
|
|
<action>
|
|
Create `pkg/recon/sources/integration_test.go`:
|
|
- Build a single httptest server with a mux routing per-path:
|
|
`/search/code` (github) → ghSearchResponse JSON
|
|
`/api/v4/search` (gitlab) → blob array JSON
|
|
`/2.0/workspaces/ws/search/code` (bitbucket) → values JSON
|
|
`/gists/public` + `/raw/gist1` (gist) → gist list + raw matching keyword
|
|
`/api/v1/repos/search` (codeberg) → data array
|
|
`/api/spaces`, `/api/models` (huggingface) → id arrays
|
|
`/search?q=...&type=repls` (replit) → HTML fixture
|
|
`/search?query=...&type=sandboxes` (codesandbox) → HTML fixture
|
|
`/codepen-search` (sandboxes sub) → HTML; `/jsfiddle-search` → JSON
|
|
`/api/v1/kernels/list` (kaggle) → ref array
|
|
- For each source, construct with BaseURL/Platforms overrides pointing at test server
|
|
- Register all on a fresh recon.Engine
|
|
- Provide synthetic providers.Registry with keyword "sk-proj-" matching openai
|
|
- Call eng.SweepAll(ctx, recon.Config{Query:"ignored"})
|
|
- Assert findings grouped by SourceType covers all 10 expected values
|
|
- Use a 30s test timeout
|
|
|
|
Update `cmd/recon.go`:
|
|
- Import `github.com/salvacybersec/keyhunter/pkg/recon/sources`, `github.com/spf13/viper`, and the providers package
|
|
- In `buildReconEngine()`:
|
|
```go
|
|
func buildReconEngine() *recon.Engine {
|
|
e := recon.NewEngine()
|
|
e.Register(recon.ExampleSource{})
|
|
reg, err := providers.NewRegistry()
|
|
if err != nil {
|
|
fmt.Fprintf(os.Stderr, "recon: failed to load providers: %v\n", err)
|
|
return e
|
|
}
|
|
cfg := sources.SourcesConfig{
|
|
Registry: reg,
|
|
Limiters: recon.NewLimiterRegistry(),
|
|
GitHubToken: firstNonEmpty(os.Getenv("GITHUB_TOKEN"), viper.GetString("recon.github.token")),
|
|
GitLabToken: firstNonEmpty(os.Getenv("GITLAB_TOKEN"), viper.GetString("recon.gitlab.token")),
|
|
BitbucketToken: firstNonEmpty(os.Getenv("BITBUCKET_TOKEN"), viper.GetString("recon.bitbucket.token")),
|
|
BitbucketWorkspace: viper.GetString("recon.bitbucket.workspace"),
|
|
CodebergToken: firstNonEmpty(os.Getenv("CODEBERG_TOKEN"), viper.GetString("recon.codeberg.token")),
|
|
HuggingFaceToken: firstNonEmpty(os.Getenv("HUGGINGFACE_TOKEN"), viper.GetString("recon.huggingface.token")),
|
|
KaggleUser: firstNonEmpty(os.Getenv("KAGGLE_USERNAME"), viper.GetString("recon.kaggle.username")),
|
|
KaggleKey: firstNonEmpty(os.Getenv("KAGGLE_KEY"), viper.GetString("recon.kaggle.key")),
|
|
}
|
|
sources.RegisterAll(e, cfg)
|
|
return e
|
|
}
|
|
|
|
func firstNonEmpty(a, b string) string { if a != "" { return a }; return b }
|
|
```
|
|
- Preserve existing reconFullCmd / reconListCmd behavior.
|
|
</action>
|
|
<verify>
|
|
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run TestIntegration -v -timeout 60s && go build ./... && go run . recon list | sort</automated>
|
|
</verify>
|
|
<done>
|
|
Integration test passes with at least one Finding per SourceType across all 10
|
|
sources. `keyhunter recon list` prints all 10 source names plus "example".
|
|
</done>
|
|
</task>
|
|
|
|
</tasks>
|
|
|
|
<verification>
|
|
- `go build ./...`
|
|
- `go vet ./...`
|
|
- `go test ./pkg/recon/sources/... -v -timeout 60s`
|
|
- `go test ./pkg/recon/... -timeout 60s` (ensure no regression in Phase 9 recon tests)
|
|
- `go run . recon list` prints all 10 new source names
|
|
</verification>
|
|
|
|
<success_criteria>
|
|
All Phase 10 code hosting sources registered via sources.RegisterAll, wired into
|
|
cmd/recon.go, and exercised end-to-end by an integration test hitting httptest
|
|
fixtures for every source. Phase 10 requirements RECON-CODE-01..10 complete.
|
|
</success_criteria>
|
|
|
|
<output>
|
|
After completion, create `.planning/phases/10-osint-code-hosting/10-09-SUMMARY.md`.
|
|
</output>
|