diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md
index ec6207b..295c67b 100644
--- a/.planning/ROADMAP.md
+++ b/.planning/ROADMAP.md
@@ -270,7 +270,12 @@ Plans:
3. `keyhunter recon --sources=dockerhub` extracts and scans image layers and build args from public Docker Hub images
4. `keyhunter recon --sources=k8s` discovers publicly exposed Kubernetes dashboards and scans publicly readable Secret/ConfigMap objects
5. `keyhunter recon --sources=terraform,helm,ansible` scans Terraform registry modules, Helm chart repositories, and Ansible Galaxy roles
-**Plans**: TBD
+**Plans**: 4 plans
+Plans:
+- [ ] 13-01-PLAN.md — NpmSource + PyPISource + CratesIOSource + RubyGemsSource (RECON-PKG-01, RECON-PKG-02)
+- [ ] 13-02-PLAN.md — MavenSource + NuGetSource + GoProxySource + PackagistSource (RECON-PKG-02, RECON-PKG-03)
+- [ ] 13-03-PLAN.md — DockerHubSource + KubernetesSource + TerraformSource + HelmSource (RECON-INFRA-01..04)
+- [ ] 13-04-PLAN.md — RegisterAll wiring + integration test (all Phase 13 reqs)
### Phase 14: OSINT CI/CD Logs, Web Archives & Frontend Leaks
**Goal**: Users can scan public CI/CD build logs, historical web snapshots from the Wayback Machine and CommonCrawl, and frontend JavaScript artifacts (source maps, webpack bundles, exposed .env files) for leaked API keys
diff --git a/.planning/phases/13-osint_package_registries_container_iac/13-01-PLAN.md b/.planning/phases/13-osint_package_registries_container_iac/13-01-PLAN.md
new file mode 100644
index 0000000..f173ed0
--- /dev/null
+++ b/.planning/phases/13-osint_package_registries_container_iac/13-01-PLAN.md
@@ -0,0 +1,235 @@
+---
+phase: 13-osint_package_registries_container_iac
+plan: 01
+type: execute
+wave: 1
+depends_on: []
+files_modified:
+ - pkg/recon/sources/npm.go
+ - pkg/recon/sources/npm_test.go
+ - pkg/recon/sources/pypi.go
+ - pkg/recon/sources/pypi_test.go
+ - pkg/recon/sources/cratesio.go
+ - pkg/recon/sources/cratesio_test.go
+ - pkg/recon/sources/rubygems.go
+ - pkg/recon/sources/rubygems_test.go
+autonomous: true
+requirements:
+ - RECON-PKG-01
+ - RECON-PKG-02
+
+must_haves:
+ truths:
+ - "NpmSource searches npm registry for packages matching provider keywords and emits findings"
+ - "PyPISource searches PyPI for packages matching provider keywords and emits findings"
+ - "CratesIOSource searches crates.io for crates matching provider keywords and emits findings"
+ - "RubyGemsSource searches rubygems.org for gems matching provider keywords and emits findings"
+ - "All four sources handle context cancellation, empty registries, and HTTP errors gracefully"
+ artifacts:
+ - path: "pkg/recon/sources/npm.go"
+ provides: "NpmSource implementing recon.ReconSource"
+ contains: "func (s *NpmSource) Sweep"
+ - path: "pkg/recon/sources/npm_test.go"
+ provides: "httptest-based tests for NpmSource"
+ contains: "httptest.NewServer"
+ - path: "pkg/recon/sources/pypi.go"
+ provides: "PyPISource implementing recon.ReconSource"
+ contains: "func (s *PyPISource) Sweep"
+ - path: "pkg/recon/sources/pypi_test.go"
+ provides: "httptest-based tests for PyPISource"
+ contains: "httptest.NewServer"
+ - path: "pkg/recon/sources/cratesio.go"
+ provides: "CratesIOSource implementing recon.ReconSource"
+ contains: "func (s *CratesIOSource) Sweep"
+ - path: "pkg/recon/sources/cratesio_test.go"
+ provides: "httptest-based tests for CratesIOSource"
+ contains: "httptest.NewServer"
+ - path: "pkg/recon/sources/rubygems.go"
+ provides: "RubyGemsSource implementing recon.ReconSource"
+ contains: "func (s *RubyGemsSource) Sweep"
+ - path: "pkg/recon/sources/rubygems_test.go"
+ provides: "httptest-based tests for RubyGemsSource"
+ contains: "httptest.NewServer"
+ key_links:
+ - from: "pkg/recon/sources/npm.go"
+ to: "pkg/recon/source.go"
+ via: "implements ReconSource interface"
+ pattern: "var _ recon\\.ReconSource"
+ - from: "pkg/recon/sources/pypi.go"
+ to: "pkg/recon/source.go"
+ via: "implements ReconSource interface"
+ pattern: "var _ recon\\.ReconSource"
+---
+
+
+Implement four package registry ReconSource modules: npm, PyPI, Crates.io, and RubyGems.
+
+Purpose: Enables KeyHunter to scan the four most popular package registries for packages that may contain leaked API keys, covering JavaScript, Python, Rust, and Ruby ecosystems.
+Output: 4 source files + 4 test files in pkg/recon/sources/
+
+
+
+@$HOME/.claude/get-shit-done/workflows/execute-plan.md
+@$HOME/.claude/get-shit-done/templates/summary.md
+
+
+
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+@pkg/recon/source.go
+@pkg/recon/sources/register.go
+@pkg/recon/sources/httpclient.go
+@pkg/recon/sources/queries.go
+@pkg/recon/sources/replit.go (pattern reference — credentialless scraper source)
+@pkg/recon/sources/github.go (pattern reference — API-key-gated source)
+@pkg/recon/sources/replit_test.go (test pattern reference)
+
+
+
+
+From pkg/recon/source.go:
+```go
+type ReconSource interface {
+ Name() string
+ RateLimit() rate.Limit
+ Burst() int
+ RespectsRobots() bool
+ Enabled(cfg Config) bool
+ Sweep(ctx context.Context, query string, out chan<- Finding) error
+}
+```
+
+From pkg/recon/sources/httpclient.go:
+```go
+func NewClient() *Client
+func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
+```
+
+From pkg/recon/sources/queries.go:
+```go
+func BuildQueries(reg *providers.Registry, source string) []string
+```
+
+
+
+
+
+
+ Task 1: Implement NpmSource and PyPISource
+ pkg/recon/sources/npm.go, pkg/recon/sources/npm_test.go, pkg/recon/sources/pypi.go, pkg/recon/sources/pypi_test.go
+
+Create NpmSource in npm.go following the established ReplitSource pattern (credentialless, RespectsRobots=true):
+
+**NpmSource** (npm.go):
+- Struct: `NpmSource` with fields `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `Client *Client`
+- Compile-time assertion: `var _ recon.ReconSource = (*NpmSource)(nil)`
+- Name() returns "npm"
+- RateLimit() returns rate.Every(2 * time.Second) — npm registry is generous but be polite
+- Burst() returns 2
+- RespectsRobots() returns false (API endpoint, not scraped HTML)
+- Enabled() always returns true (no credentials needed)
+- BaseURL defaults to "https://registry.npmjs.org" if empty
+- Sweep() logic:
+ 1. Call BuildQueries(s.Registry, "npm") to get keyword list
+ 2. For each keyword, GET `{BaseURL}/-/v1/search?text={keyword}&size=20`
+ 3. Parse JSON response: `{"objects": [{"package": {"name": "...", "links": {"npm": "..."}}}]}`
+ 4. Define response structs: `npmSearchResponse`, `npmObject`, `npmPackage`, `npmLinks`
+ 5. Emit one Finding per result with Source=links.npm (or construct from package name), SourceType="recon:npm", Confidence="low"
+ 6. Honor ctx cancellation between queries, use Limiters.Wait before each request
+
+**PyPISource** (pypi.go):
+- Same pattern as NpmSource
+- Name() returns "pypi"
+- RateLimit() returns rate.Every(2 * time.Second)
+- Burst() returns 2
+- RespectsRobots() returns false
+- Enabled() always true
+- BaseURL defaults to "https://pypi.org"
+- Sweep() logic:
+ 1. BuildQueries(s.Registry, "pypi")
+ 2. For each keyword, GET `{BaseURL}/search/?q={keyword}&o=` (HTML page) OR use the XML-RPC/JSON approach:
+ Actually use the simple JSON API: GET `{BaseURL}/pypi/{keyword}/json` is for specific packages.
+ For search, use: GET `https://pypi.org/search/?q={keyword}` and parse HTML for project links.
+ Simpler approach: GET `{BaseURL}/simple/` is too large. Use the warehouse search page.
+ Best approach: GET `{BaseURL}/search/?q={keyword}` returns HTML. Parse `` links.
+ 3. Parse HTML response for project links matching `/project/[^/]+/` pattern
+ 4. Emit Finding per result with Source="{BaseURL}/project/{name}/", SourceType="recon:pypi"
+ 5. Use extractAnchorHrefs pattern or a simpler regex on href attributes
+
+**Tests** — Follow replit_test.go pattern exactly:
+- npm_test.go: httptest server returning canned npm search JSON. Test Sweep extracts findings, test Name/Rate/Burst, test ctx cancellation, test Enabled always true.
+- pypi_test.go: httptest server returning canned HTML with package-snippet links. Same test categories.
+
+
+ cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestNpm|TestPyPI" -v -count=1
+
+ NpmSource and PyPISource pass all tests: Sweep emits correct findings from httptest fixtures, Name/Rate/Burst/Enabled return expected values, ctx cancellation is handled
+
+
+
+ Task 2: Implement CratesIOSource and RubyGemsSource
+ pkg/recon/sources/cratesio.go, pkg/recon/sources/cratesio_test.go, pkg/recon/sources/rubygems.go, pkg/recon/sources/rubygems_test.go
+
+**CratesIOSource** (cratesio.go):
+- Struct: `CratesIOSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
+- Compile-time assertion: `var _ recon.ReconSource = (*CratesIOSource)(nil)`
+- Name() returns "crates"
+- RateLimit() returns rate.Every(1 * time.Second) — crates.io asks for 1 req/sec
+- Burst() returns 1
+- RespectsRobots() returns false (JSON API)
+- Enabled() always true
+- BaseURL defaults to "https://crates.io"
+- Sweep() logic:
+ 1. BuildQueries(s.Registry, "crates")
+ 2. For each keyword, GET `{BaseURL}/api/v1/crates?q={keyword}&per_page=20`
+ 3. Parse JSON: `{"crates": [{"id": "...", "name": "...", "repository": "..."}]}`
+ 4. Define response structs: `cratesSearchResponse`, `crateEntry`
+ 5. Emit Finding per crate: Source="https://crates.io/crates/{name}", SourceType="recon:crates"
+ 6. IMPORTANT: crates.io requires a custom User-Agent header. Set req.Header.Set("User-Agent", "keyhunter-recon/1.0 (https://github.com/salvacybersec/keyhunter)") before passing to client.Do
+
+**RubyGemsSource** (rubygems.go):
+- Same pattern
+- Name() returns "rubygems"
+- RateLimit() returns rate.Every(2 * time.Second)
+- Burst() returns 2
+- RespectsRobots() returns false (JSON API)
+- Enabled() always true
+- BaseURL defaults to "https://rubygems.org"
+- Sweep() logic:
+ 1. BuildQueries(s.Registry, "rubygems")
+ 2. For each keyword, GET `{BaseURL}/api/v1/search.json?query={keyword}&page=1`
+ 3. Parse JSON array: `[{"name": "...", "project_uri": "..."}]`
+ 4. Define response struct: `rubyGemEntry`
+ 5. Emit Finding per gem: Source=project_uri, SourceType="recon:rubygems"
+
+**Tests** — same httptest pattern:
+- cratesio_test.go: httptest serving canned JSON with crate entries. Verify User-Agent header is set. Test all standard categories.
+- rubygems_test.go: httptest serving canned JSON array. Test all standard categories.
+
+
+ cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestCratesIO|TestRubyGems" -v -count=1
+
+ CratesIOSource and RubyGemsSource pass all tests. CratesIO sends proper User-Agent header. Both emit correct findings from httptest fixtures.
+
+
+
+
+
+All 8 new files compile and pass tests:
+```bash
+go test ./pkg/recon/sources/ -run "TestNpm|TestPyPI|TestCratesIO|TestRubyGems" -v -count=1
+go vet ./pkg/recon/sources/
+```
+
+
+
+- 4 new source files implement recon.ReconSource interface
+- 4 test files use httptest with canned fixtures
+- All tests pass
+- No compilation errors across the package
+
+
+
diff --git a/.planning/phases/13-osint_package_registries_container_iac/13-02-PLAN.md b/.planning/phases/13-osint_package_registries_container_iac/13-02-PLAN.md
new file mode 100644
index 0000000..63176de
--- /dev/null
+++ b/.planning/phases/13-osint_package_registries_container_iac/13-02-PLAN.md
@@ -0,0 +1,215 @@
+---
+phase: 13-osint_package_registries_container_iac
+plan: 02
+type: execute
+wave: 1
+depends_on: []
+files_modified:
+ - pkg/recon/sources/maven.go
+ - pkg/recon/sources/maven_test.go
+ - pkg/recon/sources/nuget.go
+ - pkg/recon/sources/nuget_test.go
+ - pkg/recon/sources/goproxy.go
+ - pkg/recon/sources/goproxy_test.go
+ - pkg/recon/sources/packagist.go
+ - pkg/recon/sources/packagist_test.go
+autonomous: true
+requirements:
+ - RECON-PKG-02
+ - RECON-PKG-03
+
+must_haves:
+ truths:
+ - "MavenSource searches Maven Central for artifacts matching provider keywords and emits findings"
+ - "NuGetSource searches NuGet gallery for packages matching provider keywords and emits findings"
+ - "GoProxySource searches Go module proxy for modules matching provider keywords and emits findings"
+ - "PackagistSource searches Packagist for PHP packages matching provider keywords and emits findings"
+ - "All four sources handle context cancellation, empty registries, and HTTP errors gracefully"
+ artifacts:
+ - path: "pkg/recon/sources/maven.go"
+ provides: "MavenSource implementing recon.ReconSource"
+ contains: "func (s *MavenSource) Sweep"
+ - path: "pkg/recon/sources/nuget.go"
+ provides: "NuGetSource implementing recon.ReconSource"
+ contains: "func (s *NuGetSource) Sweep"
+ - path: "pkg/recon/sources/goproxy.go"
+ provides: "GoProxySource implementing recon.ReconSource"
+ contains: "func (s *GoProxySource) Sweep"
+ - path: "pkg/recon/sources/packagist.go"
+ provides: "PackagistSource implementing recon.ReconSource"
+ contains: "func (s *PackagistSource) Sweep"
+ key_links:
+ - from: "pkg/recon/sources/maven.go"
+ to: "pkg/recon/source.go"
+ via: "implements ReconSource interface"
+ pattern: "var _ recon\\.ReconSource"
+ - from: "pkg/recon/sources/nuget.go"
+ to: "pkg/recon/source.go"
+ via: "implements ReconSource interface"
+ pattern: "var _ recon\\.ReconSource"
+---
+
+
+Implement four package registry ReconSource modules: Maven Central, NuGet, Go Proxy, and Packagist.
+
+Purpose: Extends package registry coverage to Java/JVM, .NET, Go, and PHP ecosystems, completing the full set of 8 package registries for RECON-PKG-02 and RECON-PKG-03.
+Output: 4 source files + 4 test files in pkg/recon/sources/
+
+
+
+@$HOME/.claude/get-shit-done/workflows/execute-plan.md
+@$HOME/.claude/get-shit-done/templates/summary.md
+
+
+
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+@pkg/recon/source.go
+@pkg/recon/sources/httpclient.go
+@pkg/recon/sources/queries.go
+@pkg/recon/sources/replit.go (pattern reference)
+@pkg/recon/sources/replit_test.go (test pattern reference)
+
+
+From pkg/recon/source.go:
+```go
+type ReconSource interface {
+ Name() string
+ RateLimit() rate.Limit
+ Burst() int
+ RespectsRobots() bool
+ Enabled(cfg Config) bool
+ Sweep(ctx context.Context, query string, out chan<- Finding) error
+}
+```
+
+From pkg/recon/sources/httpclient.go:
+```go
+func NewClient() *Client
+func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
+```
+
+From pkg/recon/sources/queries.go:
+```go
+func BuildQueries(reg *providers.Registry, source string) []string
+```
+
+
+
+
+
+
+ Task 1: Implement MavenSource and NuGetSource
+ pkg/recon/sources/maven.go, pkg/recon/sources/maven_test.go, pkg/recon/sources/nuget.go, pkg/recon/sources/nuget_test.go
+
+**MavenSource** (maven.go):
+- Struct: `MavenSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
+- Compile-time assertion: `var _ recon.ReconSource = (*MavenSource)(nil)`
+- Name() returns "maven"
+- RateLimit() returns rate.Every(2 * time.Second)
+- Burst() returns 2
+- RespectsRobots() returns false (JSON API)
+- Enabled() always true (no credentials needed)
+- BaseURL defaults to "https://search.maven.org"
+- Sweep() logic:
+ 1. BuildQueries(s.Registry, "maven")
+ 2. For each keyword, GET `{BaseURL}/solrsearch/select?q={keyword}&rows=20&wt=json`
+ 3. Parse JSON: `{"response": {"docs": [{"g": "group", "a": "artifact", "latestVersion": "1.0"}]}}`
+ 4. Define response structs: `mavenSearchResponse`, `mavenResponseBody`, `mavenDoc`
+ 5. Emit Finding per doc: Source="https://search.maven.org/artifact/{g}/{a}/{latestVersion}/jar", SourceType="recon:maven"
+
+**NuGetSource** (nuget.go):
+- Struct: `NuGetSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
+- Compile-time assertion: `var _ recon.ReconSource = (*NuGetSource)(nil)`
+- Name() returns "nuget"
+- RateLimit() returns rate.Every(1 * time.Second)
+- Burst() returns 3
+- RespectsRobots() returns false (JSON API)
+- Enabled() always true
+- BaseURL defaults to "https://azuresearch-usnc.nuget.org"
+- Sweep() logic:
+ 1. BuildQueries(s.Registry, "nuget")
+ 2. For each keyword, GET `{BaseURL}/query?q={keyword}&take=20`
+ 3. Parse JSON: `{"data": [{"id": "...", "version": "...", "projectUrl": "..."}]}`
+ 4. Define response structs: `nugetSearchResponse`, `nugetPackage`
+ 5. Emit Finding per package: Source=projectUrl (fallback to "https://www.nuget.org/packages/{id}"), SourceType="recon:nuget"
+
+**Tests** — httptest pattern:
+- maven_test.go: httptest serving canned Solr JSON. Test Sweep extracts findings, Name/Rate/Burst, ctx cancellation.
+- nuget_test.go: httptest serving canned NuGet search JSON. Same test categories.
+
+
+ cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestMaven|TestNuGet" -v -count=1
+
+ MavenSource and NuGetSource pass all tests: findings extracted from httptest fixtures, metadata methods return expected values
+
+
+
+ Task 2: Implement GoProxySource and PackagistSource
+ pkg/recon/sources/goproxy.go, pkg/recon/sources/goproxy_test.go, pkg/recon/sources/packagist.go, pkg/recon/sources/packagist_test.go
+
+**GoProxySource** (goproxy.go):
+- Struct: `GoProxySource` with `BaseURL`, `Registry`, `Limiters`, `Client`
+- Compile-time assertion: `var _ recon.ReconSource = (*GoProxySource)(nil)`
+- Name() returns "goproxy"
+- RateLimit() returns rate.Every(2 * time.Second)
+- Burst() returns 2
+- RespectsRobots() returns false
+- Enabled() always true
+- BaseURL defaults to "https://pkg.go.dev"
+- Sweep() logic:
+ 1. BuildQueries(s.Registry, "goproxy")
+ 2. For each keyword, GET `{BaseURL}/search?q={keyword}&m=package` — this returns HTML
+ 3. Parse HTML for search result links matching pattern `/[^"]+` inside `` elements with class containing "SearchSnippet"
+ 4. Simpler approach: use regex to extract hrefs matching `href="(/[a-z][^"]*)"` from search result snippet divs
+ 5. Emit Finding per result: Source="{BaseURL}{path}", SourceType="recon:goproxy"
+ 6. Note: pkg.go.dev search returns HTML, not JSON. Use the same HTML parsing approach as ReplitSource (extractAnchorHrefs with appropriate regex).
+ 7. Define a package-level regexp: `goProxyLinkRE = regexp.MustCompile(`^/[a-z][a-z0-9./_-]*$`)` to match Go module paths
+
+**PackagistSource** (packagist.go):
+- Struct: `PackagistSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
+- Compile-time assertion: `var _ recon.ReconSource = (*PackagistSource)(nil)`
+- Name() returns "packagist"
+- RateLimit() returns rate.Every(2 * time.Second)
+- Burst() returns 2
+- RespectsRobots() returns false (JSON API)
+- Enabled() always true
+- BaseURL defaults to "https://packagist.org"
+- Sweep() logic:
+ 1. BuildQueries(s.Registry, "packagist")
+ 2. For each keyword, GET `{BaseURL}/search.json?q={keyword}&per_page=20`
+ 3. Parse JSON: `{"results": [{"name": "vendor/package", "url": "..."}]}`
+ 4. Define response structs: `packagistSearchResponse`, `packagistPackage`
+ 5. Emit Finding per package: Source=url, SourceType="recon:packagist"
+
+**Tests** — httptest pattern:
+- goproxy_test.go: httptest serving canned HTML with search result links. Test extraction of Go module paths.
+- packagist_test.go: httptest serving canned Packagist JSON. Test all standard categories.
+
+
+ cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestGoProxy|TestPackagist" -v -count=1
+
+ GoProxySource and PackagistSource pass all tests. GoProxy HTML parsing extracts module paths correctly. Packagist JSON parsing works.
+
+
+
+
+
+All 8 new files compile and pass tests:
+```bash
+go test ./pkg/recon/sources/ -run "TestMaven|TestNuGet|TestGoProxy|TestPackagist" -v -count=1
+go vet ./pkg/recon/sources/
+```
+
+
+
+- 4 new source files implement recon.ReconSource interface
+- 4 test files use httptest with canned fixtures
+- All tests pass
+- No compilation errors across the package
+
+
+
diff --git a/.planning/phases/13-osint_package_registries_container_iac/13-03-PLAN.md b/.planning/phases/13-osint_package_registries_container_iac/13-03-PLAN.md
new file mode 100644
index 0000000..60c8527
--- /dev/null
+++ b/.planning/phases/13-osint_package_registries_container_iac/13-03-PLAN.md
@@ -0,0 +1,224 @@
+---
+phase: 13-osint_package_registries_container_iac
+plan: 03
+type: execute
+wave: 1
+depends_on: []
+files_modified:
+ - pkg/recon/sources/dockerhub.go
+ - pkg/recon/sources/dockerhub_test.go
+ - pkg/recon/sources/kubernetes.go
+ - pkg/recon/sources/kubernetes_test.go
+ - pkg/recon/sources/terraform.go
+ - pkg/recon/sources/terraform_test.go
+ - pkg/recon/sources/helm.go
+ - pkg/recon/sources/helm_test.go
+autonomous: true
+requirements:
+ - RECON-INFRA-01
+ - RECON-INFRA-02
+ - RECON-INFRA-03
+ - RECON-INFRA-04
+
+must_haves:
+ truths:
+ - "DockerHubSource searches Docker Hub for images matching provider keywords and emits findings"
+ - "KubernetesSource searches for publicly exposed Kubernetes configs via search/dorking and emits findings"
+ - "TerraformSource searches Terraform Registry for modules matching provider keywords and emits findings"
+ - "HelmSource searches Artifact Hub for Helm charts matching provider keywords and emits findings"
+ - "All four sources handle context cancellation, empty registries, and HTTP errors gracefully"
+ artifacts:
+ - path: "pkg/recon/sources/dockerhub.go"
+ provides: "DockerHubSource implementing recon.ReconSource"
+ contains: "func (s *DockerHubSource) Sweep"
+ - path: "pkg/recon/sources/kubernetes.go"
+ provides: "KubernetesSource implementing recon.ReconSource"
+ contains: "func (s *KubernetesSource) Sweep"
+ - path: "pkg/recon/sources/terraform.go"
+ provides: "TerraformSource implementing recon.ReconSource"
+ contains: "func (s *TerraformSource) Sweep"
+ - path: "pkg/recon/sources/helm.go"
+ provides: "HelmSource implementing recon.ReconSource"
+ contains: "func (s *HelmSource) Sweep"
+ key_links:
+ - from: "pkg/recon/sources/dockerhub.go"
+ to: "pkg/recon/source.go"
+ via: "implements ReconSource interface"
+ pattern: "var _ recon\\.ReconSource"
+ - from: "pkg/recon/sources/terraform.go"
+ to: "pkg/recon/source.go"
+ via: "implements ReconSource interface"
+ pattern: "var _ recon\\.ReconSource"
+---
+
+
+Implement four container and infrastructure-as-code ReconSource modules: Docker Hub, Kubernetes, Terraform Registry, and Helm (via Artifact Hub).
+
+Purpose: Enables KeyHunter to scan container images, Kubernetes configs, Terraform modules, and Helm charts for leaked API keys embedded in infrastructure definitions.
+Output: 4 source files + 4 test files in pkg/recon/sources/
+
+
+
+@$HOME/.claude/get-shit-done/workflows/execute-plan.md
+@$HOME/.claude/get-shit-done/templates/summary.md
+
+
+
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+@pkg/recon/source.go
+@pkg/recon/sources/httpclient.go
+@pkg/recon/sources/queries.go
+@pkg/recon/sources/replit.go (pattern reference)
+@pkg/recon/sources/shodan.go (pattern reference — search API source)
+@pkg/recon/sources/replit_test.go (test pattern reference)
+
+
+From pkg/recon/source.go:
+```go
+type ReconSource interface {
+ Name() string
+ RateLimit() rate.Limit
+ Burst() int
+ RespectsRobots() bool
+ Enabled(cfg Config) bool
+ Sweep(ctx context.Context, query string, out chan<- Finding) error
+}
+```
+
+From pkg/recon/sources/httpclient.go:
+```go
+func NewClient() *Client
+func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
+```
+
+From pkg/recon/sources/queries.go:
+```go
+func BuildQueries(reg *providers.Registry, source string) []string
+```
+
+
+
+
+
+
+ Task 1: Implement DockerHubSource and KubernetesSource
+ pkg/recon/sources/dockerhub.go, pkg/recon/sources/dockerhub_test.go, pkg/recon/sources/kubernetes.go, pkg/recon/sources/kubernetes_test.go
+
+**DockerHubSource** (dockerhub.go):
+- Struct: `DockerHubSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
+- Compile-time assertion: `var _ recon.ReconSource = (*DockerHubSource)(nil)`
+- Name() returns "dockerhub"
+- RateLimit() returns rate.Every(2 * time.Second) — Docker Hub rate limits unauthenticated at ~100 pulls/6h, search is more lenient
+- Burst() returns 2
+- RespectsRobots() returns false (JSON API)
+- Enabled() always true (Docker Hub search is unauthenticated)
+- BaseURL defaults to "https://hub.docker.com"
+- Sweep() logic:
+ 1. BuildQueries(s.Registry, "dockerhub")
+ 2. For each keyword, GET `{BaseURL}/v2/search/repositories/?query={keyword}&page_size=20`
+ 3. Parse JSON: `{"results": [{"repo_name": "...", "description": "...", "is_official": false}]}`
+ 4. Define response structs: `dockerHubSearchResponse`, `dockerHubRepo`
+ 5. Emit Finding per result: Source="https://hub.docker.com/r/{repo_name}", SourceType="recon:dockerhub"
+ 6. Description in finding can hint at build-arg or env-var exposure
+
+**KubernetesSource** (kubernetes.go):
+- Struct: `KubernetesSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
+- Compile-time assertion: `var _ recon.ReconSource = (*KubernetesSource)(nil)`
+- Name() returns "k8s"
+- RateLimit() returns rate.Every(3 * time.Second)
+- Burst() returns 1
+- RespectsRobots() returns true (searches public web for exposed K8s dashboards/configs)
+- Enabled() always true
+- BaseURL defaults to "https://search.censys.io" — uses Censys-style search for exposed K8s dashboards
+- ALTERNATIVE simpler approach: Search GitHub for exposed Kubernetes manifests containing secrets.
+ Use BaseURL "https://api.github.com" and search for `kind: Secret` or `apiVersion: v1 kind: ConfigMap` with provider keywords.
+ BUT this duplicates GitHubSource.
+- BEST approach: Use a dedicated search via pkg.go.dev-style HTML scraping but for Kubernetes YAML files on public artifact hubs.
+ Actually, the simplest approach that aligns with RECON-INFRA-02 ("discovers publicly exposed Kubernetes dashboards and scans publicly readable Secret/ConfigMap objects"):
+ Use Shodan/Censys-style dork queries. But those sources already exist.
+- FINAL approach: KubernetesSource searches Artifact Hub (artifacthub.io) for Kubernetes manifests/operators that may embed secrets. ArtifactHub has a JSON API.
+ GET `{BaseURL}/api/v1/packages/search?ts_query_web={keyword}&kind=0&limit=20` (kind=0 = Helm charts, but also covers operators)
+ Actually, use kind=6 for "Kube Operator" or leave blank for all kinds.
+ BaseURL defaults to "https://artifacthub.io"
+ Parse JSON: `{"packages": [{"name": "...", "normalized_name": "...", "repository": {"name": "...", "url": "..."}}]}`
+ Emit Finding: Source="https://artifacthub.io/packages/{repository.kind}/{repository.name}/{package.name}", SourceType="recon:k8s"
+
+**Tests** — httptest pattern:
+- dockerhub_test.go: httptest serving canned Docker Hub search JSON. Verify findings have correct SourceType and Source URL format.
+- kubernetes_test.go: httptest serving canned Artifact Hub search JSON. Standard test categories.
+
+
+ cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestDockerHub|TestKubernetes" -v -count=1
+
+ DockerHubSource and KubernetesSource pass all tests: Docker Hub search returns repo findings, K8s source finds Artifact Hub packages
+
+
+
+ Task 2: Implement TerraformSource and HelmSource
+ pkg/recon/sources/terraform.go, pkg/recon/sources/terraform_test.go, pkg/recon/sources/helm.go, pkg/recon/sources/helm_test.go
+
+**TerraformSource** (terraform.go):
+- Struct: `TerraformSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
+- Compile-time assertion: `var _ recon.ReconSource = (*TerraformSource)(nil)`
+- Name() returns "terraform"
+- RateLimit() returns rate.Every(2 * time.Second)
+- Burst() returns 2
+- RespectsRobots() returns false (JSON API)
+- Enabled() always true
+- BaseURL defaults to "https://registry.terraform.io"
+- Sweep() logic:
+ 1. BuildQueries(s.Registry, "terraform")
+ 2. For each keyword, GET `{BaseURL}/v1/modules?q={keyword}&limit=20`
+ 3. Parse JSON: `{"modules": [{"id": "namespace/name/provider", "namespace": "...", "name": "...", "provider": "...", "description": "..."}]}`
+ 4. Define response structs: `terraformSearchResponse`, `terraformModule`
+ 5. Emit Finding per module: Source="https://registry.terraform.io/modules/{namespace}/{name}/{provider}", SourceType="recon:terraform"
+
+**HelmSource** (helm.go):
+- Struct: `HelmSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
+- Compile-time assertion: `var _ recon.ReconSource = (*HelmSource)(nil)`
+- Name() returns "helm"
+- RateLimit() returns rate.Every(2 * time.Second)
+- Burst() returns 2
+- RespectsRobots() returns false (JSON API)
+- Enabled() always true
+- BaseURL defaults to "https://artifacthub.io"
+- Sweep() logic:
+ 1. BuildQueries(s.Registry, "helm")
+ 2. For each keyword, GET `{BaseURL}/api/v1/packages/search?ts_query_web={keyword}&kind=0&limit=20` (kind=0 = Helm charts)
+ 3. Parse JSON: `{"packages": [{"package_id": "...", "name": "...", "normalized_name": "...", "repository": {"name": "...", "kind": 0}}]}`
+ 4. Define response structs: `artifactHubSearchResponse`, `artifactHubPackage`, `artifactHubRepo`
+ 5. Emit Finding per package: Source="https://artifacthub.io/packages/helm/{repo.name}/{package.name}", SourceType="recon:helm"
+ 6. Note: HelmSource and KubernetesSource both use Artifact Hub but with different `kind` parameters and different SourceType tags. Keep them separate — different concerns.
+
+**Tests** — httptest pattern:
+- terraform_test.go: httptest serving canned Terraform registry JSON. Verify module URL construction from namespace/name/provider.
+- helm_test.go: httptest serving canned Artifact Hub JSON for Helm charts. Standard test categories.
+
+
+ cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestTerraform|TestHelm" -v -count=1
+
+ TerraformSource and HelmSource pass all tests. Terraform constructs correct module URLs. Helm extracts Artifact Hub packages correctly.
+
+
+
+
+
+All 8 new files compile and pass tests:
+```bash
+go test ./pkg/recon/sources/ -run "TestDockerHub|TestKubernetes|TestTerraform|TestHelm" -v -count=1
+go vet ./pkg/recon/sources/
+```
+
+
+
+- 4 new source files implement recon.ReconSource interface
+- 4 test files use httptest with canned fixtures
+- All tests pass
+- No compilation errors across the package
+
+
+
diff --git a/.planning/phases/13-osint_package_registries_container_iac/13-04-PLAN.md b/.planning/phases/13-osint_package_registries_container_iac/13-04-PLAN.md
new file mode 100644
index 0000000..d790745
--- /dev/null
+++ b/.planning/phases/13-osint_package_registries_container_iac/13-04-PLAN.md
@@ -0,0 +1,237 @@
+---
+phase: 13-osint_package_registries_container_iac
+plan: 04
+type: execute
+wave: 2
+depends_on:
+ - "13-01"
+ - "13-02"
+ - "13-03"
+files_modified:
+ - pkg/recon/sources/register.go
+ - pkg/recon/sources/register_test.go
+ - pkg/recon/sources/integration_test.go
+ - cmd/recon.go
+autonomous: true
+requirements:
+ - RECON-PKG-01
+ - RECON-PKG-02
+ - RECON-PKG-03
+ - RECON-INFRA-01
+ - RECON-INFRA-02
+ - RECON-INFRA-03
+ - RECON-INFRA-04
+
+must_haves:
+ truths:
+ - "RegisterAll registers all 12 new Phase 13 sources (40 total) on the engine"
+ - "All 40 sources appear in engine.List() sorted alphabetically"
+ - "Integration test runs SweepAll across all 40 sources with httptest fixtures and gets at least one finding per SourceType"
+ - "cmd/recon.go wires any new SourcesConfig fields needed for Phase 13 sources"
+ artifacts:
+ - path: "pkg/recon/sources/register.go"
+ provides: "Updated RegisterAll with 12 new Phase 13 source registrations"
+ contains: "NpmSource"
+ - path: "pkg/recon/sources/register_test.go"
+ provides: "Updated test asserting 40 sources registered"
+ contains: "40"
+ - path: "pkg/recon/sources/integration_test.go"
+ provides: "Updated integration test with httptest mux handlers for all 12 new sources"
+ contains: "recon:npm"
+ key_links:
+ - from: "pkg/recon/sources/register.go"
+ to: "pkg/recon/sources/npm.go"
+ via: "engine.Register call"
+ pattern: "NpmSource"
+ - from: "pkg/recon/sources/register.go"
+ to: "pkg/recon/sources/dockerhub.go"
+ via: "engine.Register call"
+ pattern: "DockerHubSource"
+ - from: "pkg/recon/sources/integration_test.go"
+ to: "all 12 new sources"
+ via: "httptest mux handlers"
+ pattern: "recon:(npm|pypi|crates|rubygems|maven|nuget|goproxy|packagist|dockerhub|k8s|terraform|helm)"
+---
+
+
+Wire all 12 Phase 13 sources into RegisterAll, update register_test.go to assert 40 total sources, and extend the integration test with httptest handlers for all new sources.
+
+Purpose: Connects the individually-implemented sources into the recon engine so `keyhunter recon` discovers and runs them. Integration test proves end-to-end SweepAll works across all 40 sources.
+Output: Updated register.go, register_test.go, integration_test.go, cmd/recon.go
+
+
+
+@$HOME/.claude/get-shit-done/workflows/execute-plan.md
+@$HOME/.claude/get-shit-done/templates/summary.md
+
+
+
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+@pkg/recon/sources/register.go
+@pkg/recon/sources/register_test.go
+@pkg/recon/sources/integration_test.go
+@cmd/recon.go
+
+
+@.planning/phases/13-osint_package_registries_container_iac/13-01-SUMMARY.md
+@.planning/phases/13-osint_package_registries_container_iac/13-02-SUMMARY.md
+@.planning/phases/13-osint_package_registries_container_iac/13-03-SUMMARY.md
+
+
+From pkg/recon/sources/register.go (current):
+```go
+type SourcesConfig struct {
+ GitHubToken string
+ // ... existing fields ...
+ Registry *providers.Registry
+ Limiters *recon.LimiterRegistry
+}
+
+func RegisterAll(engine *recon.Engine, cfg SourcesConfig) { ... }
+```
+
+From pkg/recon/engine.go:
+```go
+func (e *Engine) Register(src ReconSource)
+func (e *Engine) List() []string // sorted source names
+```
+
+New sources created by Plans 13-01..03 (all credentialless, struct-literal style):
+- NpmSource{BaseURL, Registry, Limiters, Client}
+- PyPISource{BaseURL, Registry, Limiters, Client}
+- CratesIOSource{BaseURL, Registry, Limiters, Client}
+- RubyGemsSource{BaseURL, Registry, Limiters, Client}
+- MavenSource{BaseURL, Registry, Limiters, Client}
+- NuGetSource{BaseURL, Registry, Limiters, Client}
+- GoProxySource{BaseURL, Registry, Limiters, Client}
+- PackagistSource{BaseURL, Registry, Limiters, Client}
+- DockerHubSource{BaseURL, Registry, Limiters, Client}
+- KubernetesSource{BaseURL, Registry, Limiters, Client}
+- TerraformSource{BaseURL, Registry, Limiters, Client}
+- HelmSource{BaseURL, Registry, Limiters, Client}
+
+
+
+
+
+
+ Task 1: Wire Phase 13 sources into RegisterAll and update register_test
+ pkg/recon/sources/register.go, pkg/recon/sources/register_test.go
+
+**register.go updates:**
+1. Add a `// Phase 13: Package registry sources (credentialless).` comment block after the Phase 12 cloud storage block
+2. Register all 8 package registry sources as struct literals (no New* constructors needed since they're credentialless):
+ ```go
+ engine.Register(&NpmSource{Registry: reg, Limiters: lim})
+ engine.Register(&PyPISource{Registry: reg, Limiters: lim})
+ engine.Register(&CratesIOSource{Registry: reg, Limiters: lim})
+ engine.Register(&RubyGemsSource{Registry: reg, Limiters: lim})
+ engine.Register(&MavenSource{Registry: reg, Limiters: lim})
+ engine.Register(&NuGetSource{Registry: reg, Limiters: lim})
+ engine.Register(&GoProxySource{Registry: reg, Limiters: lim})
+ engine.Register(&PackagistSource{Registry: reg, Limiters: lim})
+ ```
+3. Add a `// Phase 13: Container & IaC sources (credentialless).` comment block
+4. Register all 4 infra sources:
+ ```go
+ engine.Register(&DockerHubSource{Registry: reg, Limiters: lim})
+ engine.Register(&KubernetesSource{Registry: reg, Limiters: lim})
+ engine.Register(&TerraformSource{Registry: reg, Limiters: lim})
+ engine.Register(&HelmSource{Registry: reg, Limiters: lim})
+ ```
+5. Update the RegisterAll doc comment: change "28 sources total" to "40 sources total" and mention Phase 13
+6. No new SourcesConfig fields needed — all Phase 13 sources are credentialless
+
+**register_test.go updates:**
+1. Rename `TestRegisterAll_WiresAllTwentyEightSources` to `TestRegisterAll_WiresAllFortySources`
+2. Update `want` slice to include all 12 new names in alphabetical order: "crates", "dockerhub", "goproxy", "helm", "k8s", "maven", "npm", "nuget", "packagist", "pypi", "rubygems", "terraform" merged into existing list
+3. Update `TestRegisterAll_MissingCredsStillRegistered` count from 28 to 40
+4. The full sorted list should be: azureblob, binaryedge, bing, bitbucket, brave, censys, codeberg, codesandbox, crates, dockerhub, dospaces, duckduckgo, fofa, gcs, gist, gistpaste, github, gitlab, google, goproxy, helm, huggingface, k8s, kaggle, maven, netlas, npm, nuget, packagist, pastebin, pastesites, pypi, replit, rubygems, s3, sandboxes, shodan, spaces, terraform, yandex, zoomeye
+ Wait — that's 41. Let me recount existing: azureblob, binaryedge, bing, bitbucket, brave, censys, codeberg, codesandbox, duckduckgo, fofa, gcs, gist, gistpaste, github, gitlab, google, huggingface, kaggle, netlas, pastebin, pastesites, replit, s3, sandboxes, shodan, spaces, yandex, zoomeye = 28.
+ Add 12 new: crates, dockerhub, goproxy, helm, k8s, maven, npm, nuget, packagist, pypi, rubygems, terraform = 12.
+ But wait — check if dospaces is already in the list. Looking at register.go: DOSpacesScanner is registered. Check its Name(). Need to verify.
+ Read the current want list from register_test.go to be precise. It has 28 entries already listed. Add the 12 new ones merged alphabetically. Total = 40.
+
+
+ cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestRegisterAll" -v -count=1
+
+ RegisterAll registers all 40 sources. TestRegisterAll_WiresAllFortySources passes with complete sorted name list. Missing creds test asserts 40.
+
+
+
+ Task 2: Extend integration test with Phase 13 httptest handlers
+ pkg/recon/sources/integration_test.go, cmd/recon.go
+
+**integration_test.go updates:**
+1. Add httptest mux handlers for all 12 new sources. Each handler serves canned JSON/HTML fixture matching the API format that source expects:
+
+ **npm** — `mux.HandleFunc("/npm/-/v1/search", ...)` returning `{"objects": [{"package": {"name": "leak-pkg", "links": {"npm": "https://npmjs.com/package/leak-pkg"}}}]}`
+
+ **pypi** — `mux.HandleFunc("/pypi/search/", ...)` returning HTML with `` links
+
+ **crates** — `mux.HandleFunc("/crates/api/v1/crates", ...)` returning `{"crates": [{"name": "leaked-crate"}]}`
+
+ **rubygems** — `mux.HandleFunc("/rubygems/api/v1/search.json", ...)` returning `[{"name": "leaked-gem", "project_uri": "https://rubygems.org/gems/leaked-gem"}]`
+
+ **maven** — `mux.HandleFunc("/maven/solrsearch/select", ...)` returning `{"response": {"docs": [{"g": "com.leak", "a": "sdk", "latestVersion": "1.0"}]}}`
+
+ **nuget** — `mux.HandleFunc("/nuget/query", ...)` returning `{"data": [{"id": "LeakedPkg", "version": "1.0"}]}`
+
+ **goproxy** — `mux.HandleFunc("/goproxy/search", ...)` returning HTML with `` links
+
+ **packagist** — `mux.HandleFunc("/packagist/search.json", ...)` returning `{"results": [{"name": "vendor/leaked", "url": "https://packagist.org/packages/vendor/leaked"}]}`
+
+ **dockerhub** — `mux.HandleFunc("/dockerhub/v2/search/repositories/", ...)` returning `{"results": [{"repo_name": "user/leaked-image"}]}`
+
+ **k8s** — `mux.HandleFunc("/k8s/api/v1/packages/search", ...)` returning `{"packages": [{"name": "leaked-operator", "repository": {"name": "bitnami", "kind": 6}}]}`
+
+ **terraform** — `mux.HandleFunc("/terraform/v1/modules", ...)` returning `{"modules": [{"namespace": "hashicorp", "name": "leaked", "provider": "aws"}]}`
+
+ **helm** — `mux.HandleFunc("/helm/api/v1/packages/search", ...)` returning `{"packages": [{"name": "leaked-chart", "repository": {"name": "bitnami", "kind": 0}}]}`
+
+ NOTE: The mux path prefixes (e.g., `/npm/`, `/pypi/`) are conventions to route in a single httptest server. Each source constructor in the test sets BaseURL to `srv.URL + "/npm"`, `srv.URL + "/pypi"`, etc.
+
+2. Register each new source with BaseURL pointing at `srv.URL + "/{prefix}"`:
+ ```go
+ engine.Register(&NpmSource{BaseURL: srv.URL + "/npm", Registry: reg, Limiters: lim, Client: NewClient()})
+ // ... same for all 12
+ ```
+
+3. Update the expected SourceType set to include all 12 new types: "recon:npm", "recon:pypi", "recon:crates", "recon:rubygems", "recon:maven", "recon:nuget", "recon:goproxy", "recon:packagist", "recon:dockerhub", "recon:k8s", "recon:terraform", "recon:helm"
+
+4. Update the test name/comment from "28 sources" to "40 sources"
+
+**cmd/recon.go updates:**
+- No new SourcesConfig fields needed since all Phase 13 sources are credentialless
+- Verify the existing cmd/recon.go RegisterAll call passes through correctly — no changes expected but confirm no compilation errors
+
+
+ cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestIntegration_AllSources" -v -count=1 -timeout=60s
+
+ Integration test passes with all 40 sources producing at least one finding each via httptest. Full package compiles clean.
+
+
+
+
+
+Full test suite passes:
+```bash
+go test ./pkg/recon/sources/ -v -count=1 -timeout=120s
+go vet ./pkg/recon/sources/
+go build ./cmd/...
+```
+
+
+
+- RegisterAll registers 40 sources (28 existing + 12 new)
+- register_test.go asserts exact 40-name sorted list
+- Integration test exercises all 40 sources via httptest
+- cmd/recon.go compiles with updated register.go
+- `go test ./pkg/recon/sources/ -count=1` all green
+
+
+