--- phase: 13-osint_package_registries_container_iac plan: 02 type: execute wave: 1 depends_on: [] files_modified: - pkg/recon/sources/maven.go - pkg/recon/sources/maven_test.go - pkg/recon/sources/nuget.go - pkg/recon/sources/nuget_test.go - pkg/recon/sources/goproxy.go - pkg/recon/sources/goproxy_test.go - pkg/recon/sources/packagist.go - pkg/recon/sources/packagist_test.go autonomous: true requirements: - RECON-PKG-02 - RECON-PKG-03 must_haves: truths: - "MavenSource searches Maven Central for artifacts matching provider keywords and emits findings" - "NuGetSource searches NuGet gallery for packages matching provider keywords and emits findings" - "GoProxySource searches Go module proxy for modules matching provider keywords and emits findings" - "PackagistSource searches Packagist for PHP packages matching provider keywords and emits findings" - "All four sources handle context cancellation, empty registries, and HTTP errors gracefully" artifacts: - path: "pkg/recon/sources/maven.go" provides: "MavenSource implementing recon.ReconSource" contains: "func (s *MavenSource) Sweep" - path: "pkg/recon/sources/nuget.go" provides: "NuGetSource implementing recon.ReconSource" contains: "func (s *NuGetSource) Sweep" - path: "pkg/recon/sources/goproxy.go" provides: "GoProxySource implementing recon.ReconSource" contains: "func (s *GoProxySource) Sweep" - path: "pkg/recon/sources/packagist.go" provides: "PackagistSource implementing recon.ReconSource" contains: "func (s *PackagistSource) Sweep" key_links: - from: "pkg/recon/sources/maven.go" to: "pkg/recon/source.go" via: "implements ReconSource interface" pattern: "var _ recon\\.ReconSource" - from: "pkg/recon/sources/nuget.go" to: "pkg/recon/source.go" via: "implements ReconSource interface" pattern: "var _ recon\\.ReconSource" --- Implement four package registry ReconSource modules: Maven Central, NuGet, Go Proxy, and Packagist. Purpose: Extends package registry coverage to Java/JVM, .NET, Go, and PHP ecosystems, completing the full set of 8 package registries for RECON-PKG-02 and RECON-PKG-03. Output: 4 source files + 4 test files in pkg/recon/sources/ @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @pkg/recon/source.go @pkg/recon/sources/httpclient.go @pkg/recon/sources/queries.go @pkg/recon/sources/replit.go (pattern reference) @pkg/recon/sources/replit_test.go (test pattern reference) From pkg/recon/source.go: ```go type ReconSource interface { Name() string RateLimit() rate.Limit Burst() int RespectsRobots() bool Enabled(cfg Config) bool Sweep(ctx context.Context, query string, out chan<- Finding) error } ``` From pkg/recon/sources/httpclient.go: ```go func NewClient() *Client func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error) ``` From pkg/recon/sources/queries.go: ```go func BuildQueries(reg *providers.Registry, source string) []string ``` Task 1: Implement MavenSource and NuGetSource pkg/recon/sources/maven.go, pkg/recon/sources/maven_test.go, pkg/recon/sources/nuget.go, pkg/recon/sources/nuget_test.go **MavenSource** (maven.go): - Struct: `MavenSource` with `BaseURL`, `Registry`, `Limiters`, `Client` - Compile-time assertion: `var _ recon.ReconSource = (*MavenSource)(nil)` - Name() returns "maven" - RateLimit() returns rate.Every(2 * time.Second) - Burst() returns 2 - RespectsRobots() returns false (JSON API) - Enabled() always true (no credentials needed) - BaseURL defaults to "https://search.maven.org" - Sweep() logic: 1. BuildQueries(s.Registry, "maven") 2. For each keyword, GET `{BaseURL}/solrsearch/select?q={keyword}&rows=20&wt=json` 3. Parse JSON: `{"response": {"docs": [{"g": "group", "a": "artifact", "latestVersion": "1.0"}]}}` 4. Define response structs: `mavenSearchResponse`, `mavenResponseBody`, `mavenDoc` 5. Emit Finding per doc: Source="https://search.maven.org/artifact/{g}/{a}/{latestVersion}/jar", SourceType="recon:maven" **NuGetSource** (nuget.go): - Struct: `NuGetSource` with `BaseURL`, `Registry`, `Limiters`, `Client` - Compile-time assertion: `var _ recon.ReconSource = (*NuGetSource)(nil)` - Name() returns "nuget" - RateLimit() returns rate.Every(1 * time.Second) - Burst() returns 3 - RespectsRobots() returns false (JSON API) - Enabled() always true - BaseURL defaults to "https://azuresearch-usnc.nuget.org" - Sweep() logic: 1. BuildQueries(s.Registry, "nuget") 2. For each keyword, GET `{BaseURL}/query?q={keyword}&take=20` 3. Parse JSON: `{"data": [{"id": "...", "version": "...", "projectUrl": "..."}]}` 4. Define response structs: `nugetSearchResponse`, `nugetPackage` 5. Emit Finding per package: Source=projectUrl (fallback to "https://www.nuget.org/packages/{id}"), SourceType="recon:nuget" **Tests** — httptest pattern: - maven_test.go: httptest serving canned Solr JSON. Test Sweep extracts findings, Name/Rate/Burst, ctx cancellation. - nuget_test.go: httptest serving canned NuGet search JSON. Same test categories. cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestMaven|TestNuGet" -v -count=1 MavenSource and NuGetSource pass all tests: findings extracted from httptest fixtures, metadata methods return expected values Task 2: Implement GoProxySource and PackagistSource pkg/recon/sources/goproxy.go, pkg/recon/sources/goproxy_test.go, pkg/recon/sources/packagist.go, pkg/recon/sources/packagist_test.go **GoProxySource** (goproxy.go): - Struct: `GoProxySource` with `BaseURL`, `Registry`, `Limiters`, `Client` - Compile-time assertion: `var _ recon.ReconSource = (*GoProxySource)(nil)` - Name() returns "goproxy" - RateLimit() returns rate.Every(2 * time.Second) - Burst() returns 2 - RespectsRobots() returns false - Enabled() always true - BaseURL defaults to "https://pkg.go.dev" - Sweep() logic: 1. BuildQueries(s.Registry, "goproxy") 2. For each keyword, GET `{BaseURL}/search?q={keyword}&m=package` — this returns HTML 3. Parse HTML for search result links matching pattern `/[^"]+` inside `` elements with class containing "SearchSnippet" 4. Simpler approach: use regex to extract hrefs matching `href="(/[a-z][^"]*)"` from search result snippet divs 5. Emit Finding per result: Source="{BaseURL}{path}", SourceType="recon:goproxy" 6. Note: pkg.go.dev search returns HTML, not JSON. Use the same HTML parsing approach as ReplitSource (extractAnchorHrefs with appropriate regex). 7. Define a package-level regexp: `goProxyLinkRE = regexp.MustCompile(`^/[a-z][a-z0-9./_-]*$`)` to match Go module paths **PackagistSource** (packagist.go): - Struct: `PackagistSource` with `BaseURL`, `Registry`, `Limiters`, `Client` - Compile-time assertion: `var _ recon.ReconSource = (*PackagistSource)(nil)` - Name() returns "packagist" - RateLimit() returns rate.Every(2 * time.Second) - Burst() returns 2 - RespectsRobots() returns false (JSON API) - Enabled() always true - BaseURL defaults to "https://packagist.org" - Sweep() logic: 1. BuildQueries(s.Registry, "packagist") 2. For each keyword, GET `{BaseURL}/search.json?q={keyword}&per_page=20` 3. Parse JSON: `{"results": [{"name": "vendor/package", "url": "..."}]}` 4. Define response structs: `packagistSearchResponse`, `packagistPackage` 5. Emit Finding per package: Source=url, SourceType="recon:packagist" **Tests** — httptest pattern: - goproxy_test.go: httptest serving canned HTML with search result links. Test extraction of Go module paths. - packagist_test.go: httptest serving canned Packagist JSON. Test all standard categories. cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestGoProxy|TestPackagist" -v -count=1 GoProxySource and PackagistSource pass all tests. GoProxy HTML parsing extracts module paths correctly. Packagist JSON parsing works. All 8 new files compile and pass tests: ```bash go test ./pkg/recon/sources/ -run "TestMaven|TestNuGet|TestGoProxy|TestPackagist" -v -count=1 go vet ./pkg/recon/sources/ ``` - 4 new source files implement recon.ReconSource interface - 4 test files use httptest with canned fixtures - All tests pass - No compilation errors across the package After completion, create `.planning/phases/13-osint_package_registries_container_iac/13-02-SUMMARY.md`