---
phase: 13-osint_package_registries_container_iac
plan: 02
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/recon/sources/maven.go
- pkg/recon/sources/maven_test.go
- pkg/recon/sources/nuget.go
- pkg/recon/sources/nuget_test.go
- pkg/recon/sources/goproxy.go
- pkg/recon/sources/goproxy_test.go
- pkg/recon/sources/packagist.go
- pkg/recon/sources/packagist_test.go
autonomous: true
requirements:
- RECON-PKG-02
- RECON-PKG-03
must_haves:
truths:
- "MavenSource searches Maven Central for artifacts matching provider keywords and emits findings"
- "NuGetSource searches NuGet gallery for packages matching provider keywords and emits findings"
- "GoProxySource searches Go module proxy for modules matching provider keywords and emits findings"
- "PackagistSource searches Packagist for PHP packages matching provider keywords and emits findings"
- "All four sources handle context cancellation, empty registries, and HTTP errors gracefully"
artifacts:
- path: "pkg/recon/sources/maven.go"
provides: "MavenSource implementing recon.ReconSource"
contains: "func (s *MavenSource) Sweep"
- path: "pkg/recon/sources/nuget.go"
provides: "NuGetSource implementing recon.ReconSource"
contains: "func (s *NuGetSource) Sweep"
- path: "pkg/recon/sources/goproxy.go"
provides: "GoProxySource implementing recon.ReconSource"
contains: "func (s *GoProxySource) Sweep"
- path: "pkg/recon/sources/packagist.go"
provides: "PackagistSource implementing recon.ReconSource"
contains: "func (s *PackagistSource) Sweep"
key_links:
- from: "pkg/recon/sources/maven.go"
to: "pkg/recon/source.go"
via: "implements ReconSource interface"
pattern: "var _ recon\\.ReconSource"
- from: "pkg/recon/sources/nuget.go"
to: "pkg/recon/source.go"
via: "implements ReconSource interface"
pattern: "var _ recon\\.ReconSource"
---
Implement four package registry ReconSource modules: Maven Central, NuGet, Go Proxy, and Packagist.
Purpose: Extends package registry coverage to Java/JVM, .NET, Go, and PHP ecosystems, completing the full set of 8 package registries for RECON-PKG-02 and RECON-PKG-03.
Output: 4 source files + 4 test files in pkg/recon/sources/
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
@pkg/recon/sources/queries.go
@pkg/recon/sources/replit.go (pattern reference)
@pkg/recon/sources/replit_test.go (test pattern reference)
From pkg/recon/source.go:
```go
type ReconSource interface {
Name() string
RateLimit() rate.Limit
Burst() int
RespectsRobots() bool
Enabled(cfg Config) bool
Sweep(ctx context.Context, query string, out chan<- Finding) error
}
```
From pkg/recon/sources/httpclient.go:
```go
func NewClient() *Client
func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
```
From pkg/recon/sources/queries.go:
```go
func BuildQueries(reg *providers.Registry, source string) []string
```
Task 1: Implement MavenSource and NuGetSource
pkg/recon/sources/maven.go, pkg/recon/sources/maven_test.go, pkg/recon/sources/nuget.go, pkg/recon/sources/nuget_test.go
**MavenSource** (maven.go):
- Struct: `MavenSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
- Compile-time assertion: `var _ recon.ReconSource = (*MavenSource)(nil)`
- Name() returns "maven"
- RateLimit() returns rate.Every(2 * time.Second)
- Burst() returns 2
- RespectsRobots() returns false (JSON API)
- Enabled() always true (no credentials needed)
- BaseURL defaults to "https://search.maven.org"
- Sweep() logic:
1. BuildQueries(s.Registry, "maven")
2. For each keyword, GET `{BaseURL}/solrsearch/select?q={keyword}&rows=20&wt=json`
3. Parse JSON: `{"response": {"docs": [{"g": "group", "a": "artifact", "latestVersion": "1.0"}]}}`
4. Define response structs: `mavenSearchResponse`, `mavenResponseBody`, `mavenDoc`
5. Emit Finding per doc: Source="https://search.maven.org/artifact/{g}/{a}/{latestVersion}/jar", SourceType="recon:maven"
**NuGetSource** (nuget.go):
- Struct: `NuGetSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
- Compile-time assertion: `var _ recon.ReconSource = (*NuGetSource)(nil)`
- Name() returns "nuget"
- RateLimit() returns rate.Every(1 * time.Second)
- Burst() returns 3
- RespectsRobots() returns false (JSON API)
- Enabled() always true
- BaseURL defaults to "https://azuresearch-usnc.nuget.org"
- Sweep() logic:
1. BuildQueries(s.Registry, "nuget")
2. For each keyword, GET `{BaseURL}/query?q={keyword}&take=20`
3. Parse JSON: `{"data": [{"id": "...", "version": "...", "projectUrl": "..."}]}`
4. Define response structs: `nugetSearchResponse`, `nugetPackage`
5. Emit Finding per package: Source=projectUrl (fallback to "https://www.nuget.org/packages/{id}"), SourceType="recon:nuget"
**Tests** — httptest pattern:
- maven_test.go: httptest serving canned Solr JSON. Test Sweep extracts findings, Name/Rate/Burst, ctx cancellation.
- nuget_test.go: httptest serving canned NuGet search JSON. Same test categories.
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestMaven|TestNuGet" -v -count=1
MavenSource and NuGetSource pass all tests: findings extracted from httptest fixtures, metadata methods return expected values
Task 2: Implement GoProxySource and PackagistSource
pkg/recon/sources/goproxy.go, pkg/recon/sources/goproxy_test.go, pkg/recon/sources/packagist.go, pkg/recon/sources/packagist_test.go
**GoProxySource** (goproxy.go):
- Struct: `GoProxySource` with `BaseURL`, `Registry`, `Limiters`, `Client`
- Compile-time assertion: `var _ recon.ReconSource = (*GoProxySource)(nil)`
- Name() returns "goproxy"
- RateLimit() returns rate.Every(2 * time.Second)
- Burst() returns 2
- RespectsRobots() returns false
- Enabled() always true
- BaseURL defaults to "https://pkg.go.dev"
- Sweep() logic:
1. BuildQueries(s.Registry, "goproxy")
2. For each keyword, GET `{BaseURL}/search?q={keyword}&m=package` — this returns HTML
3. Parse HTML for search result links matching pattern `/[^"]+` inside `` elements with class containing "SearchSnippet"
4. Simpler approach: use regex to extract hrefs matching `href="(/[a-z][^"]*)"` from search result snippet divs
5. Emit Finding per result: Source="{BaseURL}{path}", SourceType="recon:goproxy"
6. Note: pkg.go.dev search returns HTML, not JSON. Use the same HTML parsing approach as ReplitSource (extractAnchorHrefs with appropriate regex).
7. Define a package-level regexp: `goProxyLinkRE = regexp.MustCompile(`^/[a-z][a-z0-9./_-]*$`)` to match Go module paths
**PackagistSource** (packagist.go):
- Struct: `PackagistSource` with `BaseURL`, `Registry`, `Limiters`, `Client`
- Compile-time assertion: `var _ recon.ReconSource = (*PackagistSource)(nil)`
- Name() returns "packagist"
- RateLimit() returns rate.Every(2 * time.Second)
- Burst() returns 2
- RespectsRobots() returns false (JSON API)
- Enabled() always true
- BaseURL defaults to "https://packagist.org"
- Sweep() logic:
1. BuildQueries(s.Registry, "packagist")
2. For each keyword, GET `{BaseURL}/search.json?q={keyword}&per_page=20`
3. Parse JSON: `{"results": [{"name": "vendor/package", "url": "..."}]}`
4. Define response structs: `packagistSearchResponse`, `packagistPackage`
5. Emit Finding per package: Source=url, SourceType="recon:packagist"
**Tests** — httptest pattern:
- goproxy_test.go: httptest serving canned HTML with search result links. Test extraction of Go module paths.
- packagist_test.go: httptest serving canned Packagist JSON. Test all standard categories.
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestGoProxy|TestPackagist" -v -count=1
GoProxySource and PackagistSource pass all tests. GoProxy HTML parsing extracts module paths correctly. Packagist JSON parsing works.
All 8 new files compile and pass tests:
```bash
go test ./pkg/recon/sources/ -run "TestMaven|TestNuGet|TestGoProxy|TestPackagist" -v -count=1
go vet ./pkg/recon/sources/
```
- 4 new source files implement recon.ReconSource interface
- 4 test files use httptest with canned fixtures
- All tests pass
- No compilation errors across the package