diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md index dd6db5d..6857900 100644 --- a/.planning/REQUIREMENTS.md +++ b/.planning/REQUIREMENTS.md @@ -126,8 +126,8 @@ Requirements for initial release. Each maps to roadmap phases. ### OSINT/Recon — Package Registries - [ ] **RECON-PKG-01**: npm registry package scanning (download + extract + grep) -- [ ] **RECON-PKG-02**: PyPI package scanning -- [ ] **RECON-PKG-03**: RubyGems, crates.io, Maven, NuGet, Packagist, Go proxy scanning +- [x] **RECON-PKG-02**: PyPI package scanning +- [x] **RECON-PKG-03**: RubyGems, crates.io, Maven, NuGet, Packagist, Go proxy scanning ### OSINT/Recon — Container & Infrastructure diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index 295c67b..00cae75 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -273,7 +273,7 @@ Plans: **Plans**: 4 plans Plans: - [ ] 13-01-PLAN.md — NpmSource + PyPISource + CratesIOSource + RubyGemsSource (RECON-PKG-01, RECON-PKG-02) -- [ ] 13-02-PLAN.md — MavenSource + NuGetSource + GoProxySource + PackagistSource (RECON-PKG-02, RECON-PKG-03) +- [x] 13-02-PLAN.md — MavenSource + NuGetSource + GoProxySource + PackagistSource (RECON-PKG-02, RECON-PKG-03) - [ ] 13-03-PLAN.md — DockerHubSource + KubernetesSource + TerraformSource + HelmSource (RECON-INFRA-01..04) - [ ] 13-04-PLAN.md — RegisterAll wiring + integration test (all Phase 13 reqs) @@ -355,7 +355,7 @@ Phases execute in numeric order: 1 → 2 → 3 → ... → 18 | 10. OSINT Code Hosting | 9/9 | Complete | 2026-04-06 | | 11. OSINT Search & Paste | 3/3 | Complete | 2026-04-06 | | 12. OSINT IoT & Cloud Storage | 4/4 | Complete | 2026-04-06 | -| 13. OSINT Package Registries & Container/IaC | 0/? | Not started | - | +| 13. OSINT Package Registries & Container/IaC | 1/4 | In Progress| | | 14. OSINT CI/CD Logs, Web Archives & Frontend Leaks | 0/? | Not started | - | | 15. OSINT Forums, Collaboration & Log Aggregators | 0/? | Not started | - | | 16. OSINT Threat Intel, Mobile, DNS & API Marketplaces | 0/? | Not started | - | diff --git a/.planning/STATE.md b/.planning/STATE.md index 4d71baa..5e59f3d 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -2,15 +2,15 @@ gsd_state_version: 1.0 milestone: v1.0 milestone_name: milestone -status: completed -stopped_at: Completed 12-04-PLAN.md -last_updated: "2026-04-06T09:45:38.963Z" +status: executing +stopped_at: Completed 13-02-PLAN.md +last_updated: "2026-04-06T09:54:37.643Z" last_activity: 2026-04-06 progress: total_phases: 18 completed_phases: 12 - total_plans: 69 - completed_plans: 70 + total_plans: 73 + completed_plans: 71 percent: 20 --- @@ -21,13 +21,13 @@ progress: See: .planning/PROJECT.md (updated 2026-04-04) **Core value:** Detect leaked LLM API keys across more providers and more internet sources than any other tool, with active verification to confirm keys are real and alive. -**Current focus:** Phase 12 — osint_iot_cloud_storage (in progress) +**Current focus:** Phase 13 — osint-package-registries ## Current Position -Phase: 13 -Plan: Not started -Status: Plan 04 complete +Phase: 13 (osint-package-registries) — EXECUTING +Plan: 2 of 4 +Status: Ready to execute Last activity: 2026-04-06 Progress: [██░░░░░░░░] 20% @@ -93,6 +93,7 @@ Progress: [██░░░░░░░░] 20% | Phase 11 P01 | 3min | 2 tasks | 11 files | | Phase 12 P01 | 3min | 2 tasks | 6 files | | Phase 12 P04 | 14min | 2 tasks | 4 files | +| Phase 13 P02 | 3min | 2 tasks | 8 files | ## Accumulated Context @@ -135,6 +136,7 @@ Recent decisions affecting current work: - [Phase 11]: All five search sources use dork query format to focus on paste/code hosting leak sites - [Phase 12]: Shodan/Censys/ZoomEye use bare keyword queries; Censys POST+BasicAuth, Shodan key param, ZoomEye API-KEY header - [Phase 12]: RegisterAll extended to 28 sources (18 Phase 10-11 + 10 Phase 12); cloud scanners credentialless, IoT scanners credential-gated +- [Phase 13]: GoProxy regex requires domain dot to filter non-module paths; NuGet projectUrl fallback to nuget.org canonical ### Pending Todos @@ -149,6 +151,6 @@ None yet. ## Session Continuity -Last session: 2026-04-06T09:42:09.000Z -Stopped at: Completed 12-04-PLAN.md +Last session: 2026-04-06T09:54:37.639Z +Stopped at: Completed 13-02-PLAN.md Resume file: None diff --git a/.planning/phases/13-osint_package_registries_container_iac/13-02-SUMMARY.md b/.planning/phases/13-osint_package_registries_container_iac/13-02-SUMMARY.md new file mode 100644 index 0000000..4e2a79c --- /dev/null +++ b/.planning/phases/13-osint_package_registries_container_iac/13-02-SUMMARY.md @@ -0,0 +1,121 @@ +--- +phase: 13-osint_package_registries_container_iac +plan: 02 +subsystem: recon +tags: [maven, nuget, goproxy, packagist, osint, package-registry] + +# Dependency graph +requires: + - phase: 09-osint-infrastructure + provides: ReconSource interface, LimiterRegistry, shared Client + - phase: 10-osint-code-hosting + provides: BuildQueries, extractAnchorHrefs HTML parsing helper +provides: + - MavenSource searching Maven Central Solr API + - NuGetSource searching NuGet gallery JSON API + - GoProxySource parsing pkg.go.dev HTML search results + - PackagistSource searching Packagist JSON API +affects: [13-04, register-all-wiring] + +# Tech tracking +tech-stack: + added: [] + patterns: [JSON API source pattern for Maven/NuGet/Packagist, HTML scraping reuse for GoProxy via extractAnchorHrefs] + +key-files: + created: + - pkg/recon/sources/maven.go + - pkg/recon/sources/maven_test.go + - pkg/recon/sources/nuget.go + - pkg/recon/sources/nuget_test.go + - pkg/recon/sources/goproxy.go + - pkg/recon/sources/goproxy_test.go + - pkg/recon/sources/packagist.go + - pkg/recon/sources/packagist_test.go + modified: [] + +key-decisions: + - "GoProxy regex requires domain dot to filter non-module paths like /about" + - "NuGet uses projectUrl with fallback to nuget.org/packages/{id} when empty" + +patterns-established: + - "JSON registry source: parse response, emit Finding per result, continue on HTTP errors" + - "HTML registry source: reuse extractAnchorHrefs with domain-aware regex" + +requirements-completed: [RECON-PKG-02, RECON-PKG-03] + +# Metrics +duration: 3min +completed: 2026-04-06 +--- + +# Phase 13 Plan 02: Maven, NuGet, GoProxy, Packagist Sources Summary + +**Four package registry ReconSources covering Java/JVM (Maven Central), .NET (NuGet), Go (pkg.go.dev), and PHP (Packagist) ecosystems** + +## Performance + +- **Duration:** 3 min +- **Started:** 2026-04-06T09:51:21Z +- **Completed:** 2026-04-06T09:54:16Z +- **Tasks:** 2 +- **Files modified:** 8 + +## Accomplishments +- MavenSource queries Maven Central's Solr search API, parsing grouped artifact results +- NuGetSource queries NuGet gallery with projectUrl fallback to nuget.org canonical URL +- GoProxySource parses pkg.go.dev HTML search results reusing extractAnchorHrefs with domain-aware regex +- PackagistSource queries Packagist JSON search API for PHP packages +- All four sources: httptest fixtures, context cancellation, metadata method tests (16 tests total) + +## Task Commits + +Each task was committed atomically: + +1. **Task 1: Implement MavenSource and NuGetSource** - `2361315` (feat) +2. **Task 2: Implement GoProxySource and PackagistSource** - `018bb16` (feat) + +## Files Created/Modified +- `pkg/recon/sources/maven.go` - MavenSource querying Maven Central Solr API +- `pkg/recon/sources/maven_test.go` - httptest with canned Solr JSON fixture +- `pkg/recon/sources/nuget.go` - NuGetSource querying NuGet gallery search API +- `pkg/recon/sources/nuget_test.go` - httptest with canned NuGet JSON, projectUrl fallback test +- `pkg/recon/sources/goproxy.go` - GoProxySource parsing pkg.go.dev HTML search +- `pkg/recon/sources/goproxy_test.go` - httptest with canned HTML, module path extraction test +- `pkg/recon/sources/packagist.go` - PackagistSource querying Packagist JSON API +- `pkg/recon/sources/packagist_test.go` - httptest with canned Packagist JSON fixture + +## Decisions Made +- GoProxy regex tightened to require a dot in the path (`^/[a-z][a-z0-9_-]*\.[a-z0-9./_-]+$`) to distinguish Go module paths from site navigation links like /about +- NuGet uses projectUrl when available, falls back to canonical nuget.org URL when empty + +## Deviations from Plan + +### Auto-fixed Issues + +**1. [Rule 1 - Bug] GoProxy regex too permissive** +- **Found during:** Task 2 (GoProxySource implementation) +- **Issue:** Original regex `^/[a-z][a-z0-9./_-]*$` matched non-module paths like /about +- **Fix:** Tightened to require a dot character (domain separator) in the path +- **Files modified:** pkg/recon/sources/goproxy.go +- **Verification:** Test now correctly extracts only 2 module paths from fixture HTML +- **Committed in:** 018bb16 + +--- + +**Total deviations:** 1 auto-fixed (1 bug) +**Impact on plan:** Minor regex fix for correctness. No scope creep. + +## Issues Encountered +None + +## User Setup Required +None - no external service configuration required. + +## Next Phase Readiness +- All four package registry sources ready for RegisterAll wiring in plan 13-04 +- Sources follow established pattern: BaseURL override for tests, BuildQueries for keyword generation, LimiterRegistry for rate coordination + +--- +*Phase: 13-osint_package_registries_container_iac* +*Completed: 2026-04-06*