From c595fef148d4820307d51df70ccae7195cac9462 Mon Sep 17 00:00:00 2001 From: salvacybersec Date: Mon, 6 Apr 2026 12:55:06 +0300 Subject: [PATCH] docs(13-01): complete package registry sources plan - SUMMARY.md with 4 sources, 16 tests, 8 files - STATE.md updated with decisions and metrics - Requirements RECON-PKG-01, RECON-PKG-02 marked complete --- .planning/REQUIREMENTS.md | 4 +- .planning/STATE.md | 13 ++- .../13-01-SUMMARY.md | 106 ++++++++++++++++++ 3 files changed, 116 insertions(+), 7 deletions(-) create mode 100644 .planning/phases/13-osint_package_registries_container_iac/13-01-SUMMARY.md diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md index dd6db5d..45677da 100644 --- a/.planning/REQUIREMENTS.md +++ b/.planning/REQUIREMENTS.md @@ -125,8 +125,8 @@ Requirements for initial release. Each maps to roadmap phases. ### OSINT/Recon — Package Registries -- [ ] **RECON-PKG-01**: npm registry package scanning (download + extract + grep) -- [ ] **RECON-PKG-02**: PyPI package scanning +- [x] **RECON-PKG-01**: npm registry package scanning (download + extract + grep) +- [x] **RECON-PKG-02**: PyPI package scanning - [ ] **RECON-PKG-03**: RubyGems, crates.io, Maven, NuGet, Packagist, Go proxy scanning ### OSINT/Recon — Container & Infrastructure diff --git a/.planning/STATE.md b/.planning/STATE.md index 4d71baa..3140256 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -3,14 +3,14 @@ gsd_state_version: 1.0 milestone: v1.0 milestone_name: milestone status: completed -stopped_at: Completed 12-04-PLAN.md -last_updated: "2026-04-06T09:45:38.963Z" +stopped_at: Completed 13-01-PLAN.md +last_updated: "2026-04-06T09:54:51.064Z" last_activity: 2026-04-06 progress: total_phases: 18 completed_phases: 12 total_plans: 69 - completed_plans: 70 + completed_plans: 71 percent: 20 --- @@ -93,6 +93,7 @@ Progress: [██░░░░░░░░] 20% | Phase 11 P01 | 3min | 2 tasks | 11 files | | Phase 12 P01 | 3min | 2 tasks | 6 files | | Phase 12 P04 | 14min | 2 tasks | 4 files | +| Phase 13 P01 | 3min | 2 tasks | 8 files | ## Accumulated Context @@ -135,6 +136,8 @@ Recent decisions affecting current work: - [Phase 11]: All five search sources use dork query format to focus on paste/code hosting leak sites - [Phase 12]: Shodan/Censys/ZoomEye use bare keyword queries; Censys POST+BasicAuth, Shodan key param, ZoomEye API-KEY header - [Phase 12]: RegisterAll extended to 28 sources (18 Phase 10-11 + 10 Phase 12); cloud scanners credentialless, IoT scanners credential-gated +- [Phase 13]: PyPI uses HTML scraping with extractAnchorHrefs since no public search JSON API +- [Phase 13]: CratesIO sets custom User-Agent per crates.io API requirements ### Pending Todos @@ -149,6 +152,6 @@ None yet. ## Session Continuity -Last session: 2026-04-06T09:42:09.000Z -Stopped at: Completed 12-04-PLAN.md +Last session: 2026-04-06T09:54:51.060Z +Stopped at: Completed 13-01-PLAN.md Resume file: None diff --git a/.planning/phases/13-osint_package_registries_container_iac/13-01-SUMMARY.md b/.planning/phases/13-osint_package_registries_container_iac/13-01-SUMMARY.md new file mode 100644 index 0000000..2814a7c --- /dev/null +++ b/.planning/phases/13-osint_package_registries_container_iac/13-01-SUMMARY.md @@ -0,0 +1,106 @@ +--- +phase: 13-osint_package_registries_container_iac +plan: 01 +subsystem: recon +tags: [npm, pypi, crates.io, rubygems, package-registry, osint] + +requires: + - phase: 10-osint-code-hosting + provides: ReconSource interface, Client, BuildQueries, LimiterRegistry patterns +provides: + - NpmSource searching npm registry JSON API + - PyPISource scraping pypi.org search HTML + - CratesIOSource searching crates.io JSON API with custom User-Agent + - RubyGemsSource searching rubygems.org search.json API +affects: [13-osint_package_registries_container_iac, register.go] + +tech-stack: + added: [] + patterns: [JSON API source pattern, HTML scraping source pattern with extractAnchorHrefs reuse] + +key-files: + created: + - pkg/recon/sources/npm.go + - pkg/recon/sources/npm_test.go + - pkg/recon/sources/pypi.go + - pkg/recon/sources/pypi_test.go + - pkg/recon/sources/cratesio.go + - pkg/recon/sources/cratesio_test.go + - pkg/recon/sources/rubygems.go + - pkg/recon/sources/rubygems_test.go + modified: [] + +key-decisions: + - "PyPI uses HTML scraping with extractAnchorHrefs (reusing Replit pattern) since PyPI has no public search JSON API" + - "CratesIO sets custom User-Agent per crates.io API requirements" + +patterns-established: + - "Package registry source pattern: credentialless, JSON API search, bare keyword queries via BuildQueries" + +requirements-completed: [RECON-PKG-01, RECON-PKG-02] + +duration: 3min +completed: 2026-04-06 +--- + +# Phase 13 Plan 01: Package Registry Sources Summary + +**Four package registry ReconSources (npm, PyPI, crates.io, RubyGems) searching JS/Python/Rust/Ruby ecosystems for provider keyword matches** + +## Performance + +- **Duration:** 3 min +- **Started:** 2026-04-06T09:51:16Z +- **Completed:** 2026-04-06T09:54:00Z +- **Tasks:** 2 +- **Files modified:** 8 + +## Accomplishments +- NpmSource searches npm registry JSON API with 20-result pagination per keyword +- PyPISource scrapes pypi.org search HTML reusing extractAnchorHrefs from Replit pattern +- CratesIOSource queries crates.io JSON API with required custom User-Agent header +- RubyGemsSource queries rubygems.org search.json with fallback URL construction +- All four sources credentialless, rate-limited, context-aware with httptest test coverage + +## Task Commits + +Each task was committed atomically: + +1. **Task 1: Implement NpmSource and PyPISource** - `4b268d1` (feat) +2. **Task 2: Implement CratesIOSource and RubyGemsSource** - `9907e24` (feat) + +## Files Created/Modified +- `pkg/recon/sources/npm.go` - NpmSource searching npm registry JSON API +- `pkg/recon/sources/npm_test.go` - httptest tests for NpmSource (4 tests) +- `pkg/recon/sources/pypi.go` - PyPISource scraping pypi.org search HTML +- `pkg/recon/sources/pypi_test.go` - httptest tests for PyPISource (4 tests) +- `pkg/recon/sources/cratesio.go` - CratesIOSource with custom User-Agent +- `pkg/recon/sources/cratesio_test.go` - httptest tests verifying User-Agent header (4 tests) +- `pkg/recon/sources/rubygems.go` - RubyGemsSource searching rubygems.org JSON API +- `pkg/recon/sources/rubygems_test.go` - httptest tests for RubyGemsSource (4 tests) + +## Decisions Made +- PyPI uses HTML scraping with extractAnchorHrefs (reusing Replit pattern) since PyPI has no public search JSON API +- CratesIO sets custom User-Agent header per crates.io API policy requirements +- All sources use bare keyword queries via BuildQueries default path + +## Deviations from Plan + +None - plan executed exactly as written. + +## Issues Encountered +None + +## User Setup Required +None - no external service configuration required. + +## Known Stubs +None - all sources fully wired with real API endpoints and functional Sweep implementations. + +## Next Phase Readiness +- Four package registry sources ready for RegisterAll wiring +- Pattern established for remaining registry sources (Maven, NuGet, GoProxy) + +--- +*Phase: 13-osint_package_registries_container_iac* +*Completed: 2026-04-06*