From 1acbedc03a8054f7213fa63cec6b6a095032013a Mon Sep 17 00:00:00 2001 From: salvacybersec Date: Mon, 6 Apr 2026 01:28:32 +0300 Subject: [PATCH] docs(10-09): complete RegisterAll + integration test plan --- .planning/ROADMAP.md | 16 +-- .planning/STATE.md | 16 +-- .../10-osint-code-hosting/10-09-SUMMARY.md | 100 ++++++++++++++++++ 3 files changed, 117 insertions(+), 15 deletions(-) create mode 100644 .planning/phases/10-osint-code-hosting/10-09-SUMMARY.md diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index 5c637e8..0d042f2 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -21,7 +21,7 @@ Decimal phases appear between their surrounding integers in numeric order. - [ ] **Phase 7: Import Adapters & CI/CD Integration** - TruffleHog/Gitleaks import + pre-commit hooks + SARIF to GitHub Security - [ ] **Phase 8: Dork Engine** - YAML-based dork definitions with 150+ built-in dorks and management commands - [ ] **Phase 9: OSINT Infrastructure** - Per-source rate limiter architecture and recon engine framework before any sources -- [ ] **Phase 10: OSINT Code Hosting** - GitHub, GitLab, Bitbucket, HuggingFace and 6 more code hosting sources +- [x] **Phase 10: OSINT Code Hosting** - GitHub, GitLab, Bitbucket, HuggingFace and 6 more code hosting sources (completed 2026-04-05) - [ ] **Phase 11: OSINT Search & Paste** - Search engine dorking and paste site aggregation - [ ] **Phase 12: OSINT IoT & Cloud Storage** - Shodan/Censys/ZoomEye/FOFA and S3/GCS/Azure cloud storage scanning - [ ] **Phase 13: OSINT Package Registries & Container/IaC** - npm/PyPI/crates.io and Docker Hub/K8s/Terraform scanning @@ -219,13 +219,13 @@ Plans: Plans: - [x] 10-01-PLAN.md — Shared HTTP client + provider-query generator + RegisterAll skeleton - [x] 10-02-PLAN.md — GitHubSource (RECON-CODE-01) -- [ ] 10-03-PLAN.md — GitLabSource (RECON-CODE-02) -- [ ] 10-04-PLAN.md — BitbucketSource + GistSource (RECON-CODE-03, RECON-CODE-04) -- [ ] 10-05-PLAN.md — CodebergSource/Gitea (RECON-CODE-05) -- [ ] 10-06-PLAN.md — HuggingFaceSource (RECON-CODE-08) +- [x] 10-03-PLAN.md — GitLabSource (RECON-CODE-02) +- [x] 10-04-PLAN.md — BitbucketSource + GistSource (RECON-CODE-03, RECON-CODE-04) +- [x] 10-05-PLAN.md — CodebergSource/Gitea (RECON-CODE-05) +- [x] 10-06-PLAN.md — HuggingFaceSource (RECON-CODE-08) - [x] 10-07-PLAN.md — Replit + CodeSandbox + Sandboxes scrapers (RECON-CODE-06, RECON-CODE-07, RECON-CODE-10) -- [ ] 10-08-PLAN.md — KaggleSource (RECON-CODE-09) -- [ ] 10-09-PLAN.md — RegisterAll wiring + CLI integration + end-to-end test +- [x] 10-08-PLAN.md — KaggleSource (RECON-CODE-09) +- [x] 10-09-PLAN.md — RegisterAll wiring + CLI integration + end-to-end test ### Phase 11: OSINT Search & Paste **Goal**: Users can run automated search engine dorking against Google, Bing, DuckDuckGo, Yandex, and Brave, and scan 15+ paste site aggregations for leaked API keys @@ -336,7 +336,7 @@ Phases execute in numeric order: 1 → 2 → 3 → ... → 18 | 7. Import Adapters & CI/CD Integration | 0/? | Not started | - | | 8. Dork Engine | 0/? | Not started | - | | 9. OSINT Infrastructure | 2/6 | In Progress| | -| 10. OSINT Code Hosting | 3/9 | In Progress| | +| 10. OSINT Code Hosting | 9/9 | Complete | 2026-04-05 | | 11. OSINT Search & Paste | 0/? | Not started | - | | 12. OSINT IoT & Cloud Storage | 0/? | Not started | - | | 13. OSINT Package Registries & Container/IaC | 0/? | Not started | - | diff --git a/.planning/STATE.md b/.planning/STATE.md index 2420e3f..f567054 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -3,14 +3,14 @@ gsd_state_version: 1.0 milestone: v1.0 milestone_name: milestone status: executing -stopped_at: Completed 10-07-PLAN.md -last_updated: "2026-04-05T22:19:41.729Z" +stopped_at: Completed 10-09-PLAN.md +last_updated: "2026-04-05T22:28:27.416Z" last_activity: 2026-04-05 progress: total_phases: 18 - completed_phases: 9 + completed_phases: 10 total_plans: 62 - completed_plans: 57 + completed_plans: 63 percent: 20 --- @@ -26,7 +26,7 @@ See: .planning/PROJECT.md (updated 2026-04-04) ## Current Position Phase: 10 (osint-code-hosting) — EXECUTING -Plan: 3 of 9 +Plan: 4 of 9 Status: Ready to execute Last activity: 2026-04-05 @@ -88,6 +88,7 @@ Progress: [██░░░░░░░░] 20% | Phase 10-osint-code-hosting P01 | 4m | 2 tasks | 7 files | | Phase 10-osint-code-hosting P02 | 5min | 1 tasks | 2 files | | Phase 10-osint-code-hosting P07 | 6 | 2 tasks | 6 files | +| Phase 10 P09 | 12min | 2 tasks | 5 files | ## Accumulated Context @@ -124,6 +125,7 @@ Recent decisions affecting current work: - [Phase 10-osint-code-hosting]: Client handles retry only; rate limiting is caller's responsibility via LimiterRegistry - [Phase 10-osint-code-hosting]: github/gist use 'kw' in:file; all other sources use bare keyword - [Phase 10-osint-code-hosting]: GitHubSource reuses shared sources.Client + LimiterRegistry; builds queries from providers.Registry via BuildQueries; missing token disables (not errors) +- [Phase 10]: RegisterAll registers all ten Phase 10 sources unconditionally; missing credentials flip Enabled()==false rather than hiding sources from the CLI catalog ### Pending Todos @@ -138,6 +140,6 @@ None yet. ## Session Continuity -Last session: 2026-04-05T22:19:41.725Z -Stopped at: Completed 10-07-PLAN.md +Last session: 2026-04-05T22:28:27.412Z +Stopped at: Completed 10-09-PLAN.md Resume file: None diff --git a/.planning/phases/10-osint-code-hosting/10-09-SUMMARY.md b/.planning/phases/10-osint-code-hosting/10-09-SUMMARY.md new file mode 100644 index 0000000..47997f8 --- /dev/null +++ b/.planning/phases/10-osint-code-hosting/10-09-SUMMARY.md @@ -0,0 +1,100 @@ +--- +phase: 10-osint-code-hosting +plan: 09 +subsystem: recon +tags: [register, integration, cmd, viper, httptest] + +requires: + - phase: 10-osint-code-hosting + provides: "Ten code-hosting ReconSource implementations (Plans 10-01..10-08)" +provides: + - "sources.RegisterAll wires all ten Phase 10 sources onto a recon.Engine" + - "cmd/recon.go constructs real SourcesConfig from env + viper and invokes RegisterAll" + - "End-to-end SweepAll integration test exercising every source against one multiplexed httptest server" +affects: [11-osint-pastebins, 12-osint-search-engines, cli-recon] + +tech-stack: + added: [] + patterns: + - "Env-var → viper fallback (firstNonEmpty) for recon credential lookup" + - "Unconditional source registration: credless sources register but Enabled()==false, uniform CLI surface" + - "Single httptest.ServeMux routing per-path fixtures for multi-source integration tests" + +key-files: + created: + - pkg/recon/sources/register_test.go + - pkg/recon/sources/integration_test.go + - .planning/phases/10-osint-code-hosting/deferred-items.md + modified: + - pkg/recon/sources/register.go + - cmd/recon.go + +key-decisions: + - "Register all ten sources unconditionally so `keyhunter recon list` shows the full catalog regardless of configured credentials; missing creds just flip Enabled()==false" + - "Integration test constructs sources directly with BaseURL overrides (not via RegisterAll) because RegisterAll wires production URLs" + - "Credential precedence: env var → viper config key → empty (source disabled)" + - "Single multiplexed httptest server used instead of ten separate servers — simpler and matches how recon.Engine fans out in parallel" + - "firstNonEmpty helper kept local to cmd/recon.go rather than pkg-level to avoid exporting a trivial utility" + +patterns-established: + - "sources.RegisterAll(engine, cfg) is the single call cmd-layer code must make to wire Phase 10" + - "Integration tests that need to drive many sources from one server encode the sub-source into the URL path (/search/code, /api/v4/search, etc.)" + - "Struct literals for sources that lazy-init `client` in Sweep; NewXxxSource constructor for sources that don't (GitHubSource, KaggleSource, HuggingFaceSource)" + +requirements-completed: [RECON-CODE-10] + +duration: 12min +completed: 2026-04-05 +--- + +# Phase 10 Plan 09: RegisterAll + cmd/recon + Integration Test Summary + +**Ten Phase 10 code-hosting sources now wire onto recon.Engine via sources.RegisterAll, the CLI reads credentials from env+viper, and an end-to-end integration test drives every source through SweepAll against one multiplexed httptest server.** + +## Performance + +- **Duration:** ~12 min +- **Tasks:** 2 (both TDD) +- **Files created:** 3 +- **Files modified:** 2 + +## Accomplishments + +- `sources.RegisterAll` wires all ten sources (github, gitlab, bitbucket, gist, codeberg, huggingface, replit, codesandbox, sandboxes, kaggle) onto a `*recon.Engine` in one call +- Extended `SourcesConfig` with `BitbucketWorkspace` and `CodebergToken` fields to match Wave 2 constructor signatures +- `cmd/recon.go` now loads providers.Registry, constructs a full `SourcesConfig` from env vars (`GITHUB_TOKEN`, `GITLAB_TOKEN`, `BITBUCKET_TOKEN`, `BITBUCKET_WORKSPACE`, `CODEBERG_TOKEN`, `HUGGINGFACE_TOKEN`, `KAGGLE_USERNAME`, `KAGGLE_KEY`) with viper fallback keys under `recon..*`, and calls `sources.RegisterAll` +- `keyhunter recon list` now prints all eleven source names (`example` + ten Phase 10 sources) +- Integration test (`integration_test.go::TestIntegration_AllSources_SweepAll`) spins up a single `httptest` server with per-path handlers for every source's API/HTML fixture, registers all ten sources (with BaseURL overrides) on a fresh `recon.Engine`, runs `SweepAll`, and asserts at least one `Finding` was emitted for each of the ten `recon:*` `SourceType` values +- `register_test.go` covers RegisterAll contracts: exactly ten sources registered in deterministic sorted order, nil engine is a no-op, and empty credentials still produce a full registration list + +## Verification + +- `go test ./pkg/recon/sources/ -run TestRegisterAll -v` → 4 passing (nil, empty cfg, all-ten, missing-creds) +- `go test ./pkg/recon/sources/ -run TestIntegration_AllSources_SweepAll -v` → passing; asserts 10/10 SourceType buckets populated +- `go test ./pkg/recon/...` → all green (35s, includes pre-existing per-source suites) +- `go vet ./...` → clean +- `go build ./...` → clean +- `go run . recon list` → prints `bitbucket codeberg codesandbox example gist github gitlab huggingface kaggle replit sandboxes` + +## Deviations from Plan + +None — plan executed as written. One out-of-scope finding was identified and logged to `deferred-items.md` (GitHubSource.Sweep dereferences `s.client` without a nil check; safe in current code paths because `RegisterAll` uses `NewGitHubSource` which initializes it, but a latent footgun for future struct-literal callers). + +## Known Stubs + +None. All ten sources are production-wired through `RegisterAll` and exercised by the integration test against realistic fixtures. + +## Commits + +- `4628ccf` test(10-09): add failing RegisterAll wiring tests +- `fb3e573` feat(10-09): wire all ten Phase 10 sources in RegisterAll +- `8528108` test(10-09): add end-to-end SweepAll integration test across all ten sources +- `e00fb17` feat(10-09): wire sources.RegisterAll into cmd/recon with viper+env credential lookup + +## Self-Check: PASSED + +- pkg/recon/sources/register.go — FOUND +- pkg/recon/sources/register_test.go — FOUND +- pkg/recon/sources/integration_test.go — FOUND +- cmd/recon.go — FOUND +- commits 4628ccf, fb3e573, 8528108, e00fb17 — FOUND