5.6 KiB
5.6 KiB
phase, plan, subsystem, tags, requires, provides, affects, tech-stack, key-files, key-decisions, patterns-established, requirements-completed, duration, completed
| phase | plan | subsystem | tags | requires | provides | affects | tech-stack | key-files | key-decisions | patterns-established | requirements-completed | duration | completed | ||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 10-osint-code-hosting | 08 | recon |
|
|
|
|
|
|
|
|
|
8min | 2026-04-05 |
Phase 10 Plan 08: KaggleSource Summary
KaggleSource emits Findings from Kaggle public notebook search via HTTP Basic auth against /api/v1/kernels/list
Performance
- Duration: ~8 min
- Tasks: 1 (TDD)
- Files created: 2
Accomplishments
- KaggleSource type implementing recon.ReconSource (Name, RateLimit, Burst, RespectsRobots, Enabled, Sweep)
- Credentials-gated: both User AND Key required; missing either returns nil with zero HTTP calls
- HTTP Basic auth wired via req.SetBasicAuth to Kaggle's /api/v1/kernels/list endpoint
- Findings normalized with SourceType "recon:kaggle" and Source = WebBaseURL + "/code/" + ref
- 60 req/min rate limit via rate.Every(1*time.Second), burst 1, honoring per-source LimiterRegistry
- Compile-time interface assertion:
var _ recon.ReconSource = (*KaggleSource)(nil)
Task Commits
- Task 1: KaggleSource + tests (TDD) —
243b740(feat)
Files Created
pkg/recon/sources/kaggle.go— KaggleSource implementation, kaggleKernel decoder, interface assertionpkg/recon/sources/kaggle_test.go— 6 httptest-driven tests
Test Coverage
| Test | Covers |
|---|---|
| TestKaggle_Enabled | All 4 credential combinations (empty/empty, user-only, key-only, both) |
| TestKaggle_Sweep_BasicAuthAndFindings | Authorization header decoded as testuser:testkey, 2 refs → 2 Findings with correct Source URLs and recon:kaggle SourceType |
| TestKaggle_Sweep_MissingCredentials_NoHTTP | Atomic counter verifies zero HTTP calls when either User or Key empty |
| TestKaggle_Sweep_Unauthorized | 401 response wrapped as ErrUnauthorized |
| TestKaggle_Sweep_CtxCancellation | Pre-cancelled ctx returns context.Canceled promptly |
| TestKaggle_ReconSourceInterface | Compile + runtime assertions on Name, Burst, RespectsRobots, RateLimit |
All 6 tests pass in isolation: go test ./pkg/recon/sources/ -run TestKaggle -v
Decisions Made
- Missing-cred behavior: Sweep returns nil (no error) when either credential absent. Matches GitHubSource pattern — disabled sources log-and-skip at the Engine level, not error out.
- Decode tolerance: kaggleKernel struct only declares
Ref string. Other fields (title, author, language) are silently discarded so upstream API changes don't break the source. - Error downgrade: Non-401 HTTP errors skip to next query rather than aborting the whole sweep. 401 is the only hard-fail case because it means credentials are actually invalid, not transient.
- Dual BaseURL fields: BaseURL (API) and WebBaseURL (Finding URL stem) are separate struct fields so tests can point BaseURL at httptest.NewServer while WebBaseURL stays at the production kaggle.com domain for assertion stability.
Deviations from Plan
None — plan executed exactly as written. All truths from frontmatter (must_haves) satisfied:
- KaggleSource queries
/api/v1/kernels/listwith Basic auth → TestKaggle_Sweep_BasicAuthAndFindings - Disabled when either credential empty → TestKaggle_Enabled + TestKaggle_Sweep_MissingCredentials_NoHTTP
- Findings tagged recon:kaggle with Source = web + /code/ + ref → TestKaggle_Sweep_BasicAuthAndFindings
Issues Encountered
- Sibling-wave file churn: During testing, sibling Wave 2 plans (10-02 GitHub, 10-05 Replit, 10-07 CodeSandbox, 10-03 GitLab) had already dropped partial files into
pkg/recon/sources/in the main repo. A straygithub_test.gowith nogithub.gobroke package compilation. Resolved by running tests in this plan's git worktree where only kaggle.go and kaggle_test.go are present alongside the Plan 10-01 scaffolding. No cross-plan changes made — scope boundary respected. Final wave merge will resolve all sibling files together.
Next Phase Readiness
- KaggleSource is ready for registration in Plan 10-09 (
RegisterAllwiring). - No blockers for downstream plans. RECON-CODE-09 satisfied.
Self-Check: PASSED
- File exists:
pkg/recon/sources/kaggle.go— FOUND - File exists:
pkg/recon/sources/kaggle_test.go— FOUND - Commit exists:
243b740— FOUND (feat(10-08): add KaggleSource with HTTP Basic auth) - Tests pass: 6/6 TestKaggle_* (verified with sibling files stashed to isolate package build)
Phase: 10-osint-code-hosting Plan: 08 Completed: 2026-04-05