Merge branch 'worktree-agent-ad7ef8d3'
This commit is contained in:
117
.planning/phases/10-osint-code-hosting/10-08-SUMMARY.md
Normal file
117
.planning/phases/10-osint-code-hosting/10-08-SUMMARY.md
Normal file
@@ -0,0 +1,117 @@
|
||||
---
|
||||
phase: 10-osint-code-hosting
|
||||
plan: 08
|
||||
subsystem: recon
|
||||
tags: [kaggle, osint, http-basic-auth, httptest]
|
||||
|
||||
requires:
|
||||
- phase: 10-osint-code-hosting
|
||||
provides: "recon.ReconSource interface, sources.Client, BuildQueries, LimiterRegistry (Plan 10-01)"
|
||||
provides:
|
||||
- "KaggleSource implementing recon.ReconSource against Kaggle /api/v1/kernels/list"
|
||||
- "HTTP Basic auth wiring via req.SetBasicAuth(user, key)"
|
||||
- "Finding normalization to Source=<web>/code/<ref>, SourceType=recon:kaggle"
|
||||
affects: [10-09-register, 10-full-integration]
|
||||
|
||||
tech-stack:
|
||||
added: []
|
||||
patterns:
|
||||
- "Basic-auth recon source pattern (user + key) as counterpart to bearer-token sources"
|
||||
- "Credential-gated Sweep: return nil without HTTP when either credential missing"
|
||||
|
||||
key-files:
|
||||
created:
|
||||
- pkg/recon/sources/kaggle.go
|
||||
- pkg/recon/sources/kaggle_test.go
|
||||
modified: []
|
||||
|
||||
key-decisions:
|
||||
- "Short-circuit Sweep with nil error when User or Key is empty — no HTTP, no log spam"
|
||||
- "kaggleKernel decoder ignores non-ref fields so API additions don't break decode"
|
||||
- "Ignore decode errors and continue to next query (downgrade, not abort) — matches GitHubSource pattern"
|
||||
|
||||
patterns-established:
|
||||
- "Basic auth: req.SetBasicAuth(s.User, s.Key) after NewRequestWithContext"
|
||||
- "Web URL derivation from API ref: web + /code/ + ref"
|
||||
|
||||
requirements-completed: [RECON-CODE-09]
|
||||
|
||||
duration: 8min
|
||||
completed: 2026-04-05
|
||||
---
|
||||
|
||||
# Phase 10 Plan 08: KaggleSource Summary
|
||||
|
||||
**KaggleSource emits Findings from Kaggle public notebook search via HTTP Basic auth against /api/v1/kernels/list**
|
||||
|
||||
## Performance
|
||||
|
||||
- **Duration:** ~8 min
|
||||
- **Tasks:** 1 (TDD)
|
||||
- **Files created:** 2
|
||||
|
||||
## Accomplishments
|
||||
|
||||
- KaggleSource type implementing recon.ReconSource (Name, RateLimit, Burst, RespectsRobots, Enabled, Sweep)
|
||||
- Credentials-gated: both User AND Key required; missing either returns nil with zero HTTP calls
|
||||
- HTTP Basic auth wired via req.SetBasicAuth to Kaggle's /api/v1/kernels/list endpoint
|
||||
- Findings normalized with SourceType "recon:kaggle" and Source = WebBaseURL + "/code/" + ref
|
||||
- 60 req/min rate limit via rate.Every(1*time.Second), burst 1, honoring per-source LimiterRegistry
|
||||
- Compile-time interface assertion: `var _ recon.ReconSource = (*KaggleSource)(nil)`
|
||||
|
||||
## Task Commits
|
||||
|
||||
1. **Task 1: KaggleSource + tests (TDD)** — `243b740` (feat)
|
||||
|
||||
## Files Created
|
||||
|
||||
- `pkg/recon/sources/kaggle.go` — KaggleSource implementation, kaggleKernel decoder, interface assertion
|
||||
- `pkg/recon/sources/kaggle_test.go` — 6 httptest-driven tests
|
||||
|
||||
## Test Coverage
|
||||
|
||||
| Test | Covers |
|
||||
|------|--------|
|
||||
| TestKaggle_Enabled | All 4 credential combinations (empty/empty, user-only, key-only, both) |
|
||||
| TestKaggle_Sweep_BasicAuthAndFindings | Authorization header decoded as testuser:testkey, 2 refs → 2 Findings with correct Source URLs and recon:kaggle SourceType |
|
||||
| TestKaggle_Sweep_MissingCredentials_NoHTTP | Atomic counter verifies zero HTTP calls when either User or Key empty |
|
||||
| TestKaggle_Sweep_Unauthorized | 401 response wrapped as ErrUnauthorized |
|
||||
| TestKaggle_Sweep_CtxCancellation | Pre-cancelled ctx returns context.Canceled promptly |
|
||||
| TestKaggle_ReconSourceInterface | Compile + runtime assertions on Name, Burst, RespectsRobots, RateLimit |
|
||||
|
||||
All 6 tests pass in isolation: `go test ./pkg/recon/sources/ -run TestKaggle -v`
|
||||
|
||||
## Decisions Made
|
||||
|
||||
- **Missing-cred behavior:** Sweep returns nil (no error) when either credential absent. Matches GitHubSource pattern — disabled sources log-and-skip at the Engine level, not error out.
|
||||
- **Decode tolerance:** kaggleKernel struct only declares `Ref string`. Other fields (title, author, language) are silently discarded so upstream API changes don't break the source.
|
||||
- **Error downgrade:** Non-401 HTTP errors skip to next query rather than aborting the whole sweep. 401 is the only hard-fail case because it means credentials are actually invalid, not transient.
|
||||
- **Dual BaseURL fields:** BaseURL (API) and WebBaseURL (Finding URL stem) are separate struct fields so tests can point BaseURL at httptest.NewServer while WebBaseURL stays at the production kaggle.com domain for assertion stability.
|
||||
|
||||
## Deviations from Plan
|
||||
|
||||
None — plan executed exactly as written. All truths from frontmatter (`must_haves`) satisfied:
|
||||
- KaggleSource queries `/api/v1/kernels/list` with Basic auth → TestKaggle_Sweep_BasicAuthAndFindings
|
||||
- Disabled when either credential empty → TestKaggle_Enabled + TestKaggle_Sweep_MissingCredentials_NoHTTP
|
||||
- Findings tagged recon:kaggle with Source = web + /code/ + ref → TestKaggle_Sweep_BasicAuthAndFindings
|
||||
|
||||
## Issues Encountered
|
||||
|
||||
- **Sibling-wave file churn:** During testing, sibling Wave 2 plans (10-02 GitHub, 10-05 Replit, 10-07 CodeSandbox, 10-03 GitLab) had already dropped partial files into `pkg/recon/sources/` in the main repo. A stray `github_test.go` with no `github.go` broke package compilation. Resolved by running tests in this plan's git worktree where only kaggle.go and kaggle_test.go are present alongside the Plan 10-01 scaffolding. No cross-plan changes made — scope boundary respected. Final wave merge will resolve all sibling files together.
|
||||
|
||||
## Next Phase Readiness
|
||||
|
||||
- KaggleSource is ready for registration in Plan 10-09 (`RegisterAll` wiring).
|
||||
- No blockers for downstream plans. RECON-CODE-09 satisfied.
|
||||
|
||||
## Self-Check: PASSED
|
||||
|
||||
- File exists: `pkg/recon/sources/kaggle.go` — FOUND
|
||||
- File exists: `pkg/recon/sources/kaggle_test.go` — FOUND
|
||||
- Commit exists: `243b740` — FOUND (feat(10-08): add KaggleSource with HTTP Basic auth)
|
||||
- Tests pass: 6/6 TestKaggle_* (verified with sibling files stashed to isolate package build)
|
||||
|
||||
---
|
||||
*Phase: 10-osint-code-hosting*
|
||||
*Plan: 08*
|
||||
*Completed: 2026-04-05*
|
||||
Reference in New Issue
Block a user