From 66fc5973993e8b950d05efe759fbc7f7d3a98950 Mon Sep 17 00:00:00 2001 From: salvacybersec Date: Sun, 5 Apr 2026 15:20:10 +0300 Subject: [PATCH] docs(04-04): complete stdin/url/clipboard sources plan --- .../phases/04-input-sources/04-04-SUMMARY.md | 176 ++++++++++++++++++ 1 file changed, 176 insertions(+) create mode 100644 .planning/phases/04-input-sources/04-04-SUMMARY.md diff --git a/.planning/phases/04-input-sources/04-04-SUMMARY.md b/.planning/phases/04-input-sources/04-04-SUMMARY.md new file mode 100644 index 0000000..3e3cd7c --- /dev/null +++ b/.planning/phases/04-input-sources/04-04-SUMMARY.md @@ -0,0 +1,176 @@ +--- +phase: 04-input-sources +plan: 04 +subsystem: input-sources +tags: [stdin, http, clipboard, net/http, atotto, io.LimitReader] + +requires: + - phase: 01-foundation + provides: Source interface, types.Chunk, FileSource chunk-windowing pattern +provides: + - StdinSource reading from any io.Reader (defaults to os.Stdin) + - URLSource fetching http/https with timeout, size cap, scheme allowlist, Content-Type filter + - ClipboardSource wrapping atotto/clipboard with injectable reader for tests + - emitByteChunks helper for in-memory sources (mirrors file.go windowing) +affects: [04-05 scan CLI wiring, phase 05 scan command integration] + +tech-stack: + added: + - github.com/atotto/clipboard (promoted from indirect) + patterns: + - Injectable Reader fields on sources for deterministic testing + - io.LimitReader(cap+1) overflow-detection idiom for HTTP bodies + - Content-Type prefix allowlist with parameter stripping + +key-files: + created: + - pkg/engine/sources/stdin.go + - pkg/engine/sources/stdin_test.go + - pkg/engine/sources/url.go + - pkg/engine/sources/url_test.go + - pkg/engine/sources/clipboard.go + - pkg/engine/sources/clipboard_test.go + modified: + - go.mod + - go.sum + +key-decisions: + - "Named the in-memory chunk helper emitByteChunks to avoid colliding with emitChunks already provided by plan 04-02's dir.go" + - "Content-Type allowlist includes text/*, application/json, application/javascript, application/xml, application/x-yaml, application/yaml; empty Content-Type is trusted" + - "URLSource enforces the 50MB cap twice: via resp.ContentLength short-circuit and via LimitReader(cap+1) to catch servers that lie or omit Content-Length" + - "ClipboardSource uses an injectable Reader func rather than a package-level shim so tests stay hermetic on headless CI" + +patterns-established: + - "In-memory sources: ReadAll the input, then delegate to emitByteChunks for consistent windowing" + - "HTTP safety: parse-and-validate scheme BEFORE building the request, fail closed on non-2xx, filter Content-Type, double-cap body size" + - "Test injection: expose a function-valued field (e.g. Reader func() (string, error)) so tests never touch real OS state" + +requirements-completed: [INPUT-03, INPUT-04, INPUT-05] + +duration: ~8min +completed: 2026-04-05 +--- + +# Phase 4 Plan 4: Stdin + URL + Clipboard Sources Summary + +**Three self-contained Source adapters — StdinSource (io.Reader), URLSource (http/https with 30s timeout, 50MB cap, scheme allowlist, Content-Type filter), and ClipboardSource (atotto/clipboard with injectable fixture reader) — completing Phase 4's input surface area.** + +## Performance + +- **Duration:** ~8 min +- **Started:** 2026-04-05T12:10:50Z +- **Completed:** 2026-04-05T12:18:55Z +- **Tasks:** 1 (combined TDD task for all three sources) +- **Files created:** 6 +- **Files modified:** 2 (go.mod, go.sum — atotto/clipboard promoted to direct dep) + +## Accomplishments + +- StdinSource: reads any io.Reader, emits chunks with Source="stdin", honors ctx cancellation mid-emit +- URLSource: full HTTP(S) fetch with scheme allowlist, 30s timeout, 5-redirect cap, 50MB double-enforced body cap, Content-Type prefix allowlist, User-Agent header +- ClipboardSource: wraps clipboard.ReadAll, detects clipboard.Unsupported, accepts an injectable Reader func for hermetic tests +- Local emitByteChunks helper mirrors file.go's overlap-windowing so in-memory sources stay consistent without depending on sibling wave-1 plans + +## Task Commits + +1. **Task 1: StdinSource + URLSource + ClipboardSource (TDD combined)** - `850c3ff` (feat) + +_Single commit because this plan bundles three small self-contained sources into one TDD task per the plan's design._ + +## Files Created/Modified + +- `pkg/engine/sources/stdin.go` - StdinSource + emitByteChunks helper +- `pkg/engine/sources/stdin_test.go` - Basic / Empty / CtxCancel tests +- `pkg/engine/sources/url.go` - URLSource with scheme allowlist, size cap, Content-Type filter +- `pkg/engine/sources/url_test.go` - Fetches / BinaryContentType / NonHTTPScheme / 500 / Oversize / Redirect +- `pkg/engine/sources/clipboard.go` - ClipboardSource with injectable Reader +- `pkg/engine/sources/clipboard_test.go` - FixtureReader / ReaderError / EmptyClipboard +- `go.mod`, `go.sum` - atotto/clipboard promoted from indirect to direct + +## Test Results + +All 12 subtests pass under `-race`: + +| Test | Result | +| ---------------------------------------- | ------ | +| TestStdinSource_Basic | PASS | +| TestStdinSource_Empty | PASS | +| TestStdinSource_CtxCancel | PASS | +| TestURLSource_Fetches | PASS | +| TestURLSource_RejectsBinaryContentType | PASS | +| TestURLSource_RejectsNonHTTPScheme | PASS | +| TestURLSource_Rejects500 | PASS | +| TestURLSource_RejectsOversizeBody | PASS | +| TestURLSource_FollowsRedirect | PASS | +| TestClipboardSource_FixtureReader | PASS | +| TestClipboardSource_ReaderError | PASS | +| TestClipboardSource_EmptyClipboard | PASS | + +Verification command: `go test ./pkg/engine/sources/... -race -count=1` → `ok` + +## Decisions Made + +- **emitByteChunks naming:** Plan text assumed emitChunks would be available from plan 04-02. Since plan 04-04 runs in parallel (wave 1), I defined a local helper under a non-colliding name (`emitByteChunks`). Post-merge both helpers now coexist — future cleanup could consolidate, but there is no functional duplication risk. +- **Double size-cap enforcement:** Some servers omit or misreport Content-Length. Pairing the Content-Length short-circuit with `io.LimitReader(cap+1)` is a belt-and-braces pattern that never lets >50MB enter memory. +- **Empty Content-Type trusted:** Fail-open on empty Content-Type (some servers omit it for text bodies); fail-closed only on explicit binary types. + +## Deviations from Plan + +### Auto-fixed Issues + +**1. [Rule 3 - Blocking] Used emitByteChunks instead of emitChunks helper** +- **Found during:** Task 1 (source implementation) +- **Issue:** Plan referenced a shared `emitChunks` helper produced by plan 04-02's `dir.go`. This worktree executes in parallel with 04-02, and at write time `dir.go` was not yet present — and after merging, using the same symbol name would conflict. +- **Fix:** Defined a local `emitByteChunks` helper in `stdin.go` that mirrors `file.go`'s chunk-windowing pattern. URLSource and ClipboardSource reuse it. Confirmed post-merge both helpers coexist without collision. +- **Files modified:** pkg/engine/sources/stdin.go (includes helper) +- **Verification:** `grep -n "func emitChunks\|func emitByteChunks" pkg/engine/sources/` shows both symbols; `go build` + `go test -race` pass. +- **Committed in:** 850c3ff + +**2. [Rule 1 - Bug] Clipboard test substring assertion was case-sensitive** +- **Found during:** Task 1 verification run +- **Issue:** `TestClipboardSource_ReaderError` asserted `err.Error()` contained `"clipboard"` but the wrapped error begins with `"ClipboardSource: read: ..."`, so the lowercase substring check failed on the Go-style capitalized prefix. +- **Fix:** Lowercased the error string before the Contains check (`strings.ToLower(err.Error())`). Keeps the assertion intent (error must mention clipboard) while remaining robust to casing. +- **Files modified:** pkg/engine/sources/clipboard_test.go +- **Verification:** Test now passes under `-race`. +- **Committed in:** 850c3ff + +--- + +**Total deviations:** 2 auto-fixed (1 blocking helper collision avoidance, 1 test assertion bug) +**Impact on plan:** Both fixes were required to make the plan executable in the parallel wave-1 context and to make the tests pass. No scope creep; no architectural changes. + +## Issues Encountered + +- atotto/clipboard was present in go.sum as an indirect dependency — `go mod tidy` promoted it to direct when `clipboard.go` imported it. No install step was needed beyond that. +- Linux CI environments without xclip/xsel can safely run these tests because all clipboard tests use the injectable Reader fixture; they never touch real OS clipboard state. + +## Known Stubs + +None. All three sources are fully functional and wired end-to-end through the Source interface. Consumer wiring into `cmd/scan.go` is out of scope for this plan (owned by plan 04-05). + +## User Setup Required + +None — no external service configuration required. The `atotto/clipboard` package is a compile-time dependency and the tests use injected fixtures so no runtime clipboard tooling is needed to pass CI. End users who invoke `keyhunter scan --clipboard` on Linux will still need xclip/xsel/wl-clipboard installed; ClipboardSource returns a clear install-hint error when tooling is missing. + +## Next Phase Readiness + +- StdinSource, URLSource, ClipboardSource are ready for wiring into `cmd/scan.go` by plan 04-05. +- `emitByteChunks` is a candidate for consolidation with `emitChunks` (from 04-02's `dir.go`) in a future refactor — functionally equivalent, named differently to dodge the wave-1 merge collision. + +## Self-Check: PASSED + +- pkg/engine/sources/stdin.go: FOUND +- pkg/engine/sources/stdin_test.go: FOUND +- pkg/engine/sources/url.go: FOUND +- pkg/engine/sources/url_test.go: FOUND +- pkg/engine/sources/clipboard.go: FOUND +- pkg/engine/sources/clipboard_test.go: FOUND +- Commit 850c3ff: FOUND in git log +- `go test ./pkg/engine/sources/... -race -count=1`: PASS +- `go vet ./pkg/engine/sources/...`: clean +- All grep acceptance checks: PASS + +--- +*Phase: 04-input-sources* +*Plan: 04* +*Completed: 2026-04-05*