From ca526d8e3265e17cc260a2522e7fbbf1c83da7a8 Mon Sep 17 00:00:00 2001 From: salvacybersec Date: Mon, 6 Apr 2026 00:00:24 +0300 Subject: [PATCH] docs(07-04): complete import command plan --- .../phases/07-import-cicd/07-04-SUMMARY.md | 100 ++++++++++++++++++ 1 file changed, 100 insertions(+) create mode 100644 .planning/phases/07-import-cicd/07-04-SUMMARY.md diff --git a/.planning/phases/07-import-cicd/07-04-SUMMARY.md b/.planning/phases/07-import-cicd/07-04-SUMMARY.md new file mode 100644 index 0000000..c4ffbb8 --- /dev/null +++ b/.planning/phases/07-import-cicd/07-04-SUMMARY.md @@ -0,0 +1,100 @@ +--- +phase: 07-import-cicd +plan: 04 +subsystem: cmd/import +tags: [cli, importer, storage, dedup] +requires: + - pkg/importer (07-01, 07-02, 07-03) + - pkg/storage.SaveFinding + - cmd.openDBWithKey +provides: + - cmd.importCmd (keyhunter import) + - pkg/storage.FindingExistsByKey +affects: + - cmd/stubs.go (import stub removed) +tech-stack: + added: [] + patterns: + - RunE extracted for direct test invocation + - Cross-import dedup via DB tuple lookup (no decrypt needed) +key-files: + created: + - cmd/import.go + - cmd/import_test.go + modified: + - pkg/storage/queries.go + - pkg/storage/queries_test.go + - cmd/stubs.go +decisions: + - Dedup identity = (provider, masked key, source path, line number); matches importer.FindingKey semantics and requires no key decryption at lookup time + - Reuse openDBWithKey helper rather than duplicating encryption bootstrap + - Report total findings, new, and duplicates (in-file + DB) in summary line +metrics: + duration: ~4 min + completed: 2026-04-06 +requirements: [IMP-01, IMP-02, IMP-03] +--- + +# Phase 7 Plan 4: Import Command Wiring Summary + +Wires `keyhunter import --format=trufflehog|gitleaks|gitleaks-csv ` end-to-end: parses external scanner output via pkg/importer, deduplicates in-file and against the existing KeyHunter database, and persists new findings to encrypted SQLite storage. + +## What Was Built + +- **cmd/import.go** — new `importCmd` with required `--format` flag dispatching to `TruffleHogImporter`, `GitleaksImporter`, or `GitleaksCSVImporter`. `runImport` opens the file, decodes, runs `importer.Dedup`, then for each unique finding checks `db.FindingExistsByKey` before `db.SaveFinding`. Emits `Imported N findings (M new, K duplicates)` to stdout where K combines in-file duplicates and pre-existing DB matches. +- **engineToStorage helper** — bridges the `engine.Source` / `storage.SourcePath` field name gap and defaults `DetectedAt`. +- **pkg/storage.FindingExistsByKey** — thin `SELECT 1 ... LIMIT 1` lookup keyed on `(provider_name, key_masked, source_path, line_number)`. Makes repeat imports idempotent without decrypting stored key values. +- **cmd/stubs.go** — `importCmd` stub block removed; new `var importCmd` in cmd/import.go takes over the identifier so no cmd/root.go change is required. + +## Tests + +- `TestSelectImporter` — table covering trufflehog / gitleaks / gitleaks-csv / bogus / empty. +- `TestEngineToStorage` — verifies Source->SourcePath mapping and all verify_* fields. +- `TestRunImport_TruffleHogEndToEnd` — loads `pkg/importer/testdata/trufflehog-sample.json`, runs `runImport` twice: first pass asserts `Imported 3 findings (3 new, 0 duplicates)` and ≥3 rows in `db.ListFindings`; second pass asserts `0 new, 3 duplicates`. +- `TestRunImport_UnknownFormat` — asserts selectImporter surfaces the "unknown format" error. +- `TestRunImport_MissingFile` — asserts wrapped "opening" error for a nonexistent path. +- `TestFindingExistsByKey` — hit case plus four miss cases (each tuple field flipped). + +All tests pass: `go build ./...` clean, `go test ./cmd/... ./pkg/storage/... ./pkg/importer/...` ok. + +## Deviations from Plan + +- **[Rule 3 - Blocking]** The plan sketch left `openDBForImport` and `findingExistsInDB` as TODOs inside cmd/import.go. Replaced inline: `openDBForImport` collapsed into a direct call to the existing `openDBWithKey` helper (per plan's executor note), and `findingExistsInDB` was replaced by a new `storage.FindingExistsByKey` method so dedup runs as a single indexed SQL lookup instead of loading+decrypting every stored finding. +- **[Rule 2 - Missing critical functionality]** `cmd/stubs.go` was already stripped of the `hookCmd` block by a sibling wave-2 plan when this plan reached it. The import stub removal still applied cleanly; no conflict. +- Added `TestRunImport_UnknownFormat` and `TestRunImport_MissingFile` beyond the plan's test list to lock in error-path behavior since the success path exercises most of the happy code. + +## Verification + +``` +cd /home/salva/Documents/apikey +go build ./... +go test ./cmd/... ./pkg/storage/... ./pkg/importer/... +# ok github.com/salvacybersec/keyhunter/cmd 0.448s +# ok github.com/salvacybersec/keyhunter/pkg/storage 0.148s +# ok github.com/salvacybersec/keyhunter/pkg/importer (cached) +``` + +Manual smoke (matches `` block in plan): + +``` +go run ./cmd/keyhunter import --format=trufflehog pkg/importer/testdata/trufflehog-sample.json +# Imported 3 findings (3 new, 0 duplicates) +go run ./cmd/keyhunter import --format=trufflehog pkg/importer/testdata/trufflehog-sample.json +# Imported 3 findings (0 new, 3 duplicates) +``` + +The end-to-end test exercises this exact sequence against a tempdir DB. + +## Commits + +- `9dbb0b8` feat(07-04): wire keyhunter import command with dedup and DB persist + +## Self-Check: PASSED + +- cmd/import.go: FOUND +- cmd/import_test.go: FOUND +- pkg/storage/queries.go FindingExistsByKey: FOUND +- pkg/storage/queries_test.go TestFindingExistsByKey: FOUND +- cmd/stubs.go importCmd removed: CONFIRMED (grep empty) +- Commit 9dbb0b8: FOUND +- Tests green across cmd, pkg/storage, pkg/importer