--- phase: 07-import-cicd plan: 04 subsystem: cmd/import tags: [cli, importer, storage, dedup] requires: - pkg/importer (07-01, 07-02, 07-03) - pkg/storage.SaveFinding - cmd.openDBWithKey provides: - cmd.importCmd (keyhunter import) - pkg/storage.FindingExistsByKey affects: - cmd/stubs.go (import stub removed) tech-stack: added: [] patterns: - RunE extracted for direct test invocation - Cross-import dedup via DB tuple lookup (no decrypt needed) key-files: created: - cmd/import.go - cmd/import_test.go modified: - pkg/storage/queries.go - pkg/storage/queries_test.go - cmd/stubs.go decisions: - Dedup identity = (provider, masked key, source path, line number); matches importer.FindingKey semantics and requires no key decryption at lookup time - Reuse openDBWithKey helper rather than duplicating encryption bootstrap - Report total findings, new, and duplicates (in-file + DB) in summary line metrics: duration: ~4 min completed: 2026-04-06 requirements: [IMP-01, IMP-02, IMP-03] --- # Phase 7 Plan 4: Import Command Wiring Summary Wires `keyhunter import --format=trufflehog|gitleaks|gitleaks-csv ` end-to-end: parses external scanner output via pkg/importer, deduplicates in-file and against the existing KeyHunter database, and persists new findings to encrypted SQLite storage. ## What Was Built - **cmd/import.go** — new `importCmd` with required `--format` flag dispatching to `TruffleHogImporter`, `GitleaksImporter`, or `GitleaksCSVImporter`. `runImport` opens the file, decodes, runs `importer.Dedup`, then for each unique finding checks `db.FindingExistsByKey` before `db.SaveFinding`. Emits `Imported N findings (M new, K duplicates)` to stdout where K combines in-file duplicates and pre-existing DB matches. - **engineToStorage helper** — bridges the `engine.Source` / `storage.SourcePath` field name gap and defaults `DetectedAt`. - **pkg/storage.FindingExistsByKey** — thin `SELECT 1 ... LIMIT 1` lookup keyed on `(provider_name, key_masked, source_path, line_number)`. Makes repeat imports idempotent without decrypting stored key values. - **cmd/stubs.go** — `importCmd` stub block removed; new `var importCmd` in cmd/import.go takes over the identifier so no cmd/root.go change is required. ## Tests - `TestSelectImporter` — table covering trufflehog / gitleaks / gitleaks-csv / bogus / empty. - `TestEngineToStorage` — verifies Source->SourcePath mapping and all verify_* fields. - `TestRunImport_TruffleHogEndToEnd` — loads `pkg/importer/testdata/trufflehog-sample.json`, runs `runImport` twice: first pass asserts `Imported 3 findings (3 new, 0 duplicates)` and ≥3 rows in `db.ListFindings`; second pass asserts `0 new, 3 duplicates`. - `TestRunImport_UnknownFormat` — asserts selectImporter surfaces the "unknown format" error. - `TestRunImport_MissingFile` — asserts wrapped "opening" error for a nonexistent path. - `TestFindingExistsByKey` — hit case plus four miss cases (each tuple field flipped). All tests pass: `go build ./...` clean, `go test ./cmd/... ./pkg/storage/... ./pkg/importer/...` ok. ## Deviations from Plan - **[Rule 3 - Blocking]** The plan sketch left `openDBForImport` and `findingExistsInDB` as TODOs inside cmd/import.go. Replaced inline: `openDBForImport` collapsed into a direct call to the existing `openDBWithKey` helper (per plan's executor note), and `findingExistsInDB` was replaced by a new `storage.FindingExistsByKey` method so dedup runs as a single indexed SQL lookup instead of loading+decrypting every stored finding. - **[Rule 2 - Missing critical functionality]** `cmd/stubs.go` was already stripped of the `hookCmd` block by a sibling wave-2 plan when this plan reached it. The import stub removal still applied cleanly; no conflict. - Added `TestRunImport_UnknownFormat` and `TestRunImport_MissingFile` beyond the plan's test list to lock in error-path behavior since the success path exercises most of the happy code. ## Verification ``` cd /home/salva/Documents/apikey go build ./... go test ./cmd/... ./pkg/storage/... ./pkg/importer/... # ok github.com/salvacybersec/keyhunter/cmd 0.448s # ok github.com/salvacybersec/keyhunter/pkg/storage 0.148s # ok github.com/salvacybersec/keyhunter/pkg/importer (cached) ``` Manual smoke (matches `` block in plan): ``` go run ./cmd/keyhunter import --format=trufflehog pkg/importer/testdata/trufflehog-sample.json # Imported 3 findings (3 new, 0 duplicates) go run ./cmd/keyhunter import --format=trufflehog pkg/importer/testdata/trufflehog-sample.json # Imported 3 findings (0 new, 3 duplicates) ``` The end-to-end test exercises this exact sequence against a tempdir DB. ## Commits - `9dbb0b8` feat(07-04): wire keyhunter import command with dedup and DB persist ## Self-Check: PASSED - cmd/import.go: FOUND - cmd/import_test.go: FOUND - pkg/storage/queries.go FindingExistsByKey: FOUND - pkg/storage/queries_test.go TestFindingExistsByKey: FOUND - cmd/stubs.go importCmd removed: CONFIRMED (grep empty) - Commit 9dbb0b8: FOUND - Tests green across cmd, pkg/storage, pkg/importer