Files
keyhunter/.planning/phases/07-import-cicd/07-04-SUMMARY.md
2026-04-06 00:00:24 +03:00

5.0 KiB

phase, plan, subsystem, tags, requires, provides, affects, tech-stack, key-files, decisions, metrics, requirements
phase plan subsystem tags requires provides affects tech-stack key-files decisions metrics requirements
07-import-cicd 04 cmd/import
cli
importer
storage
dedup
pkg/importer (07-01, 07-02, 07-03)
pkg/storage.SaveFinding
cmd.openDBWithKey
cmd.importCmd (keyhunter import)
pkg/storage.FindingExistsByKey
cmd/stubs.go (import stub removed)
added patterns
RunE extracted for direct test invocation
Cross-import dedup via DB tuple lookup (no decrypt needed)
created modified
cmd/import.go
cmd/import_test.go
pkg/storage/queries.go
pkg/storage/queries_test.go
cmd/stubs.go
Dedup identity = (provider, masked key, source path, line number); matches importer.FindingKey semantics and requires no key decryption at lookup time
Reuse openDBWithKey helper rather than duplicating encryption bootstrap
Report total findings, new, and duplicates (in-file + DB) in summary line
duration completed
~4 min 2026-04-06
IMP-01
IMP-02
IMP-03

Phase 7 Plan 4: Import Command Wiring Summary

Wires keyhunter import --format=trufflehog|gitleaks|gitleaks-csv <file> end-to-end: parses external scanner output via pkg/importer, deduplicates in-file and against the existing KeyHunter database, and persists new findings to encrypted SQLite storage.

What Was Built

  • cmd/import.go — new importCmd with required --format flag dispatching to TruffleHogImporter, GitleaksImporter, or GitleaksCSVImporter. runImport opens the file, decodes, runs importer.Dedup, then for each unique finding checks db.FindingExistsByKey before db.SaveFinding. Emits Imported N findings (M new, K duplicates) to stdout where K combines in-file duplicates and pre-existing DB matches.
  • engineToStorage helper — bridges the engine.Source / storage.SourcePath field name gap and defaults DetectedAt.
  • pkg/storage.FindingExistsByKey — thin SELECT 1 ... LIMIT 1 lookup keyed on (provider_name, key_masked, source_path, line_number). Makes repeat imports idempotent without decrypting stored key values.
  • cmd/stubs.goimportCmd stub block removed; new var importCmd in cmd/import.go takes over the identifier so no cmd/root.go change is required.

Tests

  • TestSelectImporter — table covering trufflehog / gitleaks / gitleaks-csv / bogus / empty.
  • TestEngineToStorage — verifies Source->SourcePath mapping and all verify_* fields.
  • TestRunImport_TruffleHogEndToEnd — loads pkg/importer/testdata/trufflehog-sample.json, runs runImport twice: first pass asserts Imported 3 findings (3 new, 0 duplicates) and ≥3 rows in db.ListFindings; second pass asserts 0 new, 3 duplicates.
  • TestRunImport_UnknownFormat — asserts selectImporter surfaces the "unknown format" error.
  • TestRunImport_MissingFile — asserts wrapped "opening" error for a nonexistent path.
  • TestFindingExistsByKey — hit case plus four miss cases (each tuple field flipped).

All tests pass: go build ./... clean, go test ./cmd/... ./pkg/storage/... ./pkg/importer/... ok.

Deviations from Plan

  • [Rule 3 - Blocking] The plan sketch left openDBForImport and findingExistsInDB as TODOs inside cmd/import.go. Replaced inline: openDBForImport collapsed into a direct call to the existing openDBWithKey helper (per plan's executor note), and findingExistsInDB was replaced by a new storage.FindingExistsByKey method so dedup runs as a single indexed SQL lookup instead of loading+decrypting every stored finding.
  • [Rule 2 - Missing critical functionality] cmd/stubs.go was already stripped of the hookCmd block by a sibling wave-2 plan when this plan reached it. The import stub removal still applied cleanly; no conflict.
  • Added TestRunImport_UnknownFormat and TestRunImport_MissingFile beyond the plan's test list to lock in error-path behavior since the success path exercises most of the happy code.

Verification

cd /home/salva/Documents/apikey
go build ./...
go test ./cmd/... ./pkg/storage/... ./pkg/importer/...
# ok  github.com/salvacybersec/keyhunter/cmd        0.448s
# ok  github.com/salvacybersec/keyhunter/pkg/storage 0.148s
# ok  github.com/salvacybersec/keyhunter/pkg/importer (cached)

Manual smoke (matches <verification> block in plan):

go run ./cmd/keyhunter import --format=trufflehog pkg/importer/testdata/trufflehog-sample.json
# Imported 3 findings (3 new, 0 duplicates)
go run ./cmd/keyhunter import --format=trufflehog pkg/importer/testdata/trufflehog-sample.json
# Imported 3 findings (0 new, 3 duplicates)

The end-to-end test exercises this exact sequence against a tempdir DB.

Commits

  • 9dbb0b8 feat(07-04): wire keyhunter import command with dedup and DB persist

Self-Check: PASSED

  • cmd/import.go: FOUND
  • cmd/import_test.go: FOUND
  • pkg/storage/queries.go FindingExistsByKey: FOUND
  • pkg/storage/queries_test.go TestFindingExistsByKey: FOUND
  • cmd/stubs.go importCmd removed: CONFIRMED (grep empty)
  • Commit 9dbb0b8: FOUND
  • Tests green across cmd, pkg/storage, pkg/importer