docs(07): create phase 7 import & CI/CD plans
This commit is contained in:
@@ -158,7 +158,15 @@ Plans:
|
||||
2. `keyhunter import --format=gitleaks results.json` and `--format=csv` both import and deduplicate against existing findings
|
||||
3. `keyhunter hook install` installs a git pre-commit hook; running `git commit` on a file with a known API key blocks the commit and prints findings
|
||||
4. `keyhunter scan --output=sarif` produces a valid SARIF 2.1.0 file that GitHub Code Scanning accepts without errors
|
||||
**Plans**: TBD
|
||||
**Plans**: 6 plans
|
||||
|
||||
Plans:
|
||||
- [ ] 07-01-PLAN.md — pkg/importer Importer interface + TruffleHog v3 JSON parser + fixtures (IMP-01)
|
||||
- [ ] 07-02-PLAN.md — Gitleaks JSON + CSV parsers (IMP-02)
|
||||
- [ ] 07-03-PLAN.md — Dedup helper + SARIF GitHub Code Scanning validation test (IMP-03, CICD-02)
|
||||
- [ ] 07-04-PLAN.md — cmd/import.go wiring format dispatch, dedup, DB persistence (IMP-01/02/03)
|
||||
- [ ] 07-05-PLAN.md — cmd/hook.go install/uninstall with embedded pre-commit script (CICD-01)
|
||||
- [ ] 07-06-PLAN.md — docs/CI-CD.md + README CI/CD section with GitHub Actions workflow (CICD-01, CICD-02)
|
||||
|
||||
### Phase 8: Dork Engine
|
||||
**Goal**: Users can run, manage, and extend a library of 150+ built-in YAML dorks across GitHub, Google, Shodan, Censys, ZoomEye, FOFA, GitLab, and Bing — using the same extensibility pattern as provider definitions
|
||||
|
||||
177
.planning/phases/07-import-cicd/07-01-PLAN.md
Normal file
177
.planning/phases/07-import-cicd/07-01-PLAN.md
Normal file
@@ -0,0 +1,177 @@
|
||||
---
|
||||
phase: 07-import-cicd
|
||||
plan: 01
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- pkg/importer/importer.go
|
||||
- pkg/importer/trufflehog.go
|
||||
- pkg/importer/trufflehog_test.go
|
||||
- pkg/importer/testdata/trufflehog-sample.json
|
||||
autonomous: true
|
||||
requirements: [IMP-01]
|
||||
must_haves:
|
||||
truths:
|
||||
- "TruffleHog v3 JSON output can be parsed into []engine.Finding"
|
||||
- "Detector names from TruffleHog are normalized to lowercase KeyHunter provider names"
|
||||
- "Verified flag in TruffleHog JSON maps to Finding.Verified + VerifyStatus"
|
||||
artifacts:
|
||||
- path: pkg/importer/importer.go
|
||||
provides: "Importer interface"
|
||||
contains: "type Importer interface"
|
||||
- path: pkg/importer/trufflehog.go
|
||||
provides: "TruffleHog v3 JSON parser"
|
||||
contains: "func (TruffleHogImporter) Import"
|
||||
- path: pkg/importer/testdata/trufflehog-sample.json
|
||||
provides: "Test fixture matching TruffleHog v3 JSON schema"
|
||||
key_links:
|
||||
- from: pkg/importer/trufflehog.go
|
||||
to: pkg/engine/finding.go
|
||||
via: "constructs engine.Finding from TruffleHog records"
|
||||
pattern: "engine\\.Finding\\{"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Create the pkg/importer package with the Importer interface and the TruffleHog v3 JSON adapter.
|
||||
|
||||
Purpose: External tool import requires a uniform contract so the CLI command (Plan 07-04) can dispatch by format flag. TruffleHog is the most widely used scanner; its v3 JSON output is the canonical import target (IMP-01).
|
||||
Output: Importer interface, TruffleHogImporter implementation, unit test, JSON fixture.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/PROJECT.md
|
||||
@.planning/ROADMAP.md
|
||||
@.planning/STATE.md
|
||||
@.planning/phases/07-import-cicd/07-CONTEXT.md
|
||||
@pkg/engine/finding.go
|
||||
|
||||
<interfaces>
|
||||
From pkg/engine/finding.go:
|
||||
```go
|
||||
type Finding struct {
|
||||
ProviderName string
|
||||
KeyValue string
|
||||
KeyMasked string
|
||||
Confidence string
|
||||
Source string
|
||||
SourceType string
|
||||
LineNumber int
|
||||
Offset int64
|
||||
DetectedAt time.Time
|
||||
Verified bool
|
||||
VerifyStatus string
|
||||
VerifyHTTPCode int
|
||||
VerifyMetadata map[string]string
|
||||
VerifyError string
|
||||
}
|
||||
func MaskKey(key string) string
|
||||
```
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 1: Importer interface + TruffleHog parser + fixtures</name>
|
||||
<files>pkg/importer/importer.go, pkg/importer/trufflehog.go, pkg/importer/testdata/trufflehog-sample.json</files>
|
||||
<behavior>
|
||||
- Importer interface declares: Import(r io.Reader) ([]engine.Finding, error) and Name() string
|
||||
- TruffleHogImporter parses a JSON array of objects with fields {SourceID, SourceName, SourceMetadata (object), DetectorName, DetectorType, Verified (bool), Raw (string), Redacted (string), ExtraData (object)}
|
||||
- Each record maps to engine.Finding:
|
||||
ProviderName = normalizeTruffleHogName(DetectorName) // e.g. "OpenAI" -> "openai", "GitHubV2" -> "github", "AWS" -> "aws"
|
||||
KeyValue = Raw
|
||||
KeyMasked = engine.MaskKey(Raw)
|
||||
Confidence = "high" if Verified else "medium"
|
||||
SourceType = "import:trufflehog"
|
||||
Source = extractSourcePath(SourceMetadata) — traverses SourceMetadata.Data.{Git,Filesystem,Github}.file/link/repository, falls back to SourceName
|
||||
LineNumber = extracted from SourceMetadata.Data.Git.line if present, else 0
|
||||
Verified = Verified
|
||||
VerifyStatus = "live" if Verified else "unverified"
|
||||
DetectedAt = time.Now()
|
||||
- normalizeTruffleHogName(): lowercase, strip trailing version digits ("GithubV2" -> "github"), map known aliases (AWS -> aws, GCP -> gcp). Unknown names: lowercased as-is.
|
||||
- Invalid JSON returns wrapped error
|
||||
- Empty array returns empty slice, nil error
|
||||
- Name() returns "trufflehog"
|
||||
</behavior>
|
||||
<action>
|
||||
Create pkg/importer/importer.go:
|
||||
```go
|
||||
package importer
|
||||
|
||||
import (
|
||||
"io"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine"
|
||||
)
|
||||
|
||||
// Importer parses output from an external secret scanner and returns
|
||||
// normalized engine.Finding records. Implementations must be stateless
|
||||
// and safe for reuse across calls.
|
||||
type Importer interface {
|
||||
Name() string
|
||||
Import(r io.Reader) ([]engine.Finding, error)
|
||||
}
|
||||
```
|
||||
|
||||
Create pkg/importer/trufflehog.go:
|
||||
- Define `type TruffleHogImporter struct{}`
|
||||
- Define `type trufflehogRecord struct` with JSON tags matching TruffleHog v3: SourceID, SourceName, SourceMetadata (json.RawMessage), DetectorName, DetectorType int, Verified bool, Raw, Redacted string, ExtraData json.RawMessage.
|
||||
- Implement `Name() string` returning "trufflehog".
|
||||
- Implement `Import(r io.Reader) ([]engine.Finding, error)`:
|
||||
1. json.NewDecoder(r).Decode(&records) — if error, wrap: fmt.Errorf("decoding trufflehog json: %w", err)
|
||||
2. For each record: build engine.Finding per behavior spec. Skip records with empty Raw (log-skip count via return? no — just skip silently).
|
||||
3. Return slice + nil.
|
||||
- Implement `normalizeTruffleHogName(detector string) string`:
|
||||
- lowercased := strings.ToLower(detector)
|
||||
- trim trailing "v\d+" via regexp (package-level var `var tfhVersionSuffix = regexp.MustCompile(`v\d+$`)`)
|
||||
- apply alias map: {"gcp": "gcp", "aws": "aws", "openai": "openai", "anthropic": "anthropic", "huggingface": "huggingface"}
|
||||
- return trimmed
|
||||
- Implement `extractSourcePath(meta json.RawMessage) (path string, line int)`:
|
||||
- Unmarshal into `struct{ Data struct{ Git *struct{ File, Repository, Commit string; Line int } ; Filesystem *struct{ File string }; Github *struct{ File, Link, Repository string } } }`
|
||||
- Return first non-empty in priority: Git.File, Filesystem.File, Github.File, Github.Link, Git.Repository, Github.Repository. Line from Git.Line.
|
||||
- On unmarshal error: return "", 0 (not fatal).
|
||||
|
||||
Create pkg/importer/testdata/trufflehog-sample.json with a realistic fixture containing 3 records:
|
||||
- record 1: DetectorName "OpenAI", Verified true, Raw "sk-proj-abcdef1234567890abcdef", SourceMetadata.Data.Git.File "src/config.py", Line 42
|
||||
- record 2: DetectorName "AnthropicV2", Verified false, Raw "sk-ant-api03-xxxxxxxxxxxxxxxx", SourceMetadata.Data.Filesystem.File "/tmp/leaked.env"
|
||||
- record 3: DetectorName "AWS", Verified true, Raw "AKIAIOSFODNN7EXAMPLE", SourceMetadata.Data.Github.Link "https://github.com/foo/bar/blob/main/a.yml"
|
||||
|
||||
Create pkg/importer/trufflehog_test.go:
|
||||
- TestTruffleHogImporter_Import: open testdata, call Import, assert len==3, assert findings[0].ProviderName=="openai", Confidence=="high", Verified==true, Source=="src/config.py", LineNumber==42.
|
||||
- TestTruffleHogImporter_NormalizeName: table test — {"OpenAI","openai"}, {"GithubV2","github"}, {"AnthropicV2","anthropic"}, {"AWS","aws"}, {"UnknownDetector","unknowndetector"}.
|
||||
- TestTruffleHogImporter_EmptyArray: Import(strings.NewReader("[]")) returns empty slice, nil error.
|
||||
- TestTruffleHogImporter_InvalidJSON: Import(strings.NewReader("not json")) returns error.
|
||||
- TestTruffleHogImporter_Name: asserts "trufflehog".
|
||||
|
||||
All TruffleHog v3 field decisions per 07-CONTEXT.md decisions block.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/importer/... -run TruffleHog -v</automated>
|
||||
</verify>
|
||||
<done>
|
||||
- pkg/importer/importer.go declares Importer interface
|
||||
- pkg/importer/trufflehog.go implements Importer for TruffleHog v3 JSON
|
||||
- All 5 tests pass
|
||||
- go build ./pkg/importer/... succeeds
|
||||
</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
go test ./pkg/importer/... -v passes. go vet ./pkg/importer/... clean.
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
TruffleHog v3 JSON can be loaded from disk and converted to []engine.Finding with correct provider name normalization and verify status mapping.
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/07-import-cicd/07-01-SUMMARY.md`.
|
||||
</output>
|
||||
147
.planning/phases/07-import-cicd/07-02-PLAN.md
Normal file
147
.planning/phases/07-import-cicd/07-02-PLAN.md
Normal file
@@ -0,0 +1,147 @@
|
||||
---
|
||||
phase: 07-import-cicd
|
||||
plan: 02
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- pkg/importer/gitleaks.go
|
||||
- pkg/importer/gitleaks_test.go
|
||||
- pkg/importer/testdata/gitleaks-sample.json
|
||||
- pkg/importer/testdata/gitleaks-sample.csv
|
||||
autonomous: true
|
||||
requirements: [IMP-02]
|
||||
must_haves:
|
||||
truths:
|
||||
- "Gitleaks JSON output parses to []engine.Finding"
|
||||
- "Gitleaks CSV output parses to []engine.Finding"
|
||||
- "Gitleaks RuleID normalizes to KeyHunter lowercase provider names"
|
||||
artifacts:
|
||||
- path: pkg/importer/gitleaks.go
|
||||
provides: "GitleaksImporter (JSON) + GitleaksCSVImporter"
|
||||
contains: "func (GitleaksImporter) Import"
|
||||
- path: pkg/importer/testdata/gitleaks-sample.json
|
||||
provides: "Gitleaks JSON fixture"
|
||||
- path: pkg/importer/testdata/gitleaks-sample.csv
|
||||
provides: "Gitleaks CSV fixture"
|
||||
key_links:
|
||||
- from: pkg/importer/gitleaks.go
|
||||
to: pkg/engine/finding.go
|
||||
via: "constructs engine.Finding from Gitleaks records"
|
||||
pattern: "engine\\.Finding\\{"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Add Gitleaks adapters (JSON + CSV) to pkg/importer implementing the Importer interface from Plan 07-01.
|
||||
|
||||
Purpose: Gitleaks is the second major secret scanner; ingesting its output (both JSON and CSV flavors) lets users unify findings (IMP-02).
|
||||
Output: GitleaksImporter, GitleaksCSVImporter, tests, fixtures.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/07-import-cicd/07-CONTEXT.md
|
||||
@pkg/engine/finding.go
|
||||
|
||||
<interfaces>
|
||||
Contract defined by Plan 07-01 in pkg/importer/importer.go:
|
||||
```go
|
||||
type Importer interface {
|
||||
Name() string
|
||||
Import(r io.Reader) ([]engine.Finding, error)
|
||||
}
|
||||
```
|
||||
NOTE: If pkg/importer/importer.go does not yet exist at execution time (waves 1 run in parallel), this plan's executor MUST first create that file with the interface above. The TruffleHog plan (07-01) will reuse the same file.
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 1: Gitleaks JSON + CSV parsers with fixtures</name>
|
||||
<files>pkg/importer/gitleaks.go, pkg/importer/gitleaks_test.go, pkg/importer/testdata/gitleaks-sample.json, pkg/importer/testdata/gitleaks-sample.csv</files>
|
||||
<behavior>
|
||||
- GitleaksImporter.Import parses a JSON array of records with fields: Description, StartLine, EndLine, StartColumn, EndColumn, Match, Secret, File, SymlinkFile, Commit, Entropy (float), Author, Email, Date, Message, Tags ([]string), RuleID, Fingerprint
|
||||
- Each JSON record maps to engine.Finding:
|
||||
ProviderName = normalizeGitleaksRuleID(RuleID) // "openai-api-key" -> "openai", "aws-access-token" -> "aws", "generic-api-key" -> "generic"
|
||||
KeyValue = Secret
|
||||
KeyMasked = engine.MaskKey(Secret)
|
||||
Confidence = "medium" (Gitleaks doesn't verify)
|
||||
Source = File (fallback SymlinkFile)
|
||||
SourceType = "import:gitleaks"
|
||||
LineNumber = StartLine
|
||||
DetectedAt = time.Now()
|
||||
Verified = false, VerifyStatus = "unverified"
|
||||
- GitleaksCSVImporter.Import reads CSV via encoding/csv. Header row mandatory; column order follows gitleaks default: RuleID,Commit,File,SymlinkFile,Secret,Match,StartLine,EndLine,StartColumn,EndColumn,Author,Message,Date,Email,Fingerprint,Tags. Parse header to index map so column order resilience is not required but header names must match.
|
||||
- normalizeGitleaksRuleID trims common suffixes: "-api-key", "-access-token", "-token", "-secret", "-key". E.g., "openai-api-key" -> "openai", "anthropic-api-key" -> "anthropic", "aws-access-token" -> "aws", "github-pat" -> "github-pat" (no suffix match, kept as-is but lowercased).
|
||||
- Empty array / empty CSV (header only): returns empty slice nil error.
|
||||
- Malformed JSON or CSV: returns wrapped error.
|
||||
- Name() methods return "gitleaks" and "gitleaks-csv" respectively.
|
||||
</behavior>
|
||||
<action>
|
||||
If pkg/importer/importer.go does not exist yet (parallel execution with 07-01), create it first with the Importer interface (see <interfaces> above).
|
||||
|
||||
Create pkg/importer/gitleaks.go:
|
||||
- Package `importer`; imports: encoding/csv, encoding/json, fmt, io, strconv, strings, time, engine pkg.
|
||||
- Define `type GitleaksImporter struct{}` and `type GitleaksCSVImporter struct{}`.
|
||||
- Define `type gitleaksRecord struct` with JSON tags matching the Gitleaks schema above.
|
||||
- Implement `(GitleaksImporter) Name() string` -> "gitleaks"; Import decodes JSON array, loops building engine.Finding.
|
||||
- Implement `(GitleaksCSVImporter) Name() string` -> "gitleaks-csv"; Import uses csv.NewReader(r), reads header row, builds `map[string]int` of column index, then loops reading records. Parses StartLine via strconv.Atoi; swallows parse errors by setting LineNumber=0.
|
||||
- Implement `normalizeGitleaksRuleID(id string) string`:
|
||||
```go
|
||||
id = strings.ToLower(id)
|
||||
suffixes := []string{"-api-key", "-access-token", "-secret-key", "-secret", "-token", "-key"}
|
||||
for _, s := range suffixes {
|
||||
if strings.HasSuffix(id, s) {
|
||||
return strings.TrimSuffix(id, s)
|
||||
}
|
||||
}
|
||||
return id
|
||||
```
|
||||
- Helper `buildGitleaksFinding(ruleID, secret, file, symlink string, startLine int) engine.Finding` shared between JSON and CSV paths:
|
||||
- source := file; if source == "" { source = symlink }
|
||||
- returns engine.Finding{ProviderName: normalizeGitleaksRuleID(ruleID), KeyValue: secret, KeyMasked: engine.MaskKey(secret), Confidence: "medium", Source: source, SourceType: "import:gitleaks", LineNumber: startLine, DetectedAt: time.Now(), VerifyStatus: "unverified"}
|
||||
|
||||
Create pkg/importer/testdata/gitleaks-sample.json — JSON array with 3 records covering:
|
||||
- {"RuleID":"openai-api-key","Secret":"sk-proj-1234567890abcdef1234","File":"config/app.yml","StartLine":12, ...}
|
||||
- {"RuleID":"aws-access-token","Secret":"AKIAIOSFODNN7EXAMPLE","File":"terraform/main.tf","StartLine":55, ...}
|
||||
- {"RuleID":"generic-api-key","Secret":"xoxp-abcdefghijklmnopqrstuvwxyz","File":"scripts/deploy.sh","StartLine":3, ...}
|
||||
|
||||
Create pkg/importer/testdata/gitleaks-sample.csv with header row and the same 3 rows (in Gitleaks default column order).
|
||||
|
||||
Create pkg/importer/gitleaks_test.go:
|
||||
- TestGitleaksImporter_JSON: loads fixture, expects 3 findings, findings[0].ProviderName=="openai", findings[1].ProviderName=="aws", Source/LineNumber correct.
|
||||
- TestGitleaksImporter_CSV: loads CSV fixture, same 3 findings, same assertions.
|
||||
- TestGitleaksImporter_NormalizeRuleID: table — {"openai-api-key","openai"}, {"aws-access-token","aws"}, {"anthropic-api-key","anthropic"}, {"generic-api-key","generic"}, {"github-pat","github-pat"}.
|
||||
- TestGitleaksImporter_EmptyArray, TestGitleaksImporter_EmptyCSV (header only).
|
||||
- TestGitleaksImporter_InvalidJSON returns error.
|
||||
- Name() assertions for both importers.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/importer/... -run Gitleaks -v</automated>
|
||||
</verify>
|
||||
<done>
|
||||
- GitleaksImporter + GitleaksCSVImporter implemented
|
||||
- JSON + CSV fixtures committed
|
||||
- All Gitleaks tests pass
|
||||
- go build ./pkg/importer/... succeeds
|
||||
</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
go test ./pkg/importer/... passes. Both JSON and CSV paths produce identical Finding slices from equivalent fixtures.
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
Gitleaks output (JSON and CSV) ingests into normalized engine.Finding records with correct provider name mapping and line number extraction.
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/07-import-cicd/07-02-SUMMARY.md`.
|
||||
</output>
|
||||
190
.planning/phases/07-import-cicd/07-03-PLAN.md
Normal file
190
.planning/phases/07-import-cicd/07-03-PLAN.md
Normal file
@@ -0,0 +1,190 @@
|
||||
---
|
||||
phase: 07-import-cicd
|
||||
plan: 03
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- pkg/importer/dedup.go
|
||||
- pkg/importer/dedup_test.go
|
||||
- pkg/output/sarif_github_test.go
|
||||
- testdata/sarif/sarif-2.1.0-minimal-schema.json
|
||||
autonomous: true
|
||||
requirements: [IMP-03, CICD-02]
|
||||
must_haves:
|
||||
truths:
|
||||
- "Duplicate findings (same provider + masked key + source) are detected via stable hash"
|
||||
- "SARIF output from Phase 6 contains all GitHub-required fields for code scanning uploads"
|
||||
artifacts:
|
||||
- path: pkg/importer/dedup.go
|
||||
provides: "FindingKey hash + Dedup function"
|
||||
contains: "func FindingKey"
|
||||
- path: pkg/output/sarif_github_test.go
|
||||
provides: "GitHub code scanning SARIF validation test"
|
||||
contains: "TestSARIFGitHubValidation"
|
||||
key_links:
|
||||
- from: pkg/importer/dedup.go
|
||||
to: pkg/engine/finding.go
|
||||
via: "hashes engine.Finding fields"
|
||||
pattern: "engine\\.Finding"
|
||||
- from: pkg/output/sarif_github_test.go
|
||||
to: pkg/output/sarif.go
|
||||
via: "renders SARIFFormatter output and validates required fields"
|
||||
pattern: "SARIFFormatter"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Build two independent assets needed by Plan 07-04 and the GitHub integration story: (1) deduplication helper for imported findings (IMP-03), (2) a SARIF GitHub validation test that asserts Phase 6's SARIF output satisfies GitHub Code Scanning requirements (CICD-02).
|
||||
|
||||
Purpose: Imports will be re-run repeatedly; without dedup the database fills with copies. GitHub upload validation closes the loop on CICD-02 by proving SARIF output is acceptable without manual upload.
|
||||
Output: Dedup package function, dedup unit tests, SARIF validation test, minimal schema fixture.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/07-import-cicd/07-CONTEXT.md
|
||||
@pkg/engine/finding.go
|
||||
@pkg/output/sarif.go
|
||||
|
||||
<interfaces>
|
||||
From pkg/output/sarif.go:
|
||||
```go
|
||||
type SARIFFormatter struct{}
|
||||
func (SARIFFormatter) Format(findings []engine.Finding, w io.Writer, opts Options) error
|
||||
```
|
||||
From pkg/engine/finding.go: engine.Finding with ProviderName, KeyMasked, Source, LineNumber.
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 1: Dedup helper for imported findings</name>
|
||||
<files>pkg/importer/dedup.go, pkg/importer/dedup_test.go</files>
|
||||
<behavior>
|
||||
- FindingKey(f engine.Finding) string returns hex-encoded SHA-256 over "provider\x00masked\x00source\x00line".
|
||||
- Dedup(in []engine.Finding) (unique []engine.Finding, duplicates int): preserves first-seen order, drops subsequent matches of the same FindingKey, returns count of dropped.
|
||||
- Two findings with same provider+masked+source+line are duplicates regardless of other fields (DetectedAt, Confidence).
|
||||
- Different source paths or different line numbers are NOT duplicates.
|
||||
</behavior>
|
||||
<action>
|
||||
Create pkg/importer/dedup.go:
|
||||
```go
|
||||
package importer
|
||||
|
||||
import (
|
||||
"crypto/sha256"
|
||||
"encoding/hex"
|
||||
"fmt"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine"
|
||||
)
|
||||
|
||||
// FindingKey returns a stable identity hash for a finding based on the
|
||||
// provider name, masked key, source path, and line number. This is the
|
||||
// dedup identity used by import pipelines so the same underlying secret
|
||||
// is not inserted twice when re-importing the same scanner output.
|
||||
func FindingKey(f engine.Finding) string {
|
||||
payload := fmt.Sprintf("%s\x00%s\x00%s\x00%d", f.ProviderName, f.KeyMasked, f.Source, f.LineNumber)
|
||||
sum := sha256.Sum256([]byte(payload))
|
||||
return hex.EncodeToString(sum[:])
|
||||
}
|
||||
|
||||
// Dedup removes duplicate findings from in-memory slices before insert.
|
||||
// Order of first-seen findings is preserved. Returns the deduplicated
|
||||
// slice and the number of duplicates dropped.
|
||||
func Dedup(in []engine.Finding) ([]engine.Finding, int) {
|
||||
seen := make(map[string]struct{}, len(in))
|
||||
out := make([]engine.Finding, 0, len(in))
|
||||
dropped := 0
|
||||
for _, f := range in {
|
||||
k := FindingKey(f)
|
||||
if _, ok := seen[k]; ok {
|
||||
dropped++
|
||||
continue
|
||||
}
|
||||
seen[k] = struct{}{}
|
||||
out = append(out, f)
|
||||
}
|
||||
return out, dropped
|
||||
}
|
||||
```
|
||||
|
||||
Create pkg/importer/dedup_test.go with tests:
|
||||
- TestFindingKey_Stable: same finding twice -> identical key.
|
||||
- TestFindingKey_DiffersByProvider / ByMasked / BySource / ByLine.
|
||||
- TestDedup_PreservesOrder: input [A, B, A, C, B] -> output [A, B, C], dropped=2.
|
||||
- TestDedup_Empty: nil slice -> empty slice, 0 dropped.
|
||||
- TestDedup_IgnoresUnrelatedFields: two findings identical except DetectedAt and Confidence -> one kept.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/importer/... -run Dedup -v</automated>
|
||||
</verify>
|
||||
<done>
|
||||
- FindingKey + Dedup implemented
|
||||
- 5 tests pass
|
||||
</done>
|
||||
</task>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 2: SARIF GitHub code scanning validation test</name>
|
||||
<files>pkg/output/sarif_github_test.go, testdata/sarif/sarif-2.1.0-minimal-schema.json</files>
|
||||
<action>
|
||||
Create testdata/sarif/sarif-2.1.0-minimal-schema.json — a minimal JSON document listing GitHub's required SARIF fields for code scanning upload. Not the full schema (would be 500KB); the required-fields subset documented at https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning. Content:
|
||||
```json
|
||||
{
|
||||
"required_top_level": ["$schema", "version", "runs"],
|
||||
"required_run": ["tool", "results"],
|
||||
"required_tool_driver": ["name", "version"],
|
||||
"required_result": ["ruleId", "level", "message", "locations"],
|
||||
"required_location_physical": ["artifactLocation", "region"],
|
||||
"required_region": ["startLine"],
|
||||
"allowed_levels": ["error", "warning", "note", "none"]
|
||||
}
|
||||
```
|
||||
|
||||
Create pkg/output/sarif_github_test.go (package `output`):
|
||||
- TestSARIFGitHubValidation:
|
||||
1. Build a []engine.Finding of 3 findings spanning high/medium/low confidence with realistic values (ProviderName, KeyValue, KeyMasked, Source, LineNumber).
|
||||
2. Render via SARIFFormatter.Format into a bytes.Buffer with Options{ToolName: "keyhunter", ToolVersion: "test"}.
|
||||
3. json.Unmarshal into map[string]any.
|
||||
4. Load testdata/sarif/sarif-2.1.0-minimal-schema.json (relative to test file via os.ReadFile).
|
||||
5. Assert every key in required_top_level exists at root.
|
||||
6. Assert doc["version"] == "2.1.0".
|
||||
7. Assert doc["$schema"] is a non-empty string starting with "https://".
|
||||
8. runs := doc["runs"].([]any); require len(runs) == 1.
|
||||
9. For the single run, assert tool.driver.name == "keyhunter", version non-empty, results is a slice.
|
||||
10. For each result: assert ruleId non-empty string, level in allowed_levels, message.text non-empty, locations is non-empty slice.
|
||||
11. For each location: assert physicalLocation.artifactLocation.uri non-empty and physicalLocation.region.startLine >= 1.
|
||||
12. Assert startLine is always >= 1 even when input LineNumber is 0 (test one finding with LineNumber: 0 and confirm startLine in output == 1 — matches Phase 6 floor behavior).
|
||||
- TestSARIFGitHubValidation_EmptyFindings: empty findings slice still produces a valid document with runs[0].results == [] (not null), tool.driver present.
|
||||
|
||||
Use standard library only (encoding/json, os, path/filepath, testing). No schema validation library.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/output/... -run SARIFGitHub -v</automated>
|
||||
</verify>
|
||||
<done>
|
||||
- testdata/sarif/sarif-2.1.0-minimal-schema.json committed
|
||||
- pkg/output/sarif_github_test.go passes
|
||||
- SARIFFormatter output provably satisfies GitHub Code Scanning required fields
|
||||
</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
go test ./pkg/importer/... ./pkg/output/... passes.
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
Dedup helper usable by the import command (07-04). SARIF output validated against GitHub's required-field surface with no external dependencies, proving CICD-02 end-to-end.
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/07-import-cicd/07-03-SUMMARY.md`.
|
||||
</output>
|
||||
263
.planning/phases/07-import-cicd/07-04-PLAN.md
Normal file
263
.planning/phases/07-import-cicd/07-04-PLAN.md
Normal file
@@ -0,0 +1,263 @@
|
||||
---
|
||||
phase: 07-import-cicd
|
||||
plan: 04
|
||||
type: execute
|
||||
wave: 2
|
||||
depends_on: ["07-01", "07-02", "07-03"]
|
||||
files_modified:
|
||||
- cmd/import.go
|
||||
- cmd/stubs.go
|
||||
- cmd/import_test.go
|
||||
autonomous: true
|
||||
requirements: [IMP-01, IMP-02, IMP-03]
|
||||
must_haves:
|
||||
truths:
|
||||
- "keyhunter import --format=trufflehog <file> inserts findings into the SQLite database"
|
||||
- "keyhunter import --format=gitleaks <file> inserts findings"
|
||||
- "keyhunter import --format=gitleaks-csv <file> inserts findings"
|
||||
- "Duplicate findings across repeated imports are skipped with reported count"
|
||||
- "Summary 'Imported N findings (M new, K duplicates)' is printed to stdout"
|
||||
artifacts:
|
||||
- path: cmd/import.go
|
||||
provides: "keyhunter import command implementation"
|
||||
contains: "var importCmd"
|
||||
key_links:
|
||||
- from: cmd/import.go
|
||||
to: pkg/importer
|
||||
via: "dispatches by format flag to Importer implementations"
|
||||
pattern: "importer\\.(TruffleHog|Gitleaks|GitleaksCSV)Importer"
|
||||
- from: cmd/import.go
|
||||
to: pkg/storage
|
||||
via: "calls db.SaveFinding for each deduped record"
|
||||
pattern: "SaveFinding"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Replace the cmd/import stub with a fully wired command that parses external scanner output (via pkg/importer), deduplicates, and persists findings to the KeyHunter SQLite database.
|
||||
|
||||
Purpose: Delivers IMP-01/02/03 end-to-end from CLI. Users can consolidate TruffleHog and Gitleaks scans into the unified KeyHunter database.
|
||||
Output: Working `keyhunter import` command with tests.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/07-import-cicd/07-CONTEXT.md
|
||||
@cmd/stubs.go
|
||||
@cmd/root.go
|
||||
@pkg/storage/findings.go
|
||||
|
||||
<interfaces>
|
||||
From pkg/importer (Plans 07-01, 07-02, 07-03):
|
||||
```go
|
||||
type Importer interface {
|
||||
Name() string
|
||||
Import(r io.Reader) ([]engine.Finding, error)
|
||||
}
|
||||
type TruffleHogImporter struct{}
|
||||
type GitleaksImporter struct{}
|
||||
type GitleaksCSVImporter struct{}
|
||||
func FindingKey(f engine.Finding) string
|
||||
func Dedup(in []engine.Finding) (unique []engine.Finding, duplicates int)
|
||||
```
|
||||
From pkg/storage/findings.go:
|
||||
```go
|
||||
func (db *DB) SaveFinding(f storage.Finding, encKey []byte) (int64, error)
|
||||
```
|
||||
storage.Finding fields: ProviderName, KeyValue, KeyMasked, Confidence, SourcePath, SourceType, LineNumber, Verified, VerifyStatus, VerifyHTTPCode, VerifyMetadata, ScanID.
|
||||
Note field name difference: storage uses SourcePath; engine uses Source. Conversion required.
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 1: Implement cmd/import.go with format dispatch and dedup</name>
|
||||
<files>cmd/import.go, cmd/stubs.go, cmd/import_test.go</files>
|
||||
<action>
|
||||
Remove the `importCmd` stub from cmd/stubs.go (delete the `var importCmd = &cobra.Command{...}` block). Leave all other stubs intact.
|
||||
|
||||
Create cmd/import.go:
|
||||
```go
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
"time"
|
||||
|
||||
"github.com/spf13/cobra"
|
||||
|
||||
"github.com/salvacybersec/keyhunter/pkg/engine"
|
||||
"github.com/salvacybersec/keyhunter/pkg/importer"
|
||||
"github.com/salvacybersec/keyhunter/pkg/storage"
|
||||
)
|
||||
|
||||
var (
|
||||
importFormat string
|
||||
)
|
||||
|
||||
var importCmd = &cobra.Command{
|
||||
Use: "import <file>",
|
||||
Short: "Import findings from TruffleHog or Gitleaks output",
|
||||
Long: `Import scan output from external secret scanners into the KeyHunter database. Supported formats: trufflehog (v3 JSON), gitleaks (JSON), gitleaks-csv.`,
|
||||
Args: cobra.ExactArgs(1),
|
||||
RunE: runImport,
|
||||
}
|
||||
|
||||
func init() {
|
||||
importCmd.Flags().StringVar(&importFormat, "format", "", "input format: trufflehog | gitleaks | gitleaks-csv (required)")
|
||||
_ = importCmd.MarkFlagRequired("format")
|
||||
}
|
||||
|
||||
func runImport(cmd *cobra.Command, args []string) error {
|
||||
path := args[0]
|
||||
imp, err := selectImporter(importFormat)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
f, err := os.Open(path)
|
||||
if err != nil {
|
||||
return fmt.Errorf("opening %s: %w", path, err)
|
||||
}
|
||||
defer f.Close()
|
||||
|
||||
findings, err := imp.Import(f)
|
||||
if err != nil {
|
||||
return fmt.Errorf("parsing %s output: %w", imp.Name(), err)
|
||||
}
|
||||
|
||||
unique, dupes := importer.Dedup(findings)
|
||||
|
||||
db, encKey, err := openDBForImport()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer db.Close()
|
||||
|
||||
newCount := 0
|
||||
dbDupes := 0
|
||||
for _, finding := range unique {
|
||||
sf := engineToStorage(finding)
|
||||
// Defense against cross-import duplicates already in DB:
|
||||
exists, err := findingExistsInDB(db, finding)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if exists {
|
||||
dbDupes++
|
||||
continue
|
||||
}
|
||||
if _, err := db.SaveFinding(sf, encKey); err != nil {
|
||||
return fmt.Errorf("saving finding: %w", err)
|
||||
}
|
||||
newCount++
|
||||
}
|
||||
|
||||
totalDupes := dupes + dbDupes
|
||||
fmt.Fprintf(cmd.OutOrStdout(), "Imported %d findings (%d new, %d duplicates)\n", len(findings), newCount, totalDupes)
|
||||
return nil
|
||||
}
|
||||
|
||||
func selectImporter(format string) (importer.Importer, error) {
|
||||
switch format {
|
||||
case "trufflehog":
|
||||
return importer.TruffleHogImporter{}, nil
|
||||
case "gitleaks":
|
||||
return importer.GitleaksImporter{}, nil
|
||||
case "gitleaks-csv":
|
||||
return importer.GitleaksCSVImporter{}, nil
|
||||
default:
|
||||
return nil, fmt.Errorf("unknown format %q (want trufflehog | gitleaks | gitleaks-csv)", format)
|
||||
}
|
||||
}
|
||||
|
||||
func engineToStorage(f engine.Finding) storage.Finding {
|
||||
if f.DetectedAt.IsZero() {
|
||||
f.DetectedAt = time.Now()
|
||||
}
|
||||
return storage.Finding{
|
||||
ProviderName: f.ProviderName,
|
||||
KeyValue: f.KeyValue,
|
||||
KeyMasked: f.KeyMasked,
|
||||
Confidence: f.Confidence,
|
||||
SourcePath: f.Source,
|
||||
SourceType: f.SourceType,
|
||||
LineNumber: f.LineNumber,
|
||||
Verified: f.Verified,
|
||||
VerifyStatus: f.VerifyStatus,
|
||||
VerifyHTTPCode: f.VerifyHTTPCode,
|
||||
VerifyMetadata: f.VerifyMetadata,
|
||||
}
|
||||
}
|
||||
|
||||
// openDBForImport opens the configured DB using the same helpers as scan/keys.
|
||||
// Reuse whatever helper already exists in cmd/ (e.g., openDBWithKey from keys.go).
|
||||
// If no shared helper exists, extract one from cmd/scan.go.
|
||||
func openDBForImport() (*storage.DB, []byte, error) {
|
||||
// TODO-executor: reuse existing DB-open helper from cmd/scan.go or cmd/keys.go.
|
||||
// Do NOT duplicate encryption key derivation — call into the existing helper.
|
||||
return nil, nil, fmt.Errorf("not yet wired")
|
||||
}
|
||||
|
||||
// findingExistsInDB checks if a finding with the same provider + masked key + source + line
|
||||
// already exists. Uses importer.FindingKey-style logic via a DB query against findings table.
|
||||
func findingExistsInDB(db *storage.DB, f engine.Finding) (bool, error) {
|
||||
// Executor: add a storage helper or use db.SQL() with:
|
||||
// SELECT 1 FROM findings WHERE provider_name=? AND key_masked=? AND source_path=? AND line_number=? LIMIT 1
|
||||
return false, nil
|
||||
}
|
||||
```
|
||||
|
||||
CRITICAL executor notes:
|
||||
1. Inspect cmd/scan.go and cmd/keys.go to find the existing DB-open + passphrase helper (e.g., `openDBWithPassphrase` or similar). Use that helper — do not reimplement encryption key derivation. Replace the `openDBForImport` body accordingly.
|
||||
2. Inspect pkg/storage for an existing "find by key" helper. If none, add a thin method `func (db *DB) FindingExistsByKey(provider, masked, sourcePath string, line int) (bool, error)` to pkg/storage/queries.go that runs the SELECT above. If you add this method, update pkg/storage/queries.go to include it, and add a test in pkg/storage (simple in-memory roundtrip).
|
||||
3. Register importCmd: it's already added in cmd/root.go via `rootCmd.AddCommand(importCmd)`. Since you removed the stub, your new `var importCmd` declaration takes over the identifier — no root.go change needed.
|
||||
|
||||
Create cmd/import_test.go:
|
||||
- TestSelectImporter: table — {"trufflehog", TruffleHogImporter}, {"gitleaks", GitleaksImporter}, {"gitleaks-csv", GitleaksCSVImporter}, {"bogus", error}.
|
||||
- TestEngineToStorage: converts engine.Finding (with Source="a.yml", LineNumber=5, Verified=true) to storage.Finding (SourcePath="a.yml", LineNumber=5, Verified=true).
|
||||
- TestRunImport_EndToEnd (integration-style):
|
||||
* Create a temp DB via existing test helpers (look for one in cmd/*_test.go or pkg/storage/*_test.go).
|
||||
* Write a tiny TruffleHog JSON file to a temp path.
|
||||
* Invoke importCmd.Execute() with args `["import", "--format=trufflehog", tmpPath]`.
|
||||
* Assert stdout contains "Imported" and "new".
|
||||
* Assert db.ListFindings returns at least 1 finding with ProviderName set.
|
||||
* Re-run the same command → assert output reports "0 new" and dupe count equals prior insert count.
|
||||
* If a shared test DB helper is not discoverable, mark this subtest with t.Skip("needs shared test DB helper") but still ship TestSelectImporter and TestEngineToStorage.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go build ./... && go test ./cmd/... -run Import -v</automated>
|
||||
</verify>
|
||||
<done>
|
||||
- cmd/import.go replaces the stub; stub removed from cmd/stubs.go
|
||||
- `keyhunter import --format=trufflehog sample.json` inserts findings
|
||||
- Re-running the same import reports all as duplicates
|
||||
- Unit tests pass; build succeeds
|
||||
</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
Manual smoke test:
|
||||
```
|
||||
go run ./cmd/keyhunter import --format=trufflehog pkg/importer/testdata/trufflehog-sample.json
|
||||
# Expect: "Imported 3 findings (3 new, 0 duplicates)"
|
||||
go run ./cmd/keyhunter import --format=trufflehog pkg/importer/testdata/trufflehog-sample.json
|
||||
# Expect: "Imported 3 findings (0 new, 3 duplicates)"
|
||||
```
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
IMP-01, IMP-02, IMP-03 delivered end-to-end: external scanner output can be imported, deduped, and persisted; repeat imports are idempotent.
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/07-import-cicd/07-04-SUMMARY.md`.
|
||||
</output>
|
||||
240
.planning/phases/07-import-cicd/07-05-PLAN.md
Normal file
240
.planning/phases/07-import-cicd/07-05-PLAN.md
Normal file
@@ -0,0 +1,240 @@
|
||||
---
|
||||
phase: 07-import-cicd
|
||||
plan: 05
|
||||
type: execute
|
||||
wave: 2
|
||||
depends_on: ["07-04"]
|
||||
files_modified:
|
||||
- cmd/hook.go
|
||||
- cmd/stubs.go
|
||||
- cmd/hook_script.sh
|
||||
- cmd/hook_test.go
|
||||
autonomous: true
|
||||
requirements: [CICD-01]
|
||||
must_haves:
|
||||
truths:
|
||||
- "keyhunter hook install writes an executable .git/hooks/pre-commit"
|
||||
- "The installed hook calls keyhunter scan on staged files and propagates the exit code"
|
||||
- "keyhunter hook uninstall removes a KeyHunter-owned hook, preserving non-KeyHunter content via backup"
|
||||
- "Both commands error cleanly when run outside a git repository"
|
||||
artifacts:
|
||||
- path: cmd/hook.go
|
||||
provides: "keyhunter hook install/uninstall implementation"
|
||||
contains: "var hookCmd"
|
||||
- path: cmd/hook_script.sh
|
||||
provides: "embedded pre-commit shell script"
|
||||
contains: "keyhunter scan"
|
||||
key_links:
|
||||
- from: cmd/hook.go
|
||||
to: cmd/hook_script.sh
|
||||
via: "go:embed compile-time bundling"
|
||||
pattern: "//go:embed hook_script.sh"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Replace the cmd/hook stub with working install/uninstall logic. The install subcommand writes a pre-commit script (embedded via go:embed) that invokes `keyhunter scan` on staged files and exits with scan's exit code.
|
||||
|
||||
Purpose: CICD-01 — git pre-commit integration prevents leaked keys from being committed. First line of defense for developer workflows.
|
||||
Output: Working `keyhunter hook install` / `keyhunter hook uninstall` subcommands, embedded script, tests.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/07-import-cicd/07-CONTEXT.md
|
||||
@cmd/stubs.go
|
||||
@cmd/root.go
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 1: cmd/hook.go with install/uninstall subcommands + embedded script</name>
|
||||
<files>cmd/hook.go, cmd/stubs.go, cmd/hook_script.sh, cmd/hook_test.go</files>
|
||||
<action>
|
||||
Remove the `hookCmd` stub block from cmd/stubs.go. Keep all other stubs.
|
||||
|
||||
Create cmd/hook_script.sh (exact contents below — trailing newline important):
|
||||
```sh
|
||||
#!/usr/bin/env bash
|
||||
# KEYHUNTER-HOOK v1 — managed by `keyhunter hook install`
|
||||
# Remove via `keyhunter hook uninstall`.
|
||||
set -e
|
||||
|
||||
files=$(git diff --cached --name-only --diff-filter=ACMR)
|
||||
if [ -z "$files" ]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Run keyhunter against each staged file. Exit code 1 from keyhunter
|
||||
# means findings present; 2 means scan error. Either blocks the commit.
|
||||
echo "$files" | xargs -r keyhunter scan --exit-code
|
||||
status=$?
|
||||
if [ $status -ne 0 ]; then
|
||||
echo "keyhunter: pre-commit blocked (exit $status). Run 'git commit --no-verify' to bypass." >&2
|
||||
exit $status
|
||||
fi
|
||||
exit 0
|
||||
```
|
||||
|
||||
Create cmd/hook.go:
|
||||
```go
|
||||
package cmd
|
||||
|
||||
import (
|
||||
_ "embed"
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/spf13/cobra"
|
||||
)
|
||||
|
||||
//go:embed hook_script.sh
|
||||
var hookScript string
|
||||
|
||||
// hookMarker identifies a KeyHunter-managed hook. Uninstall refuses to
|
||||
// delete pre-commit files that don't contain this marker unless --force.
|
||||
const hookMarker = "KEYHUNTER-HOOK v1"
|
||||
|
||||
var (
|
||||
hookForce bool
|
||||
)
|
||||
|
||||
var hookCmd = &cobra.Command{
|
||||
Use: "hook",
|
||||
Short: "Install or manage git pre-commit hooks",
|
||||
}
|
||||
|
||||
var hookInstallCmd = &cobra.Command{
|
||||
Use: "install",
|
||||
Short: "Install the keyhunter pre-commit hook into .git/hooks/",
|
||||
RunE: runHookInstall,
|
||||
}
|
||||
|
||||
var hookUninstallCmd = &cobra.Command{
|
||||
Use: "uninstall",
|
||||
Short: "Remove the keyhunter pre-commit hook",
|
||||
RunE: runHookUninstall,
|
||||
}
|
||||
|
||||
func init() {
|
||||
hookInstallCmd.Flags().BoolVar(&hookForce, "force", false, "overwrite any existing pre-commit hook without prompt")
|
||||
hookUninstallCmd.Flags().BoolVar(&hookForce, "force", false, "delete pre-commit even if it is not KeyHunter-managed")
|
||||
hookCmd.AddCommand(hookInstallCmd)
|
||||
hookCmd.AddCommand(hookUninstallCmd)
|
||||
}
|
||||
|
||||
func hookPath() (string, error) {
|
||||
gitDir := ".git"
|
||||
info, err := os.Stat(gitDir)
|
||||
if err != nil || !info.IsDir() {
|
||||
return "", fmt.Errorf("not a git repository (no .git/ in current directory)")
|
||||
}
|
||||
return filepath.Join(gitDir, "hooks", "pre-commit"), nil
|
||||
}
|
||||
|
||||
func runHookInstall(cmd *cobra.Command, args []string) error {
|
||||
target, err := hookPath()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if _, err := os.Stat(target); err == nil {
|
||||
existing, _ := os.ReadFile(target)
|
||||
if strings.Contains(string(existing), hookMarker) {
|
||||
// Already ours — overwrite silently to update script.
|
||||
} else if !hookForce {
|
||||
return fmt.Errorf("pre-commit hook already exists at %s (use --force to overwrite; a .bak backup will be kept)", target)
|
||||
} else {
|
||||
backup := target + ".bak." + time.Now().Format("20060102150405")
|
||||
if err := os.Rename(target, backup); err != nil {
|
||||
return fmt.Errorf("backing up existing hook: %w", err)
|
||||
}
|
||||
fmt.Fprintf(cmd.OutOrStdout(), "Backed up existing hook to %s\n", backup)
|
||||
}
|
||||
}
|
||||
if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil {
|
||||
return fmt.Errorf("creating hooks dir: %w", err)
|
||||
}
|
||||
if err := os.WriteFile(target, []byte(hookScript), 0o755); err != nil {
|
||||
return fmt.Errorf("writing hook: %w", err)
|
||||
}
|
||||
fmt.Fprintf(cmd.OutOrStdout(), "Installed pre-commit hook at %s\n", target)
|
||||
return nil
|
||||
}
|
||||
|
||||
func runHookUninstall(cmd *cobra.Command, args []string) error {
|
||||
target, err := hookPath()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
data, err := os.ReadFile(target)
|
||||
if err != nil {
|
||||
if os.IsNotExist(err) {
|
||||
fmt.Fprintln(cmd.OutOrStdout(), "No pre-commit hook to remove.")
|
||||
return nil
|
||||
}
|
||||
return fmt.Errorf("reading hook: %w", err)
|
||||
}
|
||||
if !strings.Contains(string(data), hookMarker) && !hookForce {
|
||||
return fmt.Errorf("pre-commit at %s is not KeyHunter-managed (use --force to remove anyway)", target)
|
||||
}
|
||||
if err := os.Remove(target); err != nil {
|
||||
return fmt.Errorf("removing hook: %w", err)
|
||||
}
|
||||
fmt.Fprintf(cmd.OutOrStdout(), "Removed pre-commit hook at %s\n", target)
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
hookCmd is already registered in cmd/root.go via `rootCmd.AddCommand(hookCmd)`. Removing the stub declaration lets this new `var hookCmd` take over the same identifier. Verify cmd/root.go still compiles.
|
||||
|
||||
Create cmd/hook_test.go:
|
||||
- Each test uses t.TempDir(), chdirs into it (t.Chdir(tmp) in Go 1.24+, else os.Chdir with cleanup), creates .git/ subdirectory.
|
||||
- TestHookInstall_FreshRepo: run install, assert .git/hooks/pre-commit exists, is 0o755 executable, and contains the hookMarker string.
|
||||
- TestHookInstall_NotAGitRepo: no .git/ dir → install returns error containing "not a git repository".
|
||||
- TestHookInstall_ExistingNonKeyhunterRefuses: pre-create pre-commit with "# my hook"; install without --force returns error; file unchanged.
|
||||
- TestHookInstall_ForceBackupsExisting: same as above with --force=true; assert original moved to *.bak.*, new hook installed.
|
||||
- TestHookInstall_ExistingKeyhunterOverwrites: pre-create pre-commit containing the marker; install succeeds without --force, file updated.
|
||||
- TestHookUninstall_RemovesKeyhunter: install then uninstall → file gone.
|
||||
- TestHookUninstall_RefusesForeign: pre-create foreign pre-commit; uninstall without --force errors; file unchanged.
|
||||
- TestHookUninstall_Force: same with --force → file removed.
|
||||
- TestHookUninstall_Missing: no pre-commit → succeeds with "No pre-commit hook to remove." output.
|
||||
- TestHookScript_ContainsRequired: the embedded hookScript variable contains "keyhunter scan" and "git diff --cached".
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/salva/Documents/apikey && go build ./... && go test ./cmd/... -run Hook -v</automated>
|
||||
</verify>
|
||||
<done>
|
||||
- cmd/hook.go implements install/uninstall
|
||||
- hook_script.sh embedded via go:embed
|
||||
- All hook tests pass
|
||||
- Stub removed from cmd/stubs.go
|
||||
- go build succeeds
|
||||
</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
In a scratch git repo:
|
||||
```
|
||||
keyhunter hook install # creates .git/hooks/pre-commit
|
||||
cat .git/hooks/pre-commit # shows embedded script
|
||||
git add file-with-key.txt && git commit -m test # should block
|
||||
keyhunter hook uninstall # removes hook
|
||||
```
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
CICD-01 delivered. Hook lifecycle (install → trigger on commit → uninstall) works on a real git repo; tests cover edge cases (non-repo, existing hook, force flag, missing marker).
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/07-import-cicd/07-05-SUMMARY.md`.
|
||||
</output>
|
||||
164
.planning/phases/07-import-cicd/07-06-PLAN.md
Normal file
164
.planning/phases/07-import-cicd/07-06-PLAN.md
Normal file
@@ -0,0 +1,164 @@
|
||||
---
|
||||
phase: 07-import-cicd
|
||||
plan: 06
|
||||
type: execute
|
||||
wave: 2
|
||||
depends_on: ["07-04", "07-05"]
|
||||
files_modified:
|
||||
- docs/CI-CD.md
|
||||
- README.md
|
||||
autonomous: true
|
||||
requirements: [CICD-01, CICD-02]
|
||||
must_haves:
|
||||
truths:
|
||||
- "Users have a documented GitHub Actions workflow example that runs keyhunter and uploads SARIF"
|
||||
- "Pre-commit hook setup is documented with install/uninstall commands"
|
||||
- "README references the new CI/CD document"
|
||||
artifacts:
|
||||
- path: docs/CI-CD.md
|
||||
provides: "CI/CD integration guide (GitHub Actions + pre-commit hook)"
|
||||
contains: "github/codeql-action/upload-sarif"
|
||||
- path: README.md
|
||||
provides: "Top-level project README (updated to link CI/CD guide)"
|
||||
key_links:
|
||||
- from: README.md
|
||||
to: docs/CI-CD.md
|
||||
via: "markdown link"
|
||||
pattern: "docs/CI-CD\\.md"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Document the Phase 7 deliverables: import command usage, pre-commit hook lifecycle, and GitHub Actions workflow for SARIF upload.
|
||||
|
||||
Purpose: CICD-01 and CICD-02 require the integration to be discoverable by users. Code alone is not enough — a working workflow example and hook setup walkthrough are part of the requirement.
|
||||
Output: docs/CI-CD.md, README section linking to it.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/phases/07-import-cicd/07-CONTEXT.md
|
||||
@README.md
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 1: Write docs/CI-CD.md with GitHub Actions + pre-commit guide</name>
|
||||
<files>docs/CI-CD.md</files>
|
||||
<action>
|
||||
Create docs/CI-CD.md with the following sections (markdown):
|
||||
|
||||
1. **Title & intro** — "KeyHunter CI/CD Integration" — one paragraph explaining scope: pre-commit hooks, GitHub Actions SARIF upload, importing external scanner output.
|
||||
|
||||
2. **Pre-commit Hook** section:
|
||||
- Install: `keyhunter hook install` (explain what file is written, where).
|
||||
- Override: `--force` flag backs up existing pre-commit as `pre-commit.bak.<timestamp>`.
|
||||
- Bypass a single commit: `git commit --no-verify`.
|
||||
- Uninstall: `keyhunter hook uninstall`.
|
||||
- Note: only scans staged files via `git diff --cached --name-only --diff-filter=ACMR`.
|
||||
|
||||
3. **GitHub Actions (SARIF upload to Code Scanning)** section, with a full working workflow example saved as a fenced yaml block:
|
||||
```yaml
|
||||
name: KeyHunter
|
||||
on:
|
||||
push:
|
||||
branches: [main]
|
||||
pull_request:
|
||||
jobs:
|
||||
scan:
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
contents: read
|
||||
security-events: write
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
- name: Install KeyHunter
|
||||
run: |
|
||||
curl -sSL https://github.com/salvacybersec/keyhunter/releases/latest/download/keyhunter_linux_amd64.tar.gz | tar -xz
|
||||
sudo mv keyhunter /usr/local/bin/
|
||||
- name: Scan repository
|
||||
run: keyhunter scan . --output sarif > keyhunter.sarif
|
||||
continue-on-error: true
|
||||
- name: Upload SARIF to GitHub Code Scanning
|
||||
uses: github/codeql-action/upload-sarif@v3
|
||||
with:
|
||||
sarif_file: keyhunter.sarif
|
||||
category: keyhunter
|
||||
```
|
||||
- Explain `continue-on-error: true` — scan exits 1 on findings; we want the SARIF upload step to still run. The findings show up in the Security tab.
|
||||
- Explain the required `security-events: write` permission.
|
||||
|
||||
4. **Importing External Scanner Output** section:
|
||||
- Running TruffleHog then importing:
|
||||
```
|
||||
trufflehog filesystem . --json > trufflehog.json
|
||||
keyhunter import --format=trufflehog trufflehog.json
|
||||
```
|
||||
- Gitleaks JSON:
|
||||
```
|
||||
gitleaks detect -f json -r gitleaks.json
|
||||
keyhunter import --format=gitleaks gitleaks.json
|
||||
```
|
||||
- Gitleaks CSV:
|
||||
```
|
||||
gitleaks detect -f csv -r gitleaks.csv
|
||||
keyhunter import --format=gitleaks-csv gitleaks.csv
|
||||
```
|
||||
- Dedup guarantee: re-running the same import is idempotent.
|
||||
|
||||
5. **Exit Codes** section — table of 0/1/2 semantics for CI integration.
|
||||
|
||||
Keep the whole file under ~200 lines. No emojis.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>test -f docs/CI-CD.md && grep -q "upload-sarif" docs/CI-CD.md && grep -q "keyhunter hook install" docs/CI-CD.md && grep -q "keyhunter import --format=trufflehog" docs/CI-CD.md</automated>
|
||||
</verify>
|
||||
<done>
|
||||
- docs/CI-CD.md exists with all 5 sections
|
||||
- Required strings present (upload-sarif, hook install, import --format=trufflehog)
|
||||
</done>
|
||||
</task>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 2: Update README.md with CI/CD integration link</name>
|
||||
<files>README.md</files>
|
||||
<action>
|
||||
Read current README.md first.
|
||||
|
||||
Add (or update if a stub section exists) a "CI/CD Integration" H2 section that:
|
||||
- Contains 2-4 sentences summarizing pre-commit hook + GitHub SARIF upload support.
|
||||
- Links to `docs/CI-CD.md` for the full guide.
|
||||
- Mentions `keyhunter import` for TruffleHog/Gitleaks consolidation.
|
||||
|
||||
Place the section after any existing "Installation" / "Usage" section and before "Development" or "License" sections. If those anchors don't exist, append near the end but before "License".
|
||||
|
||||
Do not rewrite unrelated parts of the README.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>grep -q "docs/CI-CD.md" README.md && grep -q "CI/CD" README.md</automated>
|
||||
</verify>
|
||||
<done>
|
||||
- README.md references docs/CI-CD.md
|
||||
- CI/CD Integration section exists
|
||||
</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
grep -q "upload-sarif" docs/CI-CD.md && grep -q "docs/CI-CD.md" README.md
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
CICD-01 and CICD-02 are discoverable end-to-end: a user landing on the README can find CI/CD guidance, follow it to docs/CI-CD.md, and copy a working GitHub Actions workflow + pre-commit setup.
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/07-import-cicd/07-06-SUMMARY.md`.
|
||||
</output>
|
||||
Reference in New Issue
Block a user