Files
2026-04-05 23:53:14 +03:00

148 lines
7.5 KiB
Markdown

---
phase: 07-import-cicd
plan: 02
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/importer/gitleaks.go
- pkg/importer/gitleaks_test.go
- pkg/importer/testdata/gitleaks-sample.json
- pkg/importer/testdata/gitleaks-sample.csv
autonomous: true
requirements: [IMP-02]
must_haves:
truths:
- "Gitleaks JSON output parses to []engine.Finding"
- "Gitleaks CSV output parses to []engine.Finding"
- "Gitleaks RuleID normalizes to KeyHunter lowercase provider names"
artifacts:
- path: pkg/importer/gitleaks.go
provides: "GitleaksImporter (JSON) + GitleaksCSVImporter"
contains: "func (GitleaksImporter) Import"
- path: pkg/importer/testdata/gitleaks-sample.json
provides: "Gitleaks JSON fixture"
- path: pkg/importer/testdata/gitleaks-sample.csv
provides: "Gitleaks CSV fixture"
key_links:
- from: pkg/importer/gitleaks.go
to: pkg/engine/finding.go
via: "constructs engine.Finding from Gitleaks records"
pattern: "engine\\.Finding\\{"
---
<objective>
Add Gitleaks adapters (JSON + CSV) to pkg/importer implementing the Importer interface from Plan 07-01.
Purpose: Gitleaks is the second major secret scanner; ingesting its output (both JSON and CSV flavors) lets users unify findings (IMP-02).
Output: GitleaksImporter, GitleaksCSVImporter, tests, fixtures.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/phases/07-import-cicd/07-CONTEXT.md
@pkg/engine/finding.go
<interfaces>
Contract defined by Plan 07-01 in pkg/importer/importer.go:
```go
type Importer interface {
Name() string
Import(r io.Reader) ([]engine.Finding, error)
}
```
NOTE: If pkg/importer/importer.go does not yet exist at execution time (waves 1 run in parallel), this plan's executor MUST first create that file with the interface above. The TruffleHog plan (07-01) will reuse the same file.
</interfaces>
</context>
<tasks>
<task type="auto" tdd="true">
<name>Task 1: Gitleaks JSON + CSV parsers with fixtures</name>
<files>pkg/importer/gitleaks.go, pkg/importer/gitleaks_test.go, pkg/importer/testdata/gitleaks-sample.json, pkg/importer/testdata/gitleaks-sample.csv</files>
<behavior>
- GitleaksImporter.Import parses a JSON array of records with fields: Description, StartLine, EndLine, StartColumn, EndColumn, Match, Secret, File, SymlinkFile, Commit, Entropy (float), Author, Email, Date, Message, Tags ([]string), RuleID, Fingerprint
- Each JSON record maps to engine.Finding:
ProviderName = normalizeGitleaksRuleID(RuleID) // "openai-api-key" -> "openai", "aws-access-token" -> "aws", "generic-api-key" -> "generic"
KeyValue = Secret
KeyMasked = engine.MaskKey(Secret)
Confidence = "medium" (Gitleaks doesn't verify)
Source = File (fallback SymlinkFile)
SourceType = "import:gitleaks"
LineNumber = StartLine
DetectedAt = time.Now()
Verified = false, VerifyStatus = "unverified"
- GitleaksCSVImporter.Import reads CSV via encoding/csv. Header row mandatory; column order follows gitleaks default: RuleID,Commit,File,SymlinkFile,Secret,Match,StartLine,EndLine,StartColumn,EndColumn,Author,Message,Date,Email,Fingerprint,Tags. Parse header to index map so column order resilience is not required but header names must match.
- normalizeGitleaksRuleID trims common suffixes: "-api-key", "-access-token", "-token", "-secret", "-key". E.g., "openai-api-key" -> "openai", "anthropic-api-key" -> "anthropic", "aws-access-token" -> "aws", "github-pat" -> "github-pat" (no suffix match, kept as-is but lowercased).
- Empty array / empty CSV (header only): returns empty slice nil error.
- Malformed JSON or CSV: returns wrapped error.
- Name() methods return "gitleaks" and "gitleaks-csv" respectively.
</behavior>
<action>
If pkg/importer/importer.go does not exist yet (parallel execution with 07-01), create it first with the Importer interface (see <interfaces> above).
Create pkg/importer/gitleaks.go:
- Package `importer`; imports: encoding/csv, encoding/json, fmt, io, strconv, strings, time, engine pkg.
- Define `type GitleaksImporter struct{}` and `type GitleaksCSVImporter struct{}`.
- Define `type gitleaksRecord struct` with JSON tags matching the Gitleaks schema above.
- Implement `(GitleaksImporter) Name() string` -> "gitleaks"; Import decodes JSON array, loops building engine.Finding.
- Implement `(GitleaksCSVImporter) Name() string` -> "gitleaks-csv"; Import uses csv.NewReader(r), reads header row, builds `map[string]int` of column index, then loops reading records. Parses StartLine via strconv.Atoi; swallows parse errors by setting LineNumber=0.
- Implement `normalizeGitleaksRuleID(id string) string`:
```go
id = strings.ToLower(id)
suffixes := []string{"-api-key", "-access-token", "-secret-key", "-secret", "-token", "-key"}
for _, s := range suffixes {
if strings.HasSuffix(id, s) {
return strings.TrimSuffix(id, s)
}
}
return id
```
- Helper `buildGitleaksFinding(ruleID, secret, file, symlink string, startLine int) engine.Finding` shared between JSON and CSV paths:
- source := file; if source == "" { source = symlink }
- returns engine.Finding{ProviderName: normalizeGitleaksRuleID(ruleID), KeyValue: secret, KeyMasked: engine.MaskKey(secret), Confidence: "medium", Source: source, SourceType: "import:gitleaks", LineNumber: startLine, DetectedAt: time.Now(), VerifyStatus: "unverified"}
Create pkg/importer/testdata/gitleaks-sample.json — JSON array with 3 records covering:
- {"RuleID":"openai-api-key","Secret":"sk-proj-1234567890abcdef1234","File":"config/app.yml","StartLine":12, ...}
- {"RuleID":"aws-access-token","Secret":"AKIAIOSFODNN7EXAMPLE","File":"terraform/main.tf","StartLine":55, ...}
- {"RuleID":"generic-api-key","Secret":"xoxp-abcdefghijklmnopqrstuvwxyz","File":"scripts/deploy.sh","StartLine":3, ...}
Create pkg/importer/testdata/gitleaks-sample.csv with header row and the same 3 rows (in Gitleaks default column order).
Create pkg/importer/gitleaks_test.go:
- TestGitleaksImporter_JSON: loads fixture, expects 3 findings, findings[0].ProviderName=="openai", findings[1].ProviderName=="aws", Source/LineNumber correct.
- TestGitleaksImporter_CSV: loads CSV fixture, same 3 findings, same assertions.
- TestGitleaksImporter_NormalizeRuleID: table — {"openai-api-key","openai"}, {"aws-access-token","aws"}, {"anthropic-api-key","anthropic"}, {"generic-api-key","generic"}, {"github-pat","github-pat"}.
- TestGitleaksImporter_EmptyArray, TestGitleaksImporter_EmptyCSV (header only).
- TestGitleaksImporter_InvalidJSON returns error.
- Name() assertions for both importers.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/importer/... -run Gitleaks -v</automated>
</verify>
<done>
- GitleaksImporter + GitleaksCSVImporter implemented
- JSON + CSV fixtures committed
- All Gitleaks tests pass
- go build ./pkg/importer/... succeeds
</done>
</task>
</tasks>
<verification>
go test ./pkg/importer/... passes. Both JSON and CSV paths produce identical Finding slices from equivalent fixtures.
</verification>
<success_criteria>
Gitleaks output (JSON and CSV) ingests into normalized engine.Finding records with correct provider name mapping and line number extraction.
</success_criteria>
<output>
After completion, create `.planning/phases/07-import-cicd/07-02-SUMMARY.md`.
</output>