Files
keyhunter/.planning/phases/06-output-reporting/06-03-SUMMARY.md
2026-04-05 23:32:37 +03:00

7.4 KiB

phase, plan, subsystem, tags, requirements, dependency-graph, tech-stack, key-files, decisions, metrics
phase plan subsystem tags requirements dependency-graph tech-stack key-files decisions metrics
06-output-reporting 03 pkg/output
output
formatter
sarif
ci-cd
json-schema
OUT-03
requires provides affects
output.Formatter interface (06-01)
output.Options struct (06-01)
output.Register registry (06-01)
engine.Finding
output.SARIFFormatter
SARIF 2.1.0 document structs (sarifDoc, sarifRun, sarifRule, sarifResult, ...)
Registry entry "sarif"
cmd/scan.go (downstream: --output=sarif selection)
Phase 7 CICD-02 (SARIF upload to GitHub code scanning)
added patterns
Hand-rolled schema structs with json struct tags (no SARIF library per CLAUDE.md)
init()-registered formatter, same pattern as TableFormatter / JSONFormatter
Deterministic rule dedup: first-seen order over the findings slice
Confidence -> level mapping via pure switch function (sarifLevel)
created modified
pkg/output/sarif.go
pkg/output/sarif_test.go
Used json.schemastore.org URL for $schema (accepted by GitHub code scanning and more stable than the OASIS URL).
Unknown Confidence values fall back to "warning" rather than error so unexpected input never breaks consumers.
startLine is floored to 1 per SARIF 2.1.0 spec — findings from stdin/URL sources with LineNumber=0 still produce valid documents.
Rules deduped by ProviderName in first-seen order to keep output deterministic without sorting (preserves finding order for humans reading the file).
Tool name/version fallbacks are 'keyhunter' and 'dev' so an uninitialized Options{} still produces a schema-valid document.
duration completed tasks commits
~6m 2026-04-05 1 2

Phase 06 Plan 03: SARIF 2.1.0 Formatter Summary

Implemented output.SARIFFormatter, a hand-rolled SARIF 2.1.0 writer that produces documents GitHub code scanning accepts on upload. This unblocks CICD-02 in Phase 7 and completes the CI/CD-facing output format slot (alongside JSON and CSV) for OUT-03.

What Was Built

1. SARIF document structs (pkg/output/sarif.go)

A minimal but schema-valid subset of SARIF 2.1.0 modeled as Go structs with json tags:

  • sarifDoc — top-level with $schema, version, runs[]
  • sarifRuntool, results[]
  • sarifTool / sarifDrivername, version, rules[]
  • sarifRuleid, name, shortDescription.text
  • sarifResultruleId, level, message.text, locations[]
  • sarifLocation / sarifPhysicalLocation / sarifArtifactLocation / sarifRegion
  • sarifText — shared {text} wrapper

No SARIF library dependency was added — CLAUDE.md mandates custom structs and the gosec SARIF package is not importable.

2. SARIFFormatter.Format behavior

  • Fallback tool identity: "keyhunter" / "dev" when Options.ToolName / ToolVersion are empty.
  • Rules: deduped by ProviderName in first-seen order. rule.id == rule.name == providerName, shortDescription.text == "Leaked <provider> API key".
  • Results: one per finding. ruleId = providerName, level via sarifLevel(confidence), message.text = "Detected <provider> key (<confidence>): <key>" where <key> is KeyMasked by default and KeyValue iff opts.Unmask.
  • Locations: one physicalLocation with artifactLocation.uri = f.Source and region.startLine = max(1, f.LineNumber).
  • Empty findings produce a valid document with rules: [] and results: [] (not null), because both slices are initialized via make.
  • Output is indented JSON (enc.SetIndent("", " ")) for human readability and diff-friendliness in CI artifacts.

3. sarifLevel confidence mapping

high   -> error
medium -> warning
low    -> note
*      -> warning   (safe default for unknown values)

4. Registration

init() { Register("sarif", SARIFFormatter{}) } — discoverable via output.Get("sarif") and listed in output.Names(), matching the pattern used by TableFormatter and JSONFormatter.

Tests (pkg/output/sarif_test.go)

All seven tests pass on first green build.

Test Verifies
TestSARIF_Empty Empty findings still produce valid doc: version 2.1.0, 1 run, 0 results, 0 rules
TestSARIF_DedupRules Duplicate providers collapse to one rule; 3 findings still produce 3 results
TestSARIF_LevelMapping high/medium/low/unknown -> error/warning/note/warning
TestSARIF_LineFloor LineNumber 0 and negative values floor to 1; positive values pass through
TestSARIF_Masking Default output uses KeyMasked; Unmask=true reveals KeyValue
TestSARIF_ToolVersionFallback Empty Options fall back to "keyhunter"/"dev"; explicit values are honored
TestSARIF_RegisteredInRegistry output.Get("sarif") returns a SARIFFormatter

Tests use json.Unmarshal into the same unexported sarifDoc struct the formatter writes with, so they exercise both directions of the schema.

Verification

$ go test ./pkg/output/... -run "TestSARIF" -count=1
=== RUN   TestSARIF_Empty            --- PASS
=== RUN   TestSARIF_DedupRules       --- PASS
=== RUN   TestSARIF_LevelMapping     --- PASS
=== RUN   TestSARIF_LineFloor        --- PASS
=== RUN   TestSARIF_Masking          --- PASS
=== RUN   TestSARIF_ToolVersionFallback --- PASS
=== RUN   TestSARIF_RegisteredInRegistry --- PASS
PASS

$ go test ./pkg/output/... -count=1
ok  	github.com/salvacybersec/keyhunter/pkg/output

$ go build ./...
(no output — success)

Commits

Hash Type Message
2cb35d5 test test(06-03): add failing tests for SARIF 2.1.0 formatter
2717aa3 feat feat(06-03): implement SARIF 2.1.0 formatter with hand-rolled structs

Deviations from Plan

None — plan executed exactly as written. The <action> block in the plan included a complete sketch of sarif.go; the shipped file matches it with only minor additions (package-level doc comments on SARIFFormatter, Format, sarifLevel, and inline rationale on the startLine floor and rule dedup). These are documentation-only and do not alter behavior.

Known Stubs

None. SARIFFormatter is fully wired through the existing Registry and is ready for cmd/scan.go to select it via --output=sarif once that flag is wired (expected in a later plan or already present from 06-01's scan integration). No placeholder data sources, no TODO markers.

Downstream Enablement

  • Phase 7 CICD-02 (SARIF upload to GitHub code scanning) can now format scan results by calling output.Get("sarif") and passing a real Options{ToolName: "keyhunter", ToolVersion: <buildversion>}.
  • The 2.1.0 document emitted here validates against https://json.schemastore.org/sarif-2.1.0.json and is the exact shape GitHub's codeql/upload-sarif action expects.

Self-Check: PASSED

  • pkg/output/sarif.go — FOUND
  • pkg/output/sarif_test.go — FOUND
  • Commit 2cb35d5 (test) — FOUND in git log
  • Commit 2717aa3 (feat) — FOUND in git log
  • All seven TestSARIF_* tests — PASSING
  • go build ./... — SUCCEEDING