Commit Graph

171 Commits

Author SHA1 Message Date
salvacybersec
c504cbd5d3 feat(08-04): add 10 FOFA + 10 GitLab + 5 Bing dorks
- 10 FOFA queries using title=/body=/port=/cert= syntax (8 infrastructure
  + 2 frontier: Azure OpenAI cert, OpenAI proxy api_key leak)
- 10 GitLab code search dorks across frontier/specialized/infrastructure/
  emerging categories (OpenAI, Anthropic, Google AI, Groq, Cohere, HF,
  OpenRouter, Perplexity, DeepSeek, Pinecone)
- 5 Bing dorks using site:/filetype:/intitle:/inbody: operators
  (3 frontier + 1 specialized + 1 infrastructure)
- Brings grand total across all 8 sources to 150 dorks, satisfying DORK-02
- Dual-located under pkg/dorks/definitions/ and dorks/
2026-04-06 00:21:41 +03:00
salvacybersec
1c86800c14 feat(08-04): add 15 Censys + 10 ZoomEye dorks
- 15 Censys Search 2.0 queries for Ollama, vLLM, LocalAI, Open WebUI,
  LM Studio, Triton, TGI, LiteLLM, Portkey, LangServe, FastChat,
  text-generation-webui, Azure OpenAI certs, Bedrock certs, and OpenAI
  proxies (12 infrastructure + 3 frontier)
- 10 ZoomEye app/title/port/service queries covering the same LLM
  infrastructure surface (9 infrastructure + 1 frontier)
- Dual-located under pkg/dorks/definitions/ (embedded) and dorks/ (repo root)
2026-04-06 00:21:34 +03:00
salvacybersec
56c11e39a0 feat(08-03): add 20 Shodan dorks for exposed LLM infrastructure
- frontier.yaml: 6 dorks (OpenAI/Anthropic proxies, Azure OpenAI certs, AWS Bedrock, LiteLLM)
- infrastructure.yaml: 14 dorks (Ollama, vLLM, LocalAI, LM Studio, text-generation-webui, Open WebUI, Triton, TGI, LangServe, FastChat, OpenRouter/Portkey/Helicone gateways)
- Real Shodan query syntax: http.title, http.html, ssl.cert.subject.cn, product, port, http.component
- Dual-located: pkg/dorks/definitions/shodan/ + dorks/shodan/
2026-04-06 00:21:03 +03:00
salvacybersec
348d1c057b feat(08-03): add 30 Google dorks across 3 categories
- frontier.yaml: 12 dorks (OpenAI, Anthropic, Google AI, Groq, Cohere, Mistral, xAI, Replicate)
- specialized.yaml: 10 dorks (Perplexity, HF, ElevenLabs, Deepgram, AssemblyAI, Stability, Jina, Voyage)
- infrastructure.yaml: 8 dorks (OpenRouter, LiteLLM, Helicone, Portkey, Ollama, vLLM, LocalAI)
- Real site:/filetype:/intitle:/inurl: operators, no templating
- Dual-located: pkg/dorks/definitions/google/ (go:embed) + dorks/google/ (user-visible)
2026-04-06 00:20:56 +03:00
salvacybersec
9755b3756a feat(08-02): add 25 GitHub dorks for infrastructure, emerging, enterprise categories
- infrastructure.yaml: 10 dorks covering Tier 5 gateways (OpenRouter,
  LiteLLM, Portkey, Helicone, Cloudflare AI, Vercel AI) and Tier 8
  self-hosted (Ollama, vLLM, LocalAI)
- emerging.yaml: 10 dorks covering Tier 4 Chinese providers (DeepSeek,
  Moonshot, Qwen, Zhipu, MiniMax) and Tier 6 vector DBs (Pinecone,
  Weaviate, Qdrant, Chroma) plus Writer.com
- enterprise.yaml: 5 dorks covering Tier 7 dev tools (Codeium, Tabnine)
  and Tier 9 enterprise (Databricks, Snowflake Cortex, IBM watsonx)
- Registry now loads 50 total GitHub dorks across all 5 categories,
  mirrored in both dorks/github/ and pkg/dorks/definitions/github/
2026-04-06 00:20:52 +03:00
salvacybersec
09722eaec4 feat(08-02): add 25 GitHub dorks for frontier and specialized categories
- frontier.yaml: 15 dorks covering Tier 1/2 providers (OpenAI, Anthropic,
  Google AI, Azure OpenAI, AWS Bedrock, xAI, Cohere, Mistral, Groq,
  Together, Replicate)
- specialized.yaml: 10 dorks covering Tier 3 providers (Perplexity,
  Voyage, Jina, AssemblyAI, Deepgram, ElevenLabs, Stability, HuggingFace)
- Extend loader to accept YAML list format in addition to single-dork
  mapping, enabling multi-dork files for Wave 2+ plans
- Mirror all YAMLs into dorks/github/ (user-visible) and
  pkg/dorks/definitions/github/ (go:embed target)
2026-04-06 00:20:43 +03:00
salvacybersec
2dc7078708 docs(08-01): complete dork engine foundation plan
SUMMARY, STATE, ROADMAP, and REQUIREMENTS updates for pkg/dorks
foundation + custom_dorks storage (DORK-01, DORK-03).
2026-04-06 00:17:53 +03:00
salvacybersec
01062b88b1 feat(08-01): add custom_dorks table and CRUD for user-authored dorks
- schema.sql: CREATE TABLE IF NOT EXISTS custom_dorks with unique dork_id,
  source/category indexes, and tags stored as JSON TEXT
- custom_dorks.go: Save/List/Get/GetByDorkID/Delete with JSON tag round-trip
- Tests: round-trip, newest-first ordering, not-found, unique constraint,
  delete no-op, schema migration idempotency
2026-04-06 00:16:33 +03:00
salvacybersec
fd6efbb4c2 feat(08-01): add pkg/dorks foundation (schema, loader, registry, executor)
- Dork schema with Validate() mirroring provider YAML pattern
- go:embed loader tolerating empty definitions tree
- Registry with List/Get/Stats/ListBySource/ListByCategory
- Executor interface + Runner dispatch + ErrSourceNotImplemented
- Placeholder definitions/.gitkeep and repo-root dorks/.gitkeep
- Full unit test coverage for registry, validation, and runner dispatch
2026-04-06 00:15:32 +03:00
salvacybersec
46cf55ad37 docs(08): create phase plan 2026-04-06 00:13:13 +03:00
salvacybersec
4c2081821f docs(08): dork engine context 2026-04-06 00:05:59 +03:00
salvacybersec
436791f263 docs(phase-07): complete phase execution 2026-04-06 00:05:04 +03:00
salvacybersec
ca526d8e32 docs(07-04): complete import command plan 2026-04-06 00:00:24 +03:00
salvacybersec
9dbb0b87d4 feat(07-04): wire keyhunter import command with dedup and DB persist
- Replace import stub with cmd/import.go dispatching to pkg/importer
  (trufflehog, gitleaks, gitleaks-csv) via --format flag
- Reuse openDBWithKey helper so encryption + path resolution match scan/keys
- engineToStorage converts engine.Finding -> storage.Finding (Source -> SourcePath)
- Add pkg/storage.FindingExistsByKey for idempotent cross-import dedup
  keyed on (provider, masked key, source path, line number)
- cmd/import_test.go: selector table, field conversion, end-to-end trufflehog
  import with re-run duplicate assertion, unknown-format + missing-file errors
- pkg/storage queries_test: FindingExistsByKey hit and four miss cases

Delivers IMP-01/02/03 end-to-end.
2026-04-05 23:59:39 +03:00
salvacybersec
b3db22ac93 docs(07-05): complete hook install/uninstall plan 2026-04-05 23:59:32 +03:00
salvacybersec
7f2f42804d docs(07-06): complete CI/CD documentation plan 2026-04-05 23:59:11 +03:00
salvacybersec
aa8daf8de2 feat(07-05): implement keyhunter hook install/uninstall with embedded pre-commit script
- cmd/hook.go: install/uninstall subcommands with --force flag
- cmd/hook_script.sh: embedded via go:embed, runs keyhunter scan on staged files
- KEYHUNTER-HOOK v1 marker prevents accidental deletion of non-owned hooks
- Backup existing hooks on --force install
- cmd/hook_test.go: 10 tests covering fresh install, non-repo, force/backup, overwrite, uninstall lifecycle
- Remove hookCmd stub from cmd/stubs.go
2026-04-05 23:58:44 +03:00
salvacybersec
87c5a00203 docs(07-06): link README CI/CD section to full guide
- Expand CI/CD Integration section with import examples
- Link to docs/CI-CD.md for full walkthrough
2026-04-05 23:58:31 +03:00
salvacybersec
e4a71bb0de docs(07-06): add CI/CD integration guide
- Pre-commit hook install/force/uninstall lifecycle
- GitHub Actions workflow example with SARIF upload
- External scanner import walkthrough (trufflehog, gitleaks)
- Exit-code table for CI gating
2026-04-05 23:58:31 +03:00
salvacybersec
1a4d520b4f docs(07-03): complete dedup + SARIF github validation plan 2026-04-05 23:56:40 +03:00
salvacybersec
5ce2d4945e docs(07-02): complete Gitleaks importer plan 2026-04-05 23:56:12 +03:00
salvacybersec
75becce3dd docs(07-01): complete importer trufflehog adapter plan 2026-04-05 23:55:58 +03:00
salvacybersec
bd8eb9b611 test(07-03): SARIF GitHub code scanning validation
- Minimal required-fields fixture for GitHub SARIF upload schema
- TestSARIFGitHubValidation: asserts $schema/version/runs, tool.driver.name,
  per-result ruleId/level/message/locations, physicalLocation.region.startLine >= 1
- Covers startLine floor for LineNumber=0 inputs
- TestSARIFGitHubValidation_EmptyFindings: empty input still yields a valid
  document with results: [] (not null)
2026-04-05 23:55:38 +03:00
salvacybersec
83640ac200 feat(07-02): add Gitleaks JSON + CSV importers
- GitleaksImporter parses native JSON array output to []engine.Finding
- GitleaksCSVImporter parses CSV with header-based column resolution
- normalizeGitleaksRuleID strips suffixes (-api-key, -access-token, ...)
- Shared buildGitleaksFinding helper keeps JSON/CSV paths in lockstep
- Test fixtures + 8 tests covering happy path, empty, invalid, symlink fallback
2026-04-05 23:55:36 +03:00
salvacybersec
46eec328d2 feat(07-01): Importer interface and TruffleHog v3 JSON adapter
- pkg/importer/importer.go: shared Importer interface (Name, Import)
- pkg/importer/trufflehog.go: TruffleHogImporter with v3 JSON decoding,
  detector-name normalization (OpenAI/GithubV2/AWS -> canonical ids),
  SourceMetadata path+line extraction for Git/Filesystem/Github
- pkg/importer/testdata/trufflehog-sample.json: 3-record fixture
- pkg/importer/trufflehog_test.go: Name, Import, NormalizeName, EmptyArray,
  InvalidJSON tests -- all passing
2026-04-05 23:55:24 +03:00
salvacybersec
6a3d5b0cb7 feat(07-03): dedup helper for imported findings
- FindingKey: stable SHA-256 over provider+masked+source+line
- Dedup: preserves first-seen order, returns drop count
- 8 unit tests covering stability, field sensitivity, order preservation
2026-04-05 23:54:44 +03:00
salvacybersec
779c5b3d6f docs(07): create phase 7 import & CI/CD plans 2026-04-05 23:53:14 +03:00
salvacybersec
5c74c35a26 docs(07): import adapters and CI/CD context 2026-04-05 23:47:19 +03:00
salvacybersec
f6f6730ddb docs(phase-06): complete phase execution 2026-04-05 23:46:26 +03:00
salvacybersec
e5f93ef89c docs(06-06): complete scan output wiring plan 2026-04-05 23:42:57 +03:00
salvacybersec
cdf3c8ab4b test(06-06): cover scan output dispatch and unknown-format error
- Verify output.Names() exposes table, json, csv, sarif
- Assert renderScanOutput wraps output.ErrUnknownFormat and lists valid formats
- Smoke-test JSON and table dispatch paths through the registry
2026-04-05 23:42:01 +03:00
salvacybersec
c9114e4142 feat(06-06): wire scan --output to formatter registry and exit-code contract
- Replace inline jsonFinding switch with output.Get() dispatch
- Add renderScanOutput helper used by RunE and tests
- Introduce version var + versionString() for SARIF tool metadata
- Update --output help to list table, json, sarif, csv
- Change root Execute to os.Exit(2) on RunE errors per OUT-06
  (exit 0=clean, 1=findings, 2=tool error)
2026-04-05 23:41:38 +03:00
salvacybersec
3b89bde38d docs(06-05): complete keys command tree plan 2026-04-05 23:40:36 +03:00
salvacybersec
e2394ec663 test(06-05): integration tests for keys list/show/export/delete
- Temp-file SQLite DB seeded with three findings (2 openai, 1 anthropic,
  one verified) via storage.SaveFinding + loadOrCreateEncKey
- RunE + cmd.SetOut buffers for hermetic stdout capture
- Covers: list default + provider filter, show hit (unmasked) + miss,
  export JSON stdout (parses + plaintext present), export CSV to file
  (header + 3 rows), delete --yes then list returns 2
- TestKeysCopy and TestKeysVerify are documented as intentionally skipped
  (clipboard backend unavailable headlessly; verify needs network)
2026-04-05 23:39:07 +03:00
salvacybersec
06594afc57 feat(06-05): implement keys command tree (list/show/export/copy/delete/verify)
- Add cmd/keys.go with six subcommands backed by the Plan 04 query layer
- keys list prints masked findings with id/provider/confidence/source columns
  and supports --provider/--verified/--limit/--unmask filters
- keys show <id> renders a finding fully unmasked with verify metadata
- keys export --format=json|csv reuses the formatter registry, atomic
  file writes when --output is set
- keys copy <id> uses atotto/clipboard for clipboard handoff
- keys delete <id> prompts via cmd.InOrStdin unless --yes is passed
- keys verify <id> gates on verify.EnsureConsent, then updates the stored
  row inline via UPDATE findings SET verify_* using db.SQL()
- Remove the keysCmd stub from cmd/stubs.go (single declaration)
- All subcommands read config via openDBWithKey() mirroring scan.go
2026-04-05 23:37:25 +03:00
salvacybersec
7a3822c22e docs(06-02): complete JSON + CSV formatter plan 2026-04-05 23:34:51 +03:00
salvacybersec
9546f80fab docs(06-03): complete SARIF 2.1.0 formatter plan 2026-04-05 23:32:37 +03:00
salvacybersec
35352ff3d0 docs(06-04): complete findings query layer plan
SUMMARY.md for Plan 06-04: Filters struct + ListFindingsFiltered +
GetFinding + DeleteFinding on pkg/storage. Foundation for keys command
tree in Plan 06-05.
2026-04-05 23:32:11 +03:00
salvacybersec
03249fb3d1 feat(06-02): implement CSVFormatter with Unmask support
- Fixed 9-column header: id,provider,confidence,key,source,line,detected_at,verified,verify_status
- Uses encoding/csv for automatic quoting of commas/quotes in source paths
- Honors Options.Unmask for key column
- Registers under "csv" in output registry
2026-04-05 23:32:07 +03:00
salvacybersec
b35881aaef test(06-02): add failing tests for CSVFormatter 2026-04-05 23:31:44 +03:00
salvacybersec
2717aa3196 feat(06-03): implement SARIF 2.1.0 formatter with hand-rolled structs
- SARIFFormatter emits schema-valid SARIF 2.1.0 JSON for CI ingestion
- One rule per distinct provider, deduped in first-seen order
- Confidence mapped high/medium/low to error/warning/note
- startLine floored to 1 per SARIF spec requirement
- Registered under name 'sarif' via init()
2026-04-05 23:31:15 +03:00
salvacybersec
b1e4dea51c feat(06-04): implement findings query layer for keys command
- Filters struct: Provider, Verified (*bool), Limit, Offset
- ListFindingsFiltered: optional WHERE + ORDER BY created_at DESC, id DESC
- GetFinding: single-row lookup, propagates sql.ErrNoRows on miss
- DeleteFinding: returns RowsAffected so caller can distinguish hit/miss
- Shared scan/hydrate helpers decrypt key_value via existing Decrypt
2026-04-05 23:31:15 +03:00
salvacybersec
164477136c feat(06-02): implement JSONFormatter with Unmask support
- Renders findings as 2-space indented JSON array
- Honors Options.Unmask for key field exposure
- Omits empty verify fields via json omitempty
- Registers under "json" in output registry
2026-04-05 23:31:12 +03:00
salvacybersec
2cb35d50ac test(06-03): add failing tests for SARIF 2.1.0 formatter 2026-04-05 23:30:38 +03:00
salvacybersec
67763ec498 test(06-04): add failing tests for findings query layer
- Filters struct with provider, verified, limit/offset
- ListFindingsFiltered, GetFinding, DeleteFinding coverage
- Uses in-memory SQLite with seeded fixtures across 2 providers
2026-04-05 23:30:33 +03:00
salvacybersec
c933673ca9 test(06-02): add failing tests for JSONFormatter 2026-04-05 23:30:12 +03:00
salvacybersec
5292502000 docs(06-01): complete formatter interface + TableFormatter plan 2026-04-05 23:29:13 +03:00
salvacybersec
8e4db5db09 feat(06-01): refactor table output into TableFormatter
- TableFormatter implements Formatter interface, registered as "table"
- Writes to arbitrary io.Writer instead of hardcoded os.Stdout
- Strips ANSI colors when writer is not a TTY or NO_COLOR is set
- Uses bundled tableStyles so plain/colored paths share one renderer
- PrintFindings retained as backward-compat wrapper delegating to Format
2026-04-05 23:27:53 +03:00
salvacybersec
8c37252c1b test(06-01): add failing tests for TableFormatter refactor
- Add TestTableFormatter_Empty, NoColorInBuffer, Unverified/VerifiedLayout
- Add TestTableFormatter_Masking, MetadataSorted, RegisteredUnderTable
- Keep legacy PrintFindings tests as backward-compat wrapper coverage
2026-04-05 23:27:03 +03:00
salvacybersec
291c97ed0b feat(06-01): add Formatter interface, Registry, and TTY color detection
- pkg/output/formatter.go: Formatter interface, Options, Registry with
  Register/Get/Names, ErrUnknownFormat sentinel
- pkg/output/colors.go: IsTTY + ColorsEnabled honoring NO_COLOR
- Promote github.com/mattn/go-isatty to direct dependency
- Unit tests cover registry round-trip, unknown lookup, sorted Names,
  non-TTY buffer, NO_COLOR override

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 18:41:23 +03:00