salvacybersec 09722eaec4 feat(08-02): add 25 GitHub dorks for frontier and specialized categories
- frontier.yaml: 15 dorks covering Tier 1/2 providers (OpenAI, Anthropic,
  Google AI, Azure OpenAI, AWS Bedrock, xAI, Cohere, Mistral, Groq,
  Together, Replicate)
- specialized.yaml: 10 dorks covering Tier 3 providers (Perplexity,
  Voyage, Jina, AssemblyAI, Deepgram, ElevenLabs, Stability, HuggingFace)
- Extend loader to accept YAML list format in addition to single-dork
  mapping, enabling multi-dork files for Wave 2+ plans
- Mirror all YAMLs into dorks/github/ (user-visible) and
  pkg/dorks/definitions/github/ (go:embed target)
2026-04-06 00:20:43 +03:00
2026-04-04 19:12:41 +03:00

KeyHunter

The most comprehensive API key scanner for LLM/AI providers. Detect, validate, and monitor leaked API keys across 108+ providers.

Go License Providers


Why KeyHunter?

Existing tools like TruffleHog (~3 LLM detectors) and Gitleaks (~5 LLM rules) were built for general secret scanning. AI-related credential leaks grew 81% year-over-year in 2025, yet no tool covers more than ~15 LLM providers.

KeyHunter fills that gap with 108+ provider-specific detectors, active key validation, OSINT/recon capabilities, and real-time notifications.

How It Compares

Feature KeyHunter TruffleHog Gitleaks detect-secrets
LLM Providers 108+ ~3 ~5 ~1
Active Verification 108+ endpoints ~20 types No No
OSINT/Recon Shodan, Censys, GitHub, GitLab, Paste, S3 No No No
External Tool Import TruffleHog + Gitleaks - - -
Web Dashboard Built-in No No No
Telegram Bot Built-in No No No
Dork Engine Built-in YAML dorks No No No
Provider YAML Plugin Community-extensible Go code only TOML rules Python plugins
Scheduled Scanning Cron-based No No No

Features

Core Scanning

  • File/Directory scanning with recursive traversal and glob exclusions
  • Git-aware scanning — full history, branches, stash, delta-based diffs
  • stdin/pipe support — cat dump.txt | keyhunter scan stdin
  • URL fetching — scan any remote URL content
  • Clipboard scanning — instant clipboard content analysis

OSINT / Recon Engine (80+ Sources, 18 Categories)

IoT & Internet Scanners

  • Shodan — exposed LLM proxies, dashboards, API endpoints
  • Censys — HTTP body search for leaked credentials
  • ZoomEye — Chinese IoT scanner, different coverage perspective
  • FOFA — Asian infrastructure scanning, body content search
  • Netlas — HTTP response body keyword search
  • BinaryEdge — internet-wide scan data

Code Hosting & Snippets

  • GitHub / GitLab / Bitbucket — code search with automated dorks
  • Codeberg / Gitea instances — alternative Git platforms (Gitea auto-discovered via Shodan)
  • Replit / CodeSandbox / StackBlitz / Glitch — interactive dev environments with hardcoded keys
  • CodePen / JSFiddle / Observable — browser snippet platforms
  • HuggingFace — Spaces, repos, model configs (high-yield for LLM keys)
  • Kaggle — notebooks and datasets with API keys
  • Jupyter / nbviewer — shared notebooks
  • GitHub Gist — public gist search
  • Gitpod — workspace snapshots

Search Engine Dorking

  • Google — Custom Search API / SerpAPI, 100+ built-in dorks
  • Bing — Azure Cognitive Services search
  • DuckDuckGo / Yandex / Brave — alternative indexes for broader coverage

Paste Sites

  • Multi-paste aggregator — Pastebin, dpaste, paste.ee, rentry, hastebin, ix.io, and more

Package Registries

  • npm / PyPI / RubyGems / crates.io / Maven / NuGet / Packagist / Go modules — download packages, extract source, scan for key patterns

Container & Infrastructure

  • Docker Hub — image layer scanning, build arg extraction
  • Kubernetes — exposed dashboards, public Secret/ConfigMap YAML files
  • Terraform — state files (.tfstate with plaintext secrets), registry modules
  • Helm Charts / Ansible Galaxy — default values with credentials

Cloud Storage

  • AWS S3 / GCS / Azure Blob / DigitalOcean Spaces / Backblaze B2 — bucket enumeration and content scanning
  • MinIO — self-hosted instances discovered via Shodan
  • GrayHatWarfare — searchable database of public bucket objects

CI/CD Log Leaks

  • Travis CI / CircleCI — public build logs with leaked env vars
  • GitHub Actions — workflow run log scanning
  • Jenkins — exposed instances (Shodan-discovered), console output
  • GitLab CI/CD — public pipeline job traces

Web Archives

  • Wayback Machine — historical snapshots of removed .env files, config pages
  • CommonCrawl — massive web crawl data, WARC record scanning

Forums & Documentation

  • Stack Overflow — API + SEDE queries for code snippets with real keys
  • Reddit — programming subreddit scanning
  • Hacker News — Algolia API comment search
  • dev.to / Medium — tutorial articles with hardcoded keys
  • Telegram groups — public channels sharing configs and "free API keys"
  • Discord — indexed public server content

Collaboration Tools

  • Notion / Confluence — public pages and spaces with credentials
  • Trello — public boards with API key cards
  • Google Docs/Sheets — publicly shared documents

Frontend & JavaScript Leaks

  • JS Source Maps — original source recovery with inlined secrets
  • Webpack / Vite bundlesREACT_APP_*, NEXT_PUBLIC_*, VITE_* variable extraction
  • Exposed .env files — misconfigured web servers serving dotenv from root
  • Swagger / OpenAPI docs — real auth examples in API docs
  • Vercel / Netlify previews — deploy preview JS bundles with production secrets

Log Aggregators

  • Elasticsearch / Kibana — exposed instances with application logs containing API keys
  • Grafana — exposed dashboards with datasource configs
  • Sentry — error tracking capturing request headers with keys

Threat Intelligence

  • VirusTotal — uploaded files/scripts containing embedded keys
  • Intelligence X — aggregated paste, darknet, and leak search
  • URLhaus — malicious URLs with API keys in parameters

Mobile Apps

  • APK analysis — download, decompile, grep for key patterns (via apktool/jadx)

DNS / Subdomain Discovery

  • crt.sh — Certificate Transparency log for API subdomain discovery
  • Subdomain probing — config endpoint enumeration (.env, /api/config, /actuator/env)

API Marketplaces

  • Postman — public collections, workspaces, environments
  • SwaggerHub — published API definitions with example values

recon full — parallel sweep across all 80+ sources with deduplication and unified reporting

Active Verification

  • Lightweight API calls to verify if detected keys are active
  • Permission and scope extraction (org, rate limits, model access)
  • Configurable via --verify flag (off by default)
  • Provider-specific verification endpoints

External Tool Integration

  • Import TruffleHog JSON output — enrich with LLM-specific analysis
  • Import Gitleaks JSON output — cross-reference with 108+ providers
  • Generic CSV import for custom tool output

Notifications & Dashboard

  • Telegram Bot — scan triggers, key alerts, recon results
  • Web Dashboard — htmx + Tailwind, SQLite-backed, real-time scan viewer
  • Webhook — generic HTTP POST notifications
  • Slack — workspace notifications
  • Scheduled scans — cron-based recurring scans with auto-notify

Quick Start

Install

# From source
go install github.com/keyhunter/keyhunter@latest

# Binary release
curl -sSL https://get.keyhunter.dev | bash

# Docker
docker pull keyhunter/keyhunter:latest

Basic Usage

# Scan a directory
keyhunter scan path ./my-project/

# Scan with active verification
keyhunter scan path ./my-project/ --verify

# Scan git history (last 30 days)
keyhunter scan git . --since="30 days ago"

# Scan from pipe
cat secrets.txt | keyhunter scan stdin

# Scan only specific providers
keyhunter scan path . --providers=openai,anthropic,deepseek

# JSON output
keyhunter scan path . --output=json > results.json

OSINT / Recon

# ── IoT & Internet Scanners ──
keyhunter recon shodan --dork="http.title:\"LiteLLM\" port:4000"
keyhunter recon censys --query='services.http.response.body:"sk-proj-"'
keyhunter recon zoomeye --query='app:"Elasticsearch" +"api_key"'
keyhunter recon fofa --query='body="OPENAI_API_KEY"'
keyhunter recon netlas --query='http.body:"sk-ant-"'

# ── Code Hosting ──
keyhunter recon github --dork=auto               # Tum built-in GitHub dork'lari
keyhunter recon gitlab --dork=auto
keyhunter recon bitbucket --query="OPENAI_API_KEY"
keyhunter recon replit --query="sk-proj-"         # Public repl'ler
keyhunter recon huggingface --spaces --query="api_key"  # HF Spaces
keyhunter recon kaggle --notebooks --query="openai"
keyhunter recon codesandbox --query="sk-ant-"
keyhunter recon glitch --query="ANTHROPIC_API_KEY"
keyhunter recon gitea --instances-from=shodan     # Auto-discover Gitea instances

# ── Search Engine Dorking ──
keyhunter recon google --dork=auto                # 100+ built-in Google dorks
keyhunter recon google --dork='"sk-proj-" -github.com filetype:env'
keyhunter recon bing --dork=auto
keyhunter recon brave --query="OPENAI_API_KEY filetype:yaml"

# ── Package Registries ──
keyhunter recon npm --recent --query="openai"     # Scan yeni paketler
keyhunter recon pypi --recent --query="llm"
keyhunter recon crates --query="api_key"

# ── Cloud Storage ──
keyhunter recon s3 --domain=targetcorp            # S3 bucket enumeration
keyhunter recon gcs --domain=targetcorp           # GCS buckets
keyhunter recon azure --domain=targetcorp         # Azure Blob
keyhunter recon minio --shodan                    # Exposed MinIO instances
keyhunter recon grayhat --query="openai api_key"  # GrayHatWarfare search

# ── CI/CD Logs ──
keyhunter recon ghactions --org=targetcorp        # GitHub Actions logs
keyhunter recon travis --org=targetcorp
keyhunter recon jenkins --shodan                  # Exposed Jenkins instances
keyhunter recon circleci --org=targetcorp

# ── Web Archives ──
keyhunter recon wayback --domain=targetcorp.com   # Wayback Machine
keyhunter recon commoncrawl --domain=targetcorp.com

# ── Frontend & JS ──
keyhunter recon dotenv --domain-list=targets.txt  # Exposed .env files
keyhunter recon sourcemaps --domain=app.target.com  # JS source maps
keyhunter recon webpack --url=https://app.target.com/main.js
keyhunter recon swagger --shodan                  # Exposed Swagger UI's
keyhunter recon deploys --domain=targetcorp       # Vercel/Netlify previews

# ── Forums ──
keyhunter recon stackoverflow --query="sk-proj-"
keyhunter recon reddit --subreddit=openai --query="api key"
keyhunter recon hackernews --query="leaked api key"
keyhunter recon telegram-groups --query="free api key"

# ── Collaboration ──
keyhunter recon notion --query="API_KEY"          # Google dorked
keyhunter recon confluence --shodan               # Exposed instances
keyhunter recon trello --query="openai api key"

# ── Log Aggregators ──
keyhunter recon elasticsearch --shodan            # Exposed ES instances
keyhunter recon grafana --shodan
keyhunter recon sentry --shodan

# ── Threat Intelligence ──
keyhunter recon virustotal --query="sk-proj-"
keyhunter recon intelx --query="sk-ant-api03"     # Intelligence X
keyhunter recon urlhaus --query="openai"

# ── Mobile Apps ──
keyhunter recon apk --query="ai chatbot"          # APK download + decompile

# ── DNS/Subdomain ──
keyhunter recon crtsh --domain=targetcorp.com     # Cert transparency
keyhunter recon subdomain --domain=targetcorp.com --probe-configs

# ── Full Sweep ──
keyhunter recon full --providers=openai,anthropic  # ALL 80+ sources parallel
keyhunter recon full --categories=code,cloud       # Category-filtered sweep

# ── Dork Management ──
keyhunter dorks list                               # All dorks across all sources
keyhunter dorks list --source=github
keyhunter dorks list --source=google
keyhunter dorks add github 'filename:.env "GROQ_API_KEY"'
keyhunter dorks run google --category=frontier     # Run Google dorks for frontier providers
keyhunter dorks export

Viewing Full API Keys

Default olarak key'ler terminalde maskelenir (omuz surfing koruması). Gerçek key'e erişim yolları:

# 1. CLI'da --unmask flag'i ile tam key gör
keyhunter scan path . --unmask
#  Provider    | Key                                          | Confidence | File          | Line | Status
# ─────────────┼──────────────────────────────────────────────┼────────────┼───────────────┼──────┼────────
#  OpenAI      | sk-proj-abc123def456ghi789jkl012mno345pqr678 | HIGH       | src/config.py | 42   | ACTIVE

# 2. JSON export — her zaman tam key içerir
keyhunter scan path . --output=json > results.json

# 3. Key management komutu — bulunan tüm key'leri yönet
keyhunter keys list                   # Maskelenmiş liste
keyhunter keys list --unmask          # Tam key'li liste
keyhunter keys show <id>              # Tek key tam detay (her zaman unmasked)
keyhunter keys copy <id>              # Key'i clipboard'a kopyala
keyhunter keys export --format=json   # Tüm key'leri tam değerleriyle export et
keyhunter keys verify <id>            # Key'i doğrula + tam detay göster

# 4. Web Dashboard — /keys/:id sayfasında "Reveal Key" butonu
# 5. Telegram Bot — /key <id> komutu ile tam key

Örnek keyhunter keys show çıktısı:

 ID:          a3f7b2c1
 Provider:    OpenAI
 Pattern:     OpenAI Project Key
 Key:         sk-proj-abc123def456ghi789jkl012mno345pqr678stu901vwx234
 Confidence:  HIGH
 Source:      src/config.py:42
 Found:       2026-04-04 14:32:01
 Scan ID:     scan_001
 Status:      ACTIVE (verified 2026-04-04 14:32:05)
 Org:         my-org
 Rate Limit:  500 req/min
 Revoke URL:  https://platform.openai.com/api-keys

Verify a Single Key

keyhunter verify sk-proj-abc123...
# Output:
# Provider:  OpenAI
# Status:    ACTIVE
# Org:       my-org
# Rate Limit: 500 req/min
# Revoke:    https://platform.openai.com/api-keys

Import External Tools

# Run TruffleHog, then enrich with KeyHunter
trufflehog git . --json > trufflehog.json
keyhunter import trufflehog trufflehog.json --verify

# Run Gitleaks, then enrich
gitleaks detect -r gitleaks.json
keyhunter import gitleaks gitleaks.json

Web Dashboard & Telegram Bot

# Start web dashboard
keyhunter serve --port=8080

# Start with Telegram bot
keyhunter serve --port=8080 --telegram

# Configure Telegram
keyhunter config set telegram.token "YOUR_BOT_TOKEN"
keyhunter config set telegram.chat_id "YOUR_CHAT_ID"

CI/CD Integration

KeyHunter ships with a git pre-commit hook that blocks leaks before they land in history, a GitHub Actions integration that uploads SARIF findings directly into the repository's Code Scanning tab, and an import command that consolidates TruffleHog and Gitleaks output into one normalized database.

# Install pre-commit hook (scans staged files only)
keyhunter hook install

# GitHub Actions (SARIF output for Code Scanning upload)
keyhunter scan . --output sarif > keyhunter.sarif

# Import findings from other scanners
keyhunter import --format=trufflehog trufflehog.json
keyhunter import --format=gitleaks   gitleaks.json

# Exit codes: 0 = clean, 1 = keys found, 2 = error
keyhunter scan . && echo "Clean" || echo "Keys found!"

See docs/CI-CD.md for the full guide, including a copy-paste GitHub Actions workflow and the pre-commit hook install/uninstall lifecycle.

Scheduled Scanning

# Daily GitHub recon at 09:00
keyhunter schedule add \
  --name="daily-github" \
  --cron="0 9 * * *" \
  --command="recon github --dork=auto" \
  --notify=telegram

# Hourly paste site monitoring
keyhunter schedule add \
  --name="hourly-paste" \
  --cron="0 * * * *" \
  --command="recon paste --sources=pastebin" \
  --notify=telegram

keyhunter schedule list
keyhunter schedule remove daily-github

Configuration

# Initialize config
keyhunter config init
# Creates ~/.keyhunter.yaml

# Set API keys for recon sources
keyhunter config set shodan.apikey "YOUR_SHODAN_KEY"
keyhunter config set censys.api_id "YOUR_CENSYS_ID"
keyhunter config set censys.api_secret "YOUR_CENSYS_SECRET"
keyhunter config set github.token "YOUR_GITHUB_TOKEN"
keyhunter config set gitlab.token "YOUR_GITLAB_TOKEN"
keyhunter config set zoomeye.apikey "YOUR_ZOOMEYE_KEY"
keyhunter config set fofa.email "YOUR_FOFA_EMAIL"
keyhunter config set fofa.apikey "YOUR_FOFA_KEY"
keyhunter config set netlas.apikey "YOUR_NETLAS_KEY"
keyhunter config set binaryedge.apikey "YOUR_BINARYEDGE_KEY"
keyhunter config set google.cx "YOUR_GOOGLE_CX_ID"
keyhunter config set google.apikey "YOUR_GOOGLE_API_KEY"
keyhunter config set bing.apikey "YOUR_BING_API_KEY"
keyhunter config set brave.apikey "YOUR_BRAVE_API_KEY"
keyhunter config set virustotal.apikey "YOUR_VT_KEY"
keyhunter config set intelx.apikey "YOUR_INTELX_KEY"
keyhunter config set grayhat.apikey "YOUR_GRAYHAT_KEY"
keyhunter config set reddit.client_id "YOUR_REDDIT_ID"
keyhunter config set reddit.client_secret "YOUR_REDDIT_SECRET"
keyhunter config set stackoverflow.apikey "YOUR_SO_KEY"
keyhunter config set kaggle.username "YOUR_KAGGLE_USER"
keyhunter config set kaggle.apikey "YOUR_KAGGLE_KEY"

# Set notification channels
keyhunter config set telegram.token "YOUR_BOT_TOKEN"
keyhunter config set telegram.chat_id "YOUR_CHAT_ID"
keyhunter config set webhook.url "https://your-webhook.com/alert"

# Database encryption
keyhunter config set db.password "YOUR_DB_PASSWORD"

Config File (~/.keyhunter.yaml)

scan:
  workers: 8
  verify_timeout: 10s
  default_output: table
  respect_robots: true

recon:
  stealth: false
  rate_limits:
    github: 30        # req/min
    shodan: 1         # req/sec
    censys: 5         # req/sec
    zoomeye: 10       # req/sec
    fofa: 1           # req/sec
    netlas: 1         # req/sec
    google: 100       # req/day (Custom Search API)
    bing: 3           # req/sec
    stackoverflow: 30 # req/sec
    hackernews: 100   # req/min
    paste: 0.5        # req/sec
    npm: 10           # req/sec
    pypi: 5           # req/sec
    virustotal: 4     # req/min (free tier)
    intelx: 10        # req/day (free tier)
    grayhat: 5        # req/sec
    wayback: 15       # req/min
    trello: 10        # req/sec
    devto: 1          # req/sec

telegram:
  token: "encrypted:..."
  chat_id: "123456789"
  auto_notify: true

web:
  port: 8080
  auth:
    enabled: false
    username: admin
    password: "encrypted:..."

db:
  path: ~/.keyhunter/keyhunter.db
  encrypted: true

Supported Providers (108)

Tier 1 — Frontier

Provider Key Pattern Confidence Verify
OpenAI sk-proj-*, sk-svcacct-* High GET /v1/models
Anthropic sk-ant-api03-* High GET /v1/models
Google AI (Gemini) AIza* High GET /v1/models
Google Vertex AI OAuth token Medium GET /v1/models
AWS Bedrock AKIA* High GetFoundationModel
Azure OpenAI 32-char hex Medium GET /openai/deployments
Meta AI meta-llama-* Medium GET /v1/models
xAI (Grok) xai-* High GET /v1/models
Cohere co-* High GET /v1/models
Mistral AI 32-char generic Low GET /v1/models
Inflection AI Generic UUID Low GET /api/models
AI21 Labs Generic key Low GET /v1/models

Tier 2 — Inference Platforms

Provider Key Pattern Confidence Verify
Together AI Generic key Low GET /v1/models
Fireworks AI fw_* High GET /v1/models
Groq gsk_* High GET /openai/v1/models
Replicate r8_* High GET /v1/predictions
Anyscale Generic key Low GET /v1/models
DeepInfra Generic key Low GET /v1/models
Lepton AI lpt_* High GET /v1/models
Modal Generic token Low GET /api/apps
Baseten Generic key Low GET /v1/models
Cerebrium Generic key Low GET /v1/models
NovitaAI Generic key Low GET /v1/models
Sambanova Generic key Low GET /v1/models
OctoAI Generic key Low GET /v1/models
Friendli AI Generic key Low GET /v1/models

Tier 3 — Specialized/Vertical

Provider Key Pattern Confidence Verify
Perplexity pplx-* High GET /chat/completions
You.com Generic key Low GET /v1/search
Voyage AI voy-* High GET /v1/models
Jina AI jina_* High GET /v1/models
Unstructured Generic key Low GET /general/v0/general
AssemblyAI Generic key Low GET /v2/transcript
Deepgram Generic key Low GET /v1/projects
ElevenLabs el_* High GET /v1/user
Stability AI sk-* Medium GET /v1/engines/list
Runway ML Generic key Low GET /v1/models
Midjourney Generic key Low N/A
HuggingFace hf_* High GET /api/whoami

Tier 4 — Chinese/Regional

Provider Key Pattern Confidence Verify
DeepSeek sk-* Medium GET /v1/models
Baichuan Generic key Low GET /v1/models
Zhipu AI (GLM) Generic key Low POST /api/paas/v4/chat
Moonshot AI (Kimi) sk-* Medium GET /v1/models
Yi (01.AI) Generic key Low GET /v1/models
Qwen (Alibaba) sk-* Medium GET /v1/models
Baidu (ERNIE) API Key + Secret Medium Token endpoint
ByteDance (Doubao) Generic key Low GET /v1/models
SenseTime Generic key Low GET /v1/models
iFlytek (Spark) API Key + Secret Medium WebSocket handshake
MiniMax Generic key Low GET /v1/models
Stepfun Generic key Low GET /v1/models
360 AI Generic key Low GET /v1/models
Kuaishou (Kling) Generic key Low GET /v1/models
Tencent Hunyuan SecretId + SecretKey Medium DescribeModels
SiliconFlow sf_* High GET /v1/models

Tier 5 — Infrastructure/Gateway

Provider Key Pattern Confidence Verify
Cloudflare AI Cloudflare API token Medium GET /ai/models
Vercel AI vercel_* High GET /v1/models
LiteLLM Generic key Low GET /v1/models
Portkey Generic key Low GET /v1/models
Helicone sk-helicone-* High GET /v1/models
OpenRouter sk-or-* High GET /api/v1/models
Martian Generic key Low GET /v1/models
AI Gateway (Kong) Generic key Low Health endpoint
BricksAI Generic key Low GET /v1/models
Aether Generic key Low GET /v1/models
Not Diamond Generic key Low GET /v1/models

Tier 6 — Emerging/Niche

Provider Key Pattern Confidence Verify
Reka AI Generic key Low GET /v1/models
Aleph Alpha Generic key Low GET /models
Writer Generic key Low GET /v1/models
Jasper AI Generic key Low N/A
Typeface Generic key Low N/A
Comet ML Generic key Low GET /api/rest/v2
Weights & Biases Generic key Low GET /api/v1/viewer
LangSmith ls__* High GET /api/v1/info
Pinecone Generic key Low GET /databases
Weaviate Generic key Low GET /v1/meta
Qdrant Generic key Low GET /collections
Chroma Generic key Low GET /api/v1/heartbeat
Milvus Generic key Low GET /v1/vector/collections
Neon AI Generic key Low N/A
Lamini Generic key Low GET /v1/models

Tier 7 — Code & Dev Tools

Provider Key Pattern Confidence Verify
GitHub Copilot ghu_*, ghp_* High GET /user
Cursor Generic key Low N/A
Tabnine Generic key Low N/A
Codeium/Windsurf Generic key Low N/A
Sourcegraph Cody sgp_* High GET /.api/current-user
Amazon CodeWhisperer AKIA* High STS GetCallerIdentity
Replit AI Generic key Low N/A
Codestral (Mistral) Generic key Low GET /v1/models
IBM watsonx.ai ibm_* Medium IAM token endpoint
Oracle AI Generic key Low N/A

Tier 8 — Self-Hosted/Open Infra

Provider Key Pattern Confidence Verify
Ollama N/A (local) N/A GET /api/tags
vLLM Generic key Low GET /v1/models
LocalAI Generic key Low GET /v1/models
LM Studio N/A (local) N/A GET /v1/models
llama.cpp N/A (local) N/A GET /health
GPT4All N/A (local) N/A N/A
text-generation-webui Generic key Low GET /v1/models
TensorRT-LLM N/A N/A Health endpoint
Triton Inference Server N/A N/A GET /v2/health/ready
Jan AI N/A (local) N/A GET /v1/models

Tier 9 — Enterprise/Legacy

Provider Key Pattern Confidence Verify
Salesforce Einstein Generic token Low REST API
ServiceNow AI Generic token Low REST API
SAP AI Core OAuth token Low Token endpoint
Palantir AIP Generic token Low REST API
Databricks (DBRX) dapi* High GET /api/2.0/clusters
Snowflake Cortex JWT token Medium SQL endpoint
Oracle Generative AI Generic key Low REST API
HPE GreenLake AI Generic token Low REST API

Architecture

                    +------------------+
                    |   CLI (Cobra)    |
                    +--------+---------+
                             |
              +--------------+--------------+
              |              |              |
     +--------v--+   +------v-----+  +-----v------+
     | Input      |   | Recon      |  | Import     |
     | Adapters   |   | Engine     |  | Adapters   |
     | - file     |   | (80+ src)  |  | - trufflehog|
     | - git      |   | - IoT (6)  |  | - gitleaks |
     | - stdin    |   | - Code(16) |  | - generic  |
     | - url      |   | - Search(5)|  +-----+------+
     | - clipboard|   | - Paste(8+)|        |
     +--------+---+   | - Pkg (8)  |        |
              |        | - Cloud(7) |        |
              |        | - CI/CD(5) |        |
              |        | - Archive2 |        |
              |        | - Forum(7) |        |
              |        | - Collab(4)|        |
              |        | - JS/FE(5) |        |
              |        | - Logs (3) |        |
              |        | - Intel(3) |        |
              |        | - Mobile(1)|        |
              |        | - DNS (2)  |        |
              |        | - API (3)  |        |
              |        +------+-----+        |
              |               |              |
              +-------+-------+--------------+
                      |
              +-------v--------+
              | Scanner Engine |
              | - matcher.go   |
              | - verifier.go  |
              +-------+--------+
                      |
         +------------+-------------+
         |            |             |
   +-----v----+ +----v-----+ +----v-------+
   | Output   | | Notify   | | Web        |
   | - table  | | - telegram| | Dashboard  |
   | - json   | | - webhook| | - htmx     |
   | - sarif  | | - slack  | | - REST API |
   | - csv    | +----------+ | - SQLite   |
   +----------+              +------------+

   +------------------------------------------+
   | Provider Registry (108+ YAML providers)  |
   | Dork Registry (50+ YAML dorks)           |
   +------------------------------------------+

Key Design Decisions

  • YAML Providers — Adding a new provider = adding a YAML file. No recompile needed for pattern-only changes (when using external provider dir). Built-in providers are embedded at compile time.
  • Keyword Pre-filtering — Before running regex, files are scanned for keywords. This provides ~10x speedup on large codebases.
  • Worker Pool — Parallel scanning with configurable worker count. Default: CPU count.
  • Delta-based Git Scanning — Only scans changes between commits, not entire trees.
  • SQLite Storage — All scan results persisted with AES-256 encryption.

Security & Ethics

Built-in Protections

  • Key values masked by default in terminal (first 8 + last 4 chars) — use --unmask for full keys
  • Full keys always available via: --unmask, --output=json, keyhunter keys show, web dashboard, Telegram bot
  • Database is AES-256 encrypted (full keys stored encrypted)
  • API tokens stored encrypted in config
  • No key values written to logs during --verify
  • Web dashboard supports basic auth / token auth

Rate Limiting

Source Rate Limit
GitHub API (auth) 30 req/min
GitHub API (unauth) 10 req/min
Shodan Per API plan
Censys 250 queries/day (free)
ZoomEye 10,000 results/month (free)
FOFA 100 results/query (free)
Netlas 50 queries/day (free)
Google Custom Search 100/day free, 10K/day paid
Bing Search 1,000/month (free)
Stack Overflow 300/day (no key), 10K/day (key)
HN Algolia 10,000 req/hour
VirusTotal 4 req/min (free)
IntelX 10 searches/day (free)
GrayHatWarfare Per plan
Wayback Machine ~15 req/min
Paste sites 1 req/2sec
npm/PyPI Generous, be respectful
Trello 100 req/10sec
Docker Hub 100 pulls/6hr (unauth)

Stealth & Ethics Flags

--stealth           # User-agent rotation, increased request spacing
--respect-robots    # Respect robots.txt (default: on)

Use Cases

Red Team / Pentest

# Full multi-source recon against a target org
keyhunter recon github --query="targetcorp OPENAI_API_KEY"
keyhunter recon gitlab --query="targetcorp api_key"
keyhunter recon shodan --dork='http.html:"targetcorp" "sk-"'
keyhunter recon censys --query='services.http.response.body:"targetcorp" AND "api_key"'
keyhunter recon zoomeye --query='site:targetcorp.com +"api_key"'
keyhunter recon elasticsearch --shodan   # Find exposed ES with leaked keys
keyhunter recon jenkins --shodan         # Exposed Jenkins with build logs
keyhunter recon dotenv --domain-list=targetcorp-subdomains.txt  # .env exposure
keyhunter recon wayback --domain=targetcorp.com  # Historical leaks
keyhunter recon sourcemaps --domain=app.targetcorp.com  # JS source maps
keyhunter recon crtsh --domain=targetcorp.com  # Discover API subdomains
keyhunter recon full --providers=openai,anthropic  # Everything at once

DevSecOps / CI Pipeline

# Pre-commit hook
keyhunter hook install

# GitHub Actions step
- name: KeyHunter Scan
  run: |
    keyhunter scan path . --output=sarif > keyhunter.sarif
    # Upload to GitHub Security tab

Bug Bounty

# Comprehensive target recon
keyhunter recon github --org=targetcorp --dork=auto --verify
keyhunter recon gist --query="targetcorp"
keyhunter recon paste --sources=all --query="targetcorp"
keyhunter recon postman --query="targetcorp"
keyhunter recon trello --query="targetcorp api key"
keyhunter recon notion --query="targetcorp API_KEY"
keyhunter recon confluence --shodan
keyhunter recon npm --query="targetcorp"   # Check their published packages
keyhunter recon pypi --query="targetcorp"
keyhunter recon docker --query="targetcorp" --layers  # Docker image layer scan
keyhunter recon apk --query="targetcorp"   # Mobile app decompile
keyhunter recon swagger --domain=api.targetcorp.com

Monitoring / Alerting

# Continuous monitoring with Telegram alerts
keyhunter schedule add \
  --name="monitor-github" \
  --cron="*/30 * * * *" \
  --command="recon github --dork=auto --providers=openai" \
  --notify=telegram

keyhunter serve --telegram

Dork Examples (150+ Built-in)

GitHub

filename:.env "OPENAI_API_KEY"
filename:.env "ANTHROPIC_API_KEY"
filename:config.yaml "api_key" "sk-"
"sk-proj-" language:python
"sk-ant-api03" language:javascript
filename:docker-compose "API_KEY"
"api_key" extension:ipynb
filename:.toml "api_key" "sk-"
filename:terraform.tfvars "api_key"
"kind: Secret" "data:" filename:*.yaml          # K8s secrets
filename:.npmrc "_authToken"                     # npm tokens
filename:requirements.txt "openai" path:.env     # Python projects

GitLab

"OPENAI_API_KEY" filename:.env
"sk-ant-" filename:*.py
"api_key" filename:settings.json

Google Dorking

"sk-proj-" -github.com -stackoverflow.com        # Outside known code sites
"sk-ant-api03-" filetype:env
"OPENAI_API_KEY" filetype:yml
"ANTHROPIC_API_KEY" filetype:json
inurl:.env "API_KEY"
intitle:"index of" .env
site:pastebin.com "sk-proj-"
site:replit.com "OPENAI_API_KEY"
site:codesandbox.io "sk-ant-"
site:notion.so "API_KEY"
site:trello.com "openai"
site:docs.google.com "sk-proj-"
site:medium.com "ANTHROPIC_API_KEY"
site:dev.to "sk-proj-"
site:huggingface.co "OPENAI_API_KEY"
site:kaggle.com "api_key" "sk-"
intitle:"Swagger UI" "api_key"
inurl:graphql "authorization" "Bearer sk-"
filetype:tfstate "api_key"                       # Terraform state
filetype:ipynb "sk-proj-"                        # Jupyter notebooks

Shodan

http.html:"openai" "api_key" port:8080
http.title:"LiteLLM" port:4000
http.html:"ollama" port:11434
http.title:"Kubernetes Dashboard"
"X-Jenkins" "200 OK"
http.title:"Kibana" port:5601
http.title:"Grafana"
http.title:"Swagger UI"
http.title:"Gitea" port:3000
http.html:"PrivateBin"
http.title:"MinIO Browser"
http.title:"Sentry"
http.title:"Confluence"
port:6443 "kube-apiserver"
http.html:"langchain" port:8000

Censys

services.http.response.body:"openai" and services.http.response.body:"sk-"
services.http.response.body:"langchain" and services.port:8000
services.http.response.body:"OPENAI_API_KEY"
services.http.response.body:"sk-ant-api03"

ZoomEye

app:"Elasticsearch" +"api_key"
app:"Jenkins" +openai
app:"Grafana" +anthropic
app:"Gitea"

FOFA

body="sk-proj-"
body="OPENAI_API_KEY"
body="sk-ant-api03"
title="LiteLLM"
title="Swagger UI" && body="api_key"
title="Kibana" && body="authorization"

Contributing

Adding a New Provider

  1. Create providers/your-provider.yaml:
id: your-provider
name: Your Provider
category: emerging
website: https://api.yourprovider.com
confidence: medium

patterns:
  - id: your-provider-key
    name: "Your Provider API Key"
    regex: '\byp_[A-Za-z0-9]{32}\b'
    confidence: high
    description: "Your Provider API key with yp_ prefix"

keywords:
  - "yp_"
  - "YOUR_PROVIDER_API_KEY"

verify:
  enabled: true
  method: GET
  url: "https://api.yourprovider.com/v1/models"
  headers:
    Authorization: "Bearer {{key}}"
  success_codes: [200]
  failure_codes: [401, 403]

metadata:
  docs: "https://docs.yourprovider.com"
  key_url: "https://dashboard.yourprovider.com/keys"
  env_vars: ["YOUR_PROVIDER_API_KEY"]
  1. Run tests: go test ./pkg/provider/...
  2. Submit a PR

Adding a New Dork

  1. Edit dorks/<source>.yaml and add your dork entry
  2. Submit a PR

Roadmap

  • Core scanning engine (file, git, stdin)
  • 108 provider YAML definitions
  • Active verification for all providers
  • CLI with Cobra (scan, verify, import, recon, serve)
  • TruffleHog & Gitleaks import adapters
  • OSINT/Recon engine (Shodan, Censys, GitHub, GitLab, Paste, S3)
  • Built-in dork engine with 50+ dorks
  • Web dashboard (htmx + Tailwind + SQLite)
  • Telegram bot with auto-notifications
  • Scheduled scanning (cron-based)
  • Pre-commit hook & CI/CD integration (SARIF)
  • Docker image
  • Homebrew formula

Disclaimer

KeyHunter is designed for authorized security testing, defensive security, bug bounty programs, and educational purposes only. Always ensure you have proper authorization before scanning any target. Unauthorized access to computer systems is illegal.


License

MIT License - see LICENSE for details.

Description
Comprehensive LLM API key scanner - 108+ providers, OSINT recon, verification engine
Readme 1.9 MiB
Languages
Go 99.3%
HTML 0.6%