- Expand CI/CD Integration section with import examples - Link to docs/CI-CD.md for full walkthrough
1008 lines
36 KiB
Markdown
1008 lines
36 KiB
Markdown
# KeyHunter
|
||
|
||
> The most comprehensive API key scanner for LLM/AI providers. Detect, validate, and monitor leaked API keys across 108+ providers.
|
||
|
||
[](https://golang.org)
|
||
[](LICENSE)
|
||
[](providers/)
|
||
|
||
---
|
||
|
||
## Why KeyHunter?
|
||
|
||
Existing tools like TruffleHog (~3 LLM detectors) and Gitleaks (~5 LLM rules) were built for general secret scanning. AI-related credential leaks grew **81% year-over-year** in 2025, yet no tool covers more than ~15 LLM providers.
|
||
|
||
**KeyHunter fills that gap** with 108+ provider-specific detectors, active key validation, OSINT/recon capabilities, and real-time notifications.
|
||
|
||
### How It Compares
|
||
|
||
| Feature | KeyHunter | TruffleHog | Gitleaks | detect-secrets |
|
||
|---------|-----------|------------|----------|----------------|
|
||
| LLM Providers | **108+** | ~3 | ~5 | ~1 |
|
||
| Active Verification | **108+ endpoints** | ~20 types | No | No |
|
||
| OSINT/Recon | **Shodan, Censys, GitHub, GitLab, Paste, S3** | No | No | No |
|
||
| External Tool Import | **TruffleHog + Gitleaks** | - | - | - |
|
||
| Web Dashboard | **Built-in** | No | No | No |
|
||
| Telegram Bot | **Built-in** | No | No | No |
|
||
| Dork Engine | **Built-in YAML dorks** | No | No | No |
|
||
| Provider YAML Plugin | **Community-extensible** | Go code only | TOML rules | Python plugins |
|
||
| Scheduled Scanning | **Cron-based** | No | No | No |
|
||
|
||
---
|
||
|
||
## Features
|
||
|
||
### Core Scanning
|
||
- **File/Directory scanning** with recursive traversal and glob exclusions
|
||
- **Git-aware scanning** — full history, branches, stash, delta-based diffs
|
||
- **stdin/pipe** support — `cat dump.txt | keyhunter scan stdin`
|
||
- **URL fetching** — scan any remote URL content
|
||
- **Clipboard scanning** — instant clipboard content analysis
|
||
|
||
### OSINT / Recon Engine (80+ Sources, 18 Categories)
|
||
|
||
**IoT & Internet Scanners**
|
||
- **Shodan** — exposed LLM proxies, dashboards, API endpoints
|
||
- **Censys** — HTTP body search for leaked credentials
|
||
- **ZoomEye** — Chinese IoT scanner, different coverage perspective
|
||
- **FOFA** — Asian infrastructure scanning, body content search
|
||
- **Netlas** — HTTP response body keyword search
|
||
- **BinaryEdge** — internet-wide scan data
|
||
|
||
**Code Hosting & Snippets**
|
||
- **GitHub / GitLab / Bitbucket** — code search with automated dorks
|
||
- **Codeberg / Gitea instances** — alternative Git platforms (Gitea auto-discovered via Shodan)
|
||
- **Replit / CodeSandbox / StackBlitz / Glitch** — interactive dev environments with hardcoded keys
|
||
- **CodePen / JSFiddle / Observable** — browser snippet platforms
|
||
- **HuggingFace** — Spaces, repos, model configs (high-yield for LLM keys)
|
||
- **Kaggle** — notebooks and datasets with API keys
|
||
- **Jupyter / nbviewer** — shared notebooks
|
||
- **GitHub Gist** — public gist search
|
||
- **Gitpod** — workspace snapshots
|
||
|
||
**Search Engine Dorking**
|
||
- **Google** — Custom Search API / SerpAPI, 100+ built-in dorks
|
||
- **Bing** — Azure Cognitive Services search
|
||
- **DuckDuckGo / Yandex / Brave** — alternative indexes for broader coverage
|
||
|
||
**Paste Sites**
|
||
- **Multi-paste aggregator** — Pastebin, dpaste, paste.ee, rentry, hastebin, ix.io, and more
|
||
|
||
**Package Registries**
|
||
- **npm / PyPI / RubyGems / crates.io / Maven / NuGet / Packagist / Go modules** — download packages, extract source, scan for key patterns
|
||
|
||
**Container & Infrastructure**
|
||
- **Docker Hub** — image layer scanning, build arg extraction
|
||
- **Kubernetes** — exposed dashboards, public Secret/ConfigMap YAML files
|
||
- **Terraform** — state files (`.tfstate` with plaintext secrets), registry modules
|
||
- **Helm Charts / Ansible Galaxy** — default values with credentials
|
||
|
||
**Cloud Storage**
|
||
- **AWS S3 / GCS / Azure Blob / DigitalOcean Spaces / Backblaze B2** — bucket enumeration and content scanning
|
||
- **MinIO** — self-hosted instances discovered via Shodan
|
||
- **GrayHatWarfare** — searchable database of public bucket objects
|
||
|
||
**CI/CD Log Leaks**
|
||
- **Travis CI / CircleCI** — public build logs with leaked env vars
|
||
- **GitHub Actions** — workflow run log scanning
|
||
- **Jenkins** — exposed instances (Shodan-discovered), console output
|
||
- **GitLab CI/CD** — public pipeline job traces
|
||
|
||
**Web Archives**
|
||
- **Wayback Machine** — historical snapshots of removed `.env` files, config pages
|
||
- **CommonCrawl** — massive web crawl data, WARC record scanning
|
||
|
||
**Forums & Documentation**
|
||
- **Stack Overflow** — API + SEDE queries for code snippets with real keys
|
||
- **Reddit** — programming subreddit scanning
|
||
- **Hacker News** — Algolia API comment search
|
||
- **dev.to / Medium** — tutorial articles with hardcoded keys
|
||
- **Telegram groups** — public channels sharing configs and "free API keys"
|
||
- **Discord** — indexed public server content
|
||
|
||
**Collaboration Tools**
|
||
- **Notion / Confluence** — public pages and spaces with credentials
|
||
- **Trello** — public boards with API key cards
|
||
- **Google Docs/Sheets** — publicly shared documents
|
||
|
||
**Frontend & JavaScript Leaks**
|
||
- **JS Source Maps** — original source recovery with inlined secrets
|
||
- **Webpack / Vite bundles** — `REACT_APP_*`, `NEXT_PUBLIC_*`, `VITE_*` variable extraction
|
||
- **Exposed `.env` files** — misconfigured web servers serving dotenv from root
|
||
- **Swagger / OpenAPI docs** — real auth examples in API docs
|
||
- **Vercel / Netlify previews** — deploy preview JS bundles with production secrets
|
||
|
||
**Log Aggregators**
|
||
- **Elasticsearch / Kibana** — exposed instances with application logs containing API keys
|
||
- **Grafana** — exposed dashboards with datasource configs
|
||
- **Sentry** — error tracking capturing request headers with keys
|
||
|
||
**Threat Intelligence**
|
||
- **VirusTotal** — uploaded files/scripts containing embedded keys
|
||
- **Intelligence X** — aggregated paste, darknet, and leak search
|
||
- **URLhaus** — malicious URLs with API keys in parameters
|
||
|
||
**Mobile Apps**
|
||
- **APK analysis** — download, decompile, grep for key patterns (via apktool/jadx)
|
||
|
||
**DNS / Subdomain Discovery**
|
||
- **crt.sh** — Certificate Transparency log for API subdomain discovery
|
||
- **Subdomain probing** — config endpoint enumeration (`.env`, `/api/config`, `/actuator/env`)
|
||
|
||
**API Marketplaces**
|
||
- **Postman** — public collections, workspaces, environments
|
||
- **SwaggerHub** — published API definitions with example values
|
||
|
||
**`recon full`** — parallel sweep across all 80+ sources with deduplication and unified reporting
|
||
|
||
### Active Verification
|
||
- Lightweight API calls to verify if detected keys are active
|
||
- Permission and scope extraction (org, rate limits, model access)
|
||
- Configurable via `--verify` flag (off by default)
|
||
- Provider-specific verification endpoints
|
||
|
||
### External Tool Integration
|
||
- **Import TruffleHog** JSON output — enrich with LLM-specific analysis
|
||
- **Import Gitleaks** JSON output — cross-reference with 108+ providers
|
||
- Generic CSV import for custom tool output
|
||
|
||
### Notifications & Dashboard
|
||
- **Telegram Bot** — scan triggers, key alerts, recon results
|
||
- **Web Dashboard** — htmx + Tailwind, SQLite-backed, real-time scan viewer
|
||
- **Webhook** — generic HTTP POST notifications
|
||
- **Slack** — workspace notifications
|
||
- **Scheduled scans** — cron-based recurring scans with auto-notify
|
||
|
||
---
|
||
|
||
## Quick Start
|
||
|
||
### Install
|
||
|
||
```bash
|
||
# From source
|
||
go install github.com/keyhunter/keyhunter@latest
|
||
|
||
# Binary release
|
||
curl -sSL https://get.keyhunter.dev | bash
|
||
|
||
# Docker
|
||
docker pull keyhunter/keyhunter:latest
|
||
```
|
||
|
||
### Basic Usage
|
||
|
||
```bash
|
||
# Scan a directory
|
||
keyhunter scan path ./my-project/
|
||
|
||
# Scan with active verification
|
||
keyhunter scan path ./my-project/ --verify
|
||
|
||
# Scan git history (last 30 days)
|
||
keyhunter scan git . --since="30 days ago"
|
||
|
||
# Scan from pipe
|
||
cat secrets.txt | keyhunter scan stdin
|
||
|
||
# Scan only specific providers
|
||
keyhunter scan path . --providers=openai,anthropic,deepseek
|
||
|
||
# JSON output
|
||
keyhunter scan path . --output=json > results.json
|
||
```
|
||
|
||
### OSINT / Recon
|
||
|
||
```bash
|
||
# ── IoT & Internet Scanners ──
|
||
keyhunter recon shodan --dork="http.title:\"LiteLLM\" port:4000"
|
||
keyhunter recon censys --query='services.http.response.body:"sk-proj-"'
|
||
keyhunter recon zoomeye --query='app:"Elasticsearch" +"api_key"'
|
||
keyhunter recon fofa --query='body="OPENAI_API_KEY"'
|
||
keyhunter recon netlas --query='http.body:"sk-ant-"'
|
||
|
||
# ── Code Hosting ──
|
||
keyhunter recon github --dork=auto # Tum built-in GitHub dork'lari
|
||
keyhunter recon gitlab --dork=auto
|
||
keyhunter recon bitbucket --query="OPENAI_API_KEY"
|
||
keyhunter recon replit --query="sk-proj-" # Public repl'ler
|
||
keyhunter recon huggingface --spaces --query="api_key" # HF Spaces
|
||
keyhunter recon kaggle --notebooks --query="openai"
|
||
keyhunter recon codesandbox --query="sk-ant-"
|
||
keyhunter recon glitch --query="ANTHROPIC_API_KEY"
|
||
keyhunter recon gitea --instances-from=shodan # Auto-discover Gitea instances
|
||
|
||
# ── Search Engine Dorking ──
|
||
keyhunter recon google --dork=auto # 100+ built-in Google dorks
|
||
keyhunter recon google --dork='"sk-proj-" -github.com filetype:env'
|
||
keyhunter recon bing --dork=auto
|
||
keyhunter recon brave --query="OPENAI_API_KEY filetype:yaml"
|
||
|
||
# ── Package Registries ──
|
||
keyhunter recon npm --recent --query="openai" # Scan yeni paketler
|
||
keyhunter recon pypi --recent --query="llm"
|
||
keyhunter recon crates --query="api_key"
|
||
|
||
# ── Cloud Storage ──
|
||
keyhunter recon s3 --domain=targetcorp # S3 bucket enumeration
|
||
keyhunter recon gcs --domain=targetcorp # GCS buckets
|
||
keyhunter recon azure --domain=targetcorp # Azure Blob
|
||
keyhunter recon minio --shodan # Exposed MinIO instances
|
||
keyhunter recon grayhat --query="openai api_key" # GrayHatWarfare search
|
||
|
||
# ── CI/CD Logs ──
|
||
keyhunter recon ghactions --org=targetcorp # GitHub Actions logs
|
||
keyhunter recon travis --org=targetcorp
|
||
keyhunter recon jenkins --shodan # Exposed Jenkins instances
|
||
keyhunter recon circleci --org=targetcorp
|
||
|
||
# ── Web Archives ──
|
||
keyhunter recon wayback --domain=targetcorp.com # Wayback Machine
|
||
keyhunter recon commoncrawl --domain=targetcorp.com
|
||
|
||
# ── Frontend & JS ──
|
||
keyhunter recon dotenv --domain-list=targets.txt # Exposed .env files
|
||
keyhunter recon sourcemaps --domain=app.target.com # JS source maps
|
||
keyhunter recon webpack --url=https://app.target.com/main.js
|
||
keyhunter recon swagger --shodan # Exposed Swagger UI's
|
||
keyhunter recon deploys --domain=targetcorp # Vercel/Netlify previews
|
||
|
||
# ── Forums ──
|
||
keyhunter recon stackoverflow --query="sk-proj-"
|
||
keyhunter recon reddit --subreddit=openai --query="api key"
|
||
keyhunter recon hackernews --query="leaked api key"
|
||
keyhunter recon telegram-groups --query="free api key"
|
||
|
||
# ── Collaboration ──
|
||
keyhunter recon notion --query="API_KEY" # Google dorked
|
||
keyhunter recon confluence --shodan # Exposed instances
|
||
keyhunter recon trello --query="openai api key"
|
||
|
||
# ── Log Aggregators ──
|
||
keyhunter recon elasticsearch --shodan # Exposed ES instances
|
||
keyhunter recon grafana --shodan
|
||
keyhunter recon sentry --shodan
|
||
|
||
# ── Threat Intelligence ──
|
||
keyhunter recon virustotal --query="sk-proj-"
|
||
keyhunter recon intelx --query="sk-ant-api03" # Intelligence X
|
||
keyhunter recon urlhaus --query="openai"
|
||
|
||
# ── Mobile Apps ──
|
||
keyhunter recon apk --query="ai chatbot" # APK download + decompile
|
||
|
||
# ── DNS/Subdomain ──
|
||
keyhunter recon crtsh --domain=targetcorp.com # Cert transparency
|
||
keyhunter recon subdomain --domain=targetcorp.com --probe-configs
|
||
|
||
# ── Full Sweep ──
|
||
keyhunter recon full --providers=openai,anthropic # ALL 80+ sources parallel
|
||
keyhunter recon full --categories=code,cloud # Category-filtered sweep
|
||
|
||
# ── Dork Management ──
|
||
keyhunter dorks list # All dorks across all sources
|
||
keyhunter dorks list --source=github
|
||
keyhunter dorks list --source=google
|
||
keyhunter dorks add github 'filename:.env "GROQ_API_KEY"'
|
||
keyhunter dorks run google --category=frontier # Run Google dorks for frontier providers
|
||
keyhunter dorks export
|
||
```
|
||
|
||
### Viewing Full API Keys
|
||
|
||
Default olarak key'ler terminalde maskelenir (omuz surfing koruması). Gerçek key'e erişim yolları:
|
||
|
||
```bash
|
||
# 1. CLI'da --unmask flag'i ile tam key gör
|
||
keyhunter scan path . --unmask
|
||
# Provider | Key | Confidence | File | Line | Status
|
||
# ─────────────┼──────────────────────────────────────────────┼────────────┼───────────────┼──────┼────────
|
||
# OpenAI | sk-proj-abc123def456ghi789jkl012mno345pqr678 | HIGH | src/config.py | 42 | ACTIVE
|
||
|
||
# 2. JSON export — her zaman tam key içerir
|
||
keyhunter scan path . --output=json > results.json
|
||
|
||
# 3. Key management komutu — bulunan tüm key'leri yönet
|
||
keyhunter keys list # Maskelenmiş liste
|
||
keyhunter keys list --unmask # Tam key'li liste
|
||
keyhunter keys show <id> # Tek key tam detay (her zaman unmasked)
|
||
keyhunter keys copy <id> # Key'i clipboard'a kopyala
|
||
keyhunter keys export --format=json # Tüm key'leri tam değerleriyle export et
|
||
keyhunter keys verify <id> # Key'i doğrula + tam detay göster
|
||
|
||
# 4. Web Dashboard — /keys/:id sayfasında "Reveal Key" butonu
|
||
# 5. Telegram Bot — /key <id> komutu ile tam key
|
||
```
|
||
|
||
**Örnek `keyhunter keys show` çıktısı:**
|
||
```
|
||
ID: a3f7b2c1
|
||
Provider: OpenAI
|
||
Pattern: OpenAI Project Key
|
||
Key: sk-proj-abc123def456ghi789jkl012mno345pqr678stu901vwx234
|
||
Confidence: HIGH
|
||
Source: src/config.py:42
|
||
Found: 2026-04-04 14:32:01
|
||
Scan ID: scan_001
|
||
Status: ACTIVE (verified 2026-04-04 14:32:05)
|
||
Org: my-org
|
||
Rate Limit: 500 req/min
|
||
Revoke URL: https://platform.openai.com/api-keys
|
||
```
|
||
|
||
### Verify a Single Key
|
||
|
||
```bash
|
||
keyhunter verify sk-proj-abc123...
|
||
# Output:
|
||
# Provider: OpenAI
|
||
# Status: ACTIVE
|
||
# Org: my-org
|
||
# Rate Limit: 500 req/min
|
||
# Revoke: https://platform.openai.com/api-keys
|
||
```
|
||
|
||
### Import External Tools
|
||
|
||
```bash
|
||
# Run TruffleHog, then enrich with KeyHunter
|
||
trufflehog git . --json > trufflehog.json
|
||
keyhunter import trufflehog trufflehog.json --verify
|
||
|
||
# Run Gitleaks, then enrich
|
||
gitleaks detect -r gitleaks.json
|
||
keyhunter import gitleaks gitleaks.json
|
||
```
|
||
|
||
### Web Dashboard & Telegram Bot
|
||
|
||
```bash
|
||
# Start web dashboard
|
||
keyhunter serve --port=8080
|
||
|
||
# Start with Telegram bot
|
||
keyhunter serve --port=8080 --telegram
|
||
|
||
# Configure Telegram
|
||
keyhunter config set telegram.token "YOUR_BOT_TOKEN"
|
||
keyhunter config set telegram.chat_id "YOUR_CHAT_ID"
|
||
```
|
||
|
||
### CI/CD Integration
|
||
|
||
KeyHunter ships with a git **pre-commit hook** that blocks leaks before they land in
|
||
history, a **GitHub Actions** integration that uploads SARIF findings directly into
|
||
the repository's Code Scanning tab, and an `import` command that consolidates
|
||
TruffleHog and Gitleaks output into one normalized database.
|
||
|
||
```bash
|
||
# Install pre-commit hook (scans staged files only)
|
||
keyhunter hook install
|
||
|
||
# GitHub Actions (SARIF output for Code Scanning upload)
|
||
keyhunter scan . --output sarif > keyhunter.sarif
|
||
|
||
# Import findings from other scanners
|
||
keyhunter import --format=trufflehog trufflehog.json
|
||
keyhunter import --format=gitleaks gitleaks.json
|
||
|
||
# Exit codes: 0 = clean, 1 = keys found, 2 = error
|
||
keyhunter scan . && echo "Clean" || echo "Keys found!"
|
||
```
|
||
|
||
See [docs/CI-CD.md](docs/CI-CD.md) for the full guide, including a copy-paste
|
||
GitHub Actions workflow and the pre-commit hook install/uninstall lifecycle.
|
||
|
||
### Scheduled Scanning
|
||
|
||
```bash
|
||
# Daily GitHub recon at 09:00
|
||
keyhunter schedule add \
|
||
--name="daily-github" \
|
||
--cron="0 9 * * *" \
|
||
--command="recon github --dork=auto" \
|
||
--notify=telegram
|
||
|
||
# Hourly paste site monitoring
|
||
keyhunter schedule add \
|
||
--name="hourly-paste" \
|
||
--cron="0 * * * *" \
|
||
--command="recon paste --sources=pastebin" \
|
||
--notify=telegram
|
||
|
||
keyhunter schedule list
|
||
keyhunter schedule remove daily-github
|
||
```
|
||
|
||
---
|
||
|
||
## Configuration
|
||
|
||
```bash
|
||
# Initialize config
|
||
keyhunter config init
|
||
# Creates ~/.keyhunter.yaml
|
||
|
||
# Set API keys for recon sources
|
||
keyhunter config set shodan.apikey "YOUR_SHODAN_KEY"
|
||
keyhunter config set censys.api_id "YOUR_CENSYS_ID"
|
||
keyhunter config set censys.api_secret "YOUR_CENSYS_SECRET"
|
||
keyhunter config set github.token "YOUR_GITHUB_TOKEN"
|
||
keyhunter config set gitlab.token "YOUR_GITLAB_TOKEN"
|
||
keyhunter config set zoomeye.apikey "YOUR_ZOOMEYE_KEY"
|
||
keyhunter config set fofa.email "YOUR_FOFA_EMAIL"
|
||
keyhunter config set fofa.apikey "YOUR_FOFA_KEY"
|
||
keyhunter config set netlas.apikey "YOUR_NETLAS_KEY"
|
||
keyhunter config set binaryedge.apikey "YOUR_BINARYEDGE_KEY"
|
||
keyhunter config set google.cx "YOUR_GOOGLE_CX_ID"
|
||
keyhunter config set google.apikey "YOUR_GOOGLE_API_KEY"
|
||
keyhunter config set bing.apikey "YOUR_BING_API_KEY"
|
||
keyhunter config set brave.apikey "YOUR_BRAVE_API_KEY"
|
||
keyhunter config set virustotal.apikey "YOUR_VT_KEY"
|
||
keyhunter config set intelx.apikey "YOUR_INTELX_KEY"
|
||
keyhunter config set grayhat.apikey "YOUR_GRAYHAT_KEY"
|
||
keyhunter config set reddit.client_id "YOUR_REDDIT_ID"
|
||
keyhunter config set reddit.client_secret "YOUR_REDDIT_SECRET"
|
||
keyhunter config set stackoverflow.apikey "YOUR_SO_KEY"
|
||
keyhunter config set kaggle.username "YOUR_KAGGLE_USER"
|
||
keyhunter config set kaggle.apikey "YOUR_KAGGLE_KEY"
|
||
|
||
# Set notification channels
|
||
keyhunter config set telegram.token "YOUR_BOT_TOKEN"
|
||
keyhunter config set telegram.chat_id "YOUR_CHAT_ID"
|
||
keyhunter config set webhook.url "https://your-webhook.com/alert"
|
||
|
||
# Database encryption
|
||
keyhunter config set db.password "YOUR_DB_PASSWORD"
|
||
```
|
||
|
||
### Config File (`~/.keyhunter.yaml`)
|
||
|
||
```yaml
|
||
scan:
|
||
workers: 8
|
||
verify_timeout: 10s
|
||
default_output: table
|
||
respect_robots: true
|
||
|
||
recon:
|
||
stealth: false
|
||
rate_limits:
|
||
github: 30 # req/min
|
||
shodan: 1 # req/sec
|
||
censys: 5 # req/sec
|
||
zoomeye: 10 # req/sec
|
||
fofa: 1 # req/sec
|
||
netlas: 1 # req/sec
|
||
google: 100 # req/day (Custom Search API)
|
||
bing: 3 # req/sec
|
||
stackoverflow: 30 # req/sec
|
||
hackernews: 100 # req/min
|
||
paste: 0.5 # req/sec
|
||
npm: 10 # req/sec
|
||
pypi: 5 # req/sec
|
||
virustotal: 4 # req/min (free tier)
|
||
intelx: 10 # req/day (free tier)
|
||
grayhat: 5 # req/sec
|
||
wayback: 15 # req/min
|
||
trello: 10 # req/sec
|
||
devto: 1 # req/sec
|
||
|
||
telegram:
|
||
token: "encrypted:..."
|
||
chat_id: "123456789"
|
||
auto_notify: true
|
||
|
||
web:
|
||
port: 8080
|
||
auth:
|
||
enabled: false
|
||
username: admin
|
||
password: "encrypted:..."
|
||
|
||
db:
|
||
path: ~/.keyhunter/keyhunter.db
|
||
encrypted: true
|
||
```
|
||
|
||
---
|
||
|
||
## Supported Providers (108)
|
||
|
||
### Tier 1 — Frontier
|
||
|
||
| Provider | Key Pattern | Confidence | Verify |
|
||
|----------|-------------|------------|--------|
|
||
| OpenAI | `sk-proj-*`, `sk-svcacct-*` | High | `GET /v1/models` |
|
||
| Anthropic | `sk-ant-api03-*` | High | `GET /v1/models` |
|
||
| Google AI (Gemini) | `AIza*` | High | `GET /v1/models` |
|
||
| Google Vertex AI | OAuth token | Medium | `GET /v1/models` |
|
||
| AWS Bedrock | `AKIA*` | High | `GetFoundationModel` |
|
||
| Azure OpenAI | 32-char hex | Medium | `GET /openai/deployments` |
|
||
| Meta AI | `meta-llama-*` | Medium | `GET /v1/models` |
|
||
| xAI (Grok) | `xai-*` | High | `GET /v1/models` |
|
||
| Cohere | `co-*` | High | `GET /v1/models` |
|
||
| Mistral AI | 32-char generic | Low | `GET /v1/models` |
|
||
| Inflection AI | Generic UUID | Low | `GET /api/models` |
|
||
| AI21 Labs | Generic key | Low | `GET /v1/models` |
|
||
|
||
### Tier 2 — Inference Platforms
|
||
|
||
| Provider | Key Pattern | Confidence | Verify |
|
||
|----------|-------------|------------|--------|
|
||
| Together AI | Generic key | Low | `GET /v1/models` |
|
||
| Fireworks AI | `fw_*` | High | `GET /v1/models` |
|
||
| Groq | `gsk_*` | High | `GET /openai/v1/models` |
|
||
| Replicate | `r8_*` | High | `GET /v1/predictions` |
|
||
| Anyscale | Generic key | Low | `GET /v1/models` |
|
||
| DeepInfra | Generic key | Low | `GET /v1/models` |
|
||
| Lepton AI | `lpt_*` | High | `GET /v1/models` |
|
||
| Modal | Generic token | Low | `GET /api/apps` |
|
||
| Baseten | Generic key | Low | `GET /v1/models` |
|
||
| Cerebrium | Generic key | Low | `GET /v1/models` |
|
||
| NovitaAI | Generic key | Low | `GET /v1/models` |
|
||
| Sambanova | Generic key | Low | `GET /v1/models` |
|
||
| OctoAI | Generic key | Low | `GET /v1/models` |
|
||
| Friendli AI | Generic key | Low | `GET /v1/models` |
|
||
|
||
### Tier 3 — Specialized/Vertical
|
||
|
||
| Provider | Key Pattern | Confidence | Verify |
|
||
|----------|-------------|------------|--------|
|
||
| Perplexity | `pplx-*` | High | `GET /chat/completions` |
|
||
| You.com | Generic key | Low | `GET /v1/search` |
|
||
| Voyage AI | `voy-*` | High | `GET /v1/models` |
|
||
| Jina AI | `jina_*` | High | `GET /v1/models` |
|
||
| Unstructured | Generic key | Low | `GET /general/v0/general` |
|
||
| AssemblyAI | Generic key | Low | `GET /v2/transcript` |
|
||
| Deepgram | Generic key | Low | `GET /v1/projects` |
|
||
| ElevenLabs | `el_*` | High | `GET /v1/user` |
|
||
| Stability AI | `sk-*` | Medium | `GET /v1/engines/list` |
|
||
| Runway ML | Generic key | Low | `GET /v1/models` |
|
||
| Midjourney | Generic key | Low | N/A |
|
||
| HuggingFace | `hf_*` | High | `GET /api/whoami` |
|
||
|
||
### Tier 4 — Chinese/Regional
|
||
|
||
| Provider | Key Pattern | Confidence | Verify |
|
||
|----------|-------------|------------|--------|
|
||
| DeepSeek | `sk-*` | Medium | `GET /v1/models` |
|
||
| Baichuan | Generic key | Low | `GET /v1/models` |
|
||
| Zhipu AI (GLM) | Generic key | Low | `POST /api/paas/v4/chat` |
|
||
| Moonshot AI (Kimi) | `sk-*` | Medium | `GET /v1/models` |
|
||
| Yi (01.AI) | Generic key | Low | `GET /v1/models` |
|
||
| Qwen (Alibaba) | `sk-*` | Medium | `GET /v1/models` |
|
||
| Baidu (ERNIE) | API Key + Secret | Medium | Token endpoint |
|
||
| ByteDance (Doubao) | Generic key | Low | `GET /v1/models` |
|
||
| SenseTime | Generic key | Low | `GET /v1/models` |
|
||
| iFlytek (Spark) | API Key + Secret | Medium | WebSocket handshake |
|
||
| MiniMax | Generic key | Low | `GET /v1/models` |
|
||
| Stepfun | Generic key | Low | `GET /v1/models` |
|
||
| 360 AI | Generic key | Low | `GET /v1/models` |
|
||
| Kuaishou (Kling) | Generic key | Low | `GET /v1/models` |
|
||
| Tencent Hunyuan | SecretId + SecretKey | Medium | `DescribeModels` |
|
||
| SiliconFlow | `sf_*` | High | `GET /v1/models` |
|
||
|
||
### Tier 5 — Infrastructure/Gateway
|
||
|
||
| Provider | Key Pattern | Confidence | Verify |
|
||
|----------|-------------|------------|--------|
|
||
| Cloudflare AI | Cloudflare API token | Medium | `GET /ai/models` |
|
||
| Vercel AI | `vercel_*` | High | `GET /v1/models` |
|
||
| LiteLLM | Generic key | Low | `GET /v1/models` |
|
||
| Portkey | Generic key | Low | `GET /v1/models` |
|
||
| Helicone | `sk-helicone-*` | High | `GET /v1/models` |
|
||
| OpenRouter | `sk-or-*` | High | `GET /api/v1/models` |
|
||
| Martian | Generic key | Low | `GET /v1/models` |
|
||
| AI Gateway (Kong) | Generic key | Low | Health endpoint |
|
||
| BricksAI | Generic key | Low | `GET /v1/models` |
|
||
| Aether | Generic key | Low | `GET /v1/models` |
|
||
| Not Diamond | Generic key | Low | `GET /v1/models` |
|
||
|
||
### Tier 6 — Emerging/Niche
|
||
|
||
| Provider | Key Pattern | Confidence | Verify |
|
||
|----------|-------------|------------|--------|
|
||
| Reka AI | Generic key | Low | `GET /v1/models` |
|
||
| Aleph Alpha | Generic key | Low | `GET /models` |
|
||
| Writer | Generic key | Low | `GET /v1/models` |
|
||
| Jasper AI | Generic key | Low | N/A |
|
||
| Typeface | Generic key | Low | N/A |
|
||
| Comet ML | Generic key | Low | `GET /api/rest/v2` |
|
||
| Weights & Biases | Generic key | Low | `GET /api/v1/viewer` |
|
||
| LangSmith | `ls__*` | High | `GET /api/v1/info` |
|
||
| Pinecone | Generic key | Low | `GET /databases` |
|
||
| Weaviate | Generic key | Low | `GET /v1/meta` |
|
||
| Qdrant | Generic key | Low | `GET /collections` |
|
||
| Chroma | Generic key | Low | `GET /api/v1/heartbeat` |
|
||
| Milvus | Generic key | Low | `GET /v1/vector/collections` |
|
||
| Neon AI | Generic key | Low | N/A |
|
||
| Lamini | Generic key | Low | `GET /v1/models` |
|
||
|
||
### Tier 7 — Code & Dev Tools
|
||
|
||
| Provider | Key Pattern | Confidence | Verify |
|
||
|----------|-------------|------------|--------|
|
||
| GitHub Copilot | `ghu_*`, `ghp_*` | High | `GET /user` |
|
||
| Cursor | Generic key | Low | N/A |
|
||
| Tabnine | Generic key | Low | N/A |
|
||
| Codeium/Windsurf | Generic key | Low | N/A |
|
||
| Sourcegraph Cody | `sgp_*` | High | `GET /.api/current-user` |
|
||
| Amazon CodeWhisperer | `AKIA*` | High | STS GetCallerIdentity |
|
||
| Replit AI | Generic key | Low | N/A |
|
||
| Codestral (Mistral) | Generic key | Low | `GET /v1/models` |
|
||
| IBM watsonx.ai | `ibm_*` | Medium | IAM token endpoint |
|
||
| Oracle AI | Generic key | Low | N/A |
|
||
|
||
### Tier 8 — Self-Hosted/Open Infra
|
||
|
||
| Provider | Key Pattern | Confidence | Verify |
|
||
|----------|-------------|------------|--------|
|
||
| Ollama | N/A (local) | N/A | `GET /api/tags` |
|
||
| vLLM | Generic key | Low | `GET /v1/models` |
|
||
| LocalAI | Generic key | Low | `GET /v1/models` |
|
||
| LM Studio | N/A (local) | N/A | `GET /v1/models` |
|
||
| llama.cpp | N/A (local) | N/A | `GET /health` |
|
||
| GPT4All | N/A (local) | N/A | N/A |
|
||
| text-generation-webui | Generic key | Low | `GET /v1/models` |
|
||
| TensorRT-LLM | N/A | N/A | Health endpoint |
|
||
| Triton Inference Server | N/A | N/A | `GET /v2/health/ready` |
|
||
| Jan AI | N/A (local) | N/A | `GET /v1/models` |
|
||
|
||
### Tier 9 — Enterprise/Legacy
|
||
|
||
| Provider | Key Pattern | Confidence | Verify |
|
||
|----------|-------------|------------|--------|
|
||
| Salesforce Einstein | Generic token | Low | REST API |
|
||
| ServiceNow AI | Generic token | Low | REST API |
|
||
| SAP AI Core | OAuth token | Low | Token endpoint |
|
||
| Palantir AIP | Generic token | Low | REST API |
|
||
| Databricks (DBRX) | `dapi*` | High | `GET /api/2.0/clusters` |
|
||
| Snowflake Cortex | JWT token | Medium | SQL endpoint |
|
||
| Oracle Generative AI | Generic key | Low | REST API |
|
||
| HPE GreenLake AI | Generic token | Low | REST API |
|
||
|
||
---
|
||
|
||
## Architecture
|
||
|
||
```
|
||
+------------------+
|
||
| CLI (Cobra) |
|
||
+--------+---------+
|
||
|
|
||
+--------------+--------------+
|
||
| | |
|
||
+--------v--+ +------v-----+ +-----v------+
|
||
| Input | | Recon | | Import |
|
||
| Adapters | | Engine | | Adapters |
|
||
| - file | | (80+ src) | | - trufflehog|
|
||
| - git | | - IoT (6) | | - gitleaks |
|
||
| - stdin | | - Code(16) | | - generic |
|
||
| - url | | - Search(5)| +-----+------+
|
||
| - clipboard| | - Paste(8+)| |
|
||
+--------+---+ | - Pkg (8) | |
|
||
| | - Cloud(7) | |
|
||
| | - CI/CD(5) | |
|
||
| | - Archive2 | |
|
||
| | - Forum(7) | |
|
||
| | - Collab(4)| |
|
||
| | - JS/FE(5) | |
|
||
| | - Logs (3) | |
|
||
| | - Intel(3) | |
|
||
| | - Mobile(1)| |
|
||
| | - DNS (2) | |
|
||
| | - API (3) | |
|
||
| +------+-----+ |
|
||
| | |
|
||
+-------+-------+--------------+
|
||
|
|
||
+-------v--------+
|
||
| Scanner Engine |
|
||
| - matcher.go |
|
||
| - verifier.go |
|
||
+-------+--------+
|
||
|
|
||
+------------+-------------+
|
||
| | |
|
||
+-----v----+ +----v-----+ +----v-------+
|
||
| Output | | Notify | | Web |
|
||
| - table | | - telegram| | Dashboard |
|
||
| - json | | - webhook| | - htmx |
|
||
| - sarif | | - slack | | - REST API |
|
||
| - csv | +----------+ | - SQLite |
|
||
+----------+ +------------+
|
||
|
||
+------------------------------------------+
|
||
| Provider Registry (108+ YAML providers) |
|
||
| Dork Registry (50+ YAML dorks) |
|
||
+------------------------------------------+
|
||
```
|
||
|
||
### Key Design Decisions
|
||
|
||
- **YAML Providers** — Adding a new provider = adding a YAML file. No recompile needed for pattern-only changes (when using external provider dir). Built-in providers are embedded at compile time.
|
||
- **Keyword Pre-filtering** — Before running regex, files are scanned for keywords. This provides ~10x speedup on large codebases.
|
||
- **Worker Pool** — Parallel scanning with configurable worker count. Default: CPU count.
|
||
- **Delta-based Git Scanning** — Only scans changes between commits, not entire trees.
|
||
- **SQLite Storage** — All scan results persisted with AES-256 encryption.
|
||
|
||
---
|
||
|
||
## Security & Ethics
|
||
|
||
### Built-in Protections
|
||
- Key values **masked by default** in terminal (first 8 + last 4 chars) — use `--unmask` for full keys
|
||
- **Full keys always available** via: `--unmask`, `--output=json`, `keyhunter keys show`, web dashboard, Telegram bot
|
||
- Database is **AES-256 encrypted** (full keys stored encrypted)
|
||
- API tokens stored **encrypted** in config
|
||
- No key values written to logs during `--verify`
|
||
- Web dashboard supports **basic auth / token auth**
|
||
|
||
### Rate Limiting
|
||
| Source | Rate Limit |
|
||
|--------|-----------|
|
||
| GitHub API (auth) | 30 req/min |
|
||
| GitHub API (unauth) | 10 req/min |
|
||
| Shodan | Per API plan |
|
||
| Censys | 250 queries/day (free) |
|
||
| ZoomEye | 10,000 results/month (free) |
|
||
| FOFA | 100 results/query (free) |
|
||
| Netlas | 50 queries/day (free) |
|
||
| Google Custom Search | 100/day free, 10K/day paid |
|
||
| Bing Search | 1,000/month (free) |
|
||
| Stack Overflow | 300/day (no key), 10K/day (key) |
|
||
| HN Algolia | 10,000 req/hour |
|
||
| VirusTotal | 4 req/min (free) |
|
||
| IntelX | 10 searches/day (free) |
|
||
| GrayHatWarfare | Per plan |
|
||
| Wayback Machine | ~15 req/min |
|
||
| Paste sites | 1 req/2sec |
|
||
| npm/PyPI | Generous, be respectful |
|
||
| Trello | 100 req/10sec |
|
||
| Docker Hub | 100 pulls/6hr (unauth) |
|
||
|
||
### Stealth & Ethics Flags
|
||
```bash
|
||
--stealth # User-agent rotation, increased request spacing
|
||
--respect-robots # Respect robots.txt (default: on)
|
||
```
|
||
|
||
---
|
||
|
||
## Use Cases
|
||
|
||
### Red Team / Pentest
|
||
```bash
|
||
# Full multi-source recon against a target org
|
||
keyhunter recon github --query="targetcorp OPENAI_API_KEY"
|
||
keyhunter recon gitlab --query="targetcorp api_key"
|
||
keyhunter recon shodan --dork='http.html:"targetcorp" "sk-"'
|
||
keyhunter recon censys --query='services.http.response.body:"targetcorp" AND "api_key"'
|
||
keyhunter recon zoomeye --query='site:targetcorp.com +"api_key"'
|
||
keyhunter recon elasticsearch --shodan # Find exposed ES with leaked keys
|
||
keyhunter recon jenkins --shodan # Exposed Jenkins with build logs
|
||
keyhunter recon dotenv --domain-list=targetcorp-subdomains.txt # .env exposure
|
||
keyhunter recon wayback --domain=targetcorp.com # Historical leaks
|
||
keyhunter recon sourcemaps --domain=app.targetcorp.com # JS source maps
|
||
keyhunter recon crtsh --domain=targetcorp.com # Discover API subdomains
|
||
keyhunter recon full --providers=openai,anthropic # Everything at once
|
||
```
|
||
|
||
### DevSecOps / CI Pipeline
|
||
```bash
|
||
# Pre-commit hook
|
||
keyhunter hook install
|
||
|
||
# GitHub Actions step
|
||
- name: KeyHunter Scan
|
||
run: |
|
||
keyhunter scan path . --output=sarif > keyhunter.sarif
|
||
# Upload to GitHub Security tab
|
||
```
|
||
|
||
### Bug Bounty
|
||
```bash
|
||
# Comprehensive target recon
|
||
keyhunter recon github --org=targetcorp --dork=auto --verify
|
||
keyhunter recon gist --query="targetcorp"
|
||
keyhunter recon paste --sources=all --query="targetcorp"
|
||
keyhunter recon postman --query="targetcorp"
|
||
keyhunter recon trello --query="targetcorp api key"
|
||
keyhunter recon notion --query="targetcorp API_KEY"
|
||
keyhunter recon confluence --shodan
|
||
keyhunter recon npm --query="targetcorp" # Check their published packages
|
||
keyhunter recon pypi --query="targetcorp"
|
||
keyhunter recon docker --query="targetcorp" --layers # Docker image layer scan
|
||
keyhunter recon apk --query="targetcorp" # Mobile app decompile
|
||
keyhunter recon swagger --domain=api.targetcorp.com
|
||
```
|
||
|
||
### Monitoring / Alerting
|
||
```bash
|
||
# Continuous monitoring with Telegram alerts
|
||
keyhunter schedule add \
|
||
--name="monitor-github" \
|
||
--cron="*/30 * * * *" \
|
||
--command="recon github --dork=auto --providers=openai" \
|
||
--notify=telegram
|
||
|
||
keyhunter serve --telegram
|
||
```
|
||
|
||
---
|
||
|
||
## Dork Examples (150+ Built-in)
|
||
|
||
### GitHub
|
||
```
|
||
filename:.env "OPENAI_API_KEY"
|
||
filename:.env "ANTHROPIC_API_KEY"
|
||
filename:config.yaml "api_key" "sk-"
|
||
"sk-proj-" language:python
|
||
"sk-ant-api03" language:javascript
|
||
filename:docker-compose "API_KEY"
|
||
"api_key" extension:ipynb
|
||
filename:.toml "api_key" "sk-"
|
||
filename:terraform.tfvars "api_key"
|
||
"kind: Secret" "data:" filename:*.yaml # K8s secrets
|
||
filename:.npmrc "_authToken" # npm tokens
|
||
filename:requirements.txt "openai" path:.env # Python projects
|
||
```
|
||
|
||
### GitLab
|
||
```
|
||
"OPENAI_API_KEY" filename:.env
|
||
"sk-ant-" filename:*.py
|
||
"api_key" filename:settings.json
|
||
```
|
||
|
||
### Google Dorking
|
||
```
|
||
"sk-proj-" -github.com -stackoverflow.com # Outside known code sites
|
||
"sk-ant-api03-" filetype:env
|
||
"OPENAI_API_KEY" filetype:yml
|
||
"ANTHROPIC_API_KEY" filetype:json
|
||
inurl:.env "API_KEY"
|
||
intitle:"index of" .env
|
||
site:pastebin.com "sk-proj-"
|
||
site:replit.com "OPENAI_API_KEY"
|
||
site:codesandbox.io "sk-ant-"
|
||
site:notion.so "API_KEY"
|
||
site:trello.com "openai"
|
||
site:docs.google.com "sk-proj-"
|
||
site:medium.com "ANTHROPIC_API_KEY"
|
||
site:dev.to "sk-proj-"
|
||
site:huggingface.co "OPENAI_API_KEY"
|
||
site:kaggle.com "api_key" "sk-"
|
||
intitle:"Swagger UI" "api_key"
|
||
inurl:graphql "authorization" "Bearer sk-"
|
||
filetype:tfstate "api_key" # Terraform state
|
||
filetype:ipynb "sk-proj-" # Jupyter notebooks
|
||
```
|
||
|
||
### Shodan
|
||
```
|
||
http.html:"openai" "api_key" port:8080
|
||
http.title:"LiteLLM" port:4000
|
||
http.html:"ollama" port:11434
|
||
http.title:"Kubernetes Dashboard"
|
||
"X-Jenkins" "200 OK"
|
||
http.title:"Kibana" port:5601
|
||
http.title:"Grafana"
|
||
http.title:"Swagger UI"
|
||
http.title:"Gitea" port:3000
|
||
http.html:"PrivateBin"
|
||
http.title:"MinIO Browser"
|
||
http.title:"Sentry"
|
||
http.title:"Confluence"
|
||
port:6443 "kube-apiserver"
|
||
http.html:"langchain" port:8000
|
||
```
|
||
|
||
### Censys
|
||
```
|
||
services.http.response.body:"openai" and services.http.response.body:"sk-"
|
||
services.http.response.body:"langchain" and services.port:8000
|
||
services.http.response.body:"OPENAI_API_KEY"
|
||
services.http.response.body:"sk-ant-api03"
|
||
```
|
||
|
||
### ZoomEye
|
||
```
|
||
app:"Elasticsearch" +"api_key"
|
||
app:"Jenkins" +openai
|
||
app:"Grafana" +anthropic
|
||
app:"Gitea"
|
||
```
|
||
|
||
### FOFA
|
||
```
|
||
body="sk-proj-"
|
||
body="OPENAI_API_KEY"
|
||
body="sk-ant-api03"
|
||
title="LiteLLM"
|
||
title="Swagger UI" && body="api_key"
|
||
title="Kibana" && body="authorization"
|
||
```
|
||
|
||
---
|
||
|
||
## Contributing
|
||
|
||
### Adding a New Provider
|
||
|
||
1. Create `providers/your-provider.yaml`:
|
||
|
||
```yaml
|
||
id: your-provider
|
||
name: Your Provider
|
||
category: emerging
|
||
website: https://api.yourprovider.com
|
||
confidence: medium
|
||
|
||
patterns:
|
||
- id: your-provider-key
|
||
name: "Your Provider API Key"
|
||
regex: '\byp_[A-Za-z0-9]{32}\b'
|
||
confidence: high
|
||
description: "Your Provider API key with yp_ prefix"
|
||
|
||
keywords:
|
||
- "yp_"
|
||
- "YOUR_PROVIDER_API_KEY"
|
||
|
||
verify:
|
||
enabled: true
|
||
method: GET
|
||
url: "https://api.yourprovider.com/v1/models"
|
||
headers:
|
||
Authorization: "Bearer {{key}}"
|
||
success_codes: [200]
|
||
failure_codes: [401, 403]
|
||
|
||
metadata:
|
||
docs: "https://docs.yourprovider.com"
|
||
key_url: "https://dashboard.yourprovider.com/keys"
|
||
env_vars: ["YOUR_PROVIDER_API_KEY"]
|
||
```
|
||
|
||
2. Run tests: `go test ./pkg/provider/...`
|
||
3. Submit a PR
|
||
|
||
### Adding a New Dork
|
||
|
||
1. Edit `dorks/<source>.yaml` and add your dork entry
|
||
2. Submit a PR
|
||
|
||
---
|
||
|
||
## Roadmap
|
||
|
||
- [ ] Core scanning engine (file, git, stdin)
|
||
- [ ] 108 provider YAML definitions
|
||
- [ ] Active verification for all providers
|
||
- [ ] CLI with Cobra (scan, verify, import, recon, serve)
|
||
- [ ] TruffleHog & Gitleaks import adapters
|
||
- [ ] OSINT/Recon engine (Shodan, Censys, GitHub, GitLab, Paste, S3)
|
||
- [ ] Built-in dork engine with 50+ dorks
|
||
- [ ] Web dashboard (htmx + Tailwind + SQLite)
|
||
- [ ] Telegram bot with auto-notifications
|
||
- [ ] Scheduled scanning (cron-based)
|
||
- [ ] Pre-commit hook & CI/CD integration (SARIF)
|
||
- [ ] Docker image
|
||
- [ ] Homebrew formula
|
||
|
||
---
|
||
|
||
## Disclaimer
|
||
|
||
KeyHunter is designed for **authorized security testing**, **defensive security**, **bug bounty programs**, and **educational purposes** only. Always ensure you have proper authorization before scanning any target. Unauthorized access to computer systems is illegal.
|
||
|
||
---
|
||
|
||
## License
|
||
|
||
MIT License - see [LICENSE](LICENSE) for details.
|