- Add gocron/v2 v2.19.1 as direct dependency - Append subscribers and scheduled_jobs CREATE TABLE to schema.sql - Implement full subscriber CRUD (Add/Remove/List/IsSubscribed) - Implement full scheduled job CRUD (Save/List/Get/Delete/UpdateLastRun/SetEnabled)
KeyHunter
The most comprehensive API key scanner for LLM/AI providers. Detect, validate, and monitor leaked API keys across 108+ providers.
Why KeyHunter?
Existing tools like TruffleHog (~3 LLM detectors) and Gitleaks (~5 LLM rules) were built for general secret scanning. AI-related credential leaks grew 81% year-over-year in 2025, yet no tool covers more than ~15 LLM providers.
KeyHunter fills that gap with 108+ provider-specific detectors, active key validation, OSINT/recon capabilities, and a growing set of internet sources for leak discovery.
How It Compares
| Feature | KeyHunter | TruffleHog | Gitleaks | detect-secrets |
|---|---|---|---|---|
| LLM Providers | 108+ | ~3 | ~5 | ~1 |
| Active Verification | 108+ endpoints | ~20 types | No | No |
| OSINT/Recon Sources | 18 live (80+ planned) | No | No | No |
| External Tool Import | TruffleHog + Gitleaks | - | - | - |
| Dork Engine | 150 built-in YAML dorks | No | No | No |
| Pre-commit Hook | Built-in | Yes | Yes | Yes |
| SARIF Output | Yes | Yes | Yes | No |
| Provider YAML Plugin | Community-extensible | Go code only | TOML rules | Python plugins |
| Web Dashboard | Coming soon | No | No | No |
| Telegram Bot | Coming soon | No | No | No |
| Scheduled Scanning | Coming soon | No | No | No |
Features
Implemented
Core Scanning Engine
- 3-stage pipeline -- AC pre-filter, regex match, entropy scoring
- ants worker pool for parallel scanning with configurable worker count
- 108 provider YAML definitions (Tier 1-9), dual-located with
go:embed
Input Sources
- File scanning -- single file analysis
- Directory scanning -- recursive traversal with glob exclusions and mmap
- Git history scanning -- full commit history analysis
- stdin/pipe support --
echo "sk-proj-..." | keyhunter scan stdin - URL fetching -- scan any remote URL content
- Clipboard scanning -- instant clipboard content analysis
Active Verification
- YAML-driven
HTTPVerifier-- lightweight API calls to verify if detected keys are active - Permission and scope extraction (org, rate limits, model access)
- Consent prompt and
LEGAL.mdfor legal safety - Configurable via
--verifyflag (off by default)
Output Formats
- Table -- colored terminal output with key masking (default)
- JSON -- full key values for programmatic consumption
- CSV -- spreadsheet-compatible export
- SARIF 2.1.0 -- CI/CD integration (GitHub Code Scanning, etc.)
- Exit codes:
0(clean),1(findings),2(error)
Key Management
keyhunter keys list-- list all discovered keys (masked by default)keyhunter keys show <id>-- full key detailskeyhunter keys export-- export in JSON/CSV formatkeyhunter keys copy <id>-- copy key to clipboardkeyhunter keys delete <id>-- remove a key from the databasekeyhunter keys verify <id>-- verify a specific key
External Tool Import
- TruffleHog v3 JSON import with LLM-specific enrichment
- Gitleaks JSON and CSV import
- Deduplication across imports via
(provider, masked_key, source)hashing
Git Pre-commit Hook
keyhunter hook install-- embedded shell script, blocks leaks before commitkeyhunter hook uninstall-- clean removal- Backup of existing hooks with
--force
Dork Engine
- 150 built-in YAML dorks across 8 source types (GitHub, GitLab, Google, Shodan, Censys, ZoomEye, FOFA, Bing)
- GitHub live executor with authenticated API
- CLI management:
keyhunter dorks list,keyhunter dorks list --source=github,keyhunter dorks add,keyhunter dorks run,keyhunter dorks export
OSINT / Recon Engine (18 Sources Live)
The recon framework provides a ReconSource interface with per-source rate limiting, stealth mode, robots.txt compliance, parallel sweep, and result deduplication.
Code Hosting & Snippets (live)
- GitHub -- code search with automated dorks
- GitLab -- code search
- Bitbucket -- code search
- GitHub Gist -- public gist search
- Codeberg -- alternative Git platform search
- HuggingFace -- Spaces, repos, model configs (high-yield for LLM keys)
- Replit -- public repl search
- CodeSandbox -- sandbox search
- StackBlitz Sandboxes -- sandbox search
- Kaggle -- notebooks and datasets with API keys
Search Engine Dorking (live)
- Google -- Custom Search API / SerpAPI
- Bing -- Azure Cognitive Services search
- DuckDuckGo -- HTML scraping fallback
- Yandex -- XML API search
- Brave -- Brave Search API
Paste Sites (live)
- Pastebin -- scraping API
- GistPaste -- paste search
- PasteSites -- multi-paste aggregator
recon full -- parallel sweep across all 18 live sources with deduplication and unified reporting.
CLI Commands
| Command | Status |
|---|---|
keyhunter scan |
Implemented |
keyhunter providers list/info/stats |
Implemented |
keyhunter config init/set/get |
Implemented |
keyhunter keys list/show/export/copy/delete/verify |
Implemented |
keyhunter import |
Implemented |
keyhunter hook install/uninstall |
Implemented |
keyhunter dorks list/add/run/export |
Implemented |
keyhunter recon full/list |
Implemented |
keyhunter legal |
Implemented |
keyhunter verify |
Stub |
keyhunter serve |
Stub |
keyhunter schedule |
Stub |
Coming Soon
The following features are on the roadmap but not yet implemented:
Phase 12 -- IoT Scanners & Cloud Storage
- Shodan -- exposed LLM proxies, dashboards, API endpoints
- Censys -- HTTP body search for leaked credentials
- ZoomEye -- IoT scanner
- FOFA -- Asian infrastructure scanning
- Netlas -- HTTP response body search
- BinaryEdge -- internet-wide scan data
- AWS S3 / GCS / Azure Blob / DigitalOcean Spaces -- bucket enumeration and scanning
Phase 13 -- Package Registries, Containers & IaC
- npm / PyPI / RubyGems / crates.io / Maven / NuGet -- package source scanning
- Docker Hub -- image layer scanning
- Terraform / Helm Charts / Ansible -- IaC scanning
Phase 14 -- CI/CD Logs, Web Archives & Frontend Leaks
- GitHub Actions / Travis CI / CircleCI / Jenkins / GitLab CI -- public build log scanning
- Wayback Machine / CommonCrawl -- historical web archive scanning
- JS Source Maps / Webpack bundles / exposed .env -- frontend leak detection
Phase 15 -- Forums & Collaboration
- Stack Overflow / Reddit / Hacker News / dev.to / Medium -- forum scanning
- Notion / Confluence / Trello -- collaboration tool scanning
- Elasticsearch / Grafana / Sentry -- exposed log aggregators
- Telegram groups / Discord -- public channel scanning
Phase 16 -- Threat Intel, Mobile, DNS & API Marketplaces
- VirusTotal / Intelligence X / URLhaus -- threat intelligence
- APK analysis -- mobile app decompilation
- crt.sh / subdomain probing -- DNS/subdomain discovery
- Postman / SwaggerHub -- API marketplace scanning
Phase 17 -- Telegram Bot & Scheduler
- Telegram Bot -- scan triggers, key alerts, recon results
- Scheduled scanning -- cron-based recurring scans with auto-notify
Phase 18 -- Web Dashboard
- Web Dashboard -- htmx + Tailwind, SQLite-backed, real-time scan viewer
Quick Start
Install
# From source
go install github.com/salvacybersec/keyhunter@latest
# Binary release (when available)
curl -sSL https://github.com/salvacybersec/keyhunter/releases/latest/download/keyhunter_linux_amd64.tar.gz | tar -xz
sudo mv keyhunter /usr/local/bin/
Basic Usage
# Scan a directory
keyhunter scan ./my-project/
# Scan with active verification
keyhunter scan ./my-project/ --verify
# Scan git history
keyhunter scan --git .
# Scan from pipe
cat secrets.txt | keyhunter scan stdin
# Scan only specific providers
keyhunter scan . --providers=openai,anthropic,deepseek
# JSON output
keyhunter scan . --output=json > results.json
# SARIF output for CI/CD
keyhunter scan . --output=sarif > keyhunter.sarif
# CSV output
keyhunter scan . --output=csv > results.csv
OSINT / Recon
# Full sweep across all 18 live sources
keyhunter recon full
# Sweep specific sources only
keyhunter recon full --sources=github,gitlab,gist
# List available recon sources
keyhunter recon list
# Code hosting sources
keyhunter recon full --sources=github
keyhunter recon full --sources=gitlab
keyhunter recon full --sources=bitbucket
keyhunter recon full --sources=gist
keyhunter recon full --sources=codeberg
keyhunter recon full --sources=huggingface
keyhunter recon full --sources=replit
keyhunter recon full --sources=codesandbox
keyhunter recon full --sources=sandboxes
keyhunter recon full --sources=kaggle
# Search engine dorking
keyhunter recon full --sources=google
keyhunter recon full --sources=bing
keyhunter recon full --sources=duckduckgo
keyhunter recon full --sources=yandex
keyhunter recon full --sources=brave
# Paste sites
keyhunter recon full --sources=pastebin
keyhunter recon full --sources=gistpaste
keyhunter recon full --sources=pastesites
Dork Management
keyhunter dorks list # All dorks across all sources
keyhunter dorks list --source=github # GitHub dorks only
keyhunter dorks list --source=google # Google dorks only
keyhunter dorks add github 'filename:.env "GROQ_API_KEY"'
keyhunter dorks run google --category=frontier
keyhunter dorks export
Key Management
Keys are masked by default in terminal output (shoulder surfing protection). Ways to access full key values:
# Show full keys in scan output
keyhunter scan . --unmask
# JSON export always includes full keys
keyhunter scan . --output=json > results.json
# Key management commands
keyhunter keys list # Masked list
keyhunter keys list --unmask # Full key list
keyhunter keys show <id> # Single key full details (always unmasked)
keyhunter keys copy <id> # Copy key to clipboard
keyhunter keys export --format=json # Export all keys with full values
keyhunter keys verify <id> # Verify key + show full details
keyhunter keys delete <id> # Remove key from database
Example keyhunter keys show output:
ID: a3f7b2c1
Provider: OpenAI
Pattern: OpenAI Project Key
Key: sk-proj-abc123def456ghi789jkl012mno345pqr678stu901vwx234
Confidence: HIGH
Source: src/config.py:42
Found: 2026-04-04 14:32:01
Scan ID: scan_001
Status: ACTIVE (verified 2026-04-04 14:32:05)
Org: my-org
Rate Limit: 500 req/min
Revoke URL: https://platform.openai.com/api-keys
Import External Tools
# Run TruffleHog, then enrich with KeyHunter
trufflehog git . --json > trufflehog.json
keyhunter import --format=trufflehog trufflehog.json
# Run Gitleaks, then enrich
gitleaks detect -f json -r gitleaks.json
keyhunter import --format=gitleaks gitleaks.json
# Gitleaks CSV
gitleaks detect -f csv -r gitleaks.csv
keyhunter import --format=gitleaks-csv gitleaks.csv
CI/CD Integration
KeyHunter ships with a git pre-commit hook that blocks leaks before they land in
history, a GitHub Actions integration that uploads SARIF findings directly into
the repository's Code Scanning tab, and an import command that consolidates
TruffleHog and Gitleaks output into one normalized database.
# Install pre-commit hook (scans staged files only)
keyhunter hook install
# GitHub Actions (SARIF output for Code Scanning upload)
keyhunter scan . --output sarif > keyhunter.sarif
# Import findings from other scanners
keyhunter import --format=trufflehog trufflehog.json
keyhunter import --format=gitleaks gitleaks.json
# Exit codes: 0 = clean, 1 = keys found, 2 = error
keyhunter scan . && echo "Clean" || echo "Keys found!"
See docs/CI-CD.md for the full guide, including a copy-paste GitHub Actions workflow and the pre-commit hook install/uninstall lifecycle.
Configuration
# Initialize config
keyhunter config init
# Creates ~/.keyhunter.yaml
# Set API tokens for recon sources (currently supported)
keyhunter config set recon.github.token "YOUR_GITHUB_TOKEN"
keyhunter config set recon.gitlab.token "YOUR_GITLAB_TOKEN"
keyhunter config set recon.bitbucket.token "YOUR_BITBUCKET_TOKEN"
keyhunter config set recon.huggingface.token "YOUR_HF_TOKEN"
keyhunter config set recon.kaggle.token "YOUR_KAGGLE_TOKEN"
keyhunter config set recon.google.apikey "YOUR_GOOGLE_API_KEY"
keyhunter config set recon.google.cx "YOUR_GOOGLE_CX_ID"
keyhunter config set recon.bing.apikey "YOUR_BING_API_KEY"
keyhunter config set recon.brave.apikey "YOUR_BRAVE_API_KEY"
keyhunter config set recon.yandex.apikey "YOUR_YANDEX_API_KEY"
keyhunter config set recon.yandex.user "YOUR_YANDEX_USER"
# View current config
keyhunter config get recon.github.token
Config File (~/.keyhunter.yaml)
scan:
workers: 8
verify_timeout: 10s
default_output: table
recon:
stealth: false
respect_robots: true
github:
token: ""
gitlab:
token: ""
bitbucket:
token: ""
huggingface:
token: ""
kaggle:
token: ""
google:
apikey: ""
cx: ""
bing:
apikey: ""
brave:
apikey: ""
yandex:
apikey: ""
user: ""
Stealth & Ethics Flags
--stealth # User-agent rotation, increased request spacing
--respect-robots # Respect robots.txt (default: on)
Supported Providers (108)
Tier 1 -- Frontier
| Provider | Key Pattern | Confidence | Verify |
|---|---|---|---|
| OpenAI | sk-proj-*, sk-svcacct-* |
High | GET /v1/models |
| Anthropic | sk-ant-api03-* |
High | GET /v1/models |
| Google AI (Gemini) | AIza* |
High | GET /v1/models |
| Google Vertex AI | OAuth token | Medium | GET /v1/models |
| AWS Bedrock | AKIA* |
High | GetFoundationModel |
| Azure OpenAI | 32-char hex | Medium | GET /openai/deployments |
| Meta AI | meta-llama-* |
Medium | GET /v1/models |
| xAI (Grok) | xai-* |
High | GET /v1/models |
| Cohere | co-* |
High | GET /v1/models |
| Mistral AI | 32-char generic | Low | GET /v1/models |
| Inflection AI | Generic UUID | Low | GET /api/models |
| AI21 Labs | Generic key | Low | GET /v1/models |
Tier 2 -- Inference Platforms
| Provider | Key Pattern | Confidence | Verify |
|---|---|---|---|
| Together AI | Generic key | Low | GET /v1/models |
| Fireworks AI | fw_* |
High | GET /v1/models |
| Groq | gsk_* |
High | GET /openai/v1/models |
| Replicate | r8_* |
High | GET /v1/predictions |
| Anyscale | Generic key | Low | GET /v1/models |
| DeepInfra | Generic key | Low | GET /v1/models |
| Lepton AI | lpt_* |
High | GET /v1/models |
| Modal | Generic token | Low | GET /api/apps |
| Baseten | Generic key | Low | GET /v1/models |
| Cerebrium | Generic key | Low | GET /v1/models |
| NovitaAI | Generic key | Low | GET /v1/models |
| Sambanova | Generic key | Low | GET /v1/models |
| OctoAI | Generic key | Low | GET /v1/models |
| Friendli AI | Generic key | Low | GET /v1/models |
Tier 3 -- Specialized/Vertical
| Provider | Key Pattern | Confidence | Verify |
|---|---|---|---|
| Perplexity | pplx-* |
High | GET /chat/completions |
| You.com | Generic key | Low | GET /v1/search |
| Voyage AI | voy-* |
High | GET /v1/models |
| Jina AI | jina_* |
High | GET /v1/models |
| Unstructured | Generic key | Low | GET /general/v0/general |
| AssemblyAI | Generic key | Low | GET /v2/transcript |
| Deepgram | Generic key | Low | GET /v1/projects |
| ElevenLabs | el_* |
High | GET /v1/user |
| Stability AI | sk-* |
Medium | GET /v1/engines/list |
| Runway ML | Generic key | Low | GET /v1/models |
| Midjourney | Generic key | Low | N/A |
| HuggingFace | hf_* |
High | GET /api/whoami |
Tier 4 -- Chinese/Regional
| Provider | Key Pattern | Confidence | Verify |
|---|---|---|---|
| DeepSeek | sk-* |
Medium | GET /v1/models |
| Baichuan | Generic key | Low | GET /v1/models |
| Zhipu AI (GLM) | Generic key | Low | POST /api/paas/v4/chat |
| Moonshot AI (Kimi) | sk-* |
Medium | GET /v1/models |
| Yi (01.AI) | Generic key | Low | GET /v1/models |
| Qwen (Alibaba) | sk-* |
Medium | GET /v1/models |
| Baidu (ERNIE) | API Key + Secret | Medium | Token endpoint |
| ByteDance (Doubao) | Generic key | Low | GET /v1/models |
| SenseTime | Generic key | Low | GET /v1/models |
| iFlytek (Spark) | API Key + Secret | Medium | WebSocket handshake |
| MiniMax | Generic key | Low | GET /v1/models |
| Stepfun | Generic key | Low | GET /v1/models |
| 360 AI | Generic key | Low | GET /v1/models |
| Kuaishou (Kling) | Generic key | Low | GET /v1/models |
| Tencent Hunyuan | SecretId + SecretKey | Medium | DescribeModels |
| SiliconFlow | sf_* |
High | GET /v1/models |
Tier 5 -- Infrastructure/Gateway
| Provider | Key Pattern | Confidence | Verify |
|---|---|---|---|
| Cloudflare AI | Cloudflare API token | Medium | GET /ai/models |
| Vercel AI | vercel_* |
High | GET /v1/models |
| LiteLLM | Generic key | Low | GET /v1/models |
| Portkey | Generic key | Low | GET /v1/models |
| Helicone | sk-helicone-* |
High | GET /v1/models |
| OpenRouter | sk-or-* |
High | GET /api/v1/models |
| Martian | Generic key | Low | GET /v1/models |
| AI Gateway (Kong) | Generic key | Low | Health endpoint |
| BricksAI | Generic key | Low | GET /v1/models |
| Aether | Generic key | Low | GET /v1/models |
| Not Diamond | Generic key | Low | GET /v1/models |
Tier 6 -- Emerging/Niche
| Provider | Key Pattern | Confidence | Verify |
|---|---|---|---|
| Reka AI | Generic key | Low | GET /v1/models |
| Aleph Alpha | Generic key | Low | GET /models |
| Writer | Generic key | Low | GET /v1/models |
| Jasper AI | Generic key | Low | N/A |
| Typeface | Generic key | Low | N/A |
| Comet ML | Generic key | Low | GET /api/rest/v2 |
| Weights & Biases | Generic key | Low | GET /api/v1/viewer |
| LangSmith | ls__* |
High | GET /api/v1/info |
| Pinecone | Generic key | Low | GET /databases |
| Weaviate | Generic key | Low | GET /v1/meta |
| Qdrant | Generic key | Low | GET /collections |
| Chroma | Generic key | Low | GET /api/v1/heartbeat |
| Milvus | Generic key | Low | GET /v1/vector/collections |
| Neon AI | Generic key | Low | N/A |
| Lamini | Generic key | Low | GET /v1/models |
Tier 7 -- Code & Dev Tools
| Provider | Key Pattern | Confidence | Verify |
|---|---|---|---|
| GitHub Copilot | ghu_*, ghp_* |
High | GET /user |
| Cursor | Generic key | Low | N/A |
| Tabnine | Generic key | Low | N/A |
| Codeium/Windsurf | Generic key | Low | N/A |
| Sourcegraph Cody | sgp_* |
High | GET /.api/current-user |
| Amazon CodeWhisperer | AKIA* |
High | STS GetCallerIdentity |
| Replit AI | Generic key | Low | N/A |
| Codestral (Mistral) | Generic key | Low | GET /v1/models |
| IBM watsonx.ai | ibm_* |
Medium | IAM token endpoint |
| Oracle AI | Generic key | Low | N/A |
Tier 8 -- Self-Hosted/Open Infra
| Provider | Key Pattern | Confidence | Verify |
|---|---|---|---|
| Ollama | N/A (local) | N/A | GET /api/tags |
| vLLM | Generic key | Low | GET /v1/models |
| LocalAI | Generic key | Low | GET /v1/models |
| LM Studio | N/A (local) | N/A | GET /v1/models |
| llama.cpp | N/A (local) | N/A | GET /health |
| GPT4All | N/A (local) | N/A | N/A |
| text-generation-webui | Generic key | Low | GET /v1/models |
| TensorRT-LLM | N/A | N/A | Health endpoint |
| Triton Inference Server | N/A | N/A | GET /v2/health/ready |
| Jan AI | N/A (local) | N/A | GET /v1/models |
Tier 9 -- Enterprise/Legacy
| Provider | Key Pattern | Confidence | Verify |
|---|---|---|---|
| Salesforce Einstein | Generic token | Low | REST API |
| ServiceNow AI | Generic token | Low | REST API |
| SAP AI Core | OAuth token | Low | Token endpoint |
| Palantir AIP | Generic token | Low | REST API |
| Databricks (DBRX) | dapi* |
High | GET /api/2.0/clusters |
| Snowflake Cortex | JWT token | Medium | SQL endpoint |
| Oracle Generative AI | Generic key | Low | REST API |
| HPE GreenLake AI | Generic token | Low | REST API |
Architecture
+------------------+
| CLI (Cobra) |
+--------+---------+
|
+--------------+--------------+
| | |
+--------v--+ +------v-----+ +-----v------+
| Input | | Recon | | Import |
| Adapters | | Engine | | Adapters |
| - file | | (18 live) | | - trufflehog|
| - dir | | - Code(10) | | - gitleaks |
| - git | | - Search(5)| +-----+------+
| - stdin | | - Paste(3) | |
| - url | +------+-----+ |
| - clipboard| | |
+--------+---+ | |
| | |
+-------+------+--------------+
|
+-------v--------+
| Scanner Engine |
| - matcher.go |
| - verifier.go |
+-------+--------+
|
+------------+-------------+
| | |
+-----v----+ +----v-----+ +----v-------+
| Output | | Dork | | Key |
| - table | | Engine | | Management |
| - json | | - 150 | | - list |
| - sarif | | dorks | | - show |
| - csv | | - 8 src | | - export |
+----------+ +----------+ +------------+
+------------------------------------------+
| Provider Registry (108+ YAML providers) |
| Dork Registry (150 YAML dorks) |
+------------------------------------------+
Key Design Decisions
- YAML Providers -- Adding a new provider = adding a YAML file. No recompile needed for pattern-only changes (when using external provider dir). Built-in providers are embedded at compile time.
- Keyword Pre-filtering -- Before running regex, files are scanned for keywords via Aho-Corasick. This provides ~10x speedup on large codebases.
- Worker Pool -- Parallel scanning with configurable worker count via ants. Default: CPU count.
- Delta-based Git Scanning -- Only scans changes between commits, not entire trees.
- SQLite Storage -- All scan results persisted with AES-256 encryption.
Dork Examples (150 Built-in)
GitHub
filename:.env "OPENAI_API_KEY"
filename:.env "ANTHROPIC_API_KEY"
filename:config.yaml "api_key" "sk-"
"sk-proj-" language:python
"sk-ant-api03" language:javascript
filename:docker-compose "API_KEY"
"api_key" extension:ipynb
filename:.toml "api_key" "sk-"
filename:terraform.tfvars "api_key"
Google Dorking
"sk-proj-" -github.com -stackoverflow.com
"sk-ant-api03-" filetype:env
"OPENAI_API_KEY" filetype:yml
"ANTHROPIC_API_KEY" filetype:json
inurl:.env "API_KEY"
intitle:"index of" .env
site:pastebin.com "sk-proj-"
site:replit.com "OPENAI_API_KEY"
Shodan (for future IoT recon sources)
http.html:"openai" "api_key" port:8080
http.title:"LiteLLM" port:4000
http.html:"ollama" port:11434
http.title:"Kubernetes Dashboard"
Use Cases
Red Team / Pentest
# Multi-source recon against a target org
keyhunter recon full --sources=github,gitlab,gist,pastebin
# Scan a cloned repository
keyhunter scan ./target-repo/ --verify
# Scan git history for rotated keys
keyhunter scan --git ./target-repo/
DevSecOps / CI Pipeline
# Pre-commit hook
keyhunter hook install
# GitHub Actions step
- name: KeyHunter Scan
run: keyhunter scan . --output=sarif > keyhunter.sarif
Bug Bounty
# Search code hosting platforms for leaked keys
keyhunter recon full --sources=github,gitlab,bitbucket,gist,codeberg
keyhunter recon full --sources=huggingface,kaggle,replit,codesandbox
# Search engine dorking
keyhunter recon full --sources=google,bing,duckduckgo,brave
# Paste site monitoring
keyhunter recon full --sources=pastebin,pastesites,gistpaste
Security & Ethics
Built-in Protections
- Key values masked by default in terminal (first 8 + last 4 chars) -- use
--unmaskfor full keys - Full keys always available via:
--unmask,--output=json,keyhunter keys show - Database is AES-256 encrypted (full keys stored encrypted)
- API tokens stored encrypted in config
- No key values written to logs during
--verify
Rate Limiting (Recon Sources)
| Source | Rate Limit |
|---|---|
| GitHub API (auth) | 30 req/min |
| GitHub API (unauth) | 10 req/min |
| Google Custom Search | 100/day free, 10K/day paid |
| Bing Search | 1,000/month (free) |
| Brave Search | Per API plan |
| Paste sites | 1 req/2sec |
Contributing
Adding a New Provider
- Create
providers/your-provider.yaml:
id: your-provider
name: Your Provider
category: emerging
website: https://api.yourprovider.com
confidence: medium
patterns:
- id: your-provider-key
name: "Your Provider API Key"
regex: '\byp_[A-Za-z0-9]{32}\b'
confidence: high
description: "Your Provider API key with yp_ prefix"
keywords:
- "yp_"
- "YOUR_PROVIDER_API_KEY"
verify:
enabled: true
method: GET
url: "https://api.yourprovider.com/v1/models"
headers:
Authorization: "Bearer {{key}}"
success_codes: [200]
failure_codes: [401, 403]
metadata:
docs: "https://docs.yourprovider.com"
key_url: "https://dashboard.yourprovider.com/keys"
env_vars: ["YOUR_PROVIDER_API_KEY"]
- Run tests:
go test ./pkg/provider/... - Submit a PR
Adding a New Dork
- Edit
dorks/<source>.yamland add your dork entry - Submit a PR
Roadmap
- Core scanning engine (file, dir, git, stdin, url, clipboard)
- 108 provider YAML definitions (Tier 1-9)
- Active verification (YAML-driven HTTPVerifier)
- Output formats: table, JSON, CSV, SARIF 2.1.0
- CLI with Cobra (scan, providers, config, keys, import, hook, dorks, recon, legal)
- TruffleHog & Gitleaks import adapters
- Key management (list, show, export, copy, delete, verify)
- Git pre-commit hook (install/uninstall)
- Dork engine with 150 built-in dorks across 8 sources
- OSINT recon framework with 18 live sources
- IoT scanners (Shodan, Censys, ZoomEye, FOFA, Netlas, BinaryEdge)
- Cloud storage scanning (S3, GCS, Azure, DigitalOcean)
- Package registries (npm, PyPI, RubyGems, crates.io, Maven, NuGet)
- Container & IaC scanning (Docker Hub, Terraform, Helm, Ansible)
- CI/CD log scanning (GitHub Actions, Travis, CircleCI, Jenkins, GitLab CI)
- Web archives (Wayback Machine, CommonCrawl)
- Frontend leak detection (source maps, webpack, .env exposure)
- Forums & collaboration tools (Stack Overflow, Reddit, Notion, Trello)
- Threat intel (VirusTotal, Intelligence X, URLhaus)
- Telegram bot with auto-notifications
- Scheduled scanning (cron-based)
- Web dashboard (htmx + Tailwind + SQLite)
- Docker image
- Homebrew formula
Disclaimer
KeyHunter is designed for authorized security testing, defensive security, bug bounty programs, and educational purposes only. Always ensure you have proper authorization before scanning any target. Unauthorized access to computer systems is illegal.
License
MIT License - see LICENSE for details.