merge: phase 14-03 frontend leaks
This commit is contained in:
556
docs/superpowers/specs/2026-04-04-keyhunter-design.md
Normal file
556
docs/superpowers/specs/2026-04-04-keyhunter-design.md
Normal file
@@ -0,0 +1,556 @@
|
||||
# KeyHunter - Design Specification
|
||||
|
||||
## Overview
|
||||
|
||||
KeyHunter is a comprehensive, modular API key scanner built in Go, focused on detecting and validating API keys from 100+ LLM/AI providers. It combines native scanning capabilities with external tool integration (TruffleHog, Gitleaks), OSINT/recon modules, a web dashboard, and Telegram bot notifications.
|
||||
|
||||
## Architecture
|
||||
|
||||
**Approach:** Plugin-based architecture. Core scanner engine with providers defined as YAML files (compile-time embedded). Single binary distribution.
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
keyhunter/
|
||||
├── cmd/keyhunter/ # CLI entrypoint (cobra)
|
||||
├── pkg/
|
||||
│ ├── engine/ # Core scanning engine
|
||||
│ │ ├── scanner.go # Orchestrator - input alir, provider'lari calistirir
|
||||
│ │ ├── matcher.go # Regex + entropy matching
|
||||
│ │ └── verifier.go # Active key verification (--verify flag)
|
||||
│ ├── provider/ # Provider registry & loader
|
||||
│ │ ├── registry.go # Provider'lari yukler ve yonetir
|
||||
│ │ ├── types.go # Provider interface tanimlari
|
||||
│ │ └── builtin/ # Compile-time embedded provider YAML'lari
|
||||
│ ├── input/ # Input source adapters
|
||||
│ │ ├── file.go # Dosya/dizin tarama
|
||||
│ │ ├── git.go # Git history/diff tarama
|
||||
│ │ ├── stdin.go # Pipe/stdin destegi
|
||||
│ │ ├── url.go # URL fetch
|
||||
│ │ └── remote.go # GitHub/GitLab API, paste siteleri
|
||||
│ ├── output/ # Output formatters
|
||||
│ │ ├── table.go # Renkli terminal tablo
|
||||
│ │ ├── json.go # JSON export
|
||||
│ │ ├── sarif.go # SARIF (CI/CD uyumlu)
|
||||
│ │ └── csv.go # CSV export
|
||||
│ ├── adapter/ # External tool parsers
|
||||
│ │ ├── trufflehog.go # TruffleHog JSON output parser
|
||||
│ │ └── gitleaks.go # Gitleaks JSON output parser
|
||||
│ ├── recon/ # OSINT/Recon engine (80+ sources)
|
||||
│ │ ├── engine.go # Recon orchestrator
|
||||
│ │ ├── ratelimit.go # Rate limiting & politeness
|
||||
│ │ │
|
||||
│ │ │ # --- IoT & Internet Search Engines ---
|
||||
│ │ ├── shodan.go # Shodan API client
|
||||
│ │ ├── censys.go # Censys API client
|
||||
│ │ ├── zoomeye.go # ZoomEye (Chinese IoT scanner)
|
||||
│ │ ├── fofa.go # FOFA (Chinese IoT scanner)
|
||||
│ │ ├── netlas.go # Netlas.io (HTTP body search)
|
||||
│ │ ├── binaryedge.go # BinaryEdge scanner
|
||||
│ │ │
|
||||
│ │ │ # --- Code Hosting & Snippets ---
|
||||
│ │ ├── github.go # GitHub code search / dorks
|
||||
│ │ ├── gitlab.go # GitLab search
|
||||
│ │ ├── gist.go # GitHub Gist search
|
||||
│ │ ├── bitbucket.go # Bitbucket code search
|
||||
│ │ ├── codeberg.go # Codeberg/Gitea search
|
||||
│ │ ├── gitea.go # Self-hosted Gitea instances
|
||||
│ │ ├── replit.go # Replit public repls
|
||||
│ │ ├── codesandbox.go # CodeSandbox projects
|
||||
│ │ ├── stackblitz.go # StackBlitz projects
|
||||
│ │ ├── codepen.go # CodePen pens
|
||||
│ │ ├── jsfiddle.go # JSFiddle snippets
|
||||
│ │ ├── glitch.go # Glitch public projects
|
||||
│ │ ├── observable.go # Observable notebooks
|
||||
│ │ ├── huggingface.go # HuggingFace Spaces/repos
|
||||
│ │ ├── kaggle.go # Kaggle notebooks/datasets
|
||||
│ │ ├── jupyter.go # nbviewer / Jupyter notebooks
|
||||
│ │ ├── gitpod.go # Gitpod workspace snapshots
|
||||
│ │ │
|
||||
│ │ │ # --- Search Engine Dorking ---
|
||||
│ │ ├── google.go # Google Custom Search / SerpAPI dorking
|
||||
│ │ ├── bing.go # Bing Web Search API dorking
|
||||
│ │ ├── duckduckgo.go # DuckDuckGo search
|
||||
│ │ ├── yandex.go # Yandex XML Search
|
||||
│ │ ├── brave.go # Brave Search API
|
||||
│ │ │
|
||||
│ │ │ # --- Paste Sites ---
|
||||
│ │ ├── paste.go # Multi-paste aggregator (pastebin, dpaste, paste.ee, rentry, hastebin, ix.io, etc.)
|
||||
│ │ │
|
||||
│ │ │ # --- Package Registries ---
|
||||
│ │ ├── npm.go # npm registry scanning
|
||||
│ │ ├── pypi.go # PyPI package scanning
|
||||
│ │ ├── rubygems.go # RubyGems scanning
|
||||
│ │ ├── crates.go # crates.io (Rust)
|
||||
│ │ ├── maven.go # Maven Central (Java)
|
||||
│ │ ├── nuget.go # NuGet (.NET)
|
||||
│ │ ├── packagist.go # Packagist (PHP)
|
||||
│ │ ├── goproxy.go # Go module proxy
|
||||
│ │ │
|
||||
│ │ │ # --- Container & Infra ---
|
||||
│ │ ├── docker.go # Docker Hub image/layer scanning
|
||||
│ │ ├── kubernetes.go # Exposed K8s dashboards & configs
|
||||
│ │ ├── terraform.go # Terraform state files & registry
|
||||
│ │ ├── helm.go # Artifact Hub / Helm charts
|
||||
│ │ ├── ansible.go # Ansible Galaxy collections
|
||||
│ │ │
|
||||
│ │ │ # --- Cloud Storage ---
|
||||
│ │ ├── s3.go # AWS S3 bucket enumeration
|
||||
│ │ ├── gcs.go # Google Cloud Storage buckets
|
||||
│ │ ├── azureblob.go # Azure Blob Storage
|
||||
│ │ ├── spaces.go # DigitalOcean Spaces
|
||||
│ │ ├── backblaze.go # Backblaze B2
|
||||
│ │ ├── minio.go # Self-hosted MinIO instances
|
||||
│ │ ├── grayhat.go # GrayHatWarfare (bucket search engine)
|
||||
│ │ │
|
||||
│ │ │ # --- CI/CD Log Leaks ---
|
||||
│ │ ├── travisci.go # Travis CI public build logs
|
||||
│ │ ├── circleci.go # CircleCI build logs
|
||||
│ │ ├── ghactions.go # GitHub Actions workflow logs
|
||||
│ │ ├── jenkins.go # Exposed Jenkins instances
|
||||
│ │ ├── gitlabci.go # GitLab CI/CD pipeline logs
|
||||
│ │ │
|
||||
│ │ │ # --- Web Archives ---
|
||||
│ │ ├── wayback.go # Wayback Machine CDX API
|
||||
│ │ ├── commoncrawl.go # CommonCrawl index & WARC
|
||||
│ │ │
|
||||
│ │ │ # --- Forums & Documentation ---
|
||||
│ │ ├── stackoverflow.go # Stack Overflow / Stack Exchange API
|
||||
│ │ ├── reddit.go # Reddit search
|
||||
│ │ ├── hackernews.go # HN Algolia API
|
||||
│ │ ├── devto.go # dev.to articles
|
||||
│ │ ├── medium.go # Medium articles
|
||||
│ │ ├── telegram_recon.go # Telegram public channels
|
||||
│ │ ├── discord.go # Discord indexed content
|
||||
│ │ │
|
||||
│ │ │ # --- Collaboration Tools ---
|
||||
│ │ ├── notion.go # Notion public pages
|
||||
│ │ ├── confluence.go # Confluence public spaces
|
||||
│ │ ├── trello.go # Trello public boards
|
||||
│ │ ├── googledocs.go # Google Docs/Sheets public
|
||||
│ │ │
|
||||
│ │ │ # --- Frontend & JS Leaks ---
|
||||
│ │ ├── sourcemaps.go # JS source map extraction
|
||||
│ │ ├── webpack.go # Webpack/Vite bundle scanning
|
||||
│ │ ├── dotenv_web.go # Exposed .env files on web servers
|
||||
│ │ ├── swagger.go # Exposed Swagger/OpenAPI docs
|
||||
│ │ ├── deploys.go # Vercel/Netlify preview deployments
|
||||
│ │ │
|
||||
│ │ │ # --- Log Aggregators ---
|
||||
│ │ ├── elasticsearch.go # Exposed Elasticsearch/Kibana
|
||||
│ │ ├── grafana.go # Exposed Grafana dashboards
|
||||
│ │ ├── sentry.go # Exposed Sentry instances
|
||||
│ │ │
|
||||
│ │ │ # --- Threat Intelligence ---
|
||||
│ │ ├── virustotal.go # VirusTotal file/URL search
|
||||
│ │ ├── intelx.go # Intelligence X aggregated search
|
||||
│ │ ├── urlhaus.go # URLhaus abuse.ch
|
||||
│ │ │
|
||||
│ │ │ # --- Mobile Apps ---
|
||||
│ │ ├── apk.go # APK download & decompile scanning
|
||||
│ │ │
|
||||
│ │ │ # --- DNS/Subdomain ---
|
||||
│ │ ├── crtsh.go # Certificate Transparency (crt.sh)
|
||||
│ │ ├── subdomain.go # Subdomain config endpoint probing
|
||||
│ │ │
|
||||
│ │ │ # --- API Marketplaces ---
|
||||
│ │ ├── postman.go # Postman public collections/workspaces
|
||||
│ │ ├── swaggerhub.go # SwaggerHub published APIs
|
||||
│ │ └── rapidapi.go # RapidAPI public endpoints
|
||||
│ │
|
||||
│ ├── dorks/ # Dork management
|
||||
│ │ ├── loader.go # YAML dork loader
|
||||
│ │ ├── runner.go # Dork execution engine
|
||||
│ │ └── builtin/ # Embedded dork YAML'lari
|
||||
│ ├── notify/ # Notification modulleri
|
||||
│ │ ├── telegram.go # Telegram bot
|
||||
│ │ ├── webhook.go # Generic webhook
|
||||
│ │ └── slack.go # Slack
|
||||
│ └── web/ # Web dashboard
|
||||
│ ├── server.go # Embedded HTTP server
|
||||
│ ├── api.go # REST API
|
||||
│ └── static/ # Frontend assets (htmx + tailwind)
|
||||
├── providers/ # Provider YAML definitions (embed edilir)
|
||||
│ ├── openai.yaml
|
||||
│ ├── anthropic.yaml
|
||||
│ └── ... (108 provider)
|
||||
├── dorks/ # Dork YAML definitions (embed edilir)
|
||||
│ ├── github.yaml # GitHub code search dorks
|
||||
│ ├── gitlab.yaml # GitLab search dorks
|
||||
│ ├── shodan.yaml # Shodan IoT dorks
|
||||
│ ├── censys.yaml # Censys dorks
|
||||
│ ├── zoomeye.yaml # ZoomEye dorks
|
||||
│ ├── fofa.yaml # FOFA dorks
|
||||
│ ├── google.yaml # Google dorking queries
|
||||
│ ├── bing.yaml # Bing dorking queries
|
||||
│ └── generic.yaml # Multi-source keyword dorks
|
||||
├── configs/ # Ornek config dosyalari
|
||||
└── docs/
|
||||
```
|
||||
|
||||
### Data Flow
|
||||
|
||||
```
|
||||
Input Source -> Scanner Engine -> Provider Matcher -> (optional) Verifier -> Output Formatter + Notifier
|
||||
-> SQLite DB (persist)
|
||||
-> Web Dashboard (serve)
|
||||
```
|
||||
|
||||
## Provider YAML Schema
|
||||
|
||||
```yaml
|
||||
id: string # Unique provider ID
|
||||
name: string # Display name
|
||||
category: enum # frontier | mid-tier | emerging | chinese | infrastructure | gateway | self-hosted
|
||||
website: string # API base URL
|
||||
confidence: enum # high | medium | low
|
||||
|
||||
patterns:
|
||||
- id: string # Unique pattern ID
|
||||
name: string # Human-readable name
|
||||
regex: string # Detection regex
|
||||
confidence: enum # high | medium | low
|
||||
description: string # Pattern description
|
||||
|
||||
keywords: []string # Pre-filtering keywords (performance optimization)
|
||||
|
||||
verify:
|
||||
enabled: bool
|
||||
method: string # HTTP method
|
||||
url: string # Verification endpoint
|
||||
headers: map # Headers with {{key}} template
|
||||
success_codes: []int
|
||||
failure_codes: []int
|
||||
extract: # Additional info extraction on success
|
||||
- field: string
|
||||
path: string # JSON path
|
||||
|
||||
metadata:
|
||||
docs: string # API docs URL
|
||||
key_url: string # Key management URL
|
||||
env_vars: []string # Common environment variable names
|
||||
revoke_url: string # Key revocation URL
|
||||
```
|
||||
|
||||
## CLI Command Structure
|
||||
|
||||
### Core Commands
|
||||
|
||||
```bash
|
||||
# Scanning
|
||||
keyhunter scan path <dir>
|
||||
keyhunter scan file <file>
|
||||
keyhunter scan git <repo> [--since=<duration>]
|
||||
keyhunter scan stdin
|
||||
keyhunter scan url <url>
|
||||
keyhunter scan clipboard
|
||||
|
||||
# Verification
|
||||
keyhunter verify <key>
|
||||
keyhunter verify --file <keyfile>
|
||||
|
||||
# External Tool Import
|
||||
keyhunter import trufflehog <json>
|
||||
keyhunter import gitleaks <json>
|
||||
keyhunter import generic --format=csv <file>
|
||||
|
||||
# OSINT/Recon — IoT & Internet Scanners
|
||||
keyhunter recon shodan [--query|--dork]
|
||||
keyhunter recon censys [--query]
|
||||
keyhunter recon zoomeye [--query]
|
||||
keyhunter recon fofa [--query]
|
||||
keyhunter recon netlas [--query]
|
||||
keyhunter recon binaryedge [--query]
|
||||
|
||||
# OSINT/Recon — Code Hosting & Snippets
|
||||
keyhunter recon github [--dork=auto|custom]
|
||||
keyhunter recon gitlab [--dork=auto|custom]
|
||||
keyhunter recon gist [--query]
|
||||
keyhunter recon bitbucket [--query|--workspace]
|
||||
keyhunter recon codeberg [--query]
|
||||
keyhunter recon gitea [--instances-from=shodan|file]
|
||||
keyhunter recon replit [--query]
|
||||
keyhunter recon codesandbox [--query]
|
||||
keyhunter recon stackblitz [--query]
|
||||
keyhunter recon codepen [--query]
|
||||
keyhunter recon jsfiddle [--query]
|
||||
keyhunter recon glitch [--query]
|
||||
keyhunter recon huggingface [--query|--spaces|--repos]
|
||||
keyhunter recon kaggle [--query|--notebooks]
|
||||
keyhunter recon jupyter [--query]
|
||||
keyhunter recon observable [--query]
|
||||
|
||||
# OSINT/Recon — Search Engine Dorking
|
||||
keyhunter recon google [--dork=auto|custom]
|
||||
keyhunter recon bing [--dork=auto|custom]
|
||||
keyhunter recon duckduckgo [--query]
|
||||
keyhunter recon yandex [--query]
|
||||
keyhunter recon brave [--query]
|
||||
|
||||
# OSINT/Recon — Paste Sites
|
||||
keyhunter recon paste [--sources=pastebin,dpaste,paste.ee,rentry,hastebin,ix.io,all]
|
||||
|
||||
# OSINT/Recon — Package Registries
|
||||
keyhunter recon npm [--query|--recent]
|
||||
keyhunter recon pypi [--query|--recent]
|
||||
keyhunter recon rubygems [--query]
|
||||
keyhunter recon crates [--query]
|
||||
keyhunter recon maven [--query]
|
||||
keyhunter recon nuget [--query]
|
||||
keyhunter recon packagist [--query]
|
||||
keyhunter recon goproxy [--query]
|
||||
|
||||
# OSINT/Recon — Container & Infrastructure
|
||||
keyhunter recon docker [--query|--image|--layers]
|
||||
keyhunter recon kubernetes [--shodan|--github]
|
||||
keyhunter recon terraform [--github|--registry]
|
||||
keyhunter recon helm [--query]
|
||||
keyhunter recon ansible [--query]
|
||||
|
||||
# OSINT/Recon — Cloud Storage
|
||||
keyhunter recon s3 [--wordlist|--domain]
|
||||
keyhunter recon gcs [--wordlist|--domain]
|
||||
keyhunter recon azure [--wordlist|--domain]
|
||||
keyhunter recon spaces [--wordlist]
|
||||
keyhunter recon minio [--shodan]
|
||||
keyhunter recon grayhat [--query] # GrayHatWarfare bucket search
|
||||
|
||||
# OSINT/Recon — CI/CD Logs
|
||||
keyhunter recon travis [--org|--repo]
|
||||
keyhunter recon circleci [--org|--repo]
|
||||
keyhunter recon ghactions [--org|--repo]
|
||||
keyhunter recon jenkins [--shodan|--url]
|
||||
keyhunter recon gitlabci [--project]
|
||||
|
||||
# OSINT/Recon — Web Archives
|
||||
keyhunter recon wayback [--domain|--url]
|
||||
keyhunter recon commoncrawl [--domain|--pattern]
|
||||
|
||||
# OSINT/Recon — Forums & Documentation
|
||||
keyhunter recon stackoverflow [--query]
|
||||
keyhunter recon reddit [--query|--subreddit]
|
||||
keyhunter recon hackernews [--query]
|
||||
keyhunter recon devto [--query|--tag]
|
||||
keyhunter recon medium [--query]
|
||||
keyhunter recon telegram-groups [--channel|--query]
|
||||
|
||||
# OSINT/Recon — Collaboration Tools
|
||||
keyhunter recon notion [--query] # Google dorking
|
||||
keyhunter recon confluence [--shodan|--url]
|
||||
keyhunter recon trello [--query]
|
||||
keyhunter recon googledocs [--query] # Google dorking
|
||||
|
||||
# OSINT/Recon — Frontend & JS Leaks
|
||||
keyhunter recon sourcemaps [--domain|--url]
|
||||
keyhunter recon webpack [--domain|--url]
|
||||
keyhunter recon dotenv [--domain-list|--url] # Exposed .env files
|
||||
keyhunter recon swagger [--shodan|--domain]
|
||||
keyhunter recon deploys [--domain] # Vercel/Netlify previews
|
||||
|
||||
# OSINT/Recon — Log Aggregators
|
||||
keyhunter recon elasticsearch [--shodan|--url]
|
||||
keyhunter recon grafana [--shodan|--url]
|
||||
keyhunter recon sentry [--shodan|--url]
|
||||
|
||||
# OSINT/Recon — Threat Intelligence
|
||||
keyhunter recon virustotal [--query]
|
||||
keyhunter recon intelx [--query]
|
||||
keyhunter recon urlhaus [--query]
|
||||
|
||||
# OSINT/Recon — Mobile Apps
|
||||
keyhunter recon apk [--package|--query|--file]
|
||||
|
||||
# OSINT/Recon — DNS/Subdomain
|
||||
keyhunter recon crtsh [--domain]
|
||||
keyhunter recon subdomain [--domain] [--probe-configs]
|
||||
|
||||
# OSINT/Recon — API Marketplaces
|
||||
keyhunter recon postman [--query|--workspace]
|
||||
keyhunter recon swaggerhub [--query]
|
||||
|
||||
# OSINT/Recon — Full Sweep
|
||||
keyhunter recon full [--providers] [--categories=all|code|cloud|forums|cicd|...]
|
||||
|
||||
# Dork Management
|
||||
keyhunter dorks list [--source]
|
||||
keyhunter dorks add <source> <query>
|
||||
keyhunter dorks run <source> [--category]
|
||||
keyhunter dorks export
|
||||
|
||||
# Key Management (full key access)
|
||||
keyhunter keys list [--unmask] [--provider=X] [--status=active|revoked]
|
||||
keyhunter keys show <id>
|
||||
keyhunter keys export --format=json|csv
|
||||
keyhunter keys copy <id>
|
||||
keyhunter keys verify <id>
|
||||
keyhunter keys delete <id>
|
||||
|
||||
# Provider Management
|
||||
keyhunter providers list [--category]
|
||||
keyhunter providers info <id>
|
||||
keyhunter providers stats
|
||||
|
||||
# Web Dashboard & Telegram
|
||||
keyhunter serve [--port] [--telegram]
|
||||
|
||||
# Scheduled Scanning
|
||||
keyhunter schedule add --name --cron --command --notify
|
||||
keyhunter schedule list
|
||||
keyhunter schedule remove <name>
|
||||
|
||||
# Config & Hooks
|
||||
keyhunter config init
|
||||
keyhunter config set <key> <value>
|
||||
keyhunter hook install
|
||||
keyhunter hook uninstall
|
||||
```
|
||||
|
||||
### Scan Flags
|
||||
|
||||
```
|
||||
--providers=<list> Filter by provider IDs
|
||||
--category=<cat> Filter by provider category
|
||||
--confidence=<level> Minimum confidence level
|
||||
--exclude=<patterns> Exclude file patterns
|
||||
--verify Enable active key verification
|
||||
--verify-timeout=<dur> Verification timeout (default: 10s)
|
||||
--workers=<n> Parallel workers (default: CPU count)
|
||||
--output=<format> Output format: table|json|sarif|csv
|
||||
--unmask Show full API keys without masking (default: masked)
|
||||
--notify=<channel> Send results to: telegram|webhook|slack
|
||||
--stealth Stealth mode: UA rotation, increased delays
|
||||
--respect-robots Respect robots.txt (default: true)
|
||||
```
|
||||
|
||||
### Exit Codes
|
||||
|
||||
- `0` — Clean, no keys found
|
||||
- `1` — Keys found
|
||||
- `2` — Error
|
||||
|
||||
## Dork YAML Schema
|
||||
|
||||
```yaml
|
||||
source: string # github | gitlab | shodan | censys
|
||||
dorks:
|
||||
- id: string
|
||||
query: string # Search query
|
||||
description: string
|
||||
providers: []string # Optional: related provider IDs
|
||||
```
|
||||
|
||||
Built-in dork categories: GitHub (code search, filename, language), GitLab (snippets, projects), Shodan (exposed proxies, dashboards), Censys (HTTP body search).
|
||||
|
||||
## Web Dashboard
|
||||
|
||||
**Stack:** Go embed + htmx + Tailwind CSS (zero JS framework dependency)
|
||||
|
||||
**Pages:**
|
||||
- `/` — Dashboard overview with summary statistics
|
||||
- `/scans` — Scan history list
|
||||
- `/scans/:id` — Scan detail with found keys
|
||||
- `/keys` — All found keys (filterable table)
|
||||
- `/keys/:id` — Key detail (provider, confidence, verify status)
|
||||
- `/recon` — OSINT scan launcher and results
|
||||
- `/providers` — Provider list and statistics
|
||||
- `/dorks` — Dork management
|
||||
- `/settings` — Configuration (tokens, API keys)
|
||||
- `/api/v1/*` — REST API for programmatic access
|
||||
|
||||
**Storage:** SQLite (embedded, AES-256 encrypted)
|
||||
|
||||
## Telegram Bot
|
||||
|
||||
**Commands:**
|
||||
- `/scan <url/path>` — Remote scan trigger
|
||||
- `/verify <key>` — Key verification
|
||||
- `/recon github <dork>` — GitHub dork execution
|
||||
- `/status` — Active scan status
|
||||
- `/stats` — General statistics
|
||||
- `/subscribe` — Auto-notification on new key findings
|
||||
- `/unsubscribe` — Disable notifications
|
||||
- `/providers` — Provider list
|
||||
- `/help` — Help
|
||||
|
||||
**Auto-notifications:** New key found, recon complete, scheduled scan results, verify results.
|
||||
|
||||
## LLM Provider Coverage (108 Providers)
|
||||
|
||||
### Tier 1 — Frontier (12)
|
||||
OpenAI, Anthropic, Google AI (Gemini), Google Vertex AI, AWS Bedrock, Azure OpenAI, Meta AI (Llama API), xAI (Grok), Cohere, Mistral AI, Inflection AI, AI21 Labs
|
||||
|
||||
### Tier 2 — Inference Platforms (14)
|
||||
Together AI, Fireworks AI, Groq, Replicate, Anyscale, DeepInfra, Lepton AI, Modal, Baseten, Cerebrium, NovitaAI, Sambanova, OctoAI, Friendli AI
|
||||
|
||||
### Tier 3 — Specialized/Vertical (12)
|
||||
Perplexity, You.com, Voyage AI, Jina AI, Unstructured, AssemblyAI, Deepgram, ElevenLabs, Stability AI, Runway ML, Midjourney, HuggingFace
|
||||
|
||||
### Tier 4 — Chinese/Regional (16)
|
||||
DeepSeek, Baichuan, Zhipu AI (GLM), Moonshot AI (Kimi), Yi (01.AI), Qwen (Alibaba Cloud), Baidu (ERNIE/Wenxin), ByteDance (Doubao), SenseTime, iFlytek (Spark), MiniMax, Stepfun, 360 AI, Kuaishou (Kling), Tencent Hunyuan, SiliconFlow
|
||||
|
||||
### Tier 5 — Infrastructure/Gateway (11)
|
||||
Cloudflare AI, Vercel AI, LiteLLM, Portkey, Helicone, OpenRouter, Martian, AI Gateway (Kong), BricksAI, Aether, Not Diamond
|
||||
|
||||
### Tier 6 — Emerging/Niche (15)
|
||||
Reka AI, Aleph Alpha, Writer, Jasper AI, Typeface, Comet ML, Weights & Biases, LangSmith (LangChain), Pinecone, Weaviate, Qdrant, Chroma, Milvus, Neon AI, Lamini
|
||||
|
||||
### Tier 7 — Code & Dev Tools (10)
|
||||
GitHub Copilot, Cursor, Tabnine, Codeium/Windsurf, Sourcegraph Cody, Amazon CodeWhisperer, Replit AI, Codestral (Mistral), IBM watsonx.ai, Oracle AI
|
||||
|
||||
### Tier 8 — Self-Hosted/Open Infra (10)
|
||||
Ollama, vLLM, LocalAI, LM Studio, llama.cpp, GPT4All, text-generation-webui, TensorRT-LLM, Triton Inference Server, Jan AI
|
||||
|
||||
### Tier 9 — Enterprise/Legacy (8)
|
||||
Salesforce Einstein, ServiceNow AI, SAP AI Core, Palantir AIP, Databricks (DBRX), Snowflake Cortex, Oracle Generative AI, HPE GreenLake AI
|
||||
|
||||
## Performance
|
||||
|
||||
- Worker pool: parallel scanning (default: CPU count, configurable via `--workers=N`)
|
||||
- Keyword pre-filtering before regex (10x speedup on large files)
|
||||
- `mmap` for large file reading
|
||||
- Delta-based git scanning (only changed files between commits)
|
||||
- Source-based rate limiting in recon module
|
||||
|
||||
## Key Visibility & Access
|
||||
|
||||
Full (unmasked) API keys are accessible through multiple channels:
|
||||
|
||||
1. **CLI `--unmask` flag** — `keyhunter scan path . --unmask` shows full keys in terminal table
|
||||
2. **JSON/CSV/SARIF export** — Always contains full keys: `keyhunter scan path . -o json`
|
||||
3. **`keyhunter keys` command** — Dedicated key management:
|
||||
- `keyhunter keys list` — all found keys (masked by default)
|
||||
- `keyhunter keys list --unmask` — all found keys (full)
|
||||
- `keyhunter keys show <id>` — single key full detail (always unmasked)
|
||||
- `keyhunter keys export --format=json` — export all keys with full values
|
||||
- `keyhunter keys copy <id>` — copy full key to clipboard
|
||||
- `keyhunter keys verify <id>` — verify and show full detail
|
||||
4. **Web Dashboard** — `/keys/:id` detail page with "Reveal Key" toggle button (auth required)
|
||||
5. **Telegram Bot** — `/key <id>` returns full key detail in private chat
|
||||
6. **SQLite DB** — Full keys always stored (encrypted), queryable via API
|
||||
|
||||
Default behavior: masked in terminal for shoulder-surfing protection.
|
||||
When you need the real key (to test, verify, or report): `--unmask`, JSON export, or `keys show`.
|
||||
|
||||
## Security
|
||||
|
||||
- Key masking in terminal output by default (first 8 + last 4 chars, middle `***`)
|
||||
- `--unmask` flag to reveal full keys when needed
|
||||
- SQLite database AES-256 encrypted (full keys stored encrypted)
|
||||
- Telegram/Shodan tokens encrypted in config
|
||||
- No key values written to logs during `--verify`
|
||||
- Optional basic auth / token auth for web dashboard
|
||||
|
||||
## Rate Limiting & Ethics
|
||||
|
||||
- GitHub API: 30 req/min (auth), 10 req/min (unauth)
|
||||
- Shodan/Censys: respect API plan limits
|
||||
- Paste sites: 1 req/2sec politeness delay
|
||||
- `--stealth` flag: UA rotation, increased spacing
|
||||
- `--respect-robots`: robots.txt compliance (default: on)
|
||||
|
||||
## Error Handling
|
||||
|
||||
- Verify timeout: 10s default, configurable
|
||||
- Network errors: 3 retries with exponential backoff
|
||||
- Partial results: failed sources don't block others
|
||||
- Graceful degradation on all external dependencies
|
||||
Reference in New Issue
Block a user