Comprehensive reference for the 46 active non-fork repos in the github.com/bellingcat org — tools Bellingcat ships as code (vs the external tools they curate, which the existing 12 category refs cover). Sections: - Power tools (auto-archiver, octosuite, telegram-phone-number-checker, snscrape, vk-url-scraper, whisperbox-transcribe, EDGAR) with install commands + key invocations - Geolocation toolbox (ShadowFinder, instagram-location-search, osm-search, geoclustering, search-grid-generator, ColourHighlighter, rgb-viz) - Satellite / Earth Engine (sar-interference-tracker, cloud-free-subregion, Multispectral Imagery Explorer, umbra-open-data-tracker, ee_forest_area_tracker) - Social-media scrapers (TikTok, Reddit, YouTube, Odysee, GETTR, Facebook, cisticola coordinator) - People search (name-variant-search, alias-generator) - Telegram (phone-checker, group-joiner, gesara-entity-viz) - Companies / finance (EDGAR, sugartrail) - Aircraft tracking (adsb-history) - Image triage (smart-image-sorter via HuggingFace) - Web-history forensics (wayback-google-analytics, uniform-timezone) - Conflict tracking (ukraine-timemap, iran-conflict-damage-proxy-map, vis-tj-kg-map-2022) - Research methodologies (RS4OSINT, open-source-research-notebooks, open-questions, quitobaquito, twitter-geocode-searches) - Council / government records (CouncilSearcher) - Persona affinity quick-pivot table for all 13 personas Each entry has stars, language, use case, persona affinity, and (where useful) the exact install + first-use commands. SKILL.md updated to reference the new file in the layout tree and "when to load" table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
20 KiB
Bellingcat Toolkit — Own Repos (Tools They Built)
The bellingcat-osint-toolkit skill's main catalog (data/all-tools.csv + 12
category refs) lists tools Bellingcat curates. This reference covers
tools Bellingcat built and ships as code — 46 active non-fork repos
across the GitHub org, sorted by use case.
Source: https://github.com/orgs/bellingcat/repositories Updated: 2026-05-02. Re-pull with
curl -s "https://api.github.com/orgs/bellingcat/repos?per_page=100".
Power tools — install and use
auto-archiver — bulk web/social-media preservation
1073★ Python. Personas: scribe, oracle, herald, sentinel.
Multi-source archiver: pulls URLs from CSV / Google Sheets / CLI, archives videos, images, social-media posts, webpages, and writes status back to the source spreadsheet. Storage backends: local, S3, Google Drive.
# Pip
pip install auto-archiver
auto-archiver --help
# Docker (preferred for heavy enrichers)
docker pull bellingcat/auto-archiver
docker run -it --rm -v secrets:/app/secrets bellingcat/auto-archiver \
--config secrets/orchestration.yaml
Companions:
auto-archiver-api(13★) — REST API to manage users / sheets / URLs and dispatch workersauto-archiver-extension(3★) — browser extension front-endauto-archiver-setup-tool(11★) — Vue front-end for the API
Docs: https://auto-archiver.readthedocs.io/.
When to reach for it: an investigation needs durable preservation of dozens-to-thousands of URLs (Telegram channels going dark, breaking-news videos before takedown, court-evidence chain).
octosuite — GitHub OSINT CLI + Python lib
1892★ Python. Personas: oracle, sentinel, neo.
Terminal toolkit for GitHub data analysis. CLI + interactive TUI + Python library all from the same package.
pip install octosuite
# CLI
octosuite user torvalds # profile
octosuite user torvalds --repos --per-page 50 # all repos
octosuite user torvalds --followers --json
octosuite repo torvalds/linux --commits
octosuite repo torvalds/linux --stargazers --export ./data
octosuite org github --members --json
octosuite search "supply chain attack" --repos
octosuite -t # interactive TUI
# Library
import octosuite
user = octosuite.User("torvalds")
exists, profile = user.exists()
if exists:
repos = user.repos(page=1, per_page=100)
When to reach for it: profiling a threat-actor's GitHub footprint, finding unpublished commits in an org, supply-chain audit on a maintainer, triangulating an alias across GH events.
telegram-phone-number-checker — phone → Telegram correlation
1695★ Python. Personas: oracle, wraith, frodo.
Given a phone number (or batch), check whether it's bound to a Telegram account. Pivot for people-search on +country-code-known leads.
pip install telegram-phone-number-checker
telegram-phone-number-checker check +12025550101
telegram-phone-number-checker batch numbers.txt
Requires Telegram API credentials (api_id + api_hash from https://my.telegram.org). Rate-limited; use moderately to avoid bans.
snscrape — multi-platform social network scraper
346★ Python. Personas: oracle, frodo, ghost.
Twitter (deprecated), Mastodon, Telegram, Reddit, Facebook, VK, Instagram, WeChat, etc. Bellingcat maintains a fork — many platforms broke after upstream changes; check repo status before relying on it.
pip install snscrape
snscrape twitter-user elonmusk
snscrape telegram-channel durov --max-results 100
vk-url-scraper — VKontakte (Russian social) scraping
53★ Python. Personas: oracle, ghost, frodo (russia).
pip install vk-url-scraper
vk-url-scraper --help
Library API also available. Useful for VK posts, photos, geotagged content, group enumeration.
whisperbox-transcribe — Whisper audio/video transcription API
67★ Python. Personas: scribe, herald, oracle.
Deploy Whisper as a service. Drop a video/audio URL, get transcript + translation. Useful when an investigation accumulates hours of foreign- language broadcast / Telegram voice notes.
git clone https://github.com/bellingcat/whisperbox-transcribe
cd whisperbox-transcribe
docker compose up -d
curl -X POST -F "url=https://..." http://localhost:8000/jobs
EDGAR — SEC corporate-data Python lib
203★ Python. Personas: ledger, frodo.
Programmatic interface to SEC EDGAR — public filings, financials, ownership.
pip install edgar-tool
edgar --help
edgar 10-K AAPL --year 2024
Used for sanction-screening, insider trading patterns, beneficial-ownership chains, ESG.
Geolocation toolbox
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
ShadowFinder |
570 | Python | Map locations where a shadow of given length could occur at date/time | oracle, frodo, centurion |
instagram-location-search |
679 | Python | Find Instagram location IDs near (lat, lon) | oracle, frodo |
osm-search |
207 | Vue | OpenStreetMap proximity search UI | oracle, frodo |
geoclustering |
45 | Python | Cluster a list of (lat,lon) points; CLI | oracle, frodo, marshal |
search-grid-generator |
13 | Vue | Quickly generate KML search grids for area-of-interest mapping | oracle, marshal |
ColourHighlighter |
4 | TypeScript | WebGL color filters / LUTs for screen-share geolocation | oracle (geo-analyst) |
rgb-viz |
4 | JavaScript | Interactive viz of an image's R/G/B channels | oracle (forensics) |
pip install bellingcat-shadowfinder
shadowfinder 1.5 --datetime "2024-03-15T14:00:00" --output map.html
pip install bellingcat-instagram-location-search
ig-location-search --lat 40.7128 --lon -74.0060
Satellite / Earth Engine
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
sar-interference-tracker |
556 | JavaScript | GEE script to detect SAR satellite radar interference | warden, marshal, centurion |
cloud-free-subregion |
59 | JavaScript | GEE app — find cloud-free Sentinel-2 imagery for an AOI | oracle, marshal |
Multispectral-Satellite-Imagery-Explorer |
13 | JavaScript | GEE app to explore Landsat-8 multispectral bands | oracle, marshal |
umbra-open-data-tracker |
33 | Python | Monitor Umbra SAR open-data catalogue, emit KML coverage | warden, marshal |
ee_forest_area_tracker |
4 | (?) | Forest-area tracking via Earth Engine | oracle, scholar |
GEE scripts: copy-paste into https://code.earthengine.google.com/.
Social-media scrapers (live status varies — verify before relying)
| Repo | ★ | Lang | Platform / use | Personas |
|---|---|---|---|---|
tiktok-hashtag-analysis |
358 | Python | Analyze hashtag co-occurrence + post stats | oracle, herald |
tiktok-timestamp |
58 | HTML | Tiny client-side TikTok video timestamp retriever | oracle |
polyphemus |
18 | Python | Odysee (alt-tech video) scraper | oracle, ghost |
gogettr |
13 | Python | GETTR public API client for archival | oracle, ghost |
facebook-downloader |
40 | Python | Public FB video downloader | oracle |
reddit-post-scraping-tool |
92 | Python | Subreddit + keyword → top posts containing keyword | oracle, ghost |
youtube-comment-scraper |
27 | Python | Scrape YT comments, find users commenting on N videos | oracle |
cisticola |
20 | Python | Coordinator for multiple scrapers + DB layer | oracle (heavy) |
People search / aliases
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
name-variant-search |
50 | JavaScript | Generate search variations of a human name | oracle, wraith |
alias-generator |
22 | JavaScript | Node module — likely aliases for a given name | oracle, wraith |
npm install -g @bellingcat/alias-generator
alias-generator "John Smith" # produces J. Smith, Smith John, etc.
Use both in tandem: feed the name through name-variant-search for
cultural/transliteration variants, then pipe each variant through your
people-search stack (Sherlock, WhatsMyName, etc.).
Telegram-specific
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
telegram-phone-number-checker |
1695 | Python | Phone → Telegram presence check | oracle, wraith, frodo |
telegram-group-joiner |
55 | (web) | Auto-join public/private Telegram groups | oracle, ghost |
gesara-entity-viz |
4 | Python | Entity viz over a GESARA-conspiracy Telegram corpus | ghost, herald |
Pair with this repo's own telegram skill (custom WAHA scraper) for
operational-scale Telegram archival.
Companies / finance
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
EDGAR |
203 | Python | SEC EDGAR Python lib (filings, ownership, financials) | ledger, frodo |
sugartrail |
76 | HTML | UK Companies House network viz — companies, officers, addresses | ledger |
sugartrail is browser-based; deploy locally for big networks. Pair with
OpenCorporates / OpenSanctions in references/companies-and-finance.md.
Aircraft / transport intel
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
adsb-history |
72 | Vue | Collect & query ADS-B aircraft history by region/altitude/type | warden, echo, frodo |
git clone https://github.com/bellingcat/adsb-history
docker compose up -d
# Then visit http://localhost:5173 for the Vue front-end
Backfill investigations on private-jet movements, military transport patterns, surveillance flights.
Image / media triage
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
smart-image-sorter |
62 | Jupyter Notebook | Zero-shot image classification via HuggingFace open-source models | oracle, sentinel |
Use case: triage thousands of OSINT-collected images by content (e.g. "weapon", "uniformed personnel", "vehicle"), then deep-dive the hits.
Web-history forensics
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
wayback-google-analytics |
234 | Python | Scrape current AND historic Google Analytics tags from sites | oracle, sentinel |
uniform-timezone |
33 | Browser | Standardize timestamps across social-media UIs | scribe, oracle |
wayback-google-analytics is gold for de-anonymizing networks of related
sites: GA tag IDs reused across domains often link sister-sites that
hide ownership.
Conflict / civilian-harm tracking
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
ukraine-timemap |
287 | JavaScript | TimeMap instance for Civilian Harm in Ukraine | centurion, frodo, marshal |
iran-conflict-damage-proxy-map |
6 | JavaScript | Iran conflict damage tracking | centurion, frodo, marshal |
vis-tj-kg-map-2022 |
3 | (?) | Tajikistan-Kyrgyzstan border-clash interactive map | centurion, frodo |
These are public TimeMap front-ends backing Bellingcat's published investigations. Reference architectures for building your own conflict trackers — fork + adapt.
Specialized / research methodologies
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
RS4OSINT |
45 | TeX | Guide to Remote Sensing for OSINT (PDF + LaTeX source) | oracle, marshal, scholar |
open-source-research-notebooks |
298 | Jupyter Notebook | Tutorial notebooks for command-line + code OSINT investigations | scholar, oracle |
open-questions |
360 | Jupyter Notebook | Difficult research projects waiting for contributors | scholar, all-osint |
o9a-product-scripts |
8 | Python | Scripts from Order of Nine Angles investigation | wraith (HUMINT) |
quitobaquito |
14 | Python | Hydrology change methodology with GEE | scholar, oracle |
twitter-geocode-searches |
26 | Python | Methodology for geofenced Twitter search | oracle |
coronavirus-aid-data |
5 | Python | Data for Covid-19 relief-fund analysis article | ledger |
who-killed-abelardo |
4 | (web) | Audio map visualization (single-investigation viz) | wraith, herald |
avoc |
59 | CSS | 2024 Tech Fellowship working repo | (browse before reach) |
open-source-research-notebooks is the best entry point for a researcher
new to Bellingcat methodology — it teaches the toolkit-via-Jupyter workflow.
Council / government records
| Repo | ★ | Lang | Use | Personas |
|---|---|---|---|---|
CouncilSearcher |
13 | Python | Find verbatim quotes from UK + Ireland council meetings | scribe, frodo |
Niche but powerful for any UK municipal-level investigation. Drop a search term, get back transcript-grounded matches.
Infrastructure / supporting
| Repo | ★ | Notes |
|---|---|---|
toolkit |
539 | The GitBook / curated-tools repo (this skill's source) |
hackathon-submission-template |
11 | Template for Bellingcat Global Hackathon |
bcat-discord-bot |
5 | Bellingcat's own Discord bot |
challenge-framework |
5 | Static-site challenge framework |
google-apps-script |
31 | Handy GAS snippets |
datasheet-server |
32 | CSV → dynamic API server |
4-year-anniversary-network |
2 | Anniversary visualization |
Persona affinity quick-pivot
| Persona | Top repos to know |
|---|---|
| Oracle | octosuite, telegram-phone-number-checker, auto-archiver, snscrape, ShadowFinder, instagram-location-search, osm-search, name-variant-search, alias-generator, smart-image-sorter, wayback-google-analytics |
| Frodo | telegram-phone-number-checker, vk-url-scraper, ShadowFinder, instagram-location-search, EDGAR, ukraine-timemap, snscrape |
| Wraith | telegram-phone-number-checker, name-variant-search, alias-generator, o9a-product-scripts |
| Sentinel | octosuite, wayback-google-analytics, auto-archiver, smart-image-sorter |
| Scribe | auto-archiver, whisperbox-transcribe, uniform-timezone, CouncilSearcher |
| Ledger | EDGAR, sugartrail, coronavirus-aid-data |
| Centurion | sar-interference-tracker, ukraine-timemap, iran-conflict-damage-proxy-map, search-grid-generator, geoclustering |
| Marshal | sar-interference-tracker, umbra-open-data-tracker, cloud-free-subregion, search-grid-generator, ukraine-timemap |
| Warden | sar-interference-tracker, umbra-open-data-tracker, adsb-history |
| Echo | adsb-history (movement signatures), snscrape (signals correlation) |
| Herald | tiktok-hashtag-analysis, gesara-entity-viz, auto-archiver, whisperbox-transcribe |
| Ghost | gesara-entity-viz, polyphemus, gogettr, snscrape, telegram-group-joiner |
| Scholar | open-source-research-notebooks, RS4OSINT, open-questions, quitobaquito |
Install patterns
Most Python repos follow:
pip install <repo-name>
<repo-name> --help
# OR
git clone https://github.com/bellingcat/<repo-name>
cd <repo-name> && pip install -e .
Earth Engine repos: open https://code.earthengine.google.com/ and paste the script. Some require enabling specific imagery collections.
Vue/JavaScript apps: npm install && npm run dev for local; docker compose up if a docker-compose.yml is present.
Update freshness
Run periodically:
curl -s "https://api.github.com/orgs/bellingcat/repos?per_page=100&sort=updated" \
| jq -r '.[] | select(.fork==false and .archived==false)
| "\(.stargazers_count)\t\(.language // "?")\t\(.name)\t\(.description // "")"' \
| sort -rn -k1 \
| head -50
Watch the org page directly: https://github.com/orgs/bellingcat/repositories?type=all.
Pitfalls
- Bellingcat's social-media scrapers (
snscrape,vk-url-scraper, etc.) break frequently after platform API changes. Always run--versionand read recent issues before relying on output for an investigation. auto-archiverenrichers (Wayback, video DL, Telegram) each have their own auth + rate limits. The full pipeline is heavy — start with one enricher to validate flow before scaling.telegram-phone-number-checkerrequires a Telegram developer account (api_id/api_hash). Excessive use will rate-limit or ban the account used; rotate.- GEE scripts need a Google account with Earth Engine access (free for research/non-profit). Quotas apply on heavy queries.
- Several repos are unlicensed or have ambiguous licenses — for derivative work check the LICENSE file. Bellingcat's official repos are predominantly MIT.
- Archived repos (12 of 62) are NOT included here — those are read-only historical references, not actively maintained.