feat(bellingcat-osint-toolkit): add references/bellingcat-own-repos.md
Comprehensive reference for the 46 active non-fork repos in the github.com/bellingcat org — tools Bellingcat ships as code (vs the external tools they curate, which the existing 12 category refs cover). Sections: - Power tools (auto-archiver, octosuite, telegram-phone-number-checker, snscrape, vk-url-scraper, whisperbox-transcribe, EDGAR) with install commands + key invocations - Geolocation toolbox (ShadowFinder, instagram-location-search, osm-search, geoclustering, search-grid-generator, ColourHighlighter, rgb-viz) - Satellite / Earth Engine (sar-interference-tracker, cloud-free-subregion, Multispectral Imagery Explorer, umbra-open-data-tracker, ee_forest_area_tracker) - Social-media scrapers (TikTok, Reddit, YouTube, Odysee, GETTR, Facebook, cisticola coordinator) - People search (name-variant-search, alias-generator) - Telegram (phone-checker, group-joiner, gesara-entity-viz) - Companies / finance (EDGAR, sugartrail) - Aircraft tracking (adsb-history) - Image triage (smart-image-sorter via HuggingFace) - Web-history forensics (wayback-google-analytics, uniform-timezone) - Conflict tracking (ukraine-timemap, iran-conflict-damage-proxy-map, vis-tj-kg-map-2022) - Research methodologies (RS4OSINT, open-source-research-notebooks, open-questions, quitobaquito, twitter-geocode-searches) - Council / government records (CouncilSearcher) - Persona affinity quick-pivot table for all 13 personas Each entry has stars, language, use case, persona affinity, and (where useful) the exact install + first-use commands. SKILL.md updated to reference the new file in the layout tree and "when to load" table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -30,18 +30,22 @@ bellingcat-osint-toolkit/
|
||||
│ ├── refresh.sh pull fresh CSV from upstream nightly release
|
||||
│ └── regenerate-references.py rebuild references/*.md tables from CSV
|
||||
└── references/
|
||||
├── archiving.md 8 tools
|
||||
├── companies-and-finance.md 26 tools
|
||||
├── conflict.md 6 tools
|
||||
├── data-org-and-analysis.md 11 tools
|
||||
├── environment-and-wildlife.md 24 tools
|
||||
├── geolocation.md 9 tools
|
||||
├── image-video.md 35 tools
|
||||
├── maps-and-satellites.md 83 tools
|
||||
├── people.md 33 tools
|
||||
├── social-media.md 63 tools
|
||||
├── transport.md 27 tools
|
||||
└── websites.md 17 tools
|
||||
├── archiving.md 8 tools (curated externals)
|
||||
├── companies-and-finance.md 26 tools (curated externals)
|
||||
├── conflict.md 6 tools (curated externals)
|
||||
├── data-org-and-analysis.md 11 tools (curated externals)
|
||||
├── environment-and-wildlife.md 24 tools (curated externals)
|
||||
├── geolocation.md 9 tools (curated externals)
|
||||
├── image-video.md 35 tools (curated externals)
|
||||
├── maps-and-satellites.md 83 tools (curated externals)
|
||||
├── people.md 33 tools (curated externals)
|
||||
├── social-media.md 63 tools (curated externals)
|
||||
├── transport.md 27 tools (curated externals)
|
||||
├── websites.md 17 tools (curated externals)
|
||||
└── bellingcat-own-repos.md 46 active repos Bellingcat ships
|
||||
(octosuite, auto-archiver, EDGAR,
|
||||
ShadowFinder, telegram-phone-checker,
|
||||
sar-interference-tracker, etc.)
|
||||
```
|
||||
|
||||
For ad-hoc queries the agent can grep the CSV directly:
|
||||
@@ -73,6 +77,7 @@ bash scripts/refresh.sh && python3 scripts/regenerate-references.py
|
||||
| Wildlife trafficking, environmental crime, terrain | `references/environment-and-wildlife.md` | 24 |
|
||||
| Preserve a webpage, video, social post | `references/archiving.md` | 8 |
|
||||
| Clean / merge / publish data; build the investigation file | `references/data-org-and-analysis.md` | 11 |
|
||||
| Bellingcat's OWN open-source tools (octosuite, auto-archiver, EDGAR, ShadowFinder, etc.) | `references/bellingcat-own-repos.md` | 46 active repos |
|
||||
|
||||
## Persona affinity
|
||||
|
||||
|
||||
@@ -0,0 +1,374 @@
|
||||
# Bellingcat Toolkit — Own Repos (Tools They Built)
|
||||
|
||||
The `bellingcat-osint-toolkit` skill's main catalog (`data/all-tools.csv` + 12
|
||||
category refs) lists tools Bellingcat **curates**. This reference covers
|
||||
tools Bellingcat **built and ships** as code — 46 active non-fork repos
|
||||
across the GitHub org, sorted by use case.
|
||||
|
||||
> Source: <https://github.com/orgs/bellingcat/repositories>
|
||||
> Updated: 2026-05-02. Re-pull with
|
||||
> `curl -s "https://api.github.com/orgs/bellingcat/repos?per_page=100"`.
|
||||
|
||||
## Power tools — install and use
|
||||
|
||||
### auto-archiver — bulk web/social-media preservation
|
||||
1073★ Python. **Personas: scribe, oracle, herald, sentinel.**
|
||||
|
||||
Multi-source archiver: pulls URLs from CSV / Google Sheets / CLI, archives
|
||||
videos, images, social-media posts, webpages, and writes status back to
|
||||
the source spreadsheet. Storage backends: local, S3, Google Drive.
|
||||
|
||||
```bash
|
||||
# Pip
|
||||
pip install auto-archiver
|
||||
auto-archiver --help
|
||||
|
||||
# Docker (preferred for heavy enrichers)
|
||||
docker pull bellingcat/auto-archiver
|
||||
docker run -it --rm -v secrets:/app/secrets bellingcat/auto-archiver \
|
||||
--config secrets/orchestration.yaml
|
||||
```
|
||||
|
||||
Companions:
|
||||
- `auto-archiver-api` (13★) — REST API to manage users / sheets / URLs and dispatch workers
|
||||
- `auto-archiver-extension` (3★) — browser extension front-end
|
||||
- `auto-archiver-setup-tool` (11★) — Vue front-end for the API
|
||||
|
||||
Docs: <https://auto-archiver.readthedocs.io/>.
|
||||
|
||||
When to reach for it: an investigation needs durable preservation of
|
||||
dozens-to-thousands of URLs (Telegram channels going dark, breaking-news
|
||||
videos before takedown, court-evidence chain).
|
||||
|
||||
### octosuite — GitHub OSINT CLI + Python lib
|
||||
1892★ Python. **Personas: oracle, sentinel, neo.**
|
||||
|
||||
Terminal toolkit for GitHub data analysis. CLI + interactive TUI + Python
|
||||
library all from the same package.
|
||||
|
||||
```bash
|
||||
pip install octosuite
|
||||
|
||||
# CLI
|
||||
octosuite user torvalds # profile
|
||||
octosuite user torvalds --repos --per-page 50 # all repos
|
||||
octosuite user torvalds --followers --json
|
||||
octosuite repo torvalds/linux --commits
|
||||
octosuite repo torvalds/linux --stargazers --export ./data
|
||||
octosuite org github --members --json
|
||||
octosuite search "supply chain attack" --repos
|
||||
octosuite -t # interactive TUI
|
||||
|
||||
# Library
|
||||
import octosuite
|
||||
user = octosuite.User("torvalds")
|
||||
exists, profile = user.exists()
|
||||
if exists:
|
||||
repos = user.repos(page=1, per_page=100)
|
||||
```
|
||||
|
||||
When to reach for it: profiling a threat-actor's GitHub footprint, finding
|
||||
unpublished commits in an org, supply-chain audit on a maintainer,
|
||||
triangulating an alias across GH events.
|
||||
|
||||
### telegram-phone-number-checker — phone → Telegram correlation
|
||||
1695★ Python. **Personas: oracle, wraith, frodo.**
|
||||
|
||||
Given a phone number (or batch), check whether it's bound to a Telegram
|
||||
account. Pivot for people-search on +country-code-known leads.
|
||||
|
||||
```bash
|
||||
pip install telegram-phone-number-checker
|
||||
telegram-phone-number-checker check +12025550101
|
||||
telegram-phone-number-checker batch numbers.txt
|
||||
```
|
||||
|
||||
Requires Telegram API credentials (api_id + api_hash from
|
||||
<https://my.telegram.org>). Rate-limited; use moderately to avoid bans.
|
||||
|
||||
### snscrape — multi-platform social network scraper
|
||||
346★ Python. **Personas: oracle, frodo, ghost.**
|
||||
|
||||
Twitter (deprecated), Mastodon, Telegram, Reddit, Facebook, VK, Instagram,
|
||||
WeChat, etc. Bellingcat maintains a fork — many platforms broke after
|
||||
upstream changes; check repo status before relying on it.
|
||||
|
||||
```bash
|
||||
pip install snscrape
|
||||
snscrape twitter-user elonmusk
|
||||
snscrape telegram-channel durov --max-results 100
|
||||
```
|
||||
|
||||
### vk-url-scraper — VKontakte (Russian social) scraping
|
||||
53★ Python. **Personas: oracle, ghost, frodo (russia).**
|
||||
|
||||
```bash
|
||||
pip install vk-url-scraper
|
||||
vk-url-scraper --help
|
||||
```
|
||||
|
||||
Library API also available. Useful for VK posts, photos, geotagged
|
||||
content, group enumeration.
|
||||
|
||||
### whisperbox-transcribe — Whisper audio/video transcription API
|
||||
67★ Python. **Personas: scribe, herald, oracle.**
|
||||
|
||||
Deploy Whisper as a service. Drop a video/audio URL, get transcript +
|
||||
translation. Useful when an investigation accumulates hours of foreign-
|
||||
language broadcast / Telegram voice notes.
|
||||
|
||||
```bash
|
||||
git clone https://github.com/bellingcat/whisperbox-transcribe
|
||||
cd whisperbox-transcribe
|
||||
docker compose up -d
|
||||
curl -X POST -F "url=https://..." http://localhost:8000/jobs
|
||||
```
|
||||
|
||||
### EDGAR — SEC corporate-data Python lib
|
||||
203★ Python. **Personas: ledger, frodo.**
|
||||
|
||||
Programmatic interface to SEC EDGAR — public filings, financials, ownership.
|
||||
|
||||
```bash
|
||||
pip install edgar-tool
|
||||
edgar --help
|
||||
edgar 10-K AAPL --year 2024
|
||||
```
|
||||
|
||||
Used for sanction-screening, insider trading patterns, beneficial-ownership
|
||||
chains, ESG.
|
||||
|
||||
## Geolocation toolbox
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| --------------------------------- | ---- | ---------- | ---------------------------------------------------------------- | ---------------------- |
|
||||
| `ShadowFinder` | 570 | Python | Map locations where a shadow of given length could occur at date/time | oracle, frodo, centurion |
|
||||
| `instagram-location-search` | 679 | Python | Find Instagram location IDs near (lat, lon) | oracle, frodo |
|
||||
| `osm-search` | 207 | Vue | OpenStreetMap proximity search UI | oracle, frodo |
|
||||
| `geoclustering` | 45 | Python | Cluster a list of (lat,lon) points; CLI | oracle, frodo, marshal |
|
||||
| `search-grid-generator` | 13 | Vue | Quickly generate KML search grids for area-of-interest mapping | oracle, marshal |
|
||||
| `ColourHighlighter` | 4 | TypeScript | WebGL color filters / LUTs for screen-share geolocation | oracle (geo-analyst) |
|
||||
| `rgb-viz` | 4 | JavaScript | Interactive viz of an image's R/G/B channels | oracle (forensics) |
|
||||
|
||||
```bash
|
||||
pip install bellingcat-shadowfinder
|
||||
shadowfinder 1.5 --datetime "2024-03-15T14:00:00" --output map.html
|
||||
|
||||
pip install bellingcat-instagram-location-search
|
||||
ig-location-search --lat 40.7128 --lon -74.0060
|
||||
```
|
||||
|
||||
## Satellite / Earth Engine
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| ------------------------------------------ | --- | ---------- | --------------------------------------------------------- | --------------------- |
|
||||
| `sar-interference-tracker` | 556 | JavaScript | GEE script to detect SAR satellite radar interference | warden, marshal, centurion |
|
||||
| `cloud-free-subregion` | 59 | JavaScript | GEE app — find cloud-free Sentinel-2 imagery for an AOI | oracle, marshal |
|
||||
| `Multispectral-Satellite-Imagery-Explorer` | 13 | JavaScript | GEE app to explore Landsat-8 multispectral bands | oracle, marshal |
|
||||
| `umbra-open-data-tracker` | 33 | Python | Monitor Umbra SAR open-data catalogue, emit KML coverage | warden, marshal |
|
||||
| `ee_forest_area_tracker` | 4 | (?) | Forest-area tracking via Earth Engine | oracle, scholar |
|
||||
|
||||
GEE scripts: copy-paste into <https://code.earthengine.google.com/>.
|
||||
|
||||
## Social-media scrapers (live status varies — verify before relying)
|
||||
|
||||
| Repo | ★ | Lang | Platform / use | Personas |
|
||||
| --------------------------------- | --- | ---------- | ------------------------------------------------------- | ---------------- |
|
||||
| `tiktok-hashtag-analysis` | 358 | Python | Analyze hashtag co-occurrence + post stats | oracle, herald |
|
||||
| `tiktok-timestamp` | 58 | HTML | Tiny client-side TikTok video timestamp retriever | oracle |
|
||||
| `polyphemus` | 18 | Python | Odysee (alt-tech video) scraper | oracle, ghost |
|
||||
| `gogettr` | 13 | Python | GETTR public API client for archival | oracle, ghost |
|
||||
| `facebook-downloader` | 40 | Python | Public FB video downloader | oracle |
|
||||
| `reddit-post-scraping-tool` | 92 | Python | Subreddit + keyword → top posts containing keyword | oracle, ghost |
|
||||
| `youtube-comment-scraper` | 27 | Python | Scrape YT comments, find users commenting on N videos | oracle |
|
||||
| `cisticola` | 20 | Python | Coordinator for multiple scrapers + DB layer | oracle (heavy) |
|
||||
|
||||
## People search / aliases
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| --------------------- | --- | ---------- | ---------------------------------------------------- | -------------- |
|
||||
| `name-variant-search` | 50 | JavaScript | Generate search variations of a human name | oracle, wraith |
|
||||
| `alias-generator` | 22 | JavaScript | Node module — likely aliases for a given name | oracle, wraith |
|
||||
|
||||
```bash
|
||||
npm install -g @bellingcat/alias-generator
|
||||
alias-generator "John Smith" # produces J. Smith, Smith John, etc.
|
||||
```
|
||||
|
||||
Use both in tandem: feed the name through `name-variant-search` for
|
||||
cultural/transliteration variants, then pipe each variant through your
|
||||
people-search stack (Sherlock, WhatsMyName, etc.).
|
||||
|
||||
## Telegram-specific
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| ------------------------------- | --- | ------ | ----------------------------------------------------- | ------------------------ |
|
||||
| `telegram-phone-number-checker` | 1695| Python | Phone → Telegram presence check | oracle, wraith, frodo |
|
||||
| `telegram-group-joiner` | 55 | (web) | Auto-join public/private Telegram groups | oracle, ghost |
|
||||
| `gesara-entity-viz` | 4 | Python | Entity viz over a GESARA-conspiracy Telegram corpus | ghost, herald |
|
||||
|
||||
Pair with this repo's own `telegram` skill (custom WAHA scraper) for
|
||||
operational-scale Telegram archival.
|
||||
|
||||
## Companies / finance
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| ----------- | --- | ------ | ---------------------------------------------------------------------- | -------------- |
|
||||
| `EDGAR` | 203 | Python | SEC EDGAR Python lib (filings, ownership, financials) | ledger, frodo |
|
||||
| `sugartrail`| 76 | HTML | UK Companies House network viz — companies, officers, addresses | ledger |
|
||||
|
||||
`sugartrail` is browser-based; deploy locally for big networks. Pair with
|
||||
OpenCorporates / OpenSanctions in `references/companies-and-finance.md`.
|
||||
|
||||
## Aircraft / transport intel
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| -------------- | -- | ---- | ---------------------------------------------------------------- | --------------------- |
|
||||
| `adsb-history` | 72 | Vue | Collect & query ADS-B aircraft history by region/altitude/type | warden, echo, frodo |
|
||||
|
||||
```bash
|
||||
git clone https://github.com/bellingcat/adsb-history
|
||||
docker compose up -d
|
||||
# Then visit http://localhost:5173 for the Vue front-end
|
||||
```
|
||||
|
||||
Backfill investigations on private-jet movements, military transport
|
||||
patterns, surveillance flights.
|
||||
|
||||
## Image / media triage
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| --------------------- | -- | ---------------- | -------------------------------------------------------------------- | --------------------- |
|
||||
| `smart-image-sorter` | 62 | Jupyter Notebook | Zero-shot image classification via HuggingFace open-source models | oracle, sentinel |
|
||||
|
||||
Use case: triage thousands of OSINT-collected images by content (e.g.
|
||||
"weapon", "uniformed personnel", "vehicle"), then deep-dive the hits.
|
||||
|
||||
## Web-history forensics
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| -------------------------- | --- | -------- | ------------------------------------------------------------ | ---------------- |
|
||||
| `wayback-google-analytics` | 234 | Python | Scrape current AND historic Google Analytics tags from sites | oracle, sentinel |
|
||||
| `uniform-timezone` | 33 | Browser | Standardize timestamps across social-media UIs | scribe, oracle |
|
||||
|
||||
`wayback-google-analytics` is gold for de-anonymizing networks of related
|
||||
sites: GA tag IDs reused across domains often link sister-sites that
|
||||
hide ownership.
|
||||
|
||||
## Conflict / civilian-harm tracking
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| --------------------------------- | --- | ---------- | ------------------------------------------------------------------ | -------------------- |
|
||||
| `ukraine-timemap` | 287 | JavaScript | TimeMap instance for Civilian Harm in Ukraine | centurion, frodo, marshal |
|
||||
| `iran-conflict-damage-proxy-map` | 6 | JavaScript | Iran conflict damage tracking | centurion, frodo, marshal |
|
||||
| `vis-tj-kg-map-2022` | 3 | (?) | Tajikistan-Kyrgyzstan border-clash interactive map | centurion, frodo |
|
||||
|
||||
These are public TimeMap front-ends backing Bellingcat's published
|
||||
investigations. Reference architectures for building your own conflict
|
||||
trackers — fork + adapt.
|
||||
|
||||
## Specialized / research methodologies
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| --------------------------------- | --- | ---------------- | ---------------------------------------------------------------- | -------------------- |
|
||||
| `RS4OSINT` | 45 | TeX | Guide to Remote Sensing for OSINT (PDF + LaTeX source) | oracle, marshal, scholar |
|
||||
| `open-source-research-notebooks` | 298 | Jupyter Notebook | Tutorial notebooks for command-line + code OSINT investigations | scholar, oracle |
|
||||
| `open-questions` | 360 | Jupyter Notebook | Difficult research projects waiting for contributors | scholar, all-osint |
|
||||
| `o9a-product-scripts` | 8 | Python | Scripts from Order of Nine Angles investigation | wraith (HUMINT) |
|
||||
| `quitobaquito` | 14 | Python | Hydrology change methodology with GEE | scholar, oracle |
|
||||
| `twitter-geocode-searches` | 26 | Python | Methodology for geofenced Twitter search | oracle |
|
||||
| `coronavirus-aid-data` | 5 | Python | Data for Covid-19 relief-fund analysis article | ledger |
|
||||
| `who-killed-abelardo` | 4 | (web) | Audio map visualization (single-investigation viz) | wraith, herald |
|
||||
| `avoc` | 59 | CSS | 2024 Tech Fellowship working repo | (browse before reach)|
|
||||
|
||||
`open-source-research-notebooks` is the best entry point for a researcher
|
||||
new to Bellingcat methodology — it teaches the toolkit-via-Jupyter workflow.
|
||||
|
||||
## Council / government records
|
||||
|
||||
| Repo | ★ | Lang | Use | Personas |
|
||||
| ---------------- | --- | ------ | -------------------------------------------------------------------- | ------------- |
|
||||
| `CouncilSearcher`| 13 | Python | Find verbatim quotes from UK + Ireland council meetings | scribe, frodo |
|
||||
|
||||
Niche but powerful for any UK municipal-level investigation. Drop a
|
||||
search term, get back transcript-grounded matches.
|
||||
|
||||
## Infrastructure / supporting
|
||||
|
||||
| Repo | ★ | Notes |
|
||||
| ------------------------------- | -- | ---------------------------------------------- |
|
||||
| `toolkit` | 539| The GitBook / curated-tools repo (this skill's source) |
|
||||
| `hackathon-submission-template` | 11 | Template for Bellingcat Global Hackathon |
|
||||
| `bcat-discord-bot` | 5 | Bellingcat's own Discord bot |
|
||||
| `challenge-framework` | 5 | Static-site challenge framework |
|
||||
| `google-apps-script` | 31 | Handy GAS snippets |
|
||||
| `datasheet-server` | 32 | CSV → dynamic API server |
|
||||
| `4-year-anniversary-network` | 2 | Anniversary visualization |
|
||||
|
||||
## Persona affinity quick-pivot
|
||||
|
||||
| Persona | Top repos to know |
|
||||
| ----------- | ---------------------------------------------------------------------- |
|
||||
| **Oracle** | octosuite, telegram-phone-number-checker, auto-archiver, snscrape, ShadowFinder, instagram-location-search, osm-search, name-variant-search, alias-generator, smart-image-sorter, wayback-google-analytics |
|
||||
| **Frodo** | telegram-phone-number-checker, vk-url-scraper, ShadowFinder, instagram-location-search, EDGAR, ukraine-timemap, snscrape |
|
||||
| **Wraith** | telegram-phone-number-checker, name-variant-search, alias-generator, o9a-product-scripts |
|
||||
| **Sentinel**| octosuite, wayback-google-analytics, auto-archiver, smart-image-sorter |
|
||||
| **Scribe** | auto-archiver, whisperbox-transcribe, uniform-timezone, CouncilSearcher |
|
||||
| **Ledger** | EDGAR, sugartrail, coronavirus-aid-data |
|
||||
| **Centurion** | sar-interference-tracker, ukraine-timemap, iran-conflict-damage-proxy-map, search-grid-generator, geoclustering |
|
||||
| **Marshal** | sar-interference-tracker, umbra-open-data-tracker, cloud-free-subregion, search-grid-generator, ukraine-timemap |
|
||||
| **Warden** | sar-interference-tracker, umbra-open-data-tracker, adsb-history |
|
||||
| **Echo** | adsb-history (movement signatures), snscrape (signals correlation) |
|
||||
| **Herald** | tiktok-hashtag-analysis, gesara-entity-viz, auto-archiver, whisperbox-transcribe |
|
||||
| **Ghost** | gesara-entity-viz, polyphemus, gogettr, snscrape, telegram-group-joiner |
|
||||
| **Scholar** | open-source-research-notebooks, RS4OSINT, open-questions, quitobaquito |
|
||||
|
||||
## Install patterns
|
||||
|
||||
Most Python repos follow:
|
||||
```bash
|
||||
pip install <repo-name>
|
||||
<repo-name> --help
|
||||
# OR
|
||||
git clone https://github.com/bellingcat/<repo-name>
|
||||
cd <repo-name> && pip install -e .
|
||||
```
|
||||
|
||||
Earth Engine repos: open <https://code.earthengine.google.com/> and paste
|
||||
the script. Some require enabling specific imagery collections.
|
||||
|
||||
Vue/JavaScript apps: `npm install && npm run dev` for local; `docker
|
||||
compose up` if a `docker-compose.yml` is present.
|
||||
|
||||
## Update freshness
|
||||
|
||||
Run periodically:
|
||||
|
||||
```bash
|
||||
curl -s "https://api.github.com/orgs/bellingcat/repos?per_page=100&sort=updated" \
|
||||
| jq -r '.[] | select(.fork==false and .archived==false)
|
||||
| "\(.stargazers_count)\t\(.language // "?")\t\(.name)\t\(.description // "")"' \
|
||||
| sort -rn -k1 \
|
||||
| head -50
|
||||
```
|
||||
|
||||
Watch the org page directly: <https://github.com/orgs/bellingcat/repositories?type=all>.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Bellingcat's social-media scrapers (`snscrape`, `vk-url-scraper`, etc.)
|
||||
break frequently after platform API changes. Always run `--version` and
|
||||
read recent issues before relying on output for an investigation.
|
||||
- `auto-archiver` enrichers (Wayback, video DL, Telegram) each have their
|
||||
own auth + rate limits. The full pipeline is heavy — start with one
|
||||
enricher to validate flow before scaling.
|
||||
- `telegram-phone-number-checker` requires a Telegram developer account
|
||||
(`api_id`/`api_hash`). Excessive use will rate-limit or ban the account
|
||||
used; rotate.
|
||||
- GEE scripts need a Google account with Earth Engine access (free for
|
||||
research/non-profit). Quotas apply on heavy queries.
|
||||
- Several repos are unlicensed or have ambiguous licenses — for derivative
|
||||
work check the LICENSE file. Bellingcat's official repos are
|
||||
predominantly MIT.
|
||||
- Archived repos (12 of 62) are NOT included here — those are read-only
|
||||
historical references, not actively maintained.
|
||||
Reference in New Issue
Block a user