Commit Graph

1510 Commits

Author SHA1 Message Date
Miguel Sozinho Ramalho
52a7cabaf1 Merge pull request #402 from bellingcat/dev
bug fix: wacz screenshots leak in shared session
v1.2.2
2026-02-25 10:39:54 +00:00
msramalho
a739361e12 bug fix: wacz screenshots leak in shared session 2026-02-23 16:26:36 +00:00
Miguel Sozinho Ramalho
9a97fede43 Merge pull request #401 from bellingcat/dev
Dependencies maintenance.
v1.2.1
2026-02-23 13:27:51 +00:00
msramalho
2d13077fad bumping ruff version 2026-02-23 12:36:53 +00:00
msramalho
8a4a314cf9 ruff python version to dev version 2026-02-23 12:32:24 +00:00
msramalho
75e8b788ae revert ruff workflow changes 2026-02-23 12:31:20 +00:00
msramalho
defe2315bf docs updates 2026-02-23 12:28:25 +00:00
msramalho
ba0dffdd5e Merge branch 'dev' of github.com:bellingcat/auto-archiver into dev 2026-02-23 12:18:58 +00:00
msramalho
a09927c507 minor docs fix 2026-02-23 12:18:47 +00:00
Miguel Sozinho Ramalho
6c938c489a Merge pull request #392 from bellingcat/dependabot/github_actions/actions-bc0df0c757
Bump the actions group with 5 updates
2026-02-23 11:28:24 +00:00
msramalho
0e39768da9 version bumping settings script 2026-02-23 11:27:12 +00:00
msramalho
1e5d6ec4a6 version bump: minor 2026-02-23 11:23:40 +00:00
msramalho
3385d004cf yt-dlp to latest version 2026-02-23 11:23:26 +00:00
msramalho
7f27f7fce0 closes #383 fixing browsertrix-crawler at 1.11.4 2026-02-23 11:23:06 +00:00
msramalho
a6e3240af1 closes #399 and global dependency updates 2026-02-23 11:13:31 +00:00
dependabot[bot]
bf4c196cc2 Bump the actions group with 5 updates
Bumps the actions group with 5 updates:

| Package | From | To |
| --- | --- | --- |
| [actions/checkout](https://github.com/actions/checkout) | `4` | `6` |
| [docker/login-action](https://github.com/docker/login-action) | `3.4.0` | `3.7.0` |
| [docker/metadata-action](https://github.com/docker/metadata-action) | `5.7.0` | `5.10.0` |
| [actions/setup-python](https://github.com/actions/setup-python) | `5` | `6` |
| [actions/cache](https://github.com/actions/cache) | `4` | `5` |


Updates `actions/checkout` from 4 to 6
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v6)

Updates `docker/login-action` from 3.4.0 to 3.7.0
- [Release notes](https://github.com/docker/login-action/releases)
- [Commits](74a5d14239...c94ce9fb46)

Updates `docker/metadata-action` from 5.7.0 to 5.10.0
- [Release notes](https://github.com/docker/metadata-action/releases)
- [Commits](902fa8ec7d...c299e40c65)

Updates `actions/setup-python` from 5 to 6
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v5...v6)

Updates `actions/cache` from 4 to 5
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](https://github.com/actions/cache/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
- dependency-name: docker/login-action
  dependency-version: 3.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: actions
- dependency-name: docker/metadata-action
  dependency-version: 5.10.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: actions
- dependency-name: actions/setup-python
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
- dependency-name: actions/cache
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-02-01 20:17:43 +00:00
Miguel Sozinho Ramalho
c640cc898a Merge pull request #385 from bellingcat/dev
1.2.0 dependencies, small bugs, 1st time contributors
v1.2.0
2026-01-08 15:55:40 +00:00
msramalho
3e2c0b564b wiki fix 2026-01-08 15:49:42 +00:00
msramalho
5fd23baa55 this is ruff 2026-01-08 15:48:08 +00:00
msramalho
8a450310c7 version bump for new release 2026-01-08 15:41:27 +00:00
msramalho
bef8a14089 pyperclip version bump closes #339 2026-01-08 15:40:17 +00:00
msramalho
cd0b093e7a browsertrix-crawler to 1.9.2 see #383 2026-01-08 15:33:40 +00:00
msramalho
096c9d09ef fix for unexpected types for json.dump 2026-01-08 15:18:19 +00:00
Miguel Sozinho Ramalho
df3521e9ca Merge pull request #377 from m4cd4r4/fix/improve-deleted-post-detection
Fix #335: Add comprehensive deletion detection for removed/unavailable content
2026-01-08 15:06:21 +00:00
msramalho
a89d0193e4 removes patch file 2026-01-08 15:02:00 +00:00
msramalho
536cbd905f puts tests file in correct directory 2026-01-08 14:55:40 +00:00
msramalho
a936921c4e updates new utils file and test 2026-01-08 14:54:06 +00:00
Miguel Sozinho Ramalho
68f672a4fa Merge branch 'dev' into fix/improve-deleted-post-detection 2026-01-08 14:36:17 +00:00
Miguel Sozinho Ramalho
4ee0ad1cf8 Merge pull request #359 from mjgaughan/specify-medatada-feature
implementing default metadata omission/user metadata selection
2026-01-08 14:34:50 +00:00
msramalho
bac809451c expands tests to included non predefined metadata keys 2026-01-08 14:33:16 +00:00
msramalho
53dc9904ce refactorws PR to obey standard code approach 2026-01-08 14:30:26 +00:00
Miguel Sozinho Ramalho
c1f312d42a Merge branch 'dev' into specify-medatada-feature 2026-01-08 14:04:42 +00:00
msramalho
23c9dfe717 updating dependencies 2026-01-08 13:53:44 +00:00
m4cd4r4
d02e7e0f02 Add comprehensive deletion detection for removed/unavailable content
Implements issue #335: improve detection of deleted/missing posts

## Changes

### New Deletion Detection System
- Created `deletion_detection.py` utility module with platform-specific
  indicators for Twitter, Facebook, Instagram, TikTok, YouTube, Reddit,
  VK, and Telegram
- Detects deletion via HTML content, page titles, error messages, and
  video metadata
- Stores detailed deletion context (indicator, source, platform) in
  metadata for investigators

### Integration Points
- **Antibot Extractor**: Checks HTML and page titles after page load;
  resolves TODO about detecting deleted videos
- **Generic Extractor**: Checks yt-dlp video data and error messages
  for deletion indicators
- **Twitter Dropin**: Enhanced detection when user/created_at fields
  are missing

### Test Coverage
- Comprehensive test suite covering all platforms
- Tests for HTML, title, error message, and metadata detection
- Validates that normal content is not falsely flagged

## Impact for Conflict Documentation

This fix is critical for evidence preservation in war-torn regions:
- Investigators can now document that evidence existed but was deleted
- Prevents wasted archival attempts on deleted content
- Tracks patterns of content removal
- Preserves metadata about what was deleted and when

Twitter example: Detects "Hmm...this page doesn't exist. Try searching
for something else" and flags content as deleted_or_unavailable.
2025-12-17 18:40:58 +08:00
Miguel Sozinho Ramalho
56526a9ac7 Merge pull request #365 from bellingcat/dev
Facebook reels fix
v1.1.6
2025-10-23 10:40:43 +01:00
msramalho
3a22cc28c0 skip tiktok antibot test in CI 2025-10-23 10:17:14 +01:00
msramalho
dbb3dfa04f fixes wikipedia test 2025-10-23 10:04:44 +01:00
msramalho
01bdb35f5d version bump 2025-10-23 09:51:31 +01:00
msramalho
43cbc6ac56 generic extractor improvements 2025-10-23 09:51:14 +01:00
msramalho
9c7cab1ae2 dependencies update 2025-10-22 21:07:12 +01:00
msramalho
a9a0bae083 dependencies update v1.1.5 2025-10-22 18:11:36 +01:00
Miguel Sozinho Ramalho
97d133ce79 Merge pull request #357 from bellingcat/dev
small improvements on tiktok and verison bumps
v1.1.4
2025-10-22 16:02:26 +01:00
msramalho
432ee3dcfd version bump 2025-10-22 15:50:50 +01:00
mgaughan
94e0803fb3 implementing default metadata omission/user metadata selection 2025-09-22 20:16:40 -05:00
msramalho
794b4f6052 Merge branch 'dev' of https://github.com/bellingcat/auto-archiver into dev 2025-09-11 15:06:27 +01:00
msramalho
965d7d41dd dependency updates 2025-09-11 15:06:25 +01:00
Miguel Sozinho Ramalho
e73faa70cc Merge pull request #352 from mjgaughan/developer-documentation-updates
updating the style-checking code in the documentation
2025-08-11 10:42:53 +01:00
mgaughan
80beab9f23 ruff-fix -> ruff-clean; there is no ruff-fix in the Makefile. Maybe the command /should/ be ruff-fix to align with the underlying ruff command; for later discussion. This at least reconciles the documentation to the Makefile 2025-08-05 21:36:32 -04:00
Miguel Sozinho Ramalho
200cea4e12 Merge pull request #345 from mjgaughan/main
Correction of small documentation typos
2025-07-29 09:36:10 +01:00
mgaughan
1256fde159 updating location of .env.test.example in documentation 2025-07-23 13:04:48 -04:00