Patrick Robertson
c41d93a634
Use already implemented helper to get version
2025-01-21 17:53:37 +01:00
Patrick Robertson
cd2ae3763f
Minor adjustments
...
Co-authored-by: Miguel Sozinho Ramalho <19508417+msramalho@users.noreply.github.com >
2025-01-21 16:24:37 +00:00
Patrick Robertson
d3e3eb7639
unit tests for loading dropins
2025-01-21 16:59:45 +01:00
Patrick Robertson
9dde9b26d0
Patch in upstream changes to ytdlp for now
...
Seems like ytdlp may not merge https://github.com/yt-dlp/yt-dlp/pull/12098 anytime soon
2025-01-21 16:49:49 +01:00
Patrick Robertson
7c0dcbfd81
Re-add doc string to generic_archiver
...
(renamed from youtube_archiver)
2025-01-21 16:49:30 +01:00
Patrick Robertson
6388983815
Merge branch 'main' into youtubedlp-rewrite
2025-01-21 16:43:14 +01:00
Patrick Robertson
4bb4ebdf82
Further cleanup, abstracts 'dropins' out into generic files
2025-01-21 16:36:45 +01:00
erinhmclark
e83ccc0d7f
Cleaning up configs reference and module level.
2025-01-21 09:48:46 +00:00
Patrick Robertson
dff0105659
Small fixups + implement Truth code for posts with multiple media
2025-01-20 18:40:46 +01:00
Patrick Robertson
fd2e7f973b
Further tidy-ups, also adds some ytdlp utils to 'utils'
2025-01-20 16:31:28 +01:00
Patrick Robertson
9c5a9e1bcd
Rename BaseArchiver to GenericArchiver + some other tidyups
2025-01-17 17:06:04 +01:00
Patrick Robertson
5b20288d06
Add a 'version' arg to get the current running version
2025-01-17 16:59:57 +01:00
Patrick Robertson
394bcd8d47
Further refactoring of youtubedl_archiver->base_archiver
...
* Keep twitter_api_archiver
* Remove unit tests for obsolete archivers
* Guess filename of media using the 'Content-Type' header
* Add mechanism to run 'expensive' tests last (see conftest.py) and also flag expensive tests to fail straight off (pytest.mark.incremental)
2025-01-17 11:56:08 +01:00
erinhmclark
6fabe2a189
Fixed twitter_archiver.py changes.
2025-01-16 09:56:54 +00:00
erinhmclark
bbb3269c2b
Changes from main.
2025-01-16 09:30:32 +00:00
erinhmclark
d3eec5d90f
Basic docs structure for RTD
2025-01-15 21:45:29 +00:00
Patrick Robertson
74cf1f5f23
Merge branch 'main' into youtubedlp-rewrite
2025-01-15 17:47:23 +01:00
Patrick Robertson
4f2b9baa73
refactor youtubedlp archiver to work for all valid websites
...
1. Extract more metadata
2. Better extract thumbnail
3. Setup framework for specific sites to provide more granular metadata processing
2025-01-15 17:46:47 +01:00
Patrick Robertson
c3dd19f309
Sniff filetype of downloaded media and add extension
...
Also download in chunks - fixes 2 x TODOs
2025-01-15 17:46:47 +01:00
Patrick Robertson
306df62a98
Fix all instances of utcnow()
2025-01-14 17:51:41 +01:00
Patrick Robertson
20726c1116
Remove tiktok-downloader - getting info is broken
...
TODO: switch to using youtube-dlp
2025-01-14 17:40:45 +01:00
Patrick Robertson
2eb2ab9ac9
Merge branch 'main' into remove_dependencies
2025-01-14 17:39:20 +01:00
Patrick Robertson
080f474d49
Remove minify_html package - HTML file is no longer minified
...
Savings were 5K (~15KB vs ~20KB) for the generated .html file, but minify_html is currently not compatible with python3.13+
2025-01-14 11:36:10 +01:00
Patrick Robertson
4e13a09a87
Fix deprecation warning about utcnow
2025-01-14 11:01:40 +01:00
Patrick Robertson
1b1af2f0b1
Revert change to twitter_archiver
...
As per discussion at: https://github.com/bellingcat/auto-archiver/pull/165#discussion_r1905930837
2025-01-14 10:30:41 +01:00
Patrick Robertson
bdfedfcf61
Merge branch 'main' into feat/unittest
2025-01-13 19:50:47 +01:00
Erin Clark
9cdaea873b
Merge pull request #164 from bellingcat/ec_add_poetry
...
Migrate to Poetry
2025-01-13 18:49:15 +00:00
Patrick Robertson
528b78db85
Flag tombstone tweets for twitter_syndication method
2025-01-13 18:17:24 +01:00
Patrick Robertson
57eacdc24a
Merge branch 'main' into feat/unittest
2025-01-13 18:06:55 +01:00
Patrick Robertson
63973e2ce7
switch to pytest and pytest-recording
2025-01-13 16:23:20 +01:00
erinhmclark
d80b4b7557
Remove snscrape and Python 3.12 restriction.
2025-01-12 12:15:56 +00:00
erinhmclark
6d5b0090d9
Pull version from pyproject.toml file/
2025-01-12 12:15:56 +00:00
erinhmclark
6da837b374
Add note to update dynamic versioning and references to version.
2025-01-12 12:15:56 +00:00
Patrick Robertson
3546d4ad79
Fix 'download_syndication' method for tweet archiving (now requires a token)
...
Plus add in unit tests for token generation + download syndication
2025-01-12 12:55:00 +01:00
Patrick Robertson
c932fb7416
Improved logging when an invalid/deleted tweet is attempted to be downloaded
...
Plus: unit tests for non-existent tweet + invalid tweet ID
2025-01-12 12:00:45 +01:00
Patrick Robertson
f29950905c
Merge branch 'main' into small_issues
2025-01-12 11:47:55 +01:00
Patrick Robertson
add83c9650
Remove snscrape from twitter_archiver
...
1. snscrape twitter downloader no longer works (ref: https://github.com/JustAnotherArchivist/snscrape/issues/1045 )
2. snscrape limits python to versions <3.12
2025-01-07 19:40:19 +01:00
Miguel Sozinho Ramalho
a697f0a212
adds an unauthenticated Bluesky archiver ( #160 )
...
* adds a TODO for next code iterations
* implements bsky archiver
* adds new archiver to example orchestration file
* Fix downloading media for posts with multiple images
(Images are stored in media/images)
* Setup a basic framework for unit tests
Use 'python -m unittest' from the project root to run
---------
Co-authored-by: Patrick Robertson <robertson.patrick@gmail.com >
2025-01-07 10:28:07 +00:00
Patrick Robertson
bffa3a6254
Merge pull request #159 from bellingcat/print_pdf
...
Add 'print_pdf' option to the screenshot enricher. Fixes #132
2025-01-06 18:13:38 +01:00
Miguel Sozinho Ramalho
ef471f41e1
adds better debug for wayback failures ( #161 )
2025-01-06 16:49:11 +00:00
Patrick Robertson
928518cda7
Allow setting cookies for yt-dl ( #158 )
2025-01-06 16:19:53 +00:00
Patrick Robertson
0c803f15a5
Fix showing preview images in the .html file when using local storage
...
Local storage media urls are prefixed with '/', previously only http(s) media preview src were displayed
2024-12-31 09:29:31 +01:00
Patrick Robertson
a46f9997ea
Better logging when there's a timestamp parse error
2024-12-31 09:28:08 +01:00
msramalho
83da9ae089
adds pdf preview support for html formatter
2024-12-23 18:19:26 +00:00
Patrick Robertson
663c8ad93a
Add 'print_pdf' option to the screenshot enricher. Fixes #132
2024-12-20 07:14:03 +01:00
msramalho
e49550163f
adds proxy_server option to wacz
2024-10-06 10:45:34 +06:00
msramalho
e6f5981afc
numpy version downgrade
2024-10-06 10:10:04 +06:00
msramalho
c62bf1a34d
yt-dlp version bump
2024-10-05 17:43:07 +06:00
msramalho
b166d57e61
v0.12.0 bump
2024-08-21 13:34:34 +01:00
msramalho
004143a58a
version bump v0.11.6
2024-07-18 11:27:39 +01:00