Commit Graph

710 Commits

Author SHA1 Message Date
Patrick Robertson
3168bed0d9 Add (skipped) test for twitter extraction with youtubedlp 2025-01-15 19:00:57 +01:00
Patrick Robertson
5626bba815 Add test on bluesky and note on why it doesn't work 2025-01-15 18:31:20 +01:00
Patrick Robertson
3ff7a9444d Update yt-dlp to latest version (2025.1.12) to add bsky support 2025-01-15 17:58:07 +01:00
Patrick Robertson
74cf1f5f23 Merge branch 'main' into youtubedlp-rewrite 2025-01-15 17:47:23 +01:00
Patrick Robertson
4f2b9baa73 refactor youtubedlp archiver to work for all valid websites
1. Extract more metadata
2. Better extract thumbnail
3. Setup framework for specific sites to provide more granular metadata processing
2025-01-15 17:46:47 +01:00
Patrick Robertson
c3dd19f309 Sniff filetype of downloaded media and add extension
Also download in chunks - fixes 2 x TODOs
2025-01-15 17:46:47 +01:00
Patrick Robertson
05e0c9de93 Merge pull request #169 from bellingcat/remove_dependencies
Tidy up and remove dependencies
2025-01-15 17:16:30 +01:00
Patrick Robertson
73b1a3902c Merge pull request #172 from bellingcat/docker_compose
Add docker-compose for easy building and running of docker image in dev
2025-01-15 17:16:22 +01:00
Patrick Robertson
100996f1e5 Add docker-compose for easy building and running of docker image in dev
Just use docker compose up
2025-01-15 14:36:02 +01:00
Patrick Robertson
74a4a24a23 Remove toml - unused
(pytest etc. use tomli, which is instlled)
2025-01-14 18:13:27 +01:00
Patrick Robertson
306df62a98 Fix all instances of utcnow() 2025-01-14 17:51:41 +01:00
Patrick Robertson
20726c1116 Remove tiktok-downloader - getting info is broken
TODO: switch to using youtube-dlp
2025-01-14 17:40:45 +01:00
Patrick Robertson
2eb2ab9ac9 Merge branch 'main' into remove_dependencies 2025-01-14 17:39:20 +01:00
Patrick Robertson
eebd040e13 Merge pull request #163 from bellingcat/feat/unittest
CI Unit tests
2025-01-14 17:26:34 +01:00
Patrick Robertson
6f10270baf Remove unittest and switch to pytest fully 2025-01-14 16:28:39 +01:00
Patrick Robertson
080f474d49 Remove minify_html package - HTML file is no longer minified
Savings were 5K (~15KB vs ~20KB) for the generated .html file, but minify_html is currently not compatible with python3.13+
2025-01-14 11:36:10 +01:00
Patrick Robertson
cef4037ad5 Add documentation on running tests to the readme 2025-01-14 11:30:06 +01:00
Patrick Robertson
4e13a09a87 Fix deprecation warning about utcnow 2025-01-14 11:01:40 +01:00
Patrick Robertson
6329b72ee8 Remove argparse from dependency list
Argparse is installed by default on python>=3.2, we only support python3.10+
Ref: https://pypi.org/project/argparse/#description
2025-01-14 10:50:47 +01:00
Patrick Robertson
1b1af2f0b1 Revert change to twitter_archiver
As per discussion at: https://github.com/bellingcat/auto-archiver/pull/165#discussion_r1905930837
2025-01-14 10:30:41 +01:00
Patrick Robertson
8f17a235f3 Switch to ubuntu-22.04 for CI tests
An issue with oscrypto means it currently does not work on 24.04. Ref: https://github.com/wbond/oscrypto/issues/78#issuecomment-2565688091
2025-01-14 10:24:14 +01:00
Patrick Robertson
ab2eb3c7f5 Add dev dependencies to poetry 2025-01-13 20:42:08 +01:00
Patrick Robertson
bdfedfcf61 Merge branch 'main' into feat/unittest 2025-01-13 19:50:47 +01:00
Erin Clark
9cdaea873b Merge pull request #164 from bellingcat/ec_add_poetry
Migrate to Poetry
2025-01-13 18:49:15 +00:00
erinhmclark
84ee1b422f Update and restrict versions of Poetry and Python. 2025-01-13 17:42:51 +00:00
Patrick Robertson
b9aea99de8 Prettify pytest output 2025-01-13 18:41:24 +01:00
Patrick Robertson
52f064908e Add unit test badges to readme 2025-01-13 18:33:22 +01:00
Patrick Robertson
9b596e59d6 Run expensive download tests once per week, on a month at 2:35pm
(time is offset from the hour to alleviate high load on Github
2025-01-13 18:33:02 +01:00
Patrick Robertson
528b78db85 Flag tombstone tweets for twitter_syndication method 2025-01-13 18:17:24 +01:00
Patrick Robertson
57eacdc24a Merge branch 'main' into feat/unittest 2025-01-13 18:06:55 +01:00
Patrick Robertson
bbef80de4c Add unit tests for html_formatter, csv_db 2025-01-13 17:58:10 +01:00
Patrick Robertson
930d78096a Merge pull request #162 from bellingcat/small_issues
Fix two small issues
2025-01-13 16:39:59 +01:00
Patrick Robertson
2353f9d6a5 Separate CI for download tests and core tests 2025-01-13 16:27:46 +01:00
Patrick Robertson
63973e2ce7 switch to pytest and pytest-recording 2025-01-13 16:23:20 +01:00
erinhmclark
e9a7f435a3 Add package dist directory to .gitignore 2025-01-13 13:33:23 +00:00
Patrick Robertson
e2bc84ccb9 Merge branch 'main' into feat/unittest 2025-01-13 13:15:13 +01:00
erinhmclark
72a8e76fbb Update README.md for usage with Poetry. 2025-01-12 20:21:23 +00:00
erinhmclark
c69a5fa1c9 Refactor Dockerfile for multi-stage builds.
Combining environment and runtime stages due to Poetry's dependency on source code.
2025-01-12 12:38:12 +00:00
erinhmclark
d80b4b7557 Remove snscrape and Python 3.12 restriction. 2025-01-12 12:15:56 +00:00
erinhmclark
cc490f9c10 Updated Dockerfile (not optimised yet) 2025-01-12 12:15:56 +00:00
erinhmclark
08e83eb94e Update pyproject.toml configuration for Poetry version 2.0.0. 2025-01-12 12:15:56 +00:00
erinhmclark
dd822b8b44 Update poetry.lock 2025-01-12 12:15:56 +00:00
erinhmclark
4a63ca7753 Update PyPi workflow to read python version from pyproject.toml. 2025-01-12 12:15:56 +00:00
erinhmclark
6d5b0090d9 Pull version from pyproject.toml file/ 2025-01-12 12:15:56 +00:00
erinhmclark
26abd6f7ae Added TODO comment for adding a version restriction. 2025-01-12 12:15:56 +00:00
erinhmclark
dba8f46016 Replaced comments for python-publish.yaml workflow. 2025-01-12 12:15:56 +00:00
erinhmclark
50e8c93477 Updated workflow for python-publish.yaml to use poetry (untested), and cleanup of pipenv files. 2025-01-12 12:15:56 +00:00
erinhmclark
6da837b374 Add note to update dynamic versioning and references to version. 2025-01-12 12:15:56 +00:00
erinhmclark
660ee82c67 Update Dockerfile for poetry.
Note: Review security with curl installation. Currently locked to known version, but additional checks could be added.
2025-01-12 12:15:56 +00:00
erinhmclark
5490947657 Add packaging to Poetry. 2025-01-12 12:15:56 +00:00