Commit Graph

  • aa7ca93a43 Update manifests and modules erinhmclark 2025-01-24 12:58:16 +00:00
  • ba4b330881 Merge remote-tracking branch 'origin/more_mainifests' into more_mainifests erinhmclark 2025-01-24 08:04:27 +00:00
  • cbafbfab3f Revert Dockerfile changes erinhmclark 2025-01-24 08:04:09 +00:00
  • 9befb9776c Fix loading modules when entry_point isn't set Patrick Robertson 2025-01-23 21:08:54 +01:00
  • 06f6e34d9d Revert changes to orchestrator to avoid merge conflicts Patrick Robertson 2025-01-23 20:38:36 +01:00
  • b27bf8ffeb Fix up loading/storing configs + unit tests Patrick Robertson 2025-01-23 20:32:19 +01:00
  • 50f4ebcdc3 Move storage configs into individual manifests, assert format on useage. erinhmclark 2025-01-23 17:01:30 +00:00
  • c3403ced26 Rename storages for clarity erinhmclark 2025-01-23 16:51:17 +00:00
  • 1274a1b231 More manifests, base modules and rename from archiver to extractor. erinhmclark 2025-01-23 16:40:48 +00:00
  • 9db26cdfc2 Merge branch 'load_modules' into more_mainifests erinhmclark 2025-01-23 09:19:54 +00:00
  • 79684f8348 Set up feeder manifests (not merged by source yet) erinhmclark 2025-01-23 09:16:42 +00:00
  • 65ef46d01e Fix loading already loaded modules - don't load them twice Patrick Robertson 2025-01-23 00:09:39 +01:00
  • 550097ab7b Get module loading working properly Patrick Robertson 2025-01-22 23:54:21 +01:00
  • c517d35bdf Merge branch 'load_modules' into more_mainifests erinhmclark 2025-01-22 18:19:43 +00:00
  • 99c8c69085 Manifests for databases erinhmclark 2025-01-22 18:18:13 +00:00
  • ade5ea0f6f Tidy up imports + start on loading modules - program now starts much faster Patrick Robertson 2025-01-22 18:45:58 +01:00
  • b6b085854c Switch back to using yaml with dot notation Patrick Robertson 2025-01-22 17:40:51 +01:00
  • 54995ad6ab Further tweaks based on __manifest__.py files Patrick Robertson 2025-01-22 13:11:43 +01:00
  • 7b3a1468cd Create manifest files for archiver modules. erinhmclark 2025-01-21 22:29:50 +00:00
  • 4830f99300 Get parsing of manifest and combining with config file working Patrick Robertson 2025-01-21 20:03:10 +01:00
  • 241b35002c Initial changes to move to '__manifest__' format Patrick Robertson 2025-01-21 19:02:38 +01:00
  • 03f3770223 Add __manifest__.py for generic_extractor Patrick Robertson 2025-01-21 18:00:45 +01:00
  • bdfc855297 Ignore pylint statements for manifest files Patrick Robertson 2025-01-21 17:59:52 +01:00
  • c41d93a634 Use already implemented helper to get version Patrick Robertson 2025-01-21 17:53:37 +01:00
  • d4fff0b6eb Merge pull request #175 from bellingcat/youtubedlp-rewrite Patrick Robertson 2025-01-21 17:33:39 +01:00
  • cd2ae3763f Minor adjustments Patrick Robertson 2025-01-21 16:24:37 +00:00
  • d3e3eb7639 unit tests for loading dropins Patrick Robertson 2025-01-21 16:58:18 +01:00
  • 9dde9b26d0 Patch in upstream changes to ytdlp for now Patrick Robertson 2025-01-21 16:49:49 +01:00
  • 7c0dcbfd81 Re-add doc string to generic_archiver Patrick Robertson 2025-01-21 16:49:30 +01:00
  • 6388983815 Merge branch 'main' into youtubedlp-rewrite Patrick Robertson 2025-01-21 16:43:14 +01:00
  • 4bb4ebdf82 Further cleanup, abstracts 'dropins' out into generic files Patrick Robertson 2025-01-21 16:36:45 +01:00
  • 113a4db251 Merge pull request #177 from bellingcat/feat/documentation Erin Clark 2025-01-21 09:54:41 +00:00
  • e83ccc0d7f Cleaning up configs reference and module level. erinhmclark 2025-01-21 09:48:46 +00:00
  • dff0105659 Small fixups + implement Truth code for posts with multiple media Patrick Robertson 2025-01-20 18:40:46 +01:00
  • fd2e7f973b Further tidy-ups, also adds some ytdlp utils to 'utils' Patrick Robertson 2025-01-20 16:17:57 +01:00
  • befc92deb4 Further unit test tidy ups Patrick Robertson 2025-01-17 17:29:13 +01:00
  • d4893ee05e Fix unit tests for base_archiver->generic_archiver rename Patrick Robertson 2025-01-17 17:08:00 +01:00
  • 9c5a9e1bcd Rename BaseArchiver to GenericArchiver + some other tidyups Patrick Robertson 2025-01-17 17:06:04 +01:00
  • 5aa717452e Quick test that the app actually runs in core tests Patrick Robertson 2025-01-17 17:02:54 +01:00
  • 5b20288d06 Add a 'version' arg to get the current running version Patrick Robertson 2025-01-17 16:59:57 +01:00
  • 59eb8f7520 Add TWITTER_BEARER_TOKEN to env for running download tests Patrick Robertson 2025-01-17 12:04:40 +01:00
  • 17c1c9c360 Fix up core unit tests when a twitter api key isn't provided Patrick Robertson 2025-01-17 12:02:38 +01:00
  • 394bcd8d47 Further refactoring of youtubedl_archiver->base_archiver Patrick Robertson 2025-01-17 11:56:08 +01:00
  • 170f8d18a6 Add instructions to README.md, include build directories in .gitignore and do a bit more tidying, erinhmclark 2025-01-16 20:46:10 +00:00
  • f03ec42026 Merge pull request #174 from bellingcat/version_updates Erin Clark 2025-01-16 14:31:26 +00:00
  • 6fabe2a189 Fixed twitter_archiver.py changes. erinhmclark 2025-01-16 09:56:54 +00:00
  • a6aacfa3fb Add example pre-generated configs.rst erinhmclark 2025-01-16 09:31:50 +00:00
  • bbb3269c2b Changes from main. erinhmclark 2025-01-16 09:30:32 +00:00
  • 235da33a1a Update .readthedocs.yaml path erinhmclark 2025-01-16 09:24:46 +00:00
  • d3eec5d90f Basic docs structure for RTD erinhmclark 2025-01-15 21:45:29 +00:00
  • 3168bed0d9 Add (skipped) test for twitter extraction with youtubedlp Patrick Robertson 2025-01-15 19:00:57 +01:00
  • 33686ea851 Update versions for GH Actions and Geckodriver. erinhmclark 2025-01-15 17:35:42 +00:00
  • 5626bba815 Add test on bluesky and note on why it doesn't work Patrick Robertson 2025-01-15 18:31:20 +01:00
  • 3ff7a9444d Update yt-dlp to latest version (2025.1.12) to add bsky support Patrick Robertson 2025-01-15 17:58:07 +01:00
  • 74cf1f5f23 Merge branch 'main' into youtubedlp-rewrite Patrick Robertson 2025-01-15 17:47:23 +01:00
  • 4f2b9baa73 refactor youtubedlp archiver to work for all valid websites Patrick Robertson 2025-01-15 17:39:47 +01:00
  • c3dd19f309 Sniff filetype of downloaded media and add extension Patrick Robertson 2025-01-15 17:02:19 +01:00
  • 05e0c9de93 Merge pull request #169 from bellingcat/remove_dependencies Patrick Robertson 2025-01-15 17:16:30 +01:00
  • 73b1a3902c Merge pull request #172 from bellingcat/docker_compose Patrick Robertson 2025-01-15 17:16:22 +01:00
  • 100996f1e5 Add docker-compose for easy building and running of docker image in dev Patrick Robertson 2025-01-15 14:27:04 +01:00
  • 74a4a24a23 Remove toml - unused Patrick Robertson 2025-01-14 18:13:27 +01:00
  • 306df62a98 Fix all instances of utcnow() Patrick Robertson 2025-01-14 17:50:44 +01:00
  • 20726c1116 Remove tiktok-downloader - getting info is broken Patrick Robertson 2025-01-14 17:40:45 +01:00
  • 2eb2ab9ac9 Merge branch 'main' into remove_dependencies Patrick Robertson 2025-01-14 17:39:20 +01:00
  • eebd040e13 Merge pull request #163 from bellingcat/feat/unittest Patrick Robertson 2025-01-14 17:26:34 +01:00
  • 6f10270baf Remove unittest and switch to pytest fully feat/unittest Patrick Robertson 2025-01-14 16:28:39 +01:00
  • 080f474d49 Remove minify_html package - HTML file is no longer minified Patrick Robertson 2025-01-14 11:36:10 +01:00
  • cef4037ad5 Add documentation on running tests to the readme Patrick Robertson 2025-01-14 11:30:06 +01:00
  • 4e13a09a87 Fix deprecation warning about utcnow Patrick Robertson 2025-01-14 11:00:27 +01:00
  • 6329b72ee8 Remove argparse from dependency list Patrick Robertson 2025-01-14 10:50:47 +01:00
  • 1b1af2f0b1 Revert change to twitter_archiver Patrick Robertson 2025-01-14 10:30:41 +01:00
  • 8f17a235f3 Switch to ubuntu-22.04 for CI tests Patrick Robertson 2025-01-14 10:06:13 +01:00
  • ab2eb3c7f5 Add dev dependencies to poetry Patrick Robertson 2025-01-13 20:40:12 +01:00
  • bdfedfcf61 Merge branch 'main' into feat/unittest Patrick Robertson 2025-01-13 19:50:47 +01:00
  • 9cdaea873b Merge pull request #164 from bellingcat/ec_add_poetry Erin Clark 2025-01-13 18:49:15 +00:00
  • 84ee1b422f Update and restrict versions of Poetry and Python. erinhmclark 2025-01-13 17:42:51 +00:00
  • b9aea99de8 Prettify pytest output Patrick Robertson 2025-01-13 18:41:24 +01:00
  • 52f064908e Add unit test badges to readme Patrick Robertson 2025-01-13 18:33:22 +01:00
  • 9b596e59d6 Run expensive download tests once per week, on a month at 2:35pm Patrick Robertson 2025-01-13 18:33:02 +01:00
  • 528b78db85 Flag tombstone tweets for twitter_syndication method Patrick Robertson 2025-01-13 18:17:24 +01:00
  • 57eacdc24a Merge branch 'main' into feat/unittest Patrick Robertson 2025-01-13 18:06:55 +01:00
  • bbef80de4c Add unit tests for html_formatter, csv_db Patrick Robertson 2025-01-13 17:58:10 +01:00
  • 930d78096a Merge pull request #162 from bellingcat/small_issues Patrick Robertson 2025-01-13 16:39:59 +01:00
  • 2353f9d6a5 Separate CI for download tests and core tests Patrick Robertson 2025-01-13 16:27:46 +01:00
  • 63973e2ce7 switch to pytest and pytest-recording Patrick Robertson 2025-01-13 14:31:29 +01:00
  • e9a7f435a3 Add package dist directory to .gitignore erinhmclark 2025-01-13 13:33:23 +00:00
  • e2bc84ccb9 Merge branch 'main' into feat/unittest Patrick Robertson 2025-01-13 13:15:13 +01:00
  • 72a8e76fbb Update README.md for usage with Poetry. erinhmclark 2025-01-12 20:21:23 +00:00
  • c69a5fa1c9 Refactor Dockerfile for multi-stage builds. erinhmclark 2025-01-12 12:38:12 +00:00
  • d80b4b7557 Remove snscrape and Python 3.12 restriction. erinhmclark 2025-01-08 20:46:22 +00:00
  • cc490f9c10 Updated Dockerfile (not optimised yet) erinhmclark 2025-01-07 20:34:16 +00:00
  • 08e83eb94e Update pyproject.toml configuration for Poetry version 2.0.0. erinhmclark 2025-01-07 16:37:51 +00:00
  • dd822b8b44 Update poetry.lock erinhmclark 2025-01-06 22:31:54 +00:00
  • 4a63ca7753 Update PyPi workflow to read python version from pyproject.toml. erinhmclark 2025-01-06 20:17:54 +00:00
  • 6d5b0090d9 Pull version from pyproject.toml file/ erinhmclark 2025-01-06 20:17:04 +00:00
  • 26abd6f7ae Added TODO comment for adding a version restriction. erinhmclark 2025-01-02 16:46:47 +00:00
  • dba8f46016 Replaced comments for python-publish.yaml workflow. erinhmclark 2024-12-31 15:08:20 +00:00
  • 50e8c93477 Updated workflow for python-publish.yaml to use poetry (untested), and cleanup of pipenv files. erinhmclark 2024-12-31 15:05:46 +00:00
  • 6da837b374 Add note to update dynamic versioning and references to version. erinhmclark 2024-12-31 14:11:51 +00:00
  • 660ee82c67 Update Dockerfile for poetry. erinhmclark 2024-12-31 14:10:52 +00:00