Commit Graph

30 Commits

Author SHA1 Message Date
msramalho
ce4d7ac649 WIP refactor logging 2025-06-21 15:54:51 +01:00
Patrick Robertson
25f1f5dc93 Merge pull request #279 from bellingcat/telethon_tweaks
Fix calling extractor.cleanup (fixes telethon issue) + tidy up telethon extractor session file naming
2025-03-28 14:13:26 +04:00
Patrick Robertson
a448e2532c Code tweak for clarity 2025-03-27 15:20:52 +04:00
Patrick Robertson
17d2d14680 Fix running 'cleanup' method on extractors that fail to start 2025-03-26 22:52:52 +04:00
Patrick Robertson
76e90dd23a Small code tidy ups 2025-03-26 15:34:33 +04:00
Patrick Robertson
e6c5705f70 Merge pull request #261 from bellingcat/wacz_separate_profile
Wacz minor adjustments
2025-03-20 15:51:56 +00:00
Patrick Robertson
5e5e1c43a1 When loading modules, check they have been added to the right 'step' in the config
Fixes an issue seen on discord where a user accidentally set up metadata_enricher under 'extractors'
2025-03-20 18:09:26 +04:00
Patrick Robertson
f22af5e123 Tweak WACZ enricher docs + add comment on WACZ_ENABLE_DOCKER 2025-03-20 16:48:30 +04:00
Patrick Robertson
244341d22c Skip check for 'docker' bin dependency if already running in docker 2025-03-19 18:08:04 +04:00
Patrick Robertson
b8da7607e8 Merge branch 'main' into opentimestamps 2025-03-14 12:36:03 +00:00
erinhmclark
6e52a534e7 More fixes from Bugbear suggestions 2025-03-12 16:07:05 +00:00
Patrick Robertson
1423c10363 Finish off timestamping module 2025-03-12 10:24:57 +00:00
erinhmclark
e7fa88f1c7 Implementing ruff suggestions. 2025-03-10 21:45:30 +00:00
erinhmclark
85abe1837a Ruff format with defaults. 2025-03-10 18:44:54 +00:00
Patrick Robertson
0b5a0fcb32 Better error logs if users have XXXX_archiver modules enabled in config 2025-03-03 19:57:09 +00:00
Patrick Robertson
ca1ed418aa Throw an error for invalid __manifest__ syntax + fix: allow default values of False/None 2025-02-24 21:46:24 +00:00
Patrick Robertson
49b6c32058 Fix the 'full' mode which creates a complete config file 2025-02-20 11:34:05 +00:00
Patrick Robertson
a9802dd004 Remove the global _LAZY_LOADED_MODULES and allow each instance of ArchivingOrchestrator to load its own modules 2025-02-19 12:25:35 +00:00
erinhmclark
e97ccf8a73 Separate setup() and module_setup(). 2025-02-10 18:07:47 +00:00
erinhmclark
2c3d1f591f Separate setup() and module_setup(). 2025-02-10 17:25:15 +00:00
msramalho
7c848046e8 adds better info about wrong/missing modules 2025-02-10 14:59:32 +00:00
Patrick Robertson
c25d5cae84 Remove ArchivingContext completely
Context for a specific url/item is now passed around via the metadata (metadata.set_context('key', 'val') and metadata.get_context('key', default='something')
The only other thing that was passed around in ArchivingContext was the storage info, which is already accessible now via self.config
2025-01-30 17:50:54 +01:00
Patrick Robertson
d6b4b7a932 Further cleanup
* Removes (partly) the ArchivingOrchestrator
* Removes the cli_feeder module, and makes it the 'default', allowing you to pass URLs directly on the command line, without having to use the cumbersome --cli_feeder.urls. Just do auto-archiver https://my.url.com
* More unit tests
* Improved error handling
2025-01-30 16:44:40 +01:00
Patrick Robertson
b7d9145f6c Further tidyups + refactoring for new structure
* Add implementation tests for orchestrator + logging tests
* Standardise method/class vars for extractors to see if they are suitable
* Fix bugs with removing default loguru logger (allows further customisation)
* Fix bug loading required fields from file
*
2025-01-30 13:21:10 +01:00
Patrick Robertson
00a7018f36 Fix up dependency checking (use 'dependencies' instead of 'external_dependencies' -> simpler/easier to remember 2025-01-29 19:25:22 +01:00
Patrick Robertson
3d37c494aa Tidy ups + unit tests:
1. Allow loading modules from --module_paths=/extra/path/here
2. Improved unit tests for module loading
3. Further small tidy ups/clean ups
2025-01-29 18:42:49 +01:00
Patrick Robertson
7a4871db6b Fix up unit tests for new structure 2025-01-28 14:40:12 +01:00
erinhmclark
e1a9373336 Refactoring for new config setup 2025-01-27 19:03:02 +00:00
Patrick Robertson
e3074013d0 Fix loading/saving to orchestration file with comments 2025-01-27 14:28:04 +01:00
Patrick Robertson
f68e2726f2 Refactor loader + step into module, use LazyBaseModule and BaseModule 2025-01-27 14:01:36 +01:00