msramalho
ce4d7ac649
WIP refactor logging
2025-06-21 15:54:51 +01:00
Patrick Robertson
25f1f5dc93
Merge pull request #279 from bellingcat/telethon_tweaks
...
Fix calling extractor.cleanup (fixes telethon issue) + tidy up telethon extractor session file naming
2025-03-28 14:13:26 +04:00
Patrick Robertson
a448e2532c
Code tweak for clarity
2025-03-27 15:20:52 +04:00
Patrick Robertson
17d2d14680
Fix running 'cleanup' method on extractors that fail to start
2025-03-26 22:52:52 +04:00
Patrick Robertson
76e90dd23a
Small code tidy ups
2025-03-26 15:34:33 +04:00
Patrick Robertson
e6c5705f70
Merge pull request #261 from bellingcat/wacz_separate_profile
...
Wacz minor adjustments
2025-03-20 15:51:56 +00:00
Patrick Robertson
5e5e1c43a1
When loading modules, check they have been added to the right 'step' in the config
...
Fixes an issue seen on discord where a user accidentally set up metadata_enricher under 'extractors'
2025-03-20 18:09:26 +04:00
Patrick Robertson
f22af5e123
Tweak WACZ enricher docs + add comment on WACZ_ENABLE_DOCKER
2025-03-20 16:48:30 +04:00
Patrick Robertson
244341d22c
Skip check for 'docker' bin dependency if already running in docker
2025-03-19 18:08:04 +04:00
Patrick Robertson
b8da7607e8
Merge branch 'main' into opentimestamps
2025-03-14 12:36:03 +00:00
erinhmclark
6e52a534e7
More fixes from Bugbear suggestions
2025-03-12 16:07:05 +00:00
Patrick Robertson
1423c10363
Finish off timestamping module
2025-03-12 10:24:57 +00:00
erinhmclark
e7fa88f1c7
Implementing ruff suggestions.
2025-03-10 21:45:30 +00:00
erinhmclark
85abe1837a
Ruff format with defaults.
2025-03-10 18:44:54 +00:00
Patrick Robertson
0b5a0fcb32
Better error logs if users have XXXX_archiver modules enabled in config
2025-03-03 19:57:09 +00:00
Patrick Robertson
ca1ed418aa
Throw an error for invalid __manifest__ syntax + fix: allow default values of False/None
2025-02-24 21:46:24 +00:00
Patrick Robertson
49b6c32058
Fix the 'full' mode which creates a complete config file
2025-02-20 11:34:05 +00:00
Patrick Robertson
a9802dd004
Remove the global _LAZY_LOADED_MODULES and allow each instance of ArchivingOrchestrator to load its own modules
2025-02-19 12:25:35 +00:00
erinhmclark
e97ccf8a73
Separate setup() and module_setup().
2025-02-10 18:07:47 +00:00
erinhmclark
2c3d1f591f
Separate setup() and module_setup().
2025-02-10 17:25:15 +00:00
msramalho
7c848046e8
adds better info about wrong/missing modules
2025-02-10 14:59:32 +00:00
Patrick Robertson
c25d5cae84
Remove ArchivingContext completely
...
Context for a specific url/item is now passed around via the metadata (metadata.set_context('key', 'val') and metadata.get_context('key', default='something')
The only other thing that was passed around in ArchivingContext was the storage info, which is already accessible now via self.config
2025-01-30 17:50:54 +01:00
Patrick Robertson
d6b4b7a932
Further cleanup
...
* Removes (partly) the ArchivingOrchestrator
* Removes the cli_feeder module, and makes it the 'default', allowing you to pass URLs directly on the command line, without having to use the cumbersome --cli_feeder.urls. Just do auto-archiver https://my.url.com
* More unit tests
* Improved error handling
2025-01-30 16:44:40 +01:00
Patrick Robertson
b7d9145f6c
Further tidyups + refactoring for new structure
...
* Add implementation tests for orchestrator + logging tests
* Standardise method/class vars for extractors to see if they are suitable
* Fix bugs with removing default loguru logger (allows further customisation)
* Fix bug loading required fields from file
*
2025-01-30 13:21:10 +01:00
Patrick Robertson
00a7018f36
Fix up dependency checking (use 'dependencies' instead of 'external_dependencies' -> simpler/easier to remember
2025-01-29 19:25:22 +01:00
Patrick Robertson
3d37c494aa
Tidy ups + unit tests:
...
1. Allow loading modules from --module_paths=/extra/path/here
2. Improved unit tests for module loading
3. Further small tidy ups/clean ups
2025-01-29 18:42:49 +01:00
Patrick Robertson
7a4871db6b
Fix up unit tests for new structure
2025-01-28 14:40:12 +01:00
erinhmclark
e1a9373336
Refactoring for new config setup
2025-01-27 19:03:02 +00:00
Patrick Robertson
e3074013d0
Fix loading/saving to orchestration file with comments
2025-01-27 14:28:04 +01:00
Patrick Robertson
f68e2726f2
Refactor loader + step into module, use LazyBaseModule and BaseModule
2025-01-27 14:01:36 +01:00