Commit Graph

27 Commits

Author SHA1 Message Date
msramalho
a0971fc601 final code review changes 2023-06-26 17:32:19 +01:00
msramalho
3bbfdf6eba fix: excluding screenshots 2023-06-26 17:27:49 +01:00
msramalho
b4f86d0e8d refactor to hash all images and save hex string 2023-06-26 17:06:30 +01:00
Emiel de Heij
9fc09c724b add module for perceptual hashing with pdq 2023-06-26 15:25:55 +02:00
Logan Williams
c47da0a46f Fix issue with profiles in browsertrix 2023-05-11 15:08:27 +02:00
msramalho
e11be449e8 fix: delete completed whisper tasks 2023-05-10 18:57:17 +01:00
Logan Williams
ac82764ffc Working, but some cleanup still necessary 2023-05-09 19:34:16 +02:00
Logan Williams
0fae7d96fb Detect running in docker container in WACZ enricher 2023-05-09 19:34:16 +02:00
msramalho
875e1de589 feat: re-enable HASH on gsheet 2023-05-09 11:17:44 +01:00
msramalho
8f3d4e05c3 fixing bug in whisper wnericher 2023-05-04 09:36:10 +01:00
msramalho
5b0bff612e whisper transcripts to content 2023-05-02 19:05:32 +01:00
msramalho
5fdaa6c739 whisper improvements 2023-04-18 19:28:36 +01:00
msramalho
3d389ee05b add url info 2023-04-18 19:14:47 +01:00
msramalho
69bcfea2eb to_json fix 2023-04-18 18:48:51 +01:00
msramalho
2e2e695444 whisper enricher 2023-03-23 18:50:37 +00:00
msramalho
493055a8d9 cleanup 2023-03-23 18:50:30 +00:00
msramalho
906ed0f6e0 creating global context and refactoring tmp_dir logic 2023-03-23 11:17:38 +00:00
msramalho
0654e8c5c6 hash calculation in chunks to avoid exhausting RAM 2023-03-10 11:34:29 +00:00
msramalho
0e3c427371 Bump version to v0.4.3 for release 2023-02-27 10:30:06 +01:00
msramalho
cd81cae559 auth wall for WACZ 2023-02-20 16:08:45 +00:00
msramalho
5505255ea3 url auth wall detect 2023-02-17 15:45:58 +00:00
msramalho
e758bd076b test 2023-02-02 12:43:23 +00:00
msramalho
9bcca427a0 wacz in gsheets 2023-02-02 12:41:06 +00:00
msramalho
39bfde2026 thumbnails bug fix 2023-02-01 00:35:48 +00:00
msramalho
d1e4dde3f6 fixing imports 2023-01-27 00:19:58 +00:00
msramalho
b763fc4188 final naming cleanup + new feeders/dbs 2023-01-21 19:44:12 +00:00
msramalho
753039240f pyproject 2023-01-21 19:01:02 +00:00