msramalho
|
a0971fc601
|
final code review changes
|
2023-06-26 17:32:19 +01:00 |
|
msramalho
|
3bbfdf6eba
|
fix: excluding screenshots
|
2023-06-26 17:27:49 +01:00 |
|
msramalho
|
b4f86d0e8d
|
refactor to hash all images and save hex string
|
2023-06-26 17:06:30 +01:00 |
|
Emiel de Heij
|
9fc09c724b
|
add module for perceptual hashing with pdq
|
2023-06-26 15:25:55 +02:00 |
|
Logan Williams
|
c47da0a46f
|
Fix issue with profiles in browsertrix
|
2023-05-11 15:08:27 +02:00 |
|
msramalho
|
e11be449e8
|
fix: delete completed whisper tasks
|
2023-05-10 18:57:17 +01:00 |
|
Logan Williams
|
ac82764ffc
|
Working, but some cleanup still necessary
|
2023-05-09 19:34:16 +02:00 |
|
Logan Williams
|
0fae7d96fb
|
Detect running in docker container in WACZ enricher
|
2023-05-09 19:34:16 +02:00 |
|
msramalho
|
875e1de589
|
feat: re-enable HASH on gsheet
|
2023-05-09 11:17:44 +01:00 |
|
msramalho
|
8f3d4e05c3
|
fixing bug in whisper wnericher
|
2023-05-04 09:36:10 +01:00 |
|
msramalho
|
5b0bff612e
|
whisper transcripts to content
|
2023-05-02 19:05:32 +01:00 |
|
msramalho
|
5fdaa6c739
|
whisper improvements
|
2023-04-18 19:28:36 +01:00 |
|
msramalho
|
3d389ee05b
|
add url info
|
2023-04-18 19:14:47 +01:00 |
|
msramalho
|
69bcfea2eb
|
to_json fix
|
2023-04-18 18:48:51 +01:00 |
|
msramalho
|
2e2e695444
|
whisper enricher
|
2023-03-23 18:50:37 +00:00 |
|
msramalho
|
493055a8d9
|
cleanup
|
2023-03-23 18:50:30 +00:00 |
|
msramalho
|
906ed0f6e0
|
creating global context and refactoring tmp_dir logic
|
2023-03-23 11:17:38 +00:00 |
|
msramalho
|
0654e8c5c6
|
hash calculation in chunks to avoid exhausting RAM
|
2023-03-10 11:34:29 +00:00 |
|
msramalho
|
0e3c427371
|
Bump version to v0.4.3 for release
|
2023-02-27 10:30:06 +01:00 |
|
msramalho
|
cd81cae559
|
auth wall for WACZ
|
2023-02-20 16:08:45 +00:00 |
|
msramalho
|
5505255ea3
|
url auth wall detect
|
2023-02-17 15:45:58 +00:00 |
|
msramalho
|
e758bd076b
|
test
|
2023-02-02 12:43:23 +00:00 |
|
msramalho
|
9bcca427a0
|
wacz in gsheets
|
2023-02-02 12:41:06 +00:00 |
|
msramalho
|
39bfde2026
|
thumbnails bug fix
|
2023-02-01 00:35:48 +00:00 |
|
msramalho
|
d1e4dde3f6
|
fixing imports
|
2023-01-27 00:19:58 +00:00 |
|
msramalho
|
b763fc4188
|
final naming cleanup + new feeders/dbs
|
2023-01-21 19:44:12 +00:00 |
|
msramalho
|
753039240f
|
pyproject
|
2023-01-21 19:01:02 +00:00 |
|