R. Miles McCain
f603400d0d
Add direct Atlos integration ( #137 )
...
* Add Atlos feeder
* Add Atlos db
* Add Atlos storage
* Fix Atlos storages
* Fix Atlos feeder
* Only include URLs in Atlos feeder once they're processed
* Remove print
* Add Atlos documentation to README
* Formatting fixes
* Don't archive existing material
* avoid KeyError in atlos_db
* version bump
---------
Co-authored-by: msramalho <19508417+msramalho@users.noreply.github.com >
2024-04-15 19:25:17 +01:00
msramalho
75497f5773
minor bug fix when using an archiver_enricher in enrichers only
2024-04-15 19:02:40 +01:00
msramalho
601572d76e
strip url
2024-02-29 11:54:01 +00:00
msramalho
d21e79a272
general security updates
2024-02-29 11:40:30 +00:00
msramalho
5324d562ba
cleanup wacz patch
2024-02-21 18:14:30 +00:00
Miguel Sozinho Ramalho
7a21ae96af
V0.9.0 - closes several open issues: new enrichers and bug fixes ( #133 )
...
* clean orchestrator code, add archiver cleanup logic
* improves documentation for database.py
* telethon archivers isolate sessions into copied files
* closes #127
* closes #125
* closes #84
* meta enricher applies to all media
* closes #61 adds subtitles and comments
* minor update
* minor fixes to yt-dlp subtitles and comments
* closes #17 but logic is imperfect.
* closes #85 ssl enhancer
* minimifies html, JS refactor for preview of certificates
* closes #91 adds freetsa timestamp authority
* version bump
* simplify download_url method
* skip ssl if nothing archived
* html preview improvements
* adds retrying lib
* manual download archiver improvements
* meta only runs when relevant data available
* new metadata convenience method
* html template improvements
* removes debug message
* does not close #91 yet, will need a few more certificate chaing logging
* adds verbosity config
* new instagram api archiver
* adds proxy support we
* adds proxy/end support and bug fix for yt-dlp
* proxy support for webdriver
* adds socks proxy to wacz_enricher
* refactor recursivity in inner media and display
* infinite recursive display
* foolproofing timestamping authortities
* version to 0.9.0
* minor fixes from code-review
2024-02-20 18:05:29 +00:00
msramalho
499832d146
fix datetime parsing
2023-12-13 18:41:48 +00:00
Miguel Sozinho Ramalho
a786d4bb0e
chooses most complete result from api ( #116 )
2023-12-13 11:26:46 +00:00
Miguel Sozinho Ramalho
98fb574d89
fixing older db entries formats ( #114 )
2023-12-12 22:47:54 +00:00
Miguel Sozinho Ramalho
6f36e92e02
enables api_db cache queries if configured with new option ( #113 )
2023-12-12 19:20:26 +00:00
msramalho
a1742b5565
fixing whisper enricher
2023-08-05 13:57:09 +01:00
msramalho
bd231488ff
parameter fix
2023-07-28 13:10:06 +01:00
msramalho
aa71c85a98
improving ignored content from waczs
2023-07-28 12:19:14 +01:00
msramalho
7a5c9c65bd
detects duplicates before storing, eg: wacz getting media already fetched by another archiver
2023-07-28 10:51:48 +01:00
msramalho
fc93ebaba0
cleanup
2023-07-28 10:49:39 +01:00
msramalho
3dd3775cbd
removes rearchiving logic
2023-07-27 20:14:50 +01:00
msramalho
e8f44b652e
minor improvements
2023-07-27 15:42:23 +01:00
msramalho
a0971fc601
final code review changes
2023-06-26 17:32:19 +01:00
msramalho
0cba2c25c6
get all media method
2023-06-26 17:28:19 +01:00
msramalho
6cf3e109ed
refactor discovery of inner media elements
2023-06-26 17:05:25 +01:00
msramalho
0a91863212
typing fixes
2023-05-24 11:18:39 +01:00
msramalho
2768225cd1
fix: generator not called
2023-05-23 19:05:47 +01:00
msramalho
1a5797d0f8
feat: orchestrator fed returns archive result
2023-05-23 18:12:04 +01:00
msramalho
613b1f1e50
properly overwrite configs
2023-05-19 12:35:19 +01:00
msramalho
a655b3c987
gsheet accepts ID too
2023-05-19 12:17:34 +01:00
msramalho
68e9d2a2ce
allows yaml config to be overwritten
2023-05-19 11:49:02 +01:00
msramalho
9c25b33f1c
fix: multiple storages with folder column
2023-05-09 12:14:07 +01:00
msramalho
c1a60fde8a
fix: deprecates duration column
2023-05-09 11:26:19 +01:00
msramalho
9d44f4b207
content append instead of replace
2023-05-02 19:06:00 +01:00
msramalho
ae7ceba0e5
better debug
2023-05-02 19:05:18 +01:00
msramalho
97821a81bc
log cleanup
2023-05-02 19:05:06 +01:00
msramalho
8c22a9df72
fixes "url-not-found"
2023-05-02 14:30:07 +01:00
msramalho
3d389ee05b
add url info
2023-04-18 19:14:47 +01:00
msramalho
69bcfea2eb
to_json fix
2023-04-18 18:48:51 +01:00
msramalho
493055a8d9
cleanup
2023-03-23 18:50:30 +00:00
msramalho
6f6eb2db7a
Archiving Context refactor complete
2023-03-23 14:28:45 +00:00
msramalho
906ed0f6e0
creating global context and refactoring tmp_dir logic
2023-03-23 11:17:38 +00:00
msramalho
aa5430451e
instagram archiver via telegram bot
2023-02-17 15:46:29 +00:00
msramalho
2a7ece5dcc
cleanups and docs
2023-02-08 22:13:19 +00:00
msramalho
4854929a1d
thumbnail and bot token
2023-02-02 13:49:56 +00:00
msramalho
d8a79b930b
imrpove logs
2023-02-02 11:55:22 +00:00
msramalho
d1e4dde3f6
fixing imports
2023-01-27 00:19:58 +00:00
msramalho
ac000d5943
cleanup
2023-01-27 00:03:30 +00:00
msramalho
f5b7c3a5ea
mute formatter and docker
2023-01-26 23:38:58 +00:00
msramalho
c261361ac8
try/catch enrichers
2023-01-26 23:03:51 +00:00
msramalho
2508bb8a1b
cleanup + rearchivable logic
2023-01-26 23:01:34 +00:00
msramalho
9dd8afed8c
minor improvements
2023-01-22 23:15:54 +00:00
msramalho
092ffdb6d8
replaywebpage
2023-01-22 00:48:09 +00:00
msramalho
746f6a333e
further cleanup
2023-01-21 19:57:54 +00:00
msramalho
b763fc4188
final naming cleanup + new feeders/dbs
2023-01-21 19:44:12 +00:00