msramalho
085376f63f
telegram archiver
2023-01-18 21:14:20 +00:00
msramalho
63d1abbe4b
tiktok archiver though info is no longer working
2023-01-18 16:56:35 +00:00
msramalho
1def8bb03d
instagram archiver
2023-01-18 16:16:23 +00:00
msramalho
725bab8240
twitter archivers
2023-01-18 00:15:18 +00:00
msramalho
f1bc83818d
template updates
2023-01-17 17:01:25 +00:00
msramalho
47dc788143
thumbnails enricher
2023-01-17 16:29:27 +00:00
msramalho
74e50eccf1
hash enricher and media refactor
2023-01-13 02:12:08 +00:00
msramalho
6ca46417fe
local storage + multiple storage support
2023-01-12 02:09:39 +00:00
msramalho
0cb593fd21
wayback enricher ready
2023-01-11 00:03:47 +00:00
msramalho
d4825196f1
html template working with jinja templates
2023-01-10 00:22:16 +00:00
msramalho
aac16fa8c2
minor comments
2023-01-09 22:24:44 +00:00
msramalho
1cdc006b27
s3 storaging + WIP gsheets DB
2023-01-04 18:02:44 +00:00
msramalho
bb512b36c9
gsheet feeder + db WIP
2023-01-04 16:37:36 +00:00
msramalho
96845305a3
media concept implemented
2022-12-14 19:01:20 +00:00
msramalho
9c056d001c
merge logic started
2022-12-14 16:11:06 +00:00
msramalho
53ffa2d4ae
telethon_archiver working for multiple media
2022-12-14 15:37:34 +00:00
msramalho
b3860cfec1
telethon join channels working
2022-12-14 14:01:39 +00:00
msramalho
955891a411
WIP feeder
2022-12-10 12:03:46 +00:00
msramalho
9dc709d3b9
demo feeder logic working
2022-11-24 15:44:25 +00:00
msramalho
618e7ed0a3
subproperties in config
2022-11-24 11:53:21 +00:00
msramalho
65dd155c90
WIP refactor logic
2022-11-15 15:00:52 +00:00
msramalho
6a0ce5ced1
orchestrator design structure
2022-11-11 02:08:48 +00:00
msramalho
04263094ad
WIP docker changes for cli and auto_archiver
2022-11-10 17:46:40 +00:00
msramalho
390b84eb22
dockerization complete
2022-11-08 15:55:33 +00:00
msramalho
81eadd4672
disable browsertrix on docker, see #66
2022-11-08 14:22:13 +00:00
msramalho
a8f7055696
reduces uncontrolled exceptions
2022-11-08 13:59:59 +00:00
msramalho
09f47383a3
dockerfile improvements
2022-11-08 13:59:35 +00:00
msramalho
629cd586db
adds session_file for missing archivers
2022-11-08 13:59:09 +00:00
msramalho
889eb1d270
Merge branch 'dev' into dockerize
2022-11-02 17:01:00 +00:00
msramalho
50e03ba565
closes #65 with simpler solution
2022-11-02 16:59:44 +00:00
msramalho
a9df992f66
WiP
2022-11-02 16:51:32 +00:00
msramalho
c8fa077df7
docker initial files
2022-10-31 17:10:55 +00:00
msramalho
29e1872e87
fix: rm stopped containers only
2022-10-31 10:41:27 +00:00
msramalho
7a700acd8e
hotfix for #65
2022-10-31 10:35:01 +00:00
msramalho
22363cb8b9
adds information on browsertrix usage
2022-10-20 11:59:23 +01:00
msramalho
ac4f1b6132
readme updates
2022-10-19 11:37:04 +01:00
msramalho
4d2b7b4040
reverse order of login attempts
2022-10-19 11:27:17 +01:00
msramalho
54c572258c
fix tty
2022-10-18 17:46:40 +01:00
msramalho
6c80a5b82d
session file logic
2022-10-18 17:35:59 +01:00
msramalho
63f53358d3
adds traceback
2022-10-18 16:38:12 +01:00
msramalho
3f121d800e
catch bad instagram login
2022-10-18 16:36:27 +01:00
msramalho
93be1af93f
adds instagram post/profile
2022-10-18 15:45:10 +01:00
msramalho
f0f844a569
improves browsertrix configurations
2022-10-18 11:21:10 +01:00
msramalho
df502f3bde
updates yt-dlp
2022-10-18 11:20:53 +01:00
msramalho
26903190fd
adds wacz link
2022-10-17 14:41:34 +01:00
Miguel Sozinho Ramalho
683f2d7500
Merge pull request #64 from bellingcat/dev
2022-10-17 14:40:15 +01:00
Miguel Sozinho Ramalho
23a4dc20c5
Merge pull request #63 from edsu/browsertrix-crawler
2022-10-17 14:39:34 +01:00
msramalho
57464f1506
refactors for edges in browsertrix and s3 upload, adds timeout parameter
2022-10-17 14:07:31 +01:00
msramalho
dc0ca8bdd6
adds browsertrix to all archivers flows
2022-10-17 14:06:50 +01:00
Ed Summers
20ca50dc90
Clean up browsertrix-crawler files
...
Remove any local browsertrix-crawler files after the WACZ has been
copied to storage. Note, until this issue has a release on DockerHub the
local files won't be able to be deleted since Docker on Linux creates
the files as root:
https://github.com/webrecorder/browsertrix-crawler/issues/170
The code will catch this exception and log a warning instead of failing
and losing the work that has been completed.
2022-10-11 16:49:19 -04:00