Commit Graph

  • 24544b0fe8 library updates msramalho 2022-06-07 17:28:47 +02:00
  • e32c0788b7 minor update msramalho 2022-06-03 19:40:26 +02:00
  • e2d1a5d6be import cleanups msramalho 2022-06-03 18:30:12 +02:00
  • 5e495b713f minor update msramalho 2022-06-03 18:23:53 +02:00
  • 66e214afa4 minor improvements and cleanup archivers msramalho 2022-06-03 18:23:45 +02:00
  • 5135e97d3f cleanup auto_archive and config msramalho 2022-06-03 18:03:49 +02:00
  • aaa1d299da started cleaning auto_archive msramalho 2022-06-03 17:32:55 +02:00
  • a2fdfacb26 config refactor and cleanup msramalho 2022-06-03 17:32:25 +02:00
  • c679e02c73 updated storages init msramalho 2022-06-03 17:32:02 +02:00
  • d33daabee1 refactoring storages msramalho 2022-06-03 15:46:00 +02:00
  • 10f03cb888 Merge branch 'dev' into refactor-configs msramalho 2022-06-02 17:30:47 +02:00
  • 19049cea46 Merge pull request #35 from djhmateer/folders Miguel Sozinho Ramalho 2022-05-26 18:23:09 +01:00
  • 159adf9afe refactoring filenumber into subfolder msramalho 2022-05-26 19:18:29 +02:00
  • 03aa02e88b diagram msramalho 2022-05-25 12:23:59 +02:00
  • b895def432 method customization to children msramalho 2022-05-25 12:23:52 +02:00
  • 93cf3a8937 remove vscode files for now msramalho 2022-05-25 12:19:37 +02:00
  • b58cbd2e85 package management msramalho 2022-05-25 12:19:29 +02:00
  • 0c1cb6d6af improve comments msramalho 2022-05-25 12:19:18 +02:00
  • c802a15160 ignoring new files msramalho 2022-05-25 12:19:04 +02:00
  • ea261635a2 cleanup msramalho 2022-05-25 10:32:26 +02:00
  • dbac5accbd Save to folders for S3 and GD. Google Drive (GD) storage Dave Mateer 2022-05-11 15:39:44 +01:00
  • b3599dee71 working Dave Mateer 2022-05-11 14:01:22 +01:00
  • 2a01038c0c memleak msramalho 2022-05-11 00:14:42 +02:00
  • d7f44b948f wayback fix msramalho 2022-05-10 23:15:58 +02:00
  • bca960b228 merge from master and fixes msramalho 2022-05-10 23:09:33 +02:00
  • f6e8da34b8 Merge remote-tracking branch 'origin/main' into refactor-configs msramalho 2022-05-10 22:37:09 +02:00
  • d469967c03 fix index out of range for empty sheets msramalho 2022-05-10 22:24:21 +02:00
  • f6bc45361a ignore custom configs msramalho 2022-05-10 20:48:40 +02:00
  • 94b37b02ba telethon refactor for failures msramalho 2022-05-10 20:23:44 +02:00
  • 39f27ec1bc reenable telethon msramalho 2022-05-10 20:23:13 +02:00
  • b459f36dda C msramalho 2022-05-09 18:23:01 +02:00
  • e0276dfab1 additional cleanup msramalho 2022-05-09 18:19:38 +02:00
  • 6bd6f88b46 refactor msramalho 2022-05-09 17:45:54 +02:00
  • bba510b8c2 Merge pull request #30 from djhmateer/logging Miguel Sozinho Ramalho 2022-05-09 15:59:49 +01:00
  • 6e8eccefd8 self-documenting info message Miguel Sozinho Ramalho 2022-05-09 15:59:35 +01:00
  • 05a3adfc36 Merge pull request #31 from djhmateer/firefox Miguel Sozinho Ramalho 2022-05-09 15:59:02 +01:00
  • 3a3d3c6690 Merge pull request #29 from djhmateer/fb-cookie Miguel Sozinho Ramalho 2022-05-09 15:53:07 +01:00
  • 3c0811a80d Merge pull request #28 from djhmateer/yt-sub Miguel Sozinho Ramalho 2022-05-09 15:52:10 +01:00
  • 16f4093fbf Merge pull request #27 from djhmateer/bar Miguel Sozinho Ramalho 2022-05-09 13:56:10 +01:00
  • 0d65798308 wip: configurations and logic msramalho 2022-05-09 14:54:48 +02:00
  • bb599f702d Reload firefox driver on every spreadsheet row Dave Mateer 2022-05-09 12:16:18 +01:00
  • f52d8cdef8 add back in d_dotenv() Dave Mateer 2022-05-09 12:02:43 +01:00
  • e3c0ae1d45 dotenv Dave Mateer 2022-05-09 11:57:54 +01:00
  • e18a9779db added log diretory and file creation Dave Mateer 2022-05-09 11:55:10 +01:00
  • 07b82f0e82 update .example.env Dave Mateer 2022-05-09 11:46:19 +01:00
  • 51f635ce50 get env variable FACEBOOK_COOKIE patch through from auto_archive Dave Mateer 2022-05-09 11:44:25 +01:00
  • 7ae6e0c6f8 fb cooke in ytd Dave Mateer 2022-05-09 11:38:08 +01:00
  • bd235347ac Added catch in youtubedl_archiver for twitter.com to see if a linked video is in there eg vk.com Dave Mateer 2022-05-09 11:23:42 +01:00
  • cb18289e4f Get Twitter original size image quality Dave Mateer 2022-05-09 10:55:38 +01:00
  • f592c7fcfe refactor to use config.py msramalho 2022-05-03 20:34:04 +02:00
  • f00e31c23d introduce config.py msramalho 2022-05-03 20:33:54 +02:00
  • ac9ed1a0d7 extract wayback config msramalho 2022-05-03 20:33:38 +02:00
  • a7948ac768 extract telegram config msramalho 2022-05-03 20:33:19 +02:00
  • 03a6611c86 adds local storage msramalho 2022-05-03 20:33:02 +02:00
  • 24340190af s3 storage config refactor msramalho 2022-05-03 20:32:53 +02:00
  • b680700b22 ignoring config file msramalho 2022-05-03 20:32:23 +02:00
  • 8f62e8b7c6 Update README.md Miguel Sozinho Ramalho 2022-05-03 14:45:18 +01:00
  • 77700060a4 Merge pull request #25 from djhmateer/foo Miguel Sozinho Ramalho 2022-05-03 14:37:31 +01:00
  • fec380e93d Fixed wwww (4 w's) to www in youtubedl Dave Mateer 2022-04-27 10:18:10 +01:00
  • 1b020b5856 Merge pull request #21 from bellingcat/resolve-telethon-issues Logan Williams 2022-04-26 10:13:14 +02:00
  • 8358ab0bfc assert post is not None msramalho 2022-03-30 11:12:06 +02:00
  • 3bdeec1d2f fix deprecation warning for selenium msramalho 2022-03-30 11:05:31 +02:00
  • e5168fa07c removing TODO msramalho 2022-03-30 10:55:57 +02:00
  • 576f1a8f68 fix the UTF-8 issue for cyrilic msramalho 2022-03-30 10:55:33 +02:00
  • 398f296789 Fix Selenium driver issues with telegram links Logan Williams 2022-03-18 11:10:27 +01:00
  • 538bb05395 Merge branch 'main' of github.com:bellingcat/auto-archiver into main Logan Williams 2022-03-18 09:53:29 +01:00
  • 050b04e31d Add flag for storage privacy Logan Williams 2022-03-18 09:53:21 +01:00
  • d611aa1e14 Some videos don't render a duration for some reason Logan Williams 2022-03-18 09:44:17 +01:00
  • 0fe455558a Merge pull request #20 from bellingcat/telegram-through-api Logan Williams 2022-03-18 09:38:44 +01:00
  • 450065b6fb removes print msramalho 2022-03-16 19:56:18 +01:00
  • 516db483d6 telethon archiver working for 0,1,1+ media objects msramalho 2022-03-16 19:51:02 +01:00
  • c2ae382a4e isloates html page generation logic so it can be reused msramalho 2022-03-16 19:50:44 +01:00
  • 30787506a1 additional logging msramalho 2022-03-16 19:50:29 +01:00
  • 0035603bfb telethon-poc msramalho 2022-03-15 18:45:53 +01:00
  • 3b9b42b854 minor code cleanup msramalho 2022-03-15 11:32:39 +01:00
  • 0304860bce Don't check status for empty URL rows Logan Williams 2022-03-14 11:10:51 +01:00
  • aaca6efac1 Merge pull request #19 from bellingcat/screenshots Logan Williams 2022-03-14 09:51:57 +01:00
  • 8d06eae96a Merge pull request #18 from bellingcat/hasing-and-multiple-names Logan Williams 2022-03-14 09:49:32 +01:00
  • 07bbf443ca improves documentation msramalho 2022-03-13 12:05:09 +01:00
  • 4c54926548 offset fix msramalho 2022-03-12 20:29:43 +01:00
  • d8d9cf17dc fix offset msramalho 2022-03-12 20:25:52 +01:00
  • f121c9dab7 enable tolower msramalho 2022-03-12 20:14:16 +01:00
  • 67b16064bb offby1 msramalho 2022-03-12 20:11:38 +01:00
  • ec4ae84487 case-insensitive is a bad idea msramalho 2022-03-12 20:06:31 +01:00
  • 69483d432c adds logs msramalho 2022-03-12 20:04:08 +01:00
  • 6e5e7212c2 fixes header offset msramalho 2022-03-12 19:56:00 +01:00
  • 486c3295b5 log msramalho 2022-03-12 19:54:10 +01:00
  • 6c5d6f521e implements fresh status retrieval if needed msramalho 2022-03-10 19:00:02 +01:00
  • d30115935e Merge pull request #16 from bellingcat/screenshots Logan Williams 2022-03-09 14:59:07 +01:00
  • 52333874c9 making column names configurable through the command line msramalho 2022-03-09 12:38:04 +01:00
  • 077c71f941 fixes index out fo range bug msramalho 2022-03-09 12:18:06 +01:00
  • ff874fe0d3 simplifies access to google sheets, single get_values msramalho 2022-03-09 12:17:51 +01:00
  • 544e7578a6 removes duplicate code msramalho 2022-03-09 11:46:14 +01:00
  • 59027ac477 simplification msramalho 2022-03-09 11:44:19 +01:00
  • 39ec190e56 adds README instructions for geckodriver msramalho 2022-03-09 11:44:05 +01:00
  • 82ca6792c4 Fix issue with extracting time from Telegram media posts Logan Williams 2022-03-02 14:45:36 +01:00
  • aa4b175dea Fix issue with timestamps being convereted to user format Logan Williams 2022-02-28 12:54:58 +01:00
  • c6b159905b Switch to headless Firefox Logan Williams 2022-02-28 11:45:32 +01:00
  • 6ebce974f0 WIP: Make timezones more consistent in UTC Logan Williams 2022-02-28 08:42:59 +01:00
  • 2d50703489 Generate archivers for Telegram posts with images; move generation to function in base_archiver Logan Williams 2022-02-28 08:41:45 +01:00