Commit Graph

20 Commits

Author SHA1 Message Date
Logan Williams
fa516da763 Rename TransformedResult to the clearer Post 2022-03-22 11:41:55 +01:00
Logan Williams
c0a094eefa Load channels from google sheet in test.py 2022-03-22 11:37:47 +01:00
Logan Williams
fd4b617743 Add TwitterTransformer test 2022-03-14 13:39:10 +01:00
Logan Williams
fa5037d67c Implement transformer for TwitterScraper that handles media; implement image OCR and EXIF extraction 2022-03-10 15:34:24 +01:00
Tristan Lee
6cf3b8842d renamed 'archive_media' and 'media' to avoid name collision, changed scope of test fixture controller to 'function' so that db is fresh for each executed test 2022-03-09 13:19:35 -06:00
Tristan Lee
739e1d8484 added capability of running scraper without archiving media, and implemented prototype Telethon scraper for Telegram 2022-03-09 12:12:01 -06:00
Tristan Lee
c21e43ddfa refactored import structure 2022-03-04 10:55:54 -06:00
Tristan Lee
ee4d64750b added prototype Rumble scraper 2022-02-28 18:38:33 -06:00
Tristan Lee
bc840e631d added Gab scraper 2022-02-28 12:11:21 -06:00
Tristan Lee
47dad8fb00 added odysee scraper, minor refactoring of url_to_blob method (added url_to_key method that can be overridden by child classes while still using the parent url_to_blob method) and changed test file to include only channels with a relatively small number of posts, to make testing faster 2022-02-25 20:28:00 -06:00
Tristan Lee
ef83cc4b0a converted bitchute to yield, got video archiving working on bitchute and gettr, added url_to_blob method that downloads media bytes blob from url and converted archive_media to take in the media bytes blob instead of the media url. 2022-02-25 13:43:30 -06:00
Logan Williams
0b1c175dd9 Modify GettrScraper to yield results, archive media (videos incomplete) 2022-02-24 20:25:14 +01:00
Logan Williams
e64d845002 Archive media in Twitter scraper 2022-02-24 18:48:48 +01:00
Logan Williams
214287b7a8 Archive media in dictionary 2022-02-24 17:35:24 +01:00
Logan Williams
6092e4caa5 Add method for archiving media, reoranize scraper base classes 2022-02-24 16:36:55 +01:00
Tristan Lee
139459e3b2 implemented Bitchute scraper 2022-02-18 12:45:10 -06:00
Tristan Lee
4668d4df11 implemented Gettr scraper 2022-02-18 10:13:37 -06:00
Logan Williams
b824b98a95 Reorganize transformer defition location 2022-02-18 14:57:10 +01:00
Logan Williams
c5d49ef521 Reorganize class definitions slightly 2022-02-18 14:14:25 +01:00
Logan Williams
82ad210b8e Initial commit 2022-02-18 14:01:49 +01:00