Tristan Lee
|
fab65a5d67
|
formatted with black, added pre-commit hook, pegged typing_extensions package version to fix spaCy issue
|
2023-08-04 14:51:00 -05:00 |
|
Tristan Lee
|
e2142966e7
|
refactored tests to reduce redundancy, got tests workig for Telegram, Bitchute, Gettr, and Rumble
|
2023-08-03 00:53:38 -05:00 |
|
Tristan Lee
|
249f411a1d
|
fixed some issues with Telegram tests
|
2023-07-27 13:07:44 -05:00 |
|
Tristan Lee
|
a2a7882f1c
|
fixed Gettr and Bitchute info transformers, added missing or incorrect TelegramTransformer fields, added Telegram mentions to the transformer.
|
2022-06-13 13:42:33 -05:00 |
|
Tristan Lee
|
282f33eff3
|
implemented deferred media archiving for all scrapers, and implemented tests for them. Refactored archiving methods of Instagram and Gettr scrapers to be able to use default archiving method
|
2022-04-01 01:30:49 -05:00 |
|
Logan Williams
|
94cf6c3d84
|
TelegramTelethonScraper: Use channel_id when channel has been previously encountered
|
2022-03-31 16:37:54 +02:00 |
|
Tristan Lee
|
b7871b060d
|
added capability to scrape Gab group posts
|
2022-03-30 09:11:07 -05:00 |
|
Logan Williams
|
571b019137
|
Fix tests for Twitter transformer
|
2022-03-22 11:33:27 +01:00 |
|
Tristan Lee
|
e287fd03d9
|
merged scraper into main and fixed minor merge conflict
|
2022-03-15 09:12:12 -05:00 |
|
Tristan Lee
|
750f0cc887
|
added scraper for Instagram
|
2022-03-14 10:28:10 -05:00 |
|
Logan Williams
|
fd4b617743
|
Add TwitterTransformer test
|
2022-03-14 13:39:10 +01:00 |
|
Tristan Lee
|
965bf1e2dc
|
added youtube scraper, moved from official youtube-dl repo to using yt-dlp because download speed for youtube videos is much better
|
2022-03-11 17:19:52 -06:00 |
|
Tristan Lee
|
821c39004b
|
incorporated vkontakte scraper
|
2022-03-10 22:32:39 -06:00 |
|
Tristan Lee
|
5783206ad8
|
implemented method to reset database, to enable the 'contoller' fixture scope to be shared across the whole package, which will enable the transformer tests to be run without re-running the scrapers
|
2022-03-10 10:20:49 -06:00 |
|
Tristan Lee
|
6cf3b8842d
|
renamed 'archive_media' and 'media' to avoid name collision, changed scope of test fixture controller to 'function' so that db is fresh for each executed test
|
2022-03-09 13:19:35 -06:00 |
|
Tristan Lee
|
739e1d8484
|
added capability of running scraper without archiving media, and implemented prototype Telethon scraper for Telegram
|
2022-03-09 12:12:01 -06:00 |
|
Tristan Lee
|
cd5f68e9e5
|
added basic unit tests
|
2022-03-04 12:36:09 -06:00 |
|