JustAnotherArchivist
|
5d156c6a15
|
Detect and raise error on redirect from GraphQL endpoint to login
#165
|
2022-04-03 02:34:30 +00:00 |
|
JustAnotherArchivist
|
694657ef80
|
Fix broken exception references
|
2022-03-09 01:01:47 +00:00 |
|
JustAnotherArchivist
|
1ab0f4fccb
|
Fix missing quoted tweet reference in certain buggy cases
|
2022-03-07 22:16:58 +00:00 |
|
JustAnotherArchivist
|
3a92b5bf0d
|
Add log message for guest token file deletion
|
2022-02-26 19:32:55 +00:00 |
|
JustAnotherArchivist
|
2480b173f4
|
Fix crash on race condition in CLI guest token manager resets
Fixes #414
|
2022-02-26 19:31:08 +00:00 |
|
JustAnotherArchivist
|
77bbb9f61f
|
Remove useless pass
|
2022-02-20 18:54:51 +00:00 |
|
JustAnotherArchivist
|
57a624c618
|
Merge pull request #410 from AccentuSoft/master
Fix Vkontakte-user module crash on users with millions of followers
|
2022-02-18 06:01:35 +00:00 |
|
AccentuSoft
|
b1cfd51121
|
Implementing changes
|
2022-02-17 21:52:15 +02:00 |
|
AccentuSoft
|
ace2c16f54
|
Fix Vkontakte-user module crash on users with millions of followers
|
2022-02-17 15:42:46 +02:00 |
|
JustAnotherArchivist
|
2f9c0457df
|
Convert t.co card URLs to unshortened when possible
|
2022-02-17 01:50:15 +00:00 |
|
JustAnotherArchivist
|
878f2a3c7a
|
Handle cards without descriptions and thumbnails
Fixes #407
|
2022-02-17 01:49:32 +00:00 |
|
JustAnotherArchivist
|
25ee014e29
|
Extract cards
|
2022-02-16 02:59:21 +00:00 |
|
JustAnotherArchivist
|
a192dc6236
|
Handle TweetWithVisibilityResults
Fixes #400
|
2022-02-14 18:08:59 +00:00 |
|
JustAnotherArchivist
|
a7242f340b
|
Remove obsolete TODO
There is no retweetedTweetRef in Twitter's JS.
|
2022-02-14 18:08:29 +00:00 |
|
JustAnotherArchivist
|
359cc25cdf
|
Fix crash on entity attribute when scraping suspended users
Fixes #396
|
2022-02-10 04:22:59 +00:00 |
|
JustAnotherArchivist
|
01799a7391
|
Detect when CLI guest token from file has expired
|
2022-02-08 19:38:45 +00:00 |
|
JustAnotherArchivist
|
b0753c34ed
|
Fix forgotten method name changes in 7d939c11
Fixes #393
|
2022-02-08 15:35:49 +00:00 |
|
JustAnotherArchivist
|
7f78fa0bc0
|
Recurse through all tweets encountered, not only ones with a positive replyCount
Fixes #266
|
2022-02-07 18:13:56 +00:00 |
|
JustAnotherArchivist
|
8702a9c7e2
|
Add Reddit submission scraper
Closes #312
|
2022-02-07 04:43:54 +00:00 |
|
JustAnotherArchivist
|
8ac1fd3ea8
|
Refactor Pushshift code to separate the general things from the search
|
2022-02-07 04:43:19 +00:00 |
|
JustAnotherArchivist
|
9235890f9a
|
Fix KeyError crash on attempting to scrape inexistent tweet ID
|
2022-02-07 04:04:21 +00:00 |
|
JustAnotherArchivist
|
7d939c110c
|
Port profile and tweet scrapers to GraphQL API
Fixes #367
|
2022-02-07 03:49:14 +00:00 |
|
JustAnotherArchivist
|
8e95e9a9a7
|
Fix crash on places without a bounding box
Fixes #374
|
2022-02-07 00:38:22 +00:00 |
|
JustAnotherArchivist
|
aa7d7d3dc3
|
Refactor automatic importing in snscrape.modules to something less hacky
Cf. #357
|
2022-02-05 03:22:55 +00:00 |
|
JustAnotherArchivist
|
560c78c5cf
|
Make all optional scraper arguments keyword-only and fix Mastodon argument style to conform with the other scrapers
Cf. #376
|
2022-01-30 00:21:18 +00:00 |
|
JustAnotherArchivist
|
107c3c71c2
|
Remove unnecessary f-strings
Cf. #370
|
2022-01-28 21:22:13 +00:00 |
|
JustAnotherArchivist
|
7f88678253
|
Merge pull request #359 from own3dh2so4/master
Added proxy option to Scraper base
|
2022-01-13 23:08:28 +00:00 |
|
David Garcia Alvarez
|
52e4f9fb69
|
Added proxy option to Scraper base
|
2022-01-13 16:56:00 +01:00 |
|
JustAnotherArchivist
|
eebdfc1c55
|
Refactor username vs ID mess
Closes #354
|
2022-01-12 22:36:26 +00:00 |
|
JustAnotherArchivist
|
e6076353c8
|
Fix user ID being a string instead of an int on the entity
|
2022-01-12 22:35:50 +00:00 |
|
JustAnotherArchivist
|
a32d79fab2
|
Fix crash on certain mblogs that lack the raw_text attribute
|
2022-01-12 22:31:49 +00:00 |
|
JustAnotherArchivist
|
65391297f6
|
Move CLI methods to end of class definition for consistent code style
|
2022-01-12 21:09:38 +00:00 |
|
JustAnotherArchivist
|
deb2659dd6
|
Prefix CLI-related methods with an underscore
Closes #355
|
2022-01-12 21:07:10 +00:00 |
|
JustAnotherArchivist
|
93e62744d7
|
Fix missing timezone info
|
2022-01-07 00:42:09 +00:00 |
|
JustAnotherArchivist
|
3f3632d341
|
Add support for Mastodon profile and toot scrapes
Closes #43
|
2022-01-06 03:25:06 +00:00 |
|
JustAnotherArchivist
|
5070953feb
|
Skip private fields and properties on dataclass-to-JSON conversion
|
2022-01-06 02:08:48 +00:00 |
|
JustAnotherArchivist
|
853848ed5d
|
ScrollDirection is not part of the public API
|
2022-01-05 19:43:19 +00:00 |
|
JustAnotherArchivist
|
0b4abdc43f
|
Fix baseUrl on tweet scrapes
|
2022-01-05 02:39:54 +00:00 |
|
JustAnotherArchivist
|
267b7d0e32
|
Rename CLI classmethods
|
2022-01-05 02:27:09 +00:00 |
|
JustAnotherArchivist
|
acb7f10a4f
|
Cache Twitter tokens on disk from the CLI for reuse between scrapes
Closes #339
|
2022-01-05 02:20:40 +00:00 |
|
JustAnotherArchivist
|
ca00b480b1
|
Fix AssertionError on quoted comments
Fixes #340
|
2022-01-04 01:15:08 +00:00 |
|
JustAnotherArchivist
|
f189ab4241
|
Prefix all private API names with an underscore
Cf. #328
|
2022-01-03 17:51:23 +00:00 |
|
JustAnotherArchivist
|
c6e1e33a23
|
Fix crashing typos
|
2022-01-03 17:49:55 +00:00 |
|
JustAnotherArchivist
|
a37ea528d3
|
Refactor Reddit scrapers again to merge RedditPushshiftScraper and RedditScraper
Cf. #328
|
2022-01-03 17:48:35 +00:00 |
|
JustAnotherArchivist
|
eee06d8593
|
Refactor Reddit scrapers into a more reasonable code structure
Cf. #328
|
2021-12-24 04:58:32 +00:00 |
|
JustAnotherArchivist
|
4dd3ee6e47
|
Refactor Instagram scrapers to get rid of the awkward mode parameter
Cf. #328
|
2021-12-24 04:50:53 +00:00 |
|
JustAnotherArchivist
|
0336ce13ed
|
Add support for fetching a guest token from the API
|
2021-12-23 04:26:50 +00:00 |
|
JustAnotherArchivist
|
193d4f80d6
|
Fix user agent in API headers staying constant
|
2021-12-23 04:25:23 +00:00 |
|
JustAnotherArchivist
|
e7d35ec1eb
|
Fix date parsing on quoted posts
|
2021-12-15 16:55:14 +00:00 |
|
JustAnotherArchivist
|
8540045658
|
Fix typo
|
2021-12-15 16:36:28 +00:00 |
|