JustAnotherArchivist
50899c01f3
Fix crash on malformed guest token cache file
...
Fixes #494
2022-06-16 17:12:04 +00:00
JustAnotherArchivist
bcad6923c2
Rename Tweet.content to rawContent and User.description to renderedDescription for consistency
...
Closes #479
2022-06-14 00:35:02 +00:00
JustAnotherArchivist
0d361685ff
Fix AttributeError crash on scrapers using the default CLI constructor
...
Introduced by 267b7d0e
Fixes #483
2022-06-01 17:35:38 +00:00
JustAnotherArchivist
530f4fa122
Fix KeyErrors on display_url and expanded_url for certain users with broken profile links
...
Fixes #480
2022-05-29 17:23:43 +00:00
JustAnotherArchivist
dc6bc9bf9d
Refactor how links on Twitter are handled
...
All links in text (tweets, profile descriptions, and profile links) are now represented by TextLink objects, which contain all relevant information: the displayed text (if available), the URL, the short t.co URL, and the indices in the text at which it appears.
Closes #478
2022-05-29 07:16:04 +00:00
JustAnotherArchivist
01cf6a09b3
Fix type of description URL objects
2022-05-29 05:08:23 +00:00
JustAnotherArchivist
ef7c4fad3e
Fix AttributeError for DescriptionURL on from-import
2022-05-29 05:08:23 +00:00
JustAnotherArchivist
faeffe2603
Merge pull request #474 from GeraniumKF/GeraniumKF-reddit-since-crash
...
Fix crash using --since with Reddit
2022-05-23 23:06:16 +00:00
Geranium
e3bdc02a7c
Reddit: deprecate 'created' property for 'date'
...
This fixes a crash when using --since with the Reddit scraper,
as the CLI code expects items to have a date property.
2022-05-23 23:31:44 +01:00
JustAnotherArchivist
ed3ea944d1
Fix newsletter issue cards without an issue description
...
Fixes #456
2022-04-16 19:44:36 +00:00
JustAnotherArchivist
e7a6d38a5f
Add support for community_details cards
2022-04-15 20:07:01 +00:00
JustAnotherArchivist
6c50eee31b
Fix proxies not being applied correctly due to missing merge with environment settings
...
Fixes #447
2022-04-15 19:23:54 +00:00
JustAnotherArchivist
5103a33afa
Fix t.co card URL replacement on retweets
...
Fixes #411
2022-04-15 03:18:45 +00:00
JustAnotherArchivist
247bd82d79
Refactor to tweetId variable
2022-04-15 03:14:29 +00:00
JustAnotherArchivist
5fc67f2bcf
Add support for 'message me' cards
2022-04-15 02:52:37 +00:00
JustAnotherArchivist
65e7d8bd24
Fix warning on card URL translation to include the tweet ID
2022-04-15 02:52:03 +00:00
JustAnotherArchivist
3870282a42
Fix broadcast and event card crashes
2022-04-12 20:53:38 +00:00
JustAnotherArchivist
7c0fcdec43
Fix Periscope card crashes
2022-04-12 18:29:51 +00:00
JustAnotherArchivist
9af1f19034
Properly support all card types
...
Fixes #407
2022-04-12 18:11:26 +00:00
JustAnotherArchivist
5fc3c0e290
Fix crash in locals dumping on module-less frames
2022-04-12 18:03:36 +00:00
JustAnotherArchivist
5d156c6a15
Detect and raise error on redirect from GraphQL endpoint to login
...
#165
2022-04-03 02:34:30 +00:00
JustAnotherArchivist
694657ef80
Fix broken exception references
2022-03-09 01:01:47 +00:00
JustAnotherArchivist
1ab0f4fccb
Fix missing quoted tweet reference in certain buggy cases
2022-03-07 22:16:58 +00:00
JustAnotherArchivist
3a92b5bf0d
Add log message for guest token file deletion
2022-02-26 19:32:55 +00:00
JustAnotherArchivist
2480b173f4
Fix crash on race condition in CLI guest token manager resets
...
Fixes #414
2022-02-26 19:31:08 +00:00
JustAnotherArchivist
77bbb9f61f
Remove useless pass
2022-02-20 18:54:51 +00:00
JustAnotherArchivist
57a624c618
Merge pull request #410 from AccentuSoft/master
...
Fix Vkontakte-user module crash on users with millions of followers
2022-02-18 06:01:35 +00:00
AccentuSoft
b1cfd51121
Implementing changes
2022-02-17 21:52:15 +02:00
AccentuSoft
ace2c16f54
Fix Vkontakte-user module crash on users with millions of followers
2022-02-17 15:42:46 +02:00
JustAnotherArchivist
2f9c0457df
Convert t.co card URLs to unshortened when possible
2022-02-17 01:50:15 +00:00
JustAnotherArchivist
878f2a3c7a
Handle cards without descriptions and thumbnails
...
Fixes #407
2022-02-17 01:49:32 +00:00
JustAnotherArchivist
25ee014e29
Extract cards
2022-02-16 02:59:21 +00:00
JustAnotherArchivist
a192dc6236
Handle TweetWithVisibilityResults
...
Fixes #400
2022-02-14 18:08:59 +00:00
JustAnotherArchivist
a7242f340b
Remove obsolete TODO
...
There is no retweetedTweetRef in Twitter's JS.
2022-02-14 18:08:29 +00:00
JustAnotherArchivist
359cc25cdf
Fix crash on entity attribute when scraping suspended users
...
Fixes #396
2022-02-10 04:22:59 +00:00
JustAnotherArchivist
01799a7391
Detect when CLI guest token from file has expired
2022-02-08 19:38:45 +00:00
JustAnotherArchivist
b0753c34ed
Fix forgotten method name changes in 7d939c11
...
Fixes #393
2022-02-08 15:35:49 +00:00
JustAnotherArchivist
7f78fa0bc0
Recurse through all tweets encountered, not only ones with a positive replyCount
...
Fixes #266
2022-02-07 18:13:56 +00:00
JustAnotherArchivist
8702a9c7e2
Add Reddit submission scraper
...
Closes #312
2022-02-07 04:43:54 +00:00
JustAnotherArchivist
8ac1fd3ea8
Refactor Pushshift code to separate the general things from the search
2022-02-07 04:43:19 +00:00
JustAnotherArchivist
9235890f9a
Fix KeyError crash on attempting to scrape inexistent tweet ID
2022-02-07 04:04:21 +00:00
JustAnotherArchivist
7d939c110c
Port profile and tweet scrapers to GraphQL API
...
Fixes #367
2022-02-07 03:49:14 +00:00
JustAnotherArchivist
8e95e9a9a7
Fix crash on places without a bounding box
...
Fixes #374
2022-02-07 00:38:22 +00:00
JustAnotherArchivist
aa7d7d3dc3
Refactor automatic importing in snscrape.modules to something less hacky
...
Cf. #357
2022-02-05 03:22:55 +00:00
JustAnotherArchivist
560c78c5cf
Make all optional scraper arguments keyword-only and fix Mastodon argument style to conform with the other scrapers
...
Cf. #376
2022-01-30 00:21:18 +00:00
JustAnotherArchivist
107c3c71c2
Remove unnecessary f-strings
...
Cf. #370
2022-01-28 21:22:13 +00:00
JustAnotherArchivist
7f88678253
Merge pull request #359 from own3dh2so4/master
...
Added proxy option to Scraper base
2022-01-13 23:08:28 +00:00
David Garcia Alvarez
52e4f9fb69
Added proxy option to Scraper base
2022-01-13 16:56:00 +01:00
JustAnotherArchivist
eebdfc1c55
Refactor username vs ID mess
...
Closes #354
2022-01-12 22:36:26 +00:00
JustAnotherArchivist
e6076353c8
Fix user ID being a string instead of an int on the entity
2022-01-12 22:35:50 +00:00