Commit Graph

369 Commits

Author SHA1 Message Date
JustAnotherArchivist
23ebdd2a3c Fix YAML syntax 2023-02-02 21:03:52 +00:00
JustAnotherArchivist
35c0c32c38 Refine bug report template 2023-02-02 21:02:16 +00:00
JustAnotherArchivist
b515a66b93 Fix crash in recursive tweet scraping
Introduced by 3e297c9a

Fixes #684
2023-01-19 16:18:15 +00:00
JustAnotherArchivist
36e85c54c1 Log response headers for debugging 2023-01-16 03:48:21 +00:00
JustAnotherArchivist
49270f6d3a Fix debug messages for redirects to report the correct status code and redirect location 2023-01-16 03:47:46 +00:00
JustAnotherArchivist
d0fb9ab8a9 Log TLS connection details for debugging 2023-01-16 02:39:05 +00:00
JustAnotherArchivist
5d3f27bc2b Fix title-less BroadcastCard crash 2023-01-15 16:36:04 +00:00
JustAnotherArchivist
b7cb270b6e Fix crash on empty user objects 2023-01-15 12:31:28 +00:00
JustAnotherArchivist
8ad26fc7d1 Switch from setup.py to pyproject.toml 2023-01-13 18:52:03 +00:00
JustAnotherArchivist
1fb5c39168 Add Python 3.11 classifier 2023-01-13 10:12:39 +00:00
JustAnotherArchivist
d81d247a87 Port Reddit scraper to new Pushshift API
Fixes #619
2023-01-13 10:07:58 +00:00
JustAnotherArchivist
564a5eca77 Fix crash on unavailable users in cards 2023-01-13 09:12:16 +00:00
JustAnotherArchivist
bf0e720b5a Fix crash on empty tweet entries in timelines
Fixes #620
2023-01-13 09:01:15 +00:00
JustAnotherArchivist
27374285a2 Fix crash on missing source label data
This data had been announced in mid-November to disappear but was still always returned by the API until very recently.
2023-01-13 08:32:02 +00:00
JustAnotherArchivist
238bdcd560 Reduce warnings about duplicate users on cards 2023-01-13 08:28:52 +00:00
JustAnotherArchivist
e846a6a4cd Fix KeyError in card user handling 2023-01-13 08:06:57 +00:00
JustAnotherArchivist
cbeb65d5c9 Fix KeyError crash on some tweets with AmplifyCards
Fixes #601
2023-01-13 07:57:31 +00:00
JustAnotherArchivist
3e19f8f84b Add support for image_collection_website unified cards 2023-01-13 07:36:53 +00:00
JustAnotherArchivist
28f5a45825 Fix empty page counter not getting reset on results 2023-01-13 06:59:51 +00:00
JustAnotherArchivist
2196bdf3e8 Extract vibe 2023-01-13 04:09:00 +00:00
JustAnotherArchivist
faf09b2f5e Extract tweet view counts
Closes #629
2023-01-13 04:00:50 +00:00
JustAnotherArchivist
3e297c9a42 Update GraphQL API parameters 2023-01-13 04:00:31 +00:00
JustAnotherArchivist
a0414d92cf Extract alt text for media on Twitter
Closes #588
2023-01-13 03:13:10 +00:00
JustAnotherArchivist
ff5e2d61ee Update search API parameters 2023-01-13 03:01:48 +00:00
JustAnotherArchivist
129ad3fc34 Add --max-empty-pages option to stop long (potentially infinite) empty pagination
Fixes #636
2023-01-13 02:35:48 +00:00
JustAnotherArchivist
7de8d734e9 Override TLS ciphers to get past Twitter's new fingerprinting
Fixes #647
2023-01-13 02:25:39 +00:00
JustAnotherArchivist
ceb06664f0 Clarify descriptions of issue templates 2023-01-11 22:52:52 +00:00
JustAnotherArchivist
996cf882cc Expose status code for non-200 Twitter responses 2023-01-11 20:01:05 +00:00
JustAnotherArchivist
e449d5cdbe Expose individual error messages when all request retries fail 2023-01-11 20:01:05 +00:00
JustAnotherArchivist
cbdaee6864 Merge pull request #343 from TheTechRobo/master
Add issue templates for snscrape
2022-12-19 23:25:17 +00:00
JustAnotherArchivist
a3bee057b1 Merge pull request #615 from engkimo/fix-return-twitter-place-ids
Add returning Twitter Place IDs
2022-12-19 22:57:40 +00:00
JustAnotherArchivist
6f9a0e6534 Merge pull request #590 from caseyho/UnifiedCardApp_no_category
Handle tweets that contain card info with no category
2022-12-19 22:55:36 +00:00
engkimo
4ff4af13cf Add returning Twitter Place IDs 2022-12-06 11:23:01 +09:00
JustAnotherArchivist
e09aea70e7 Fix Twitter username length limit
Although 15 characters is the official, current limit, there are accounts with longer usernames. 20 is the longest observed example, but it's unclear what the true limit is.
2022-12-03 06:36:52 +00:00
Casey Ho
aa325fa1a5 Handle UnifiedCardApp with no category 2022-11-14 17:38:03 -08:00
JustAnotherArchivist
46a603053c Handle users with extensions but no label
Fixes #559
2022-10-16 21:13:46 +00:00
JustAnotherArchivist
59abeaf04c Make newsletter card images optional
Fixes #546
2022-09-04 15:04:20 +00:00
JustAnotherArchivist
e13033fea0 Fix AttributeError on certain videos included from other platforms 2022-08-24 15:53:21 +00:00
JustAnotherArchivist
9294c26ffa Make PeriscopeBroadcastCard.thumbnailUrl optional to handle tweets without a thumbnail
Fixes #507
2022-08-21 01:58:41 +00:00
JustAnotherArchivist
d6bce5b1d6 Merge pull request #518 from hgrsd/fix/vkontakte-photo-scrape
fix(vkontakte): update photo detection
2022-08-21 01:49:59 +00:00
JustAnotherArchivist
2c7a85a620 Add warning on unknown page_info types 2022-08-21 01:40:49 +00:00
JustAnotherArchivist
ff18f6f771 Fix video extraction on Weibo
Fixes #509
2022-08-21 01:40:31 +00:00
JustAnotherArchivist
da3d870e10 Drop app icons when Twitter didn't actually include them in the response
Fixes #470
2022-08-13 21:17:55 +00:00
hgrsd
279d1cf4a1 fix(vkontakte): update photo detection 2022-07-16 18:27:02 +01:00
JustAnotherArchivist
d72b51953f Fix missing r prefix on string with regex backslashes 2022-06-24 23:12:50 +00:00
JustAnotherArchivist
d5b406bc1b Update API parameters to what Twitter currently uses
The `count` reduction does not affect anything as Twitter ignores that parameter now. Cf. #481
2022-06-23 19:50:17 +00:00
JustAnotherArchivist
50899c01f3 Fix crash on malformed guest token cache file
Fixes #494
2022-06-16 17:12:04 +00:00
JustAnotherArchivist
bcad6923c2 Rename Tweet.content to rawContent and User.description to renderedDescription for consistency
Closes #479
2022-06-14 00:35:02 +00:00
JustAnotherArchivist
0d361685ff Fix AttributeError crash on scrapers using the default CLI constructor
Introduced by 267b7d0e

Fixes #483
2022-06-01 17:35:38 +00:00
JustAnotherArchivist
530f4fa122 Fix KeyErrors on display_url and expanded_url for certain users with broken profile links
Fixes #480
2022-05-29 17:23:43 +00:00