KΞVIN KΞLCHΞ
0942beedd6
fix: code style line spacing
2023-03-02 19:08:53 +00:00
KΞVIN KΞLCHΞ
3545837637
fix: code style line spacing
2023-03-02 19:05:16 +00:00
KΞVIN KΞLCHΞ
aa8d93e07c
Merge branch 'JustAnotherArchivist:master' into master
2023-03-01 22:49:43 +03:00
kelche
7061ad2eb5
fix: code style
2023-03-01 18:09:34 +03:00
JustAnotherArchivist
03ef3debaf
Fix behaviour on SIGPIPE/BrokenPipeError
2023-02-28 20:20:28 +00:00
JustAnotherArchivist
42cb6d8170
Fix crash on quotedRefResult without an actual result
...
Fixes #740
2023-02-28 20:16:55 +00:00
JustAnotherArchivist
ea7c6786c2
Handle TweetWithVisibilityResults on quoted tweets
...
Fixes #604
2023-02-28 20:16:07 +00:00
kelche
61dbbba6b1
feat: cashtag func
2023-02-27 22:39:31 +03:00
kelche
d1592177ab
feat: cashtag func
2023-02-27 22:35:21 +03:00
JustAnotherArchivist
21cf626803
Update list of scrapers
2023-02-21 22:10:33 +00:00
JustAnotherArchivist
f329b69ed4
Add support for scraping Twitter's user search
...
#263
2023-02-21 22:07:40 +00:00
JustAnotherArchivist
f109f3fd46
Fix forgotten warning name change (cf. 7327a013)
2023-02-21 21:59:06 +00:00
JustAnotherArchivist
7330e0a9a0
Rename private logger variable
2023-02-21 21:26:00 +00:00
JustAnotherArchivist
4e6956e564
Remove dead code
2023-02-21 21:25:01 +00:00
JustAnotherArchivist
4e70306f99
Deprecate Entity type
...
There is no meaningful distinction from Items, and it complicates the integration of scrapers for user searches
2023-02-21 21:24:00 +00:00
JustAnotherArchivist
7327a01397
Refactor module-level deprecation code
2023-02-21 21:23:12 +00:00
JustAnotherArchivist
880a0a7f55
Handle TweetUnavailable results
...
Fixes #433
2023-02-21 20:16:23 +00:00
JustAnotherArchivist
57b126c656
Add support for scraping Twitter Communities
...
Closes #614
2023-02-21 20:15:57 +00:00
JustAnotherArchivist
82f64a6472
Remove dead code
2023-02-21 06:22:13 +00:00
JustAnotherArchivist
6a6b02cb28
Handle tombstones
...
Closes #392
Fixes #603
2023-02-21 04:23:47 +00:00
JustAnotherArchivist
3d6cd63a00
Fix more logger typos
2023-02-21 04:23:47 +00:00
JustAnotherArchivist
9a2f1524c2
Remove dead code
2023-02-21 04:23:47 +00:00
JustAnotherArchivist
b5694e01a2
Fix logger typo
2023-02-21 04:23:47 +00:00
JustAnotherArchivist
280b972f22
Fix extraction of tweets behind 'offensive' replies button
2023-02-21 04:23:47 +00:00
JustAnotherArchivist
6ba478657b
Merge pull request #733 from mrunderline/fix/telegram_channel_members_count
...
fix: telegram channel members count
2023-02-20 19:16:03 +00:00
Ali Madihi
71fb33af70
fix: telegram channel members count
2023-02-20 22:14:34 +03:30
JustAnotherArchivist
c65e36a094
Bump GraphQL endpoints
2023-02-19 06:21:40 +00:00
JustAnotherArchivist
206907612d
Fix double dump on exceptions with --dump-locals
2023-02-19 05:12:47 +00:00
JustAnotherArchivist
fe5d90b748
Fix tweets behind 'Show more replies' button getting missed
...
Fixes #572
2023-02-19 03:29:39 +00:00
JustAnotherArchivist
f1cb96b685
Merge pull request #724 from quentinwolf/patch-1
...
Twitter: change fullUrl to use 'orig' instead of 'large'
2023-02-19 02:55:27 +00:00
JustAnotherArchivist
8709282ba0
Add deprecated properties to JSON
...
Cf. #611
2023-02-19 02:51:47 +00:00
quentinwolf
0933a30e37
change fullUrl to use 'orig' instead of 'large'
...
Changing fullUrl from '&name=large' to '&name=orig' since large is capped at half the resolution of orig which may not be ideal for scraping/archiving.
Large images are 2048px x 1365px
Original images are up to 4096px × 2730px
Alternatively one could add largeUrl as an alternative to download the Large image and utillze fullUrl as above to download the original image for those that do wish to save either versions, but I feel there is no reason for saving the middle-resolution image.
2023-02-13 16:45:44 -07:00
JustAnotherArchivist
d60ce38b6a
Make (most) consistency errors in unified cards non-fatal
...
Fixes #703
2023-02-10 02:39:06 +00:00
JustAnotherArchivist
23ebdd2a3c
Fix YAML syntax
2023-02-02 21:03:52 +00:00
JustAnotherArchivist
35c0c32c38
Refine bug report template
2023-02-02 21:02:16 +00:00
JustAnotherArchivist
b515a66b93
Fix crash in recursive tweet scraping
...
Introduced by 3e297c9a
Fixes #684
2023-01-19 16:18:15 +00:00
JustAnotherArchivist
36e85c54c1
Log response headers for debugging
2023-01-16 03:48:21 +00:00
JustAnotherArchivist
49270f6d3a
Fix debug messages for redirects to report the correct status code and redirect location
2023-01-16 03:47:46 +00:00
JustAnotherArchivist
d0fb9ab8a9
Log TLS connection details for debugging
2023-01-16 02:39:05 +00:00
JustAnotherArchivist
5d3f27bc2b
Fix title-less BroadcastCard crash
2023-01-15 16:36:04 +00:00
JustAnotherArchivist
b7cb270b6e
Fix crash on empty user objects
2023-01-15 12:31:28 +00:00
JustAnotherArchivist
8ad26fc7d1
Switch from setup.py to pyproject.toml
2023-01-13 18:52:03 +00:00
JustAnotherArchivist
1fb5c39168
Add Python 3.11 classifier
2023-01-13 10:12:39 +00:00
JustAnotherArchivist
d81d247a87
Port Reddit scraper to new Pushshift API
...
Fixes #619
2023-01-13 10:07:58 +00:00
JustAnotherArchivist
564a5eca77
Fix crash on unavailable users in cards
2023-01-13 09:12:16 +00:00
JustAnotherArchivist
bf0e720b5a
Fix crash on empty tweet entries in timelines
...
Fixes #620
2023-01-13 09:01:15 +00:00
JustAnotherArchivist
27374285a2
Fix crash on missing source label data
...
This data had been announced in mid-November to disappear but was still always returned by the API until very recently.
2023-01-13 08:32:02 +00:00
JustAnotherArchivist
238bdcd560
Reduce warnings about duplicate users on cards
2023-01-13 08:28:52 +00:00
JustAnotherArchivist
e846a6a4cd
Fix KeyError in card user handling
2023-01-13 08:06:57 +00:00
JustAnotherArchivist
cbeb65d5c9
Fix KeyError crash on some tweets with AmplifyCards
...
Fixes #601
2023-01-13 07:57:31 +00:00