JustAnotherArchivist
aa7d7d3dc3
Refactor automatic importing in snscrape.modules to something less hacky
...
Cf. #357
2022-02-05 03:22:55 +00:00
JustAnotherArchivist
560c78c5cf
Make all optional scraper arguments keyword-only and fix Mastodon argument style to conform with the other scrapers
...
Cf. #376
2022-01-30 00:21:18 +00:00
JustAnotherArchivist
107c3c71c2
Remove unnecessary f-strings
...
Cf. #370
2022-01-28 21:22:13 +00:00
JustAnotherArchivist
7f88678253
Merge pull request #359 from own3dh2so4/master
...
Added proxy option to Scraper base
2022-01-13 23:08:28 +00:00
David Garcia Alvarez
52e4f9fb69
Added proxy option to Scraper base
2022-01-13 16:56:00 +01:00
JustAnotherArchivist
eebdfc1c55
Refactor username vs ID mess
...
Closes #354
2022-01-12 22:36:26 +00:00
JustAnotherArchivist
e6076353c8
Fix user ID being a string instead of an int on the entity
2022-01-12 22:35:50 +00:00
JustAnotherArchivist
a32d79fab2
Fix crash on certain mblogs that lack the raw_text attribute
2022-01-12 22:31:49 +00:00
JustAnotherArchivist
65391297f6
Move CLI methods to end of class definition for consistent code style
2022-01-12 21:09:38 +00:00
JustAnotherArchivist
deb2659dd6
Prefix CLI-related methods with an underscore
...
Closes #355
2022-01-12 21:07:10 +00:00
JustAnotherArchivist
93e62744d7
Fix missing timezone info
2022-01-07 00:42:09 +00:00
JustAnotherArchivist
3f3632d341
Add support for Mastodon profile and toot scrapes
...
Closes #43
2022-01-06 03:25:06 +00:00
JustAnotherArchivist
5070953feb
Skip private fields and properties on dataclass-to-JSON conversion
2022-01-06 02:08:48 +00:00
JustAnotherArchivist
853848ed5d
ScrollDirection is not part of the public API
2022-01-05 19:43:19 +00:00
JustAnotherArchivist
0b4abdc43f
Fix baseUrl on tweet scrapes
2022-01-05 02:39:54 +00:00
JustAnotherArchivist
267b7d0e32
Rename CLI classmethods
2022-01-05 02:27:09 +00:00
JustAnotherArchivist
acb7f10a4f
Cache Twitter tokens on disk from the CLI for reuse between scrapes
...
Closes #339
2022-01-05 02:20:40 +00:00
JustAnotherArchivist
ca00b480b1
Fix AssertionError on quoted comments
...
Fixes #340
2022-01-04 01:15:08 +00:00
JustAnotherArchivist
f189ab4241
Prefix all private API names with an underscore
...
Cf. #328
2022-01-03 17:51:23 +00:00
JustAnotherArchivist
c6e1e33a23
Fix crashing typos
2022-01-03 17:49:55 +00:00
JustAnotherArchivist
a37ea528d3
Refactor Reddit scrapers again to merge RedditPushshiftScraper and RedditScraper
...
Cf. #328
2022-01-03 17:48:35 +00:00
JustAnotherArchivist
eee06d8593
Refactor Reddit scrapers into a more reasonable code structure
...
Cf. #328
2021-12-24 04:58:32 +00:00
JustAnotherArchivist
4dd3ee6e47
Refactor Instagram scrapers to get rid of the awkward mode parameter
...
Cf. #328
2021-12-24 04:50:53 +00:00
JustAnotherArchivist
0336ce13ed
Add support for fetching a guest token from the API
2021-12-23 04:26:50 +00:00
JustAnotherArchivist
193d4f80d6
Fix user agent in API headers staying constant
2021-12-23 04:25:23 +00:00
JustAnotherArchivist
e7d35ec1eb
Fix date parsing on quoted posts
2021-12-15 16:55:14 +00:00
JustAnotherArchivist
8540045658
Fix typo
2021-12-15 16:36:28 +00:00
JustAnotherArchivist
1f1c1bd8af
Fix docstring style
2021-12-14 20:05:51 +00:00
JustAnotherArchivist
7fdc8bcb53
Randomise user agent when the guest token can't be found
2021-12-14 20:04:46 +00:00
JustAnotherArchivist
4b3c6aefe7
Add default values to user and tweet scrapers for a more untuitive usage
2021-12-12 04:57:16 +00:00
JustAnotherArchivist
525cd71225
Retry guest token retrieval
...
Fixes #325 (hopefully)
2021-12-12 00:10:59 +00:00
JustAnotherArchivist
72abff9e5c
Reuse guest tokens across scrapes
...
Cf. #326
2021-12-11 23:18:42 +00:00
JustAnotherArchivist
bcaa477b3d
Update list of scrapers
2021-12-08 08:29:02 +00:00
JustAnotherArchivist
66d4c99f82
Remove dev version notice
2021-12-08 08:25:21 +00:00
JustAnotherArchivist
0ac50f1383
Add README to package metadata
2021-12-08 08:18:25 +00:00
JustAnotherArchivist
c2257ad16e
Add Python 3.10 classifier
2021-12-08 08:15:05 +00:00
JustAnotherArchivist
58f654405f
Add --citation
...
Closes #229
2021-12-08 07:51:28 +00:00
JustAnotherArchivist
35fb61a327
Fix crash on dumping scopes which have a variable pointing to a dataclass
2021-11-24 03:39:06 +00:00
JustAnotherArchivist
a6b6f3faaa
Throw an error on empty arguments
...
Fixes #290
2021-10-10 17:43:27 +00:00
JustAnotherArchivist
5e829e2541
Refactor class instantiation to remove the need to repeat 'retries' everywhere
2021-09-30 09:58:10 +00:00
JustAnotherArchivist
d4567da23c
Improve list of scrapers on --help output
...
Don't list all scrapers in the usage line, and provide a sorted readable list instead.
2021-09-30 09:35:17 +00:00
JustAnotherArchivist
e5e0da25a0
Remove unused imports
2021-09-30 09:24:18 +00:00
JustAnotherArchivist
821326bcfb
Fix a few f-strings
2021-09-30 09:23:56 +00:00
JustAnotherArchivist
4bf9ef239c
Restructure usage section
2021-09-30 09:18:43 +00:00
JustAnotherArchivist
e382891642
Fix Twitter trends not having a str representation
2021-09-21 21:40:50 +00:00
JustAnotherArchivist
e5f4389464
Add Twitter trend scraper
...
Due to restrictions on Twitter's side, it is not possible to get trends from a custom location as that would require using an account and/or their API.
Closes #206
2021-09-21 21:28:41 +00:00
JustAnotherArchivist
d91f971f51
Refactor user label implementation and add support for bot accounts
...
Closes #281
2021-09-21 19:39:40 +00:00
JustAnotherArchivist
67e8295293
Merge pull request #280 from edsu/master
...
User Labels
2021-09-19 03:35:49 +00:00
JustAnotherArchivist
5fc2562642
Add user label support on entity retrieval
2021-09-19 03:32:35 +00:00
JustAnotherArchivist
2825bd0a73
Remove accidental empty line
2021-09-19 03:31:56 +00:00