mirror of
https://github.com/bellingcat/snscrape.git
synced 2026-06-08 02:28:29 +03:00
User label url
Each label also has a URL which is used for learning more about the label. While there are more label descriptions than label URLs the URLs do seem to group language variants of the same label. For example https://help.twitter.com/rules-and-policies/state-affiliated-china is used for all of the following label descriptions: * Média affilié à un État, Chine * China state-affiliated media * 中国官方媒体 * Çin devletine bağlı medya * China government official In some analysis contexts it could be useful to group these together.
This commit is contained in:
@@ -137,6 +137,7 @@ class User(snscrape.base.Entity):
|
||||
profileImageUrl: typing.Optional[str] = None
|
||||
profileBannerUrl: typing.Optional[str] = None
|
||||
label: typing.Optional[str] = None
|
||||
labelUrl: typing.Optional[str] = None
|
||||
|
||||
@property
|
||||
def url(self):
|
||||
@@ -460,6 +461,7 @@ class TwitterAPIScraper(snscrape.base.Scraper):
|
||||
kwargs['profileBannerUrl'] = user.get('profile_banner_url')
|
||||
if 'label' in user['ext']['highlightedLabel']['r']['ok']:
|
||||
kwargs['label'] = user['ext']['highlightedLabel']['r']['ok']['label']['description']
|
||||
kwargs['labelUrl'] = user['ext']['highlightedLabel']['r']['ok']['label']['url']['url']
|
||||
|
||||
return User(**kwargs)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user