User label url

Each label also has a URL which is used for learning more about the
label. While there are more label descriptions than label URLs the URLs
do seem to group language variants of the same label. For example
https://help.twitter.com/rules-and-policies/state-affiliated-china is
used for all of the following label descriptions:

* Média affilié à un État, Chine
* China state-affiliated media
* 中国官方媒体
* Çin devletine bağlı medya
* China government official

In some analysis contexts it could be useful to group these together.
This commit is contained in:
Ed Summers
2021-09-16 13:04:57 -04:00
parent 3fb731ade1
commit a11eef6b06

View File

@@ -137,6 +137,7 @@ class User(snscrape.base.Entity):
profileImageUrl: typing.Optional[str] = None
profileBannerUrl: typing.Optional[str] = None
label: typing.Optional[str] = None
labelUrl: typing.Optional[str] = None
@property
def url(self):
@@ -460,6 +461,7 @@ class TwitterAPIScraper(snscrape.base.Scraper):
kwargs['profileBannerUrl'] = user.get('profile_banner_url')
if 'label' in user['ext']['highlightedLabel']['r']['ok']:
kwargs['label'] = user['ext']['highlightedLabel']['r']['ok']['label']['description']
kwargs['labelUrl'] = user['ext']['highlightedLabel']['r']['ok']['label']['url']['url']
return User(**kwargs)