cisticola.scraper package¶
Submodules¶
cisticola.scraper.bitchute module¶
- class cisticola.scraper.bitchute.BitchuteScraper¶
Bases:
cisticola.scraper.ScraperAn implementation of a Scraper for Bitchute, using classes from the 4cat library
- can_handle(channel)¶
- get_posts(channel: cisticola.base.Channel, since: Optional[cisticola.base.ScraperResult] = None) List[cisticola.base.ScraperResult]¶
- get_username_from_url()¶
- cisticola.scraper.bitchute.append_details(video, detail)¶
Append extra metadata to video data
Fetches the BitChute video detail page to scrape extra data for the given video.
- Parameters
video (dict) – Video details as scraped so far
detail (str) – Detail level. If ‘comments’, also scrape video comments.
- Return dict
Tuple, first item: updated video data, second: list of comments
- cisticola.scraper.bitchute.get_about(user)¶
Extract fields from channel’s “About” tab
- cisticola.scraper.bitchute.get_videos_user(session, user, csrftoken, detail)¶
Scrape videos for given BitChute user
- Parameters
session – HTTP Session to use
user (str) – Username to scrape videos for
csrftoken (str) – CSRF token to use for requests
detail (str) – Detail level to scrape, basic/detail/comments
- Returns
Video data dictionaries, as a generator
- cisticola.scraper.bitchute.request_from_bitchute(session, method, url, headers=None, data=None)¶
Request something via the BitChute API (or non-API)
To avoid having to write the same error-checking everywhere, this takes care of retrying on failure, et cetera
- Parameters
session – Requests session
method (str) – GET or POST
url (str) – URL to fetch
header (dict) – Headers to pass with the request
data (dict) – Data/params to send with the request
- Returns
Requests response
- cisticola.scraper.bitchute.strip_tags(html, convert_newlines=True)¶
Strip HTML from a string
- param html
HTML to strip
- param convert_newlines
Convert <br> and </p> tags to
- before stripping
- return
Stripped HTML
cisticola.scraper.gettr module¶
- class cisticola.scraper.gettr.GettrScraper¶
Bases:
cisticola.scraper.ScraperAn implementation of a Scraper for Gettr, using gogettr library
- can_handle(channel)¶
- get_posts(channel: cisticola.base.Channel, since: Optional[cisticola.base.ScraperResult] = None) List[cisticola.base.ScraperResult]¶
- get_username_from_url()¶
cisticola.scraper.twitter module¶
- class cisticola.scraper.twitter.TwitterScraper¶
Bases:
cisticola.scraper.ScraperAn implementation of a Scraper for Twitter, using snscrape library
- can_handle(channel)¶
- get_posts(channel: cisticola.base.Channel, since: Optional[cisticola.base.ScraperResult] = None) List[cisticola.base.ScraperResult]¶
- get_username_from_url()¶
Module contents¶
- class cisticola.scraper.Scraper¶
Bases:
object- can_handle(channel: cisticola.base.Channel) bool¶
- get_posts(channel: cisticola.base.Channel, since: Optional[cisticola.base.ScraperResult] = None) List[cisticola.base.ScraperResult]¶