cisticola.scraper package

Submodules

cisticola.scraper.bitchute module

class cisticola.scraper.bitchute.BitchuteScraper

Bases: cisticola.scraper.Scraper

An implementation of a Scraper for Bitchute, using classes from the 4cat library

can_handle(channel)
get_posts(channel: cisticola.base.Channel, since: Optional[cisticola.base.ScraperResult] = None) List[cisticola.base.ScraperResult]
get_username_from_url()
cisticola.scraper.bitchute.append_details(video, detail)

Append extra metadata to video data

Fetches the BitChute video detail page to scrape extra data for the given video.

Parameters
  • video (dict) – Video details as scraped so far

  • detail (str) – Detail level. If ‘comments’, also scrape video comments.

Return dict

Tuple, first item: updated video data, second: list of comments

cisticola.scraper.bitchute.get_about(user)

Extract fields from channel’s “About” tab

cisticola.scraper.bitchute.get_videos_user(session, user, csrftoken, detail)

Scrape videos for given BitChute user

Parameters
  • session – HTTP Session to use

  • user (str) – Username to scrape videos for

  • csrftoken (str) – CSRF token to use for requests

  • detail (str) – Detail level to scrape, basic/detail/comments

Returns

Video data dictionaries, as a generator

cisticola.scraper.bitchute.request_from_bitchute(session, method, url, headers=None, data=None)

Request something via the BitChute API (or non-API)

To avoid having to write the same error-checking everywhere, this takes care of retrying on failure, et cetera

Parameters
  • session – Requests session

  • method (str) – GET or POST

  • url (str) – URL to fetch

  • header (dict) – Headers to pass with the request

  • data (dict) – Data/params to send with the request

Returns

Requests response

cisticola.scraper.bitchute.strip_tags(html, convert_newlines=True)

Strip HTML from a string

param html

HTML to strip

param convert_newlines

Convert <br> and </p> tags to

before stripping
return

Stripped HTML

cisticola.scraper.gettr module

class cisticola.scraper.gettr.GettrScraper

Bases: cisticola.scraper.Scraper

An implementation of a Scraper for Gettr, using gogettr library

can_handle(channel)
get_posts(channel: cisticola.base.Channel, since: Optional[cisticola.base.ScraperResult] = None) List[cisticola.base.ScraperResult]
get_username_from_url()

cisticola.scraper.twitter module

class cisticola.scraper.twitter.TwitterScraper

Bases: cisticola.scraper.Scraper

An implementation of a Scraper for Twitter, using snscrape library

can_handle(channel)
get_posts(channel: cisticola.base.Channel, since: Optional[cisticola.base.ScraperResult] = None) List[cisticola.base.ScraperResult]
get_username_from_url()

Module contents

class cisticola.scraper.Scraper

Bases: object

can_handle(channel: cisticola.base.Channel) bool
get_posts(channel: cisticola.base.Channel, since: Optional[cisticola.base.ScraperResult] = None) List[cisticola.base.ScraperResult]