cisticola package

Subpackages

Submodules

cisticola.base module

class cisticola.base.Channel(id: int, name: str, platform_id: str, category: str, followers: int, platform: str, url: str, country: str, influencer: str, public: bool, chat: bool, notes: str)

Bases: object

category: str
chat: bool
country: str
followers: int
id: int
influencer: str
name: str
notes: str
platform: str
platform_id: str
public: bool
url: str
class cisticola.base.ScraperResult(scraper, platform, channel, platform_id, date, raw_data, date_archived)

Bases: object

A minimally processed result from a scraper

channel: int
date: datetime.datetime
date_archived: datetime.datetime
id
platform: str
platform_id: str
raw_data: str
scraper: str
class cisticola.base.TransformedResult(raw_id, scraper, transformer, platform, channel, date, date_archived, url, content, author_id, author_username)

Bases: object

An object with fields for columns in the analysis table

author_id: str
author_username: str
channel: str
content: str
date: datetime.datetime
date_archived: datetime.datetime
id
platform: str
raw_id: int
scraper: str
transformer: str
url: str

Module contents

class cisticola.ETLController

Bases: object

This class will transform the raw_data tables into a format more conducive to analysis.

class cisticola.ScraperController

Bases: object

Registers scrapers, uses them to generate ScraperResults. Synchronizes everything with database via ORM.

connect_to_db(engine)
register_scraper(scraper: cisticola.scraper.Scraper)
scrape_channels(channels: List[cisticola.base.Channel])