Merge pull request #278 from bellingcat/dependabot_fix

This force-pins cryptography to >44.0.1 to fix dependabot warning
Lock poetry file
2026-06-11 20:58:29 +03:00 · 2025-03-26 11:57:35 +00:00 · 2025-03-26 15:43:03 +04:00 · 2025-03-26 15:39:53 +04:00 · 2025-03-26 11:25:55 +00:00 · 2025-03-26 15:11:25 +04:00
49 changed files with 1946 additions and 772 deletions
--- a/.github/dependabot.yml
+++ b/.github/dependabot.yml
@@ -0,0 +1,40 @@
 # To get started with Dependabot version updates, you'll need to specify which
 # package ecosystems to update and where the package manifests are located.
 # Please see the documentation for all configuration options:
 # https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
 version: 2
 updates:
  - package-ecosystem: "pip"
    directory: "/"
    groups:
      python:
        patterns:
          - "*"
    schedule:
      interval: "weekly"
  - package-ecosystem: "github-actions"
    directory: "/"
    groups:
      actions:
        patterns:
          - "*"
    schedule:
      interval: "weekly"
  - package-ecosystem: "npm"
    directory: "/scripts/settings/"
    groups:
      actions:
        patterns:
          - "*"
    schedule:
      interval: "weekly"
  - package-ecosystem: "docker"
    # Look for a `Dockerfile` in the `root` directory
    directory: "/"
    # Check for updates once a week
    schedule:
      interval: "weekly"
--- a/.github/workflows/ruff.yaml
+++ b/.github/workflows/ruff.yaml
@@ -3,8 +3,18 @@ name: Ruff Formatting & Linting
 on:
  push:
    branches: [ main ]
    paths-ignore:
      - "README.md"
      - ".github"
      - "poetry.lock"
      - "scripts/settings"
  pull_request:
    branches: [ main ]
    paths-ignore:
      - "README.md"
      - ".github"
      - "poetry.lock"
      - "scripts/settings"
 jobs:
  build:
--- a/.gitignore
+++ b/.gitignore
@@ -4,6 +4,7 @@ temp/
 .DS_Store
 expmt/
 service_account.json
 service_account-*.json
 __pycache__/
 ._*
 anu.html
--- a/2
+++ b/2
@@ -1,4 +1,4 @@
-FROM webrecorder/browsertrix-crawler:1.4.2 AS base
+FROM webrecorder/browsertrix-crawler:1.5.8 AS base
 ENV RUNNING_IN_DOCKER=1 \
    LANG=C.UTF-8 \
--- a/docker-compose.yaml
+++ b/docker-compose.yaml
@@ -1,4 +1,3 @@
 version: '3.8'
 services:
  auto-archiver:
@@ -10,7 +9,4 @@ services:
    volumes:
      - ./secrets:/app/secrets
      - ./local_archive:/app/local_archive
    environment:
      - WACZ_ENABLE_DOCKER=true
      - RUNNING_IN_DOCKER=true
    command: --config secrets/orchestration.yaml
--- a/docs/source/how_to/gsheets_setup.md
+++ b/docs/source/how_to/gsheets_setup.md
@@ -6,12 +6,43 @@ This guide explains how to set up Google Sheets to process URLs automatically an
 2. Setting up a service account so Auto Archiver can access the sheet
 3. Setting the Auto Archiver settings
 ### 1. Setting up your Google Sheet
-Any Google sheet must have at least *one* column, with the name 'link' (you can change this name afterwards). This is the column with the URLs that you want the Auto Archiver to archive. 
+## 1. Setting up a Google Service Account
 Your sheet can have many other columns that the Auto Archiver can use, and you can also include any additional columns for your own personal use. The order of the columns does not matter, the naming just needs to be correctly assigned to its corresponding value in the configuration file.
-We recommend copying [this template Google Sheet](https://docs.google.com/spreadsheets/d/1NJZo_XZUBKTI1Ghlgi4nTPVvCfb0HXAs6j5tNGas72k/edit?usp=sharing) as a starting point for your project, as this matches the default column names.
+Once your Google Sheet is set up, you need to create what's called a 'service account' that will allow the Auto Archiver to access it.
 To do this, you can either:
 * a) follow the steps in [this guide](https://gspread.readthedocs.io/en/latest/oauth2.html) all the way up until step 8. You should have downloaded a file called `service_account.json` and should save it in the `secrets/` folder
 * b) run the following script to automatically generate the file:
 ```{code} bash
 https://raw.githubusercontent.com/bellingcat/auto-archiver/refs/heads/main/scripts/generate_google_services.sh | bash -s --
 ```
 This uses gcloud to create a new project, a new user and downloads the service account automatically for you. The service account file will have the name `service_account-XXXXXXX.json` where XXXXXXX is a random 16 letter/digit string for the project created.
 ```{note}
 To save the generated file to a different folder, pass an argument as follows:
 ```{code} bash
 https://raw.githubusercontent.com/bellingcat/auto-archiver/refs/heads/main/scripts/generate_google_services.sh | bash -s -- /path/to/secrets
 ```
 ----------
 Once you've downloaded the file, you can save it to `secrets/service_account.json` (the default name), or to another file and then change the location in the settings (see step 4).
 Also make sure to **note down** the email address for this service account. You'll need that for step 3.
 ```{note}
 The email address created in this step can be found either by opening the `service_account.json` file, or if you used b) the `generate_google_services.sh` script, then the script will have printed it out for you.
 The email address will look something like `user@project-name.iam.gserviceaccount.com`
 ```
 ## 2. Setting up your Google Sheet
 We recommend copying [this template Google Sheet](https://docs.google.com/spreadsheets/d/1NJZo_XZUBKTI1Ghlgi4nTPVvCfb0HXAs6j5tNGas72k/edit?usp=sharing) as a starting point for your project, as this matches all the columns required.
 But if you like, you can also create your own custom sheet. The only columns required are 'link', 'archive status', and 'archive location'. 'link' is the column with the URLs that you want the Auto Archiver to archive, the other two record the archival status and result. 
 Here's an overview of all the columns, and what a complete sheet would look like.
@@ -46,21 +77,18 @@ In this example the Ghseet Feeder and Gsheet DB are being used, and the archive
 ![A screenshot of a Google Spreadsheet with column headers defined as above, and several Youtube and Twitter URLs in the "Link" column](../../demo-before.png)
-We'll change the name of the 'Destination Folder' column in step 3.
+We'll change the name of the 'Destination Folder' column in the Step 4a.
-## 2. Setting up your Service Account
+## 3. Share your Google Sheet with your Service Account email address
-Once your Google Sheet is set up, you need to create what's called a 'service account' that will allow the Auto Archiver to access it.
+Remember that email address you copied in Step 1? Now that you've set up your Google sheet, click 'Share' in the top
 right hand corner and enter the email address. Make sure to give the account **Editor** access. Here's how that looks:
-To do this, follow the steps in [this guide](https://gspread.readthedocs.io/en/latest/oauth2.html) all the way up until step 8. You should have downloaded a file called `service_account.json` and shared the Google Sheet with the log 'client_email' email address in this file.
+![Share sheet](share_sheet.png)
-Once you've downloaded the file, save it to `secrets/service_account.json`
+## 4. Setting up the configuration file
-## 3. Setting up the configuration file
+The final step is to set your configuration. First, make sure you have `gsheet_feeder_db` set in the `steps.feeders` section of your config. If you wish to store the results of the archiving process back in your Google sheet, make sure to also put `gsheet_feeder_db` setting in the `steps.databases` section. Here's how this might look:
 Now that you've set up your Google sheet, and you've set up the service account so Auto Archiver can access the sheet, the final step is to set your configuration.
 First, make sure you have `gsheet_feeder_db` set in the `steps.feeders` section of your config. If you wish to store the results of the archiving process back in your Google sheet, make sure to also set the `ghseet_db` settig in the `steps.databases` section. Here's how this might look:
 ```{code} yaml
 steps:
@@ -75,12 +103,15 @@ steps:
 Next, set up the `gsheet_feeder_db` configuration settings in the 'Configurations' part of the config `orchestration.yaml` file. Open up the file, and set the `gsheet_feeder_db.sheet` setting or the `gsheet_feeder_db.sheet_id` setting. The `sheet` should be the name of your sheet, as it shows in the top left of the sheet. 
 For example, the sheet [here](https://docs.google.com/spreadsheets/d/1NJZo_XZUBKTI1Ghlgi4nTPVvCfb0HXAs6j5tNGas72k/edit?gid=0#gid=0) is called 'Public Auto Archiver template'.
 If you saved your `service_account.json` file to anywhere other than the default location (`secrets/service_account.json`), then also make sure to change that now:
 Here's how this might look:
 ```{code} yaml
 ...
 gsheet_feeder_db:
    sheet: 'My Awesome Sheet'
    service_account: secrets/service_account-XXXXX.json # or leave as secrets/service_account.json
    ...
 ```
@@ -90,7 +121,7 @@ You can also pass these settings directly on the command line without having to
 Here, the sheet name has been overridden/specified in the command line invocation.
-### 3a. (Optional) Changing the column names
+### 4a. (Optional) Changing the column names
 In step 1, we said we would change the name of the 'Destination Folder'. Perhaps you don't like this name, or already have a sheet with a different name. In our example here, we want to name this column 'Save Folder'. To do this, we need to edit the `ghseet_feeder_db.column` setting in the configuration file. 
 For more information on this setting, see the [Gsheet Feeder Database docs](../modules/autogen/feeder/gsheet_feeder_db.md#configuration-options). We will first copy the default settings from the Gsheet Feeder docs for the 'column' settings, and then edit the 'Destination Folder' section to rename it 'Save Folder'. Our final configuration section looks like:
--- a/docs/source/how_to/share_sheet.png
+++ b/docs/source/how_to/share_sheet.png
--- a/docs/source/installation/authentication.md
+++ b/docs/source/installation/authentication.md
@@ -6,6 +6,15 @@ There are two main use cases for authentication:
 * Some websites require some kind of authentication in order to view the content. Examples include Facebook, Telegram etc.
 * Some websites use anti-bot systems to block bot-like tools from accessing the website. Adding real login information to auto-archiver can sometimes bypass this.
 ```{note}
 The Authentication framework currently only works with the following modules:
 * Generic Extractor
 * Screenshot Enricher
 To authenticate for WACZ archiving, see the instructions on the [](../modules/autogen/enricher/wacz_extractor_enricher.md) page.
 ```
 ## The Authentication Config
 You can save your authentication information directly inside your orchestration config file, or as a separate file (for security/multi-deploy purposes). Whether storing your settings inside the orchestration file, or as a separate file, the configuration format is the same. Currently, auto-archiver supports the following authentication types:
@@ -27,7 +36,7 @@ You can save your authentication information directly inside your orchestration
 The Username & Password, and API settings only work with the Generic Extractor. Other modules (like the screenshot enricher) can only use the `cookies` options. Furthermore, many sites can still detect bots and block username/password logins. Twitter/X and YouTube are two prominent ones that block username/password logging.
-One of the 'Cookies' options is recommended for the most robust archiving.
+One of the 'Cookies' options is recommended for the most robust archiving, but it still isn't guaranteed to work.
 ```
 ```{code} yaml
--- a/poetry.lock
+++ b/poetry.lock
@@ -33,14 +33,14 @@ files = [
 [[package]]
 name = "anyio"
-version = "4.8.0"
+version = "4.9.0"
 description = "High level compatibility layer for multiple asynchronous event loop implementations"
 optional = false
 python-versions = ">=3.9"
 groups = ["docs"]
 files = [
-    {file = "anyio-4.8.0-py3-none-any.whl", hash = "sha256:b5011f270ab5eb0abf13385f851315585cc37ef330dd88e27ec3d34d651fd47a"},
+    {file = "anyio-4.9.0-py3-none-any.whl", hash = "sha256:9f76d541cad6e36af7beb62e978876f3b41e3e04f2c1fbf0884604c0a9c4d93c"},
-    {file = "anyio-4.8.0.tar.gz", hash = "sha256:1d9fe889df5212298c0c0723fa20479d1b94883a2df44bd3897aa91083316f7a"},
+    {file = "anyio-4.9.0.tar.gz", hash = "sha256:673c0c244e15788651a4ff38710fea9675823028a6f08a5eda409e0c9840a028"},
 ]
 [package.dependencies]
@@ -50,32 +50,20 @@ sniffio = ">=1.1"
 typing_extensions = {version = ">=4.5", markers = "python_version < \"3.13\""}
 [package.extras]
-doc = ["Sphinx (>=7.4,<8.0)", "packaging", "sphinx-autodoc-typehints (>=1.2.0)", "sphinx_rtd_theme"]
+doc = ["Sphinx (>=8.2,<9.0)", "packaging", "sphinx-autodoc-typehints (>=1.2.0)", "sphinx_rtd_theme"]
-test = ["anyio[trio]", "coverage[toml] (>=7)", "exceptiongroup (>=1.2.0)", "hypothesis (>=4.0)", "psutil (>=5.9)", "pytest (>=7.0)", "trustme", "truststore (>=0.9.1) ; python_version >= \"3.10\"", "uvloop (>=0.21) ; platform_python_implementation == \"CPython\" and platform_system != \"Windows\" and python_version < \"3.14\""]
+test = ["anyio[trio]", "blockbuster (>=1.5.23)", "coverage[toml] (>=7)", "exceptiongroup (>=1.2.0)", "hypothesis (>=4.0)", "psutil (>=5.9)", "pytest (>=7.0)", "trustme", "truststore (>=0.9.1) ; python_version >= \"3.10\"", "uvloop (>=0.21) ; platform_python_implementation == \"CPython\" and platform_system != \"Windows\" and python_version < \"3.14\""]
 trio = ["trio (>=0.26.1)"]
 [[package]]
 name = "asn1crypto"
 version = "1.5.1"
 description = "Fast ASN.1 parser and serializer with definitions for private keys, public keys, certificates, CRL, OCSP, CMS, PKCS#3, PKCS#7, PKCS#8, PKCS#12, PKCS#5, X.509 and TSP"
 optional = false
 python-versions = "*"
 groups = ["main"]
 files = [
    {file = "asn1crypto-1.5.1-py2.py3-none-any.whl", hash = "sha256:db4e40728b728508912cbb3d44f19ce188f218e9eba635821bb4b68564f8fd67"},
    {file = "asn1crypto-1.5.1.tar.gz", hash = "sha256:13ae38502be632115abf8a24cbe5f4da52e3b5231990aff31123c805306ccb9c"},
 ]
 [[package]]
 name = "astroid"
-version = "3.3.8"
+version = "3.3.9"
 description = "An abstract syntax tree for Python with inference support."
 optional = false
 python-versions = ">=3.9.0"
 groups = ["docs"]
 files = [
-    {file = "astroid-3.3.8-py3-none-any.whl", hash = "sha256:187ccc0c248bfbba564826c26f070494f7bc964fd286b6d9fff4420e55de828c"},
+    {file = "astroid-3.3.9-py3-none-any.whl", hash = "sha256:d05bfd0acba96a7bd43e222828b7d9bc1e138aaeb0649707908d3702a9831248"},
-    {file = "astroid-3.3.8.tar.gz", hash = "sha256:a88c7994f914a4ea8572fac479459f4955eeccc877be3f2d959a33273b0cf40b"},
+    {file = "astroid-3.3.9.tar.gz", hash = "sha256:622cc8e3048684aa42c820d9d218978021c3c3d174fb03a9f0d615921744f550"},
 ]
 [package.dependencies]
@@ -83,21 +71,21 @@ typing-extensions = {version = ">=4.0.0", markers = "python_version < \"3.11\""}
 [[package]]
 name = "attrs"
-version = "25.1.0"
+version = "25.3.0"
 description = "Classes Without Boilerplate"
 optional = false
 python-versions = ">=3.8"
 groups = ["main"]
 files = [
-    {file = "attrs-25.1.0-py3-none-any.whl", hash = "sha256:c75a69e28a550a7e93789579c22aa26b0f5b83b75dc4e08fe092980051e1090a"},
+    {file = "attrs-25.3.0-py3-none-any.whl", hash = "sha256:427318ce031701fea540783410126f03899a97ffc6f61596ad581ac2e40e3bc3"},
-    {file = "attrs-25.1.0.tar.gz", hash = "sha256:1c97078a80c814273a76b2a298a932eb681c87415c11dee0a6921de7f1b02c3e"},
+    {file = "attrs-25.3.0.tar.gz", hash = "sha256:75d7cefc7fb576747b2c81b4442d4d4a1ce0900973527c011d1030fd3bf4af1b"},
 ]
 [package.extras]
 benchmark = ["cloudpickle ; platform_python_implementation == \"CPython\"", "hypothesis", "mypy (>=1.11.1) ; platform_python_implementation == \"CPython\" and python_version >= \"3.10\"", "pympler", "pytest (>=4.3.0)", "pytest-codspeed", "pytest-mypy-plugins ; platform_python_implementation == \"CPython\" and python_version >= \"3.10\"", "pytest-xdist[psutil]"]
 cov = ["cloudpickle ; platform_python_implementation == \"CPython\"", "coverage[toml] (>=5.3)", "hypothesis", "mypy (>=1.11.1) ; platform_python_implementation == \"CPython\" and python_version >= \"3.10\"", "pympler", "pytest (>=4.3.0)", "pytest-mypy-plugins ; platform_python_implementation == \"CPython\" and python_version >= \"3.10\"", "pytest-xdist[psutil]"]
 dev = ["cloudpickle ; platform_python_implementation == \"CPython\"", "hypothesis", "mypy (>=1.11.1) ; platform_python_implementation == \"CPython\" and python_version >= \"3.10\"", "pre-commit-uv", "pympler", "pytest (>=4.3.0)", "pytest-mypy-plugins ; platform_python_implementation == \"CPython\" and python_version >= \"3.10\"", "pytest-xdist[psutil]"]
-docs = ["cogapp", "furo", "myst-parser", "sphinx", "sphinx-notfound-page", "sphinxcontrib-towncrier", "towncrier (<24.7)"]
+docs = ["cogapp", "furo", "myst-parser", "sphinx", "sphinx-notfound-page", "sphinxcontrib-towncrier", "towncrier"]
 tests = ["cloudpickle ; platform_python_implementation == \"CPython\"", "hypothesis", "mypy (>=1.11.1) ; platform_python_implementation == \"CPython\" and python_version >= \"3.10\"", "pympler", "pytest (>=4.3.0)", "pytest-mypy-plugins ; platform_python_implementation == \"CPython\" and python_version >= \"3.10\"", "pytest-xdist[psutil]"]
 tests-mypy = ["mypy (>=1.11.1) ; platform_python_implementation == \"CPython\" and python_version >= \"3.10\"", "pytest-mypy-plugins ; platform_python_implementation == \"CPython\" and python_version >= \"3.10\""]
@@ -172,18 +160,18 @@ lxml = ["lxml"]
 [[package]]
 name = "boto3"
-version = "1.37.8"
+version = "1.37.18"
 description = "The AWS SDK for Python"
 optional = false
 python-versions = ">=3.8"
 groups = ["main"]
 files = [
-    {file = "boto3-1.37.8-py3-none-any.whl", hash = "sha256:b9f506e08c9f54687d6c073ef1c550a24a62cc2d1e0bc7cda9f13112a38818bf"},
+    {file = "boto3-1.37.18-py3-none-any.whl", hash = "sha256:1545c943f36db41853cdfdb6ff09c4eda9220dd95bd2fae76fc73091603525d1"},
-    {file = "boto3-1.37.8.tar.gz", hash = "sha256:9448f4a079189e19c3253cfdc5b8ef6dc51a3b82431e8347a51f4c1b2d9dab42"},
+    {file = "boto3-1.37.18.tar.gz", hash = "sha256:9b272268794172b0b8bb9fb1f3c470c3b6c0ffb92fbd4882465cc740e40fbdcd"},
 ]
 [package.dependencies]
-botocore = ">=1.37.8,<1.38.0"
+botocore = ">=1.37.18,<1.38.0"
 jmespath = ">=0.7.1,<2.0.0"
 s3transfer = ">=0.11.0,<0.12.0"
@@ -192,14 +180,14 @@ crt = ["botocore[crt] (>=1.21.0,<2.0a0)"]
 [[package]]
 name = "botocore"
-version = "1.37.8"
+version = "1.37.18"
 description = "Low-level, data-driven core of boto 3."
 optional = false
 python-versions = ">=3.8"
 groups = ["main"]
 files = [
-    {file = "botocore-1.37.8-py3-none-any.whl", hash = "sha256:a6c94f33de12f4b10b10684019e554c980469b8394c6d82448a738cbd8452cef"},
+    {file = "botocore-1.37.18-py3-none-any.whl", hash = "sha256:a8b97d217d82b3c4f6bcc906e264df7ebb51e2c6a62b3548a97cd173fb8759a1"},
-    {file = "botocore-1.37.8.tar.gz", hash = "sha256:b5825e08dd3e25642aa22a0d7d92bf81fef1ef857117e4155f923bbccf5aba63"},
+    {file = "botocore-1.37.18.tar.gz", hash = "sha256:99e8eefd5df6347ead15df07ce55f4e62a51ea7b54de1127522a08597923b726"},
 ]
 [package.dependencies]
@@ -385,22 +373,6 @@ files = [
    {file = "certifi-2025.1.31.tar.gz", hash = "sha256:3d5da6925056f6f18f119200434a4780a94263f10d1c21d032a6f6b2baa20651"},
 ]
 [[package]]
 name = "certvalidator"
 version = "0.11.1"
 description = "Validates X.509 certificates and paths"
 optional = false
 python-versions = "*"
 groups = ["main"]
 files = [
    {file = "certvalidator-0.11.1-py2.py3-none-any.whl", hash = "sha256:77520b269f516d4fb0902998d5bd0eb3727fe153b659aa1cb828dcf12ea6b8de"},
    {file = "certvalidator-0.11.1.tar.gz", hash = "sha256:922d141c94393ab285ca34338e18dd4093e3ae330b1f278e96c837cb62cffaad"},
 ]
 [package.dependencies]
 asn1crypto = ">=0.18.1"
 oscrypto = ">=0.16.1"
 [[package]]
 name = "cffi"
 version = "1.17.1"
@@ -408,6 +380,7 @@ description = "Foreign Function Interface for Python calling C code."
 optional = false
 python-versions = ">=3.8"
 groups = ["main"]
 markers = "os_name == \"nt\" and implementation_name != \"pypy\" or platform_python_implementation != \"PyPy\""
 files = [
    {file = "cffi-1.17.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:df8b1c11f177bc2313ec4b2d46baec87a5f3e71fc8b45dab2ee7cae86d9aba14"},
    {file = "cffi-1.17.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:8f2cdc858323644ab277e9bb925ad72ae0e67f69e804f4898c070998d50b1a67"},
@@ -625,48 +598,60 @@ markers = {main = "sys_platform == \"win32\" or platform_system == \"Windows\"",
 [[package]]
 name = "cryptography"
-version = "41.0.7"
+version = "44.0.2"
 description = "cryptography is a package which provides cryptographic recipes and primitives to Python developers."
 optional = false
-python-versions = ">=3.7"
+python-versions = "!=3.9.0,!=3.9.1,>=3.7"
 groups = ["main"]
 files = [
-    {file = "cryptography-41.0.7-cp37-abi3-macosx_10_12_universal2.whl", hash = "sha256:3c78451b78313fa81607fa1b3f1ae0a5ddd8014c38a02d9db0616133987b9cdf"},
+    {file = "cryptography-44.0.2-cp37-abi3-macosx_10_9_universal2.whl", hash = "sha256:efcfe97d1b3c79e486554efddeb8f6f53a4cdd4cf6086642784fa31fc384e1d7"},
-    {file = "cryptography-41.0.7-cp37-abi3-macosx_10_12_x86_64.whl", hash = "sha256:928258ba5d6f8ae644e764d0f996d61a8777559f72dfeb2eea7e2fe0ad6e782d"},
+    {file = "cryptography-44.0.2-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:29ecec49f3ba3f3849362854b7253a9f59799e3763b0c9d0826259a88efa02f1"},
-    {file = "cryptography-41.0.7-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5a1b41bc97f1ad230a41657d9155113c7521953869ae57ac39ac7f1bb471469a"},
+    {file = "cryptography-44.0.2-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bc821e161ae88bfe8088d11bb39caf2916562e0a2dc7b6d56714a48b784ef0bb"},
-    {file = "cryptography-41.0.7-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:841df4caa01008bad253bce2a6f7b47f86dc9f08df4b433c404def869f590a15"},
+    {file = "cryptography-44.0.2-cp37-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:3c00b6b757b32ce0f62c574b78b939afab9eecaf597c4d624caca4f9e71e7843"},
-    {file = "cryptography-41.0.7-cp37-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:5429ec739a29df2e29e15d082f1d9ad683701f0ec7709ca479b3ff2708dae65a"},
+    {file = "cryptography-44.0.2-cp37-abi3-manylinux_2_28_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:7bdcd82189759aba3816d1f729ce42ffded1ac304c151d0a8e89b9996ab863d5"},
-    {file = "cryptography-41.0.7-cp37-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:43f2552a2378b44869fe8827aa19e69512e3245a219104438692385b0ee119d1"},
+    {file = "cryptography-44.0.2-cp37-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:4973da6ca3db4405c54cd0b26d328be54c7747e89e284fcff166132eb7bccc9c"},
-    {file = "cryptography-41.0.7-cp37-abi3-musllinux_1_1_aarch64.whl", hash = "sha256:af03b32695b24d85a75d40e1ba39ffe7db7ffcb099fe507b39fd41a565f1b157"},
+    {file = "cryptography-44.0.2-cp37-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:4e389622b6927d8133f314949a9812972711a111d577a5d1f4bee5e58736b80a"},
-    {file = "cryptography-41.0.7-cp37-abi3-musllinux_1_1_x86_64.whl", hash = "sha256:49f0805fc0b2ac8d4882dd52f4a3b935b210935d500b6b805f321addc8177406"},
+    {file = "cryptography-44.0.2-cp37-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:f514ef4cd14bb6fb484b4a60203e912cfcb64f2ab139e88c2274511514bf7308"},
-    {file = "cryptography-41.0.7-cp37-abi3-win32.whl", hash = "sha256:f983596065a18a2183e7f79ab3fd4c475205b839e02cbc0efbbf9666c4b3083d"},
+    {file = "cryptography-44.0.2-cp37-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:1bc312dfb7a6e5d66082c87c34c8a62176e684b6fe3d90fcfe1568de675e6688"},
-    {file = "cryptography-41.0.7-cp37-abi3-win_amd64.whl", hash = "sha256:90452ba79b8788fa380dfb587cca692976ef4e757b194b093d845e8d99f612f2"},
+    {file = "cryptography-44.0.2-cp37-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:3b721b8b4d948b218c88cb8c45a01793483821e709afe5f622861fc6182b20a7"},
-    {file = "cryptography-41.0.7-pp310-pypy310_pp73-macosx_10_12_x86_64.whl", hash = "sha256:079b85658ea2f59c4f43b70f8119a52414cdb7be34da5d019a77bf96d473b960"},
+    {file = "cryptography-44.0.2-cp37-abi3-win32.whl", hash = "sha256:51e4de3af4ec3899d6d178a8c005226491c27c4ba84101bfb59c901e10ca9f79"},
-    {file = "cryptography-41.0.7-pp310-pypy310_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:b640981bf64a3e978a56167594a0e97db71c89a479da8e175d8bb5be5178c003"},
+    {file = "cryptography-44.0.2-cp37-abi3-win_amd64.whl", hash = "sha256:c505d61b6176aaf982c5717ce04e87da5abc9a36a5b39ac03905c4aafe8de7aa"},
-    {file = "cryptography-41.0.7-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:e3114da6d7f95d2dee7d3f4eec16dacff819740bbab931aff8648cb13c5ff5e7"},
+    {file = "cryptography-44.0.2-cp39-abi3-macosx_10_9_universal2.whl", hash = "sha256:8e0ddd63e6bf1161800592c71ac794d3fb8001f2caebe0966e77c5234fa9efc3"},
-    {file = "cryptography-41.0.7-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:d5ec85080cce7b0513cfd233914eb8b7bbd0633f1d1703aa28d1dd5a72f678ec"},
+    {file = "cryptography-44.0.2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:81276f0ea79a208d961c433a947029e1a15948966658cf6710bbabb60fcc2639"},
-    {file = "cryptography-41.0.7-pp38-pypy38_pp73-macosx_10_12_x86_64.whl", hash = "sha256:7a698cb1dac82c35fcf8fe3417a3aaba97de16a01ac914b89a0889d364d2f6be"},
+    {file = "cryptography-44.0.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9a1e657c0f4ea2a23304ee3f964db058c9e9e635cc7019c4aa21c330755ef6fd"},
-    {file = "cryptography-41.0.7-pp38-pypy38_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:37a138589b12069efb424220bf78eac59ca68b95696fc622b6ccc1c0a197204a"},
+    {file = "cryptography-44.0.2-cp39-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:6210c05941994290f3f7f175a4a57dbbb2afd9273657614c506d5976db061181"},
-    {file = "cryptography-41.0.7-pp38-pypy38_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:68a2dec79deebc5d26d617bfdf6e8aab065a4f34934b22d3b5010df3ba36612c"},
+    {file = "cryptography-44.0.2-cp39-abi3-manylinux_2_28_armv7l.manylinux_2_31_armv7l.whl", hash = "sha256:d1c3572526997b36f245a96a2b1713bf79ce99b271bbcf084beb6b9b075f29ea"},
-    {file = "cryptography-41.0.7-pp38-pypy38_pp73-win_amd64.whl", hash = "sha256:09616eeaef406f99046553b8a40fbf8b1e70795a91885ba4c96a70793de5504a"},
+    {file = "cryptography-44.0.2-cp39-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:b042d2a275c8cee83a4b7ae30c45a15e6a4baa65a179a0ec2d78ebb90e4f6699"},
-    {file = "cryptography-41.0.7-pp39-pypy39_pp73-macosx_10_12_x86_64.whl", hash = "sha256:48a0476626da912a44cc078f9893f292f0b3e4c739caf289268168d8f4702a39"},
+    {file = "cryptography-44.0.2-cp39-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:d03806036b4f89e3b13b6218fefea8d5312e450935b1a2d55f0524e2ed7c59d9"},
-    {file = "cryptography-41.0.7-pp39-pypy39_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:c7f3201ec47d5207841402594f1d7950879ef890c0c495052fa62f58283fde1a"},
+    {file = "cryptography-44.0.2-cp39-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:c7362add18b416b69d58c910caa217f980c5ef39b23a38a0880dfd87bdf8cd23"},
-    {file = "cryptography-41.0.7-pp39-pypy39_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:c5ca78485a255e03c32b513f8c2bc39fedb7f5c5f8535545bdc223a03b24f248"},
+    {file = "cryptography-44.0.2-cp39-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:8cadc6e3b5a1f144a039ea08a0bdb03a2a92e19c46be3285123d32029f40a922"},
-    {file = "cryptography-41.0.7-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:d6c391c021ab1f7a82da5d8d0b3cee2f4b2c455ec86c8aebbc84837a631ff309"},
+    {file = "cryptography-44.0.2-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:6f101b1f780f7fc613d040ca4bdf835c6ef3b00e9bd7125a4255ec574c7916e4"},
-    {file = "cryptography-41.0.7.tar.gz", hash = "sha256:13f93ce9bea8016c253b34afc6bd6a75993e5c40672ed5405a9c832f0d4a00bc"},
+    {file = "cryptography-44.0.2-cp39-abi3-win32.whl", hash = "sha256:3dc62975e31617badc19a906481deacdeb80b4bb454394b4098e3f2525a488c5"},
    {file = "cryptography-44.0.2-cp39-abi3-win_amd64.whl", hash = "sha256:5f6f90b72d8ccadb9c6e311c775c8305381db88374c65fa1a68250aa8a9cb3a6"},
    {file = "cryptography-44.0.2-pp310-pypy310_pp73-macosx_10_9_x86_64.whl", hash = "sha256:af4ff3e388f2fa7bff9f7f2b31b87d5651c45731d3e8cfa0944be43dff5cfbdb"},
    {file = "cryptography-44.0.2-pp310-pypy310_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:0529b1d5a0105dd3731fa65680b45ce49da4d8115ea76e9da77a875396727b41"},
    {file = "cryptography-44.0.2-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:7ca25849404be2f8e4b3c59483d9d3c51298a22c1c61a0e84415104dacaf5562"},
    {file = "cryptography-44.0.2-pp310-pypy310_pp73-manylinux_2_34_aarch64.whl", hash = "sha256:268e4e9b177c76d569e8a145a6939eca9a5fec658c932348598818acf31ae9a5"},
    {file = "cryptography-44.0.2-pp310-pypy310_pp73-manylinux_2_34_x86_64.whl", hash = "sha256:9eb9d22b0a5d8fd9925a7764a054dca914000607dff201a24c791ff5c799e1fa"},
    {file = "cryptography-44.0.2-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:2bf7bf75f7df9715f810d1b038870309342bff3069c5bd8c6b96128cb158668d"},
    {file = "cryptography-44.0.2-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:909c97ab43a9c0c0b0ada7a1281430e4e5ec0458e6d9244c0e821bbf152f061d"},
    {file = "cryptography-44.0.2-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:96e7a5e9d6e71f9f4fca8eebfd603f8e86c5225bb18eb621b2c1e50b290a9471"},
    {file = "cryptography-44.0.2-pp311-pypy311_pp73-manylinux_2_34_aarch64.whl", hash = "sha256:d1b3031093a366ac767b3feb8bcddb596671b3aaff82d4050f984da0c248b615"},
    {file = "cryptography-44.0.2-pp311-pypy311_pp73-manylinux_2_34_x86_64.whl", hash = "sha256:04abd71114848aa25edb28e225ab5f268096f44cf0127f3d36975bdf1bdf3390"},
    {file = "cryptography-44.0.2.tar.gz", hash = "sha256:c63454aa261a0cf0c5b4718349629793e9e634993538db841165b3df74f37ec0"},
 ]
 [package.dependencies]
-cffi = ">=1.12"
+cffi = {version = ">=1.12", markers = "platform_python_implementation != \"PyPy\""}
 [package.extras]
-docs = ["sphinx (>=5.3.0)", "sphinx-rtd-theme (>=1.1.1)"]
+docs = ["sphinx (>=5.3.0)", "sphinx-rtd-theme (>=3.0.0) ; python_version >= \"3.8\""]
-docstest = ["pyenchant (>=1.6.11)", "sphinxcontrib-spelling (>=4.0.1)", "twine (>=1.12.0)"]
+docstest = ["pyenchant (>=3)", "readme-renderer (>=30.0)", "sphinxcontrib-spelling (>=7.3.1)"]
-nox = ["nox"]
+nox = ["nox (>=2024.4.15)", "nox[uv] (>=2024.3.2) ; python_version >= \"3.8\""]
-pep8test = ["black", "check-sdist", "mypy", "ruff"]
+pep8test = ["check-sdist ; python_version >= \"3.8\"", "click (>=8.0.1)", "mypy (>=1.4)", "ruff (>=0.3.6)"]
-sdist = ["build"]
+sdist = ["build (>=1.0.0)"]
 ssh = ["bcrypt (>=3.1.5)"]
-test = ["pretend", "pytest (>=6.2.0)", "pytest-benchmark", "pytest-cov", "pytest-xdist"]
+test = ["certifi (>=2024)", "cryptography-vectors (==44.0.2)", "pretend (>=0.7)", "pytest (>=7.4.0)", "pytest-benchmark (>=4.0)", "pytest-cov (>=2.10.1)", "pytest-xdist (>=3.5.0)"]
 test-randomorder = ["pytest-randomly"]
 [[package]]
@@ -768,14 +753,14 @@ dev = ["Sphinx (==2.1.0)", "future (==0.17.1)", "numpy (==1.16.4)", "pytest (==4
 [[package]]
 name = "filelock"
-version = "3.17.0"
+version = "3.18.0"
 description = "A platform independent file lock."
 optional = false
 python-versions = ">=3.9"
 groups = ["dev"]
 files = [
-    {file = "filelock-3.17.0-py3-none-any.whl", hash = "sha256:533dc2f7ba78dc2f0f531fc6c4940addf7b70a481e269a5a3b93be94ffbe8338"},
+    {file = "filelock-3.18.0-py3-none-any.whl", hash = "sha256:c401f4f8377c4464e6db25fff06205fd89bdd83b65eb0488ed1b160f780e21de"},
-    {file = "filelock-3.17.0.tar.gz", hash = "sha256:ee4e77401ef576ebb38cd7f13b9b28893194acc20a8e68e18730ba9c0e54660e"},
+    {file = "filelock-3.18.0.tar.gz", hash = "sha256:adbc88eabb99d2fec8c9c1b229b171f18afa655400173ddc653d5d01501fb9f2"},
 ]
 [package.extras]
@@ -797,22 +782,22 @@ files = [
 [[package]]
 name = "google-api-core"
-version = "2.24.1"
+version = "2.24.2"
 description = "Google API client core library"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
 files = [
-    {file = "google_api_core-2.24.1-py3-none-any.whl", hash = "sha256:bc78d608f5a5bf853b80bd70a795f703294de656c096c0968320830a4bc280f1"},
+    {file = "google_api_core-2.24.2-py3-none-any.whl", hash = "sha256:810a63ac95f3c441b7c0e43d344e372887f62ce9071ba972eacf32672e072de9"},
-    {file = "google_api_core-2.24.1.tar.gz", hash = "sha256:f8b36f5456ab0dd99a1b693a40a31d1e7757beea380ad1b38faaf8941eae9d8a"},
+    {file = "google_api_core-2.24.2.tar.gz", hash = "sha256:81718493daf06d96d6bc76a91c23874dbf2fac0adbbf542831b805ee6e974696"},
 ]
 [package.dependencies]
-google-auth = ">=2.14.1,<3.0.dev0"
+google-auth = ">=2.14.1,<3.0.0"
-googleapis-common-protos = ">=1.56.2,<2.0.dev0"
+googleapis-common-protos = ">=1.56.2,<2.0.0"
-proto-plus = ">=1.22.3,<2.0.0dev"
+proto-plus = ">=1.22.3,<2.0.0"
-protobuf = ">=3.19.5,<3.20.0 || >3.20.0,<3.20.1 || >3.20.1,<4.21.0 || >4.21.0,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<6.0.0.dev0"
+protobuf = ">=3.19.5,<3.20.0 || >3.20.0,<3.20.1 || >3.20.1,<4.21.0 || >4.21.0,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<7.0.0"
-requests = ">=2.18.0,<3.0.0.dev0"
+requests = ">=2.18.0,<3.0.0"
 [package.extras]
 async-rest = ["google-auth[aiohttp] (>=2.35.0,<3.0.dev0)"]
@@ -822,21 +807,21 @@ grpcio-gcp = ["grpcio-gcp (>=0.2.2,<1.0.dev0)"]
 [[package]]
 name = "google-api-python-client"
-version = "2.163.0"
+version = "2.165.0"
 description = "Google API Client Library for Python"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
 files = [
-    {file = "google_api_python_client-2.163.0-py2.py3-none-any.whl", hash = "sha256:080e8bc0669cb4c1fb8efb8da2f5b91a2625d8f0e7796cfad978f33f7016c6c4"},
+    {file = "google_api_python_client-2.165.0-py2.py3-none-any.whl", hash = "sha256:4eaab7d4a20be0d3d1dde462fa95e9e0ccc2a3e177a656701bf73fe738ddef7d"},
-    {file = "google_api_python_client-2.163.0.tar.gz", hash = "sha256:88dee87553a2d82176e2224648bf89272d536c8f04dcdda37ef0a71473886dd7"},
+    {file = "google_api_python_client-2.165.0.tar.gz", hash = "sha256:0d2aee76727a104705630bebbc43669c864b766924e9329051ef7b7e2468eb72"},
 ]
 [package.dependencies]
-google-api-core = ">=1.31.5,<2.0.dev0 || >2.3.0,<3.0.0.dev0"
+google-api-core = ">=1.31.5,<2.0.dev0 || >2.3.0,<3.0.0"
-google-auth = ">=1.32.0,<2.24.0 || >2.24.0,<2.25.0 || >2.25.0,<3.0.0.dev0"
+google-auth = ">=1.32.0,<2.24.0 || >2.24.0,<2.25.0 || >2.25.0,<3.0.0"
 google-auth-httplib2 = ">=0.2.0,<1.0.0"
-httplib2 = ">=0.19.0,<1.dev0"
+httplib2 = ">=0.19.0,<1.0.0"
 uritemplate = ">=3.0.1,<5"
 [[package]]
@@ -901,21 +886,21 @@ tool = ["click (>=6.0.0)"]
 [[package]]
 name = "googleapis-common-protos"
-version = "1.69.1"
+version = "1.69.2"
 description = "Common protobufs used in Google APIs"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
 files = [
-    {file = "googleapis_common_protos-1.69.1-py2.py3-none-any.whl", hash = "sha256:4077f27a6900d5946ee5a369fab9c8ded4c0ef1c6e880458ea2f70c14f7b70d5"},
+    {file = "googleapis_common_protos-1.69.2-py3-none-any.whl", hash = "sha256:0b30452ff9c7a27d80bfc5718954063e8ab53dd3697093d3bc99581f5fd24212"},
-    {file = "googleapis_common_protos-1.69.1.tar.gz", hash = "sha256:e20d2d8dda87da6fe7340afbbdf4f0bcb4c8fae7e6cadf55926c31f946b0b9b1"},
+    {file = "googleapis_common_protos-1.69.2.tar.gz", hash = "sha256:3e1b904a27a33c821b4b749fd31d334c0c9c30e6113023d495e48979a3dc9c5f"},
 ]
 [package.dependencies]
-protobuf = ">=3.20.2,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<6.0.0.dev0"
+protobuf = ">=3.20.2,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<7.0.0"
 [package.extras]
-grpc = ["grpcio (>=1.44.0,<2.0.0.dev0)"]
+grpc = ["grpcio (>=1.44.0,<2.0.0)"]
 [[package]]
 name = "gspread"
@@ -1004,14 +989,14 @@ files = [
 [[package]]
 name = "iniconfig"
-version = "2.0.0"
+version = "2.1.0"
 description = "brain-dead simple config-ini parsing"
 optional = false
-python-versions = ">=3.7"
+python-versions = ">=3.8"
 groups = ["dev"]
 files = [
-    {file = "iniconfig-2.0.0-py3-none-any.whl", hash = "sha256:b6a85871a79d2e3b22d2d1b94ac2824226a63c6b741c88f7ae975f18b6778374"},
+    {file = "iniconfig-2.1.0-py3-none-any.whl", hash = "sha256:9deba5723312380e77435581c6bf4935c94cbfab9b1ed33ef8d238ea168eb760"},
-    {file = "iniconfig-2.0.0.tar.gz", hash = "sha256:2d91e135bf72d31a410b17c16da610a82cb55f6b0477d1a902134b24a455b8b3"},
+    {file = "iniconfig-2.1.0.tar.gz", hash = "sha256:3abbd2e30b36733fee78f9c7f7308f2d0050e88f0087fd25c2645f63c773e1c7"},
 ]
 [[package]]
@@ -1445,21 +1430,6 @@ files = [
 pycryptodomex = ">=3.3.1"
 python-bitcoinlib = ">=0.9.0,<0.13.0"
 [[package]]
 name = "oscrypto"
 version = "1.3.0"
 description = "TLS (SSL) sockets, key generation, encryption, decryption, signing, verification and KDFs using the OS crypto libraries. Does not require a compiler, and relies on the OS for patching. Works on Windows, OS X and Linux/BSD."
 optional = false
 python-versions = "*"
 groups = ["main"]
 files = [
    {file = "oscrypto-1.3.0-py2.py3-none-any.whl", hash = "sha256:2b2f1d2d42ec152ca90ccb5682f3e051fb55986e1b170ebde472b133713e7085"},
    {file = "oscrypto-1.3.0.tar.gz", hash = "sha256:6f5fef59cb5b3708321db7cca56aed8ad7e662853351e7991fcf60ec606d47a4"},
 ]
 [package.dependencies]
 asn1crypto = ">=1.5.1"
 [[package]]
 name = "outcome"
 version = "1.3.0.post0"
@@ -1599,20 +1569,20 @@ xmp = ["defusedxml"]
 [[package]]
 name = "platformdirs"
-version = "4.3.6"
+version = "4.3.7"
 description = "A small Python package for determining appropriate platform-specific dirs, e.g. a `user data dir`."
 optional = false
-python-versions = ">=3.8"
+python-versions = ">=3.9"
 groups = ["dev"]
 files = [
-    {file = "platformdirs-4.3.6-py3-none-any.whl", hash = "sha256:73e575e1408ab8103900836b97580d5307456908a03e92031bab39e4554cc3fb"},
+    {file = "platformdirs-4.3.7-py3-none-any.whl", hash = "sha256:a03875334331946f13c549dbd8f4bac7a13a50a895a0eb1e8c6a8ace80d40a94"},
-    {file = "platformdirs-4.3.6.tar.gz", hash = "sha256:357fb2acbc885b0419afd3ce3ed34564c13c9b95c89360cd9563f73aa5e2b907"},
+    {file = "platformdirs-4.3.7.tar.gz", hash = "sha256:eb437d586b6a0986388f0d6f74aa0cde27b48d0e3d66843640bfb6bdcdb6e351"},
 ]
 [package.extras]
-docs = ["furo (>=2024.8.6)", "proselint (>=0.14)", "sphinx (>=8.0.2)", "sphinx-autodoc-typehints (>=2.4)"]
+docs = ["furo (>=2024.8.6)", "proselint (>=0.14)", "sphinx (>=8.1.3)", "sphinx-autodoc-typehints (>=3)"]
-test = ["appdirs (==1.4.4)", "covdefaults (>=2.3)", "pytest (>=8.3.2)", "pytest-cov (>=5)", "pytest-mock (>=3.14)"]
+test = ["appdirs (==1.4.4)", "covdefaults (>=2.3)", "pytest (>=8.3.4)", "pytest-cov (>=6)", "pytest-mock (>=3.14)"]
-type = ["mypy (>=1.11.2)"]
+type = ["mypy (>=1.14.1)"]
 [[package]]
 name = "pluggy"
@@ -1632,14 +1602,14 @@ testing = ["pytest", "pytest-benchmark"]
 [[package]]
 name = "pre-commit"
-version = "4.1.0"
+version = "4.2.0"
 description = "A framework for managing and maintaining multi-language pre-commit hooks."
 optional = false
 python-versions = ">=3.9"
 groups = ["dev"]
 files = [
-    {file = "pre_commit-4.1.0-py2.py3-none-any.whl", hash = "sha256:d29e7cb346295bcc1cc75fc3e92e343495e3ea0196c9ec6ba53f49f10ab6ae7b"},
+    {file = "pre_commit-4.2.0-py2.py3-none-any.whl", hash = "sha256:a009ca7205f1eb497d10b845e52c838a98b6cdd2102a6c8e4540e94ee75c58bd"},
-    {file = "pre_commit-4.1.0.tar.gz", hash = "sha256:ae3f018575a588e30dfddfab9a05448bfbd6b73d78709617b5a2b853549716d4"},
+    {file = "pre_commit-4.2.0.tar.gz", hash = "sha256:601283b9757afd87d40c4c4a9b2b5de9637a8ea02eaff7adc2d0fb4e04841146"},
 ]
 [package.dependencies]
@@ -1651,41 +1621,39 @@ virtualenv = ">=20.10.0"
 [[package]]
 name = "proto-plus"
-version = "1.26.0"
+version = "1.26.1"
 description = "Beautiful, Pythonic protocol buffers"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
 files = [
-    {file = "proto_plus-1.26.0-py3-none-any.whl", hash = "sha256:bf2dfaa3da281fc3187d12d224c707cb57214fb2c22ba854eb0c105a3fb2d4d7"},
+    {file = "proto_plus-1.26.1-py3-none-any.whl", hash = "sha256:13285478c2dcf2abb829db158e1047e2f1e8d63a077d94263c2b88b043c75a66"},
-    {file = "proto_plus-1.26.0.tar.gz", hash = "sha256:6e93d5f5ca267b54300880fff156b6a3386b3fa3f43b1da62e680fc0c586ef22"},
+    {file = "proto_plus-1.26.1.tar.gz", hash = "sha256:21a515a4c4c0088a773899e23c7bbade3d18f9c66c73edd4c7ee3816bc96a012"},
 ]
 [package.dependencies]
-protobuf = ">=3.19.0,<6.0.0dev"
+protobuf = ">=3.19.0,<7.0.0"
 [package.extras]
 testing = ["google-api-core (>=1.31.5)"]
 [[package]]
 name = "protobuf"
-version = "5.29.3"
+version = "6.30.1"
 description = ""
 optional = false
-python-versions = ">=3.8"
+python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "protobuf-5.29.3-cp310-abi3-win32.whl", hash = "sha256:3ea51771449e1035f26069c4c7fd51fba990d07bc55ba80701c78f886bf9c888"},
+    {file = "protobuf-6.30.1-cp310-abi3-win32.whl", hash = "sha256:ba0706f948d0195f5cac504da156d88174e03218d9364ab40d903788c1903d7e"},
-    {file = "protobuf-5.29.3-cp310-abi3-win_amd64.whl", hash = "sha256:a4fa6f80816a9a0678429e84973f2f98cbc218cca434abe8db2ad0bffc98503a"},
+    {file = "protobuf-6.30.1-cp310-abi3-win_amd64.whl", hash = "sha256:ed484f9ddd47f0f1bf0648806cccdb4fe2fb6b19820f9b79a5adf5dcfd1b8c5f"},
-    {file = "protobuf-5.29.3-cp38-abi3-macosx_10_9_universal2.whl", hash = "sha256:a8434404bbf139aa9e1300dbf989667a83d42ddda9153d8ab76e0d5dcaca484e"},
+    {file = "protobuf-6.30.1-cp39-abi3-macosx_10_9_universal2.whl", hash = "sha256:aa4f7dfaed0d840b03d08d14bfdb41348feaee06a828a8c455698234135b4075"},
-    {file = "protobuf-5.29.3-cp38-abi3-manylinux2014_aarch64.whl", hash = "sha256:daaf63f70f25e8689c072cfad4334ca0ac1d1e05a92fc15c54eb9cf23c3efd84"},
+    {file = "protobuf-6.30.1-cp39-abi3-manylinux2014_aarch64.whl", hash = "sha256:47cd320b7db63e8c9ac35f5596ea1c1e61491d8a8eb6d8b45edc44760b53a4f6"},
-    {file = "protobuf-5.29.3-cp38-abi3-manylinux2014_x86_64.whl", hash = "sha256:c027e08a08be10b67c06bf2370b99c811c466398c357e615ca88c91c07f0910f"},
+    {file = "protobuf-6.30.1-cp39-abi3-manylinux2014_x86_64.whl", hash = "sha256:e3083660225fa94748ac2e407f09a899e6a28bf9c0e70c75def8d15706bf85fc"},
-    {file = "protobuf-5.29.3-cp38-cp38-win32.whl", hash = "sha256:84a57163a0ccef3f96e4b6a20516cedcf5bb3a95a657131c5c3ac62200d23252"},
+    {file = "protobuf-6.30.1-cp39-cp39-win32.whl", hash = "sha256:554d7e61cce2aa4c63ca27328f757a9f3867bce8ec213bf09096a8d16bcdcb6a"},
-    {file = "protobuf-5.29.3-cp38-cp38-win_amd64.whl", hash = "sha256:b89c115d877892a512f79a8114564fb435943b59067615894c3b13cd3e1fa107"},
+    {file = "protobuf-6.30.1-cp39-cp39-win_amd64.whl", hash = "sha256:b510f55ce60f84dc7febc619b47215b900466e3555ab8cb1ba42deb4496d6cc0"},
-    {file = "protobuf-5.29.3-cp39-cp39-win32.whl", hash = "sha256:0eb32bfa5219fc8d4111803e9a690658aa2e6366384fd0851064b963b6d1f2a7"},
+    {file = "protobuf-6.30.1-py3-none-any.whl", hash = "sha256:3c25e51e1359f1f5fa3b298faa6016e650d148f214db2e47671131b9063c53be"},
-    {file = "protobuf-5.29.3-cp39-cp39-win_amd64.whl", hash = "sha256:6ce8cc3389a20693bfde6c6562e03474c40851b44975c9b2bf6df7d8c4f864da"},
+    {file = "protobuf-6.30.1.tar.gz", hash = "sha256:535fb4e44d0236893d5cf1263a0f706f1160b689a7ab962e9da8a9ce4050b780"},
    {file = "protobuf-5.29.3-py3-none-any.whl", hash = "sha256:0a18ed4a24198528f2333802eb075e59dea9d679ab7a6c5efb017a59004d849f"},
    {file = "protobuf-5.29.3.tar.gz", hash = "sha256:5da0f41edaf117bde316404bad1a486cb4ededf8e4a54891296f648e8e076620"},
 ]
 [[package]]
@@ -1745,6 +1713,7 @@ description = "C parser in Python"
 optional = false
 python-versions = ">=3.8"
 groups = ["main"]
 markers = "os_name == \"nt\" and implementation_name != \"pypy\" or platform_python_implementation != \"PyPy\""
 files = [
    {file = "pycparser-2.22-py3-none-any.whl", hash = "sha256:c3702b6d3dd8c7abc1afa565d7e63d53a1d0bd86cdc24edd75470f4de499cfcc"},
    {file = "pycparser-2.22.tar.gz", hash = "sha256:491c8be9c040f5390f5bf44a5b07752bd07f56edf992381b05c701439eec10f6"},
@@ -1752,44 +1721,41 @@ files = [
 [[package]]
 name = "pycryptodomex"
-version = "3.21.0"
+version = "3.22.0"
 description = "Cryptographic library for Python"
 optional = false
-python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,>=2.7"
+python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,!=3.6.*,>=2.7"
 groups = ["main"]
 files = [
-    {file = "pycryptodomex-3.21.0-cp27-cp27m-macosx_10_9_x86_64.whl", hash = "sha256:dbeb84a399373df84a69e0919c1d733b89e049752426041deeb30d68e9867822"},
+    {file = "pycryptodomex-3.22.0-cp27-cp27m-macosx_10_9_x86_64.whl", hash = "sha256:41673e5cc39a8524557a0472077635d981172182c9fe39ce0b5f5c19381ffaff"},
-    {file = "pycryptodomex-3.21.0-cp27-cp27m-manylinux2010_i686.whl", hash = "sha256:a192fb46c95489beba9c3f002ed7d93979423d1b2a53eab8771dbb1339eb3ddd"},
+    {file = "pycryptodomex-3.22.0-cp27-cp27m-manylinux2010_i686.whl", hash = "sha256:276be1ed006e8fd01bba00d9bd9b60a0151e478033e86ea1cb37447bbc057edc"},
-    {file = "pycryptodomex-3.21.0-cp27-cp27m-manylinux2010_x86_64.whl", hash = "sha256:1233443f19d278c72c4daae749872a4af3787a813e05c3561c73ab0c153c7b0f"},
+    {file = "pycryptodomex-3.22.0-cp27-cp27m-manylinux2010_x86_64.whl", hash = "sha256:813e57da5ceb4b549bab96fa548781d9a63f49f1d68fdb148eeac846238056b7"},
-    {file = "pycryptodomex-3.21.0-cp27-cp27m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:bbb07f88e277162b8bfca7134b34f18b400d84eac7375ce73117f865e3c80d4c"},
+    {file = "pycryptodomex-3.22.0-cp27-cp27m-win32.whl", hash = "sha256:d7beeacb5394765aa8dabed135389a11ee322d3ee16160d178adc7f8ee3e1f65"},
-    {file = "pycryptodomex-3.21.0-cp27-cp27m-musllinux_1_1_aarch64.whl", hash = "sha256:e859e53d983b7fe18cb8f1b0e29d991a5c93be2c8dd25db7db1fe3bd3617f6f9"},
+    {file = "pycryptodomex-3.22.0-cp27-cp27mu-manylinux2010_i686.whl", hash = "sha256:b3746dedf74787da43e4a2f85bd78f5ec14d2469eb299ddce22518b3891f16ea"},
-    {file = "pycryptodomex-3.21.0-cp27-cp27m-win32.whl", hash = "sha256:ef046b2e6c425647971b51424f0f88d8a2e0a2a63d3531817968c42078895c00"},
+    {file = "pycryptodomex-3.22.0-cp27-cp27mu-manylinux2010_x86_64.whl", hash = "sha256:5ebc09b7d8964654aaf8a4f5ac325f2b0cc038af9bea12efff0cd4a5bb19aa42"},
-    {file = "pycryptodomex-3.21.0-cp27-cp27m-win_amd64.whl", hash = "sha256:da76ebf6650323eae7236b54b1b1f0e57c16483be6e3c1ebf901d4ada47563b6"},
+    {file = "pycryptodomex-3.22.0-cp37-abi3-macosx_10_9_universal2.whl", hash = "sha256:aef4590263b9f2f6283469e998574d0bd45c14fb262241c27055b82727426157"},
-    {file = "pycryptodomex-3.21.0-cp27-cp27mu-manylinux2010_i686.whl", hash = "sha256:c07e64867a54f7e93186a55bec08a18b7302e7bee1b02fd84c6089ec215e723a"},
+    {file = "pycryptodomex-3.22.0-cp37-abi3-macosx_10_9_x86_64.whl", hash = "sha256:5ac608a6dce9418d4f300fab7ba2f7d499a96b462f2b9b5c90d8d994cd36dcad"},
-    {file = "pycryptodomex-3.21.0-cp27-cp27mu-manylinux2010_x86_64.whl", hash = "sha256:56435c7124dd0ce0c8bdd99c52e5d183a0ca7fdcd06c5d5509423843f487dd0b"},
+    {file = "pycryptodomex-3.22.0-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7a24f681365ec9757ccd69b85868bbd7216ba451d0f86f6ea0eed75eeb6975db"},
-    {file = "pycryptodomex-3.21.0-cp27-cp27mu-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:65d275e3f866cf6fe891411be9c1454fb58809ccc5de6d3770654c47197acd65"},
+    {file = "pycryptodomex-3.22.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:259664c4803a1fa260d5afb322972813c5fe30ea8b43e54b03b7e3a27b30856b"},
-    {file = "pycryptodomex-3.21.0-cp27-cp27mu-musllinux_1_1_aarch64.whl", hash = "sha256:5241bdb53bcf32a9568770a6584774b1b8109342bd033398e4ff2da052123832"},
+    {file = "pycryptodomex-3.22.0-cp37-abi3-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:7127d9de3c7ce20339e06bcd4f16f1a1a77f1471bcf04e3b704306dde101b719"},
-    {file = "pycryptodomex-3.21.0-cp36-abi3-macosx_10_9_universal2.whl", hash = "sha256:34325b84c8b380675fd2320d0649cdcbc9cf1e0d1526edbe8fce43ed858cdc7e"},
+    {file = "pycryptodomex-3.22.0-cp37-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:ee75067b35c93cc18b38af47b7c0664998d8815174cfc66dd00ea1e244eb27e6"},
-    {file = "pycryptodomex-3.21.0-cp36-abi3-macosx_10_9_x86_64.whl", hash = "sha256:103c133d6cd832ae7266feb0a65b69e3a5e4dbbd6f3a3ae3211a557fd653f516"},
+    {file = "pycryptodomex-3.22.0-cp37-abi3-musllinux_1_2_i686.whl", hash = "sha256:1a8b0c5ba061ace4bcd03496d42702c3927003db805b8ec619ea6506080b381d"},
-    {file = "pycryptodomex-3.21.0-cp36-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:77ac2ea80bcb4b4e1c6a596734c775a1615d23e31794967416afc14852a639d3"},
+    {file = "pycryptodomex-3.22.0-cp37-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:bfe4fe3233ef3e58028a3ad8f28473653b78c6d56e088ea04fe7550c63d4d16b"},
-    {file = "pycryptodomex-3.21.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9aa0cf13a1a1128b3e964dc667e5fe5c6235f7d7cfb0277213f0e2a783837cc2"},
+    {file = "pycryptodomex-3.22.0-cp37-abi3-win32.whl", hash = "sha256:2cac9ed5c343bb3d0075db6e797e6112514764d08d667c74cb89b931aac9dddd"},
-    {file = "pycryptodomex-3.21.0-cp36-abi3-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:46eb1f0c8d309da63a2064c28de54e5e614ad17b7e2f88df0faef58ce192fc7b"},
+    {file = "pycryptodomex-3.22.0-cp37-abi3-win_amd64.whl", hash = "sha256:ff46212fda7ee86ec2f4a64016c994e8ad80f11ef748131753adb67e9b722ebd"},
-    {file = "pycryptodomex-3.21.0-cp36-abi3-musllinux_1_1_aarch64.whl", hash = "sha256:cc7e111e66c274b0df5f4efa679eb31e23c7545d702333dfd2df10ab02c2a2ce"},
+    {file = "pycryptodomex-3.22.0-pp27-pypy_73-manylinux2010_x86_64.whl", hash = "sha256:5bf3ce9211d2a9877b00b8e524593e2209e370a287b3d5e61a8c45f5198487e2"},
-    {file = "pycryptodomex-3.21.0-cp36-abi3-musllinux_1_2_i686.whl", hash = "sha256:770d630a5c46605ec83393feaa73a9635a60e55b112e1fb0c3cea84c2897aa0a"},
+    {file = "pycryptodomex-3.22.0-pp27-pypy_73-win32.whl", hash = "sha256:684cb57812cd243217c3d1e01a720c5844b30f0b7b64bb1a49679f7e1e8a54ac"},
-    {file = "pycryptodomex-3.21.0-cp36-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:52e23a0a6e61691134aa8c8beba89de420602541afaae70f66e16060fdcd677e"},
+    {file = "pycryptodomex-3.22.0-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:c8cffb03f5dee1026e3f892f7cffd79926a538c67c34f8b07c90c0bd5c834e27"},
-    {file = "pycryptodomex-3.21.0-cp36-abi3-win32.whl", hash = "sha256:a3d77919e6ff56d89aada1bd009b727b874d464cb0e2e3f00a49f7d2e709d76e"},
+    {file = "pycryptodomex-3.22.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:140b27caa68a36d0501b05eb247bd33afa5f854c1ee04140e38af63c750d4e39"},
-    {file = "pycryptodomex-3.21.0-cp36-abi3-win_amd64.whl", hash = "sha256:b0e9765f93fe4890f39875e6c90c96cb341767833cfa767f41b490b506fa9ec0"},
+    {file = "pycryptodomex-3.22.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:644834b1836bb8e1d304afaf794d5ae98a1d637bd6e140c9be7dd192b5374811"},
-    {file = "pycryptodomex-3.21.0-pp27-pypy_73-manylinux2010_x86_64.whl", hash = "sha256:feaecdce4e5c0045e7a287de0c4351284391fe170729aa9182f6bd967631b3a8"},
+    {file = "pycryptodomex-3.22.0-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:72c506aba3318505dbeecf821ed7b9a9f86f422ed085e2d79c4fba0ae669920a"},
-    {file = "pycryptodomex-3.21.0-pp27-pypy_73-win32.whl", hash = "sha256:365aa5a66d52fd1f9e0530ea97f392c48c409c2f01ff8b9a39c73ed6f527d36c"},
+    {file = "pycryptodomex-3.22.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:7cd39f7a110c1ab97ce9ee3459b8bc615920344dc00e56d1b709628965fba3f2"},
-    {file = "pycryptodomex-3.21.0-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:3efddfc50ac0ca143364042324046800c126a1d63816d532f2e19e6f2d8c0c31"},
+    {file = "pycryptodomex-3.22.0-pp39-pypy39_pp73-macosx_10_15_x86_64.whl", hash = "sha256:e4eaaf6163ff13788c1f8f615ad60cdc69efac6d3bf7b310b21e8cfe5f46c801"},
-    {file = "pycryptodomex-3.21.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0df2608682db8279a9ebbaf05a72f62a321433522ed0e499bc486a6889b96bf3"},
+    {file = "pycryptodomex-3.22.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:eac39e237d65981554c2d4c6668192dc7051ad61ab5fc383ed0ba049e4007ca2"},
-    {file = "pycryptodomex-3.21.0-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:5823d03e904ea3e53aebd6799d6b8ec63b7675b5d2f4a4bd5e3adcb512d03b37"},
+    {file = "pycryptodomex-3.22.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1ab0d89d1761959b608952c7b347b0e76a32d1a5bb278afbaa10a7f3eaef9a0a"},
-    {file = "pycryptodomex-3.21.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:27e84eeff24250ffec32722334749ac2a57a5fd60332cd6a0680090e7c42877e"},
+    {file = "pycryptodomex-3.22.0-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:5e64164f816f5e43fd69f8ed98eb28f98157faf68208cd19c44ed9d8e72d33e8"},
-    {file = "pycryptodomex-3.21.0-pp39-pypy39_pp73-macosx_10_15_x86_64.whl", hash = "sha256:8ef436cdeea794015263853311f84c1ff0341b98fc7908e8a70595a68cefd971"},
+    {file = "pycryptodomex-3.22.0-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:f005de31efad6f9acefc417296c641f13b720be7dbfec90edeaca601c0fab048"},
-    {file = "pycryptodomex-3.21.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7a1058e6dfe827f4209c5cae466e67610bcd0d66f2f037465daa2a29d92d952b"},
+    {file = "pycryptodomex-3.22.0.tar.gz", hash = "sha256:a1da61bacc22f93a91cbe690e3eb2022a03ab4123690ab16c46abb693a9df63d"},
    {file = "pycryptodomex-3.21.0-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:9ba09a5b407cbb3bcb325221e346a140605714b5e880741dc9a1e9ecf1688d42"},
    {file = "pycryptodomex-3.21.0-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:8a9d8342cf22b74a746e3c6c9453cb0cfbb55943410e3a2619bd9164b48dc9d9"},
    {file = "pycryptodomex-3.21.0.tar.gz", hash = "sha256:222d0bd05381dd25c32dd6065c071ebf084212ab79bab4599ba9e6a3e0009e6c"},
 ]
 [[package]]
@@ -1836,35 +1802,16 @@ files = [
 [package.extras]
 windows-terminal = ["colorama (>=0.4.6)"]
 [[package]]
 name = "pyopenssl"
 version = "24.2.1"
 description = "Python wrapper module around the OpenSSL library"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
 files = [
    {file = "pyOpenSSL-24.2.1-py3-none-any.whl", hash = "sha256:967d5719b12b243588573f39b0c677637145c7a1ffedcd495a487e58177fbb8d"},
    {file = "pyopenssl-24.2.1.tar.gz", hash = "sha256:4247f0dbe3748d560dcbb2ff3ea01af0f9a1a001ef5f7c4c647956ed8cbf0e95"},
 ]
 [package.dependencies]
 cryptography = ">=41.0.5,<44"
 [package.extras]
 docs = ["sphinx (!=5.2.0,!=5.2.0.post0,!=7.2.5)", "sphinx-rtd-theme"]
 test = ["pretend", "pytest (>=3.0.1)", "pytest-rerunfailures"]
 [[package]]
 name = "pyparsing"
-version = "3.2.1"
+version = "3.2.2"
 description = "pyparsing module - Classes and methods to define and execute parsing grammars"
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "pyparsing-3.2.1-py3-none-any.whl", hash = "sha256:506ff4f4386c4cec0590ec19e6302d3aedb992fdc02c761e90416f158dacf8e1"},
+    {file = "pyparsing-3.2.2-py3-none-any.whl", hash = "sha256:6ab05e1cb111cc72acc8ed811a3ca4c2be2af8d7b6df324347f04fd057d8d793"},
-    {file = "pyparsing-3.2.1.tar.gz", hash = "sha256:61980854fd66de3a90028d679a954d5f2623e83144b5afe5ee86f43d762e5f0a"},
+    {file = "pyparsing-3.2.2.tar.gz", hash = "sha256:2a857aee851f113c2de9d4bfd9061baea478cb0f1c7ca6cbf594942d6d111575"},
 ]
 [package.extras]
@@ -2252,6 +2199,37 @@ files = [
 [package.dependencies]
 six = ">=1.7.0"
 [[package]]
 name = "rfc3161-client"
 version = "1.0.1"
 description = ""
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
    {file = "rfc3161_client-1.0.1-cp39-abi3-macosx_10_12_x86_64.whl", hash = "sha256:75d8c9d255fa79b9ae4aa27cee519893599efd79f9e6c24a1194dd296ce1c210"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-macosx_11_0_arm64.whl", hash = "sha256:0d3db059fe08d8b6b06aff89e133fcc352ffea1a1dafadb116dda9dae59d0689"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fdef0c9d3213ca5b79d7f76ada48ae10c5011cb25abed2f6df07b344d16d1c28"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7c34ce4d7d2bf5207c54de3a771e757f1f8bb04a8469d3cef6aefe074841064d"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:e4809f2fcfb5f8b42261a7b831929f62a297b584c8d1f4d242eae5e9447674b6"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a644b220b7f0f0be7856f49b043651982bd76e7aa9eb17b3e4e303fde36ed5a1"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:bb03a5a77b07adf766b7daac6cb8b7a8337ffc8f6d6046af74469973f52df8e1"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-musllinux_1_2_armv7l.whl", hash = "sha256:d6c6e4626780b1c531d32d6a126d6c27865b1eb59c65e8b0f1f8f94aa3205285"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-musllinux_1_2_i686.whl", hash = "sha256:912c2f049ce23d0f1c173b6fbd8673f964a27ad97907064dbc74f86dd0d95d15"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:081211a1b602b6dff7feb314d39ca2229c8db4e8cf55eef0c35b460470f4b2bb"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-win32.whl", hash = "sha256:59efa8fddf72a15e397276fe512dbfb99c0dc95032b495815bfc4f8f16302f2c"},
    {file = "rfc3161_client-1.0.1-cp39-abi3-win_amd64.whl", hash = "sha256:5381a63d5ed5b3c257cb18aacf3f737b1a1ad6df634290fe689b6d601c61cd24"},
    {file = "rfc3161_client-1.0.1.tar.gz", hash = "sha256:1c951f3912b90c6d3f3505e644b74ee08543387253647b86459addbffb16f63f"},
 ]
 [package.dependencies]
 cryptography = ">=43,<45"
 [package.extras]
 dev = ["maturin (>=1.7,<2.0)", "rfc3161-client[doc,lint,test]"]
 lint = ["interrogate", "ruff (>=0.7,<0.12)"]
 test = ["coverage[toml]", "pretend", "pytest", "pytest-cov"]
 [[package]]
 name = "rich"
 version = "13.9.4"
@@ -2287,6 +2265,23 @@ files = [
 [package.dependencies]
 rich = ">=11.0.0"
 [[package]]
 name = "roman-numerals-py"
 version = "3.1.0"
 description = "Manipulate well-formed Roman numerals"
 optional = false
 python-versions = ">=3.9"
 groups = ["docs"]
 markers = "python_version >= \"3.12\""
 files = [
    {file = "roman_numerals_py-3.1.0-py3-none-any.whl", hash = "sha256:9da2ad2fb670bcf24e81070ceb3be72f6c11c440d73bd579fbeca1e9f330954c"},
    {file = "roman_numerals_py-3.1.0.tar.gz", hash = "sha256:be4bf804f083a4ce001b5eb7e3c0862479d10f94c936f6c4e5f250aa5ff5bd2d"},
 ]
 [package.extras]
 lint = ["mypy (==1.15.0)", "pyright (==1.1.394)", "ruff (==0.9.7)"]
 test = ["pytest (>=8)"]
 [[package]]
 name = "rsa"
 version = "4.9"
@@ -2426,14 +2421,14 @@ crt = ["botocore[crt] (>=1.37.4,<2.0a.0)"]
 [[package]]
 name = "selenium"
-version = "4.29.0"
+version = "4.30.0"
 description = "Official Python bindings for Selenium WebDriver"
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "selenium-4.29.0-py3-none-any.whl", hash = "sha256:ce5d26f1ddc1111641113653af33694c13947dd36c2df09cdd33f554351d372e"},
+    {file = "selenium-4.30.0-py3-none-any.whl", hash = "sha256:90bcd3be86a1762100a093b33e5e4530b328226da94208caadb15ce13243dffd"},
-    {file = "selenium-4.29.0.tar.gz", hash = "sha256:3a62f7ec33e669364a6c0562a701deb69745b569c50d55f1a912bf8eb33358ba"},
+    {file = "selenium-4.30.0.tar.gz", hash = "sha256:16ab890fc7cb21a01e1b1e9a0fbaa9445fe30837eabc66e90b3bacf12138126a"},
 ]
 [package.dependencies]
@@ -2511,6 +2506,7 @@ description = "Python documentation generator"
 optional = false
 python-versions = ">=3.10"
 groups = ["docs"]
 markers = "python_version < \"3.12\""
 files = [
    {file = "sphinx-8.1.3-py3-none-any.whl", hash = "sha256:09719015511837b76bf6e03e42eb7595ac8c2e41eeb9c29c5b755c6b677992a2"},
    {file = "sphinx-8.1.3.tar.gz", hash = "sha256:43c1911eecb0d3e161ad78611bc905d1ad0e523e4ddc202a58a821773dc4c927"},
@@ -2540,6 +2536,43 @@ docs = ["sphinxcontrib-websupport"]
 lint = ["flake8 (>=6.0)", "mypy (==1.11.1)", "pyright (==1.1.384)", "pytest (>=6.0)", "ruff (==0.6.9)", "sphinx-lint (>=0.9)", "tomli (>=2)", "types-Pillow (==10.2.0.20240822)", "types-Pygments (==2.18.0.20240506)", "types-colorama (==0.4.15.20240311)", "types-defusedxml (==0.7.0.20240218)", "types-docutils (==0.21.0.20241005)", "types-requests (==2.32.0.20240914)", "types-urllib3 (==1.26.25.14)"]
 test = ["cython (>=3.0)", "defusedxml (>=0.7.1)", "pytest (>=8.0)", "setuptools (>=70.0)", "typing_extensions (>=4.9)"]
 [[package]]
 name = "sphinx"
 version = "8.2.3"
 description = "Python documentation generator"
 optional = false
 python-versions = ">=3.11"
 groups = ["docs"]
 markers = "python_version >= \"3.12\""
 files = [
    {file = "sphinx-8.2.3-py3-none-any.whl", hash = "sha256:4405915165f13521d875a8c29c8970800a0141c14cc5416a38feca4ea5d9b9c3"},
    {file = "sphinx-8.2.3.tar.gz", hash = "sha256:398ad29dee7f63a75888314e9424d40f52ce5a6a87ae88e7071e80af296ec348"},
 ]
 [package.dependencies]
 alabaster = ">=0.7.14"
 babel = ">=2.13"
 colorama = {version = ">=0.4.6", markers = "sys_platform == \"win32\""}
 docutils = ">=0.20,<0.22"
 imagesize = ">=1.3"
 Jinja2 = ">=3.1"
 packaging = ">=23.0"
 Pygments = ">=2.17"
 requests = ">=2.30.0"
 roman-numerals-py = ">=1.0.0"
 snowballstemmer = ">=2.2"
 sphinxcontrib-applehelp = ">=1.0.7"
 sphinxcontrib-devhelp = ">=1.0.6"
 sphinxcontrib-htmlhelp = ">=2.0.6"
 sphinxcontrib-jsmath = ">=1.0.1"
 sphinxcontrib-qthelp = ">=1.0.6"
 sphinxcontrib-serializinghtml = ">=1.1.9"
 [package.extras]
 docs = ["sphinxcontrib-websupport"]
 lint = ["betterproto (==2.0.0b6)", "mypy (==1.15.0)", "pypi-attestations (==0.0.21)", "pyright (==1.1.395)", "pytest (>=8.0)", "ruff (==0.9.9)", "sphinx-lint (>=0.9)", "types-Pillow (==10.2.0.20240822)", "types-Pygments (==2.19.0.20250219)", "types-colorama (==0.4.15.20240311)", "types-defusedxml (==0.7.0.20240218)", "types-docutils (==0.21.0.20241128)", "types-requests (==2.32.0.20241016)", "types-urllib3 (==1.26.25.14)"]
 test = ["cython (>=3.0)", "defusedxml (>=0.7.1)", "pytest (>=8.0)", "pytest-xdist[psutil] (>=3.4)", "setuptools (>=70.0)", "typing_extensions (>=4.9)"]
 [[package]]
 name = "sphinx-autoapi"
 version = "3.6.0"
@@ -2745,14 +2778,14 @@ test = ["pytest"]
 [[package]]
 name = "starlette"
-version = "0.46.0"
+version = "0.46.1"
 description = "The little ASGI library that shines."
 optional = false
 python-versions = ">=3.9"
 groups = ["docs"]
 files = [
-    {file = "starlette-0.46.0-py3-none-any.whl", hash = "sha256:913f0798bd90ba90a9156383bcf1350a17d6259451d0d8ee27fc0cf2db609038"},
+    {file = "starlette-0.46.1-py3-none-any.whl", hash = "sha256:77c74ed9d2720138b25875133f3a2dae6d854af2ec37dceb56aef370c1d8a227"},
-    {file = "starlette-0.46.0.tar.gz", hash = "sha256:b359e4567456b28d473d0193f34c0de0ed49710d75ef183a74a5ce0499324f50"},
+    {file = "starlette-0.46.1.tar.gz", hash = "sha256:3c88d58ee4bd1bb807c0d1acb381838afc7752f9ddaec81bbe4383611d833230"},
 ]
 [package.dependencies]
@@ -2896,26 +2929,6 @@ outcome = ">=1.2.0"
 trio = ">=0.11"
 wsproto = ">=0.14"
 [[package]]
 name = "tsp-client"
 version = "0.2.0"
 description = "An IETF Time-Stamp Protocol (TSP) (RFC 3161) client"
 optional = false
 python-versions = "*"
 groups = ["main"]
 files = [
    {file = "tsp-client-0.2.0.tar.gz", hash = "sha256:6e66148dd116322eb44a7484e5ad33bbe640b997343c443de9cc70fc5eb19987"},
    {file = "tsp_client-0.2.0-py3-none-any.whl", hash = "sha256:0b790d10a68d66782c13f1d7cc7f5206df26b49826c1da80944b7c05b1731784"},
 ]
 [package.dependencies]
 asn1crypto = ">=0.24.0"
 pyOpenSSL = ">=20.0.0"
 requests = ">=2.18.4"
 [package.extras]
 tests = ["build", "coverage", "mypy", "ruff", "wheel"]
 [[package]]
 name = "typing-extensions"
 version = "4.12.2"
@@ -2946,15 +2959,15 @@ typing-extensions = ">=3.7.4"
 [[package]]
 name = "tzdata"
-version = "2025.1"
+version = "2025.2"
 description = "Provider of IANA time zone data"
 optional = false
 python-versions = ">=2"
 groups = ["main"]
 markers = "platform_system == \"Windows\""
 files = [
-    {file = "tzdata-2025.1-py2.py3-none-any.whl", hash = "sha256:7e127113816800496f027041c570f50bcd464a020098a3b6b199517772303639"},
+    {file = "tzdata-2025.2-py2.py3-none-any.whl", hash = "sha256:1a403fada01ff9221ca8044d701868fa132215d84beb92242d9acd2147f667a8"},
-    {file = "tzdata-2025.1.tar.gz", hash = "sha256:24894909e88cdb28bd1636c6887801df64cb485bd593f2fd83ef29075a81d694"},
+    {file = "tzdata-2025.2.tar.gz", hash = "sha256:b60a638fcc0daffadf82fe0f57e53d06bdec2f36c4df66280ae79bce6bd6f2b9"},
 ]
 [[package]]
@@ -3343,27 +3356,27 @@ h11 = ">=0.9.0,<1"
 [[package]]
 name = "yt-dlp"
-version = "2025.2.19"
+version = "2025.3.21"
 description = "A feature-rich command-line audio/video downloader"
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "yt_dlp-2025.2.19-py3-none-any.whl", hash = "sha256:3ed218eaeece55e9d715afd41abc450dc406ee63bf79355169dfde312d38fdb8"},
+    {file = "yt_dlp-2025.3.21-py3-none-any.whl", hash = "sha256:80d5ce15f9223e0c27020b861a4c5b72c6ba5d6c957c1b8fd2a022a69783f482"},
-    {file = "yt_dlp-2025.2.19.tar.gz", hash = "sha256:f33ca76df2e4db31880f2fe408d44f5058d9f135015b13e50610dfbe78245bea"},
+    {file = "yt_dlp-2025.3.21.tar.gz", hash = "sha256:5bcf47b2897254ea3816935a8dde47d243bff556782cced6b16a2b85e6b682ba"},
 ]
 [package.extras]
 build = ["build", "hatchling", "pip", "setuptools (>=71.0.2)", "wheel"]
 curl-cffi = ["curl-cffi (==0.5.10) ; os_name == \"nt\" and implementation_name == \"cpython\"", "curl-cffi (>=0.5.10,!=0.6.*,<0.7.2) ; os_name != \"nt\" and implementation_name == \"cpython\""]
 default = ["brotli ; implementation_name == \"cpython\"", "brotlicffi ; implementation_name != \"cpython\"", "certifi", "mutagen", "pycryptodomex", "requests (>=2.32.2,<3)", "urllib3 (>=1.26.17,<3)", "websockets (>=13.0)"]
-dev = ["autopep8 (>=2.0,<3.0)", "pre-commit", "pytest (>=8.1,<9.0)", "pytest-rerunfailures (>=14.0,<15.0)", "ruff (>=0.9.0,<0.10.0)"]
+dev = ["autopep8 (>=2.0,<3.0)", "pre-commit", "pytest (>=8.1,<9.0)", "pytest-rerunfailures (>=14.0,<15.0)", "ruff (>=0.11.0,<0.12.0)"]
 pyinstaller = ["pyinstaller (>=6.11.1)"]
 secretstorage = ["cffi", "secretstorage"]
-static-analysis = ["autopep8 (>=2.0,<3.0)", "ruff (>=0.9.0,<0.10.0)"]
+static-analysis = ["autopep8 (>=2.0,<3.0)", "ruff (>=0.11.0,<0.12.0)"]
 test = ["pytest (>=8.1,<9.0)", "pytest-rerunfailures (>=14.0,<15.0)"]
 [metadata]
 lock-version = "2.1"
 python-versions = ">=3.10,<3.13"
-content-hash = "beb354960b8d8af491a13e09cb565c7e3099a2b150167c16147aa0438e970018"
+content-hash = "ac5d473189adbadb3ee5d8a36e1898a39725755704e0677768303ae46bc246c8"
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
 [project]
 name = "auto-archiver"
-version = "0.13.6"
+version = "0.13.9"
 description = "Automatically archive links to videos, images, and social media content from Google Sheets (and more)."
 requires-python = ">=3.10,<3.13"
@@ -41,11 +41,9 @@ dependencies = [
    "instaloader (>=0.0.0)",
    "tqdm (>=0.0.0)",
    "jinja2 (>=0.0.0)",
    "pyOpenSSL (==24.2.1)",
    "cryptography (>=41.0.0,<42.0.0)",
    "boto3 (>=1.28.0,<2.0.0)",
    "dataclasses-json (>=0.0.0)",
-    "yt-dlp (>=2025.1.26,<2026.0.0)",
+    "yt-dlp (>=2025.3.21,<2026.0.0)",
    "numpy (==2.1.3)",
    "vk-url-scraper (>=0.0.0)",
    "requests[socks] (>=0.0.0)",
@@ -53,10 +51,10 @@ dependencies = [
    "jsonlines (>=0.0.0)",
    "pysubs2 (>=0.0.0)",
    "retrying (>=0.0.0)",
    "tsp-client (>=0.0.0)",
    "certvalidator (>=0.0.0)",
    "rich-argparse (>=1.6.0,<2.0.0)",
    "ruamel-yaml (>=0.18.10,<0.19.0)",
    "rfc3161-client (>=1.0.1,<2.0.0)",
    "cryptography (>44.0.1,<45.0.0)",
    "opentimestamps (>=0.4.5,<0.5.0)",
 ]
--- a/scripts/generate_google_services.sh
+++ b/scripts/generate_google_services.sh
@@ -0,0 +1,135 @@
 #!/usr/bin/env bash
 set -e  # Exit on error
 UUID=$(LC_ALL=C tr -dc a-z0-9 </dev/urandom | head -c 16) 
 PROJECT_NAME="auto-archiver-$UUID"
 ACCOUNT_NAME="autoarchiver"
 KEY_FILE="service_account-$UUID.json"
 DEST_DIR="$1"
 echo "====================================================="
 echo "🔧 Auto-Archiver Google Services Setup Script"
 echo "====================================================="
 echo "This script will:"
 echo "  1. Install Google Cloud SDK if needed"
 echo "  2. Create a Google Cloud project named $PROJECT_NAME"
 echo "  3. Create a service account for Auto-Archiver"
 echo "  4. Generate a key file for API access"
 echo ""
 echo "  Tip: Pass a directory path as an argument to this script to move the key file there"
 echo "  e.g. ./generate_google_services.sh /path/to/secrets"
 echo "====================================================="
 # Check and install Google Cloud SDK based on platform
 install_gcloud_sdk() {
    if command -v gcloud &> /dev/null; then
        echo "✅ Google Cloud SDK is already installed"
        return 0
    fi
    echo "📦 Installing Google Cloud SDK..."
    # Detect OS
    case "$(uname -s)" in
        Darwin*)
            if command -v brew &> /dev/null; then
                echo "🍺 Installing via Homebrew..."
                brew install google-cloud-sdk --cask
            else
                echo "📥 Downloading Google Cloud SDK for macOS..."
                curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-latest-darwin-x86_64.tar.gz
                tar -xf google-cloud-cli-latest-darwin-x86_64.tar.gz
                ./google-cloud-sdk/install.sh --quiet
                rm google-cloud-cli-latest-darwin-x86_64.tar.gz
                echo "🔄 Please restart your terminal and run this script again"
                exit 0
            fi
            ;;
        Linux*)
            echo "📥 Downloading Google Cloud SDK for Linux..."
            curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-latest-linux-x86_64.tar.gz
            tar -xf google-cloud-cli-latest-linux-x86_64.tar.gz
            ./google-cloud-sdk/install.sh --quiet
            rm google-cloud-cli-latest-linux-x86_64.tar.gz
            echo "🔄 Please restart your terminal and run this script again"
            exit 0
            ;;
        CYGWIN*|MINGW*|MSYS*)
            echo "⚠️ Windows detected. Please follow manual installation instructions at:"
            echo "https://cloud.google.com/sdk/docs/install-sdk"
            exit 1
            ;;
        *)
            echo "⚠️ Unknown operating system. Please follow manual installation instructions at:"
            echo "https://cloud.google.com/sdk/docs/install-sdk"
            exit 1
            ;;
    esac
    echo "✅ Google Cloud SDK installed"
 }
 # Install Google Cloud SDK if needed
 install_gcloud_sdk
 # Login to Google Cloud
 if gcloud auth list --filter=status:ACTIVE --format="value(account)" | grep -q "@"; then
    echo "✅ Already authenticated with Google Cloud"
 else
    echo "🔑 Authenticating with Google Cloud..."
    gcloud auth login
 fi
 # Create project
 echo "🌟 Creating Google Cloud project: $PROJECT_NAME"
 gcloud projects create $PROJECT_NAME
 # Create service account
 echo "👤 Creating service account: $ACCOUNT_NAME"
 gcloud iam service-accounts create $ACCOUNT_NAME --project $PROJECT_NAME
 # Enable required APIs (uncomment and add APIs as needed)
 echo "⬆️ Enabling required Google APIs..."
 gcloud services enable sheets.googleapis.com --project $PROJECT_NAME
 gcloud services enable drive.googleapis.com --project $PROJECT_NAME
 # Get the service account email
 echo "📧 Retrieving service account email..."
 ACCOUNT_EMAIL=$(gcloud iam service-accounts list --project $PROJECT_NAME --format="value(email)")
 # Create and download key
 echo "🔑 Generating service account key file: $KEY_FILE"
 gcloud iam service-accounts keys create $KEY_FILE --iam-account=$ACCOUNT_EMAIL
 # move the file to TARGET_DIR if provided
 if [[ -n "$DEST_DIR" ]]; then
    # Expand `~` if used
    DEST_DIR=$(eval echo "$DEST_DIR")
    # Ensure the directory exists
    if [[ ! -d "$DEST_DIR" ]]; then
        mkdir -p "$DEST_DIR"
    fi
    DEST_PATH="$DEST_DIR/$KEY_FILE"
    echo "🚚 Moving key file to: $DEST_PATH"
    mv "$KEY_FILE" "$DEST_PATH"
    KEY_FILE="$DEST_PATH"
 fi
 echo "====================================================="
 echo "✅ SETUP COMPLETE!"
 echo "====================================================="
 echo "📝 Important Information:"
 echo "  • Project Name: $PROJECT_NAME"
 echo "  • Service Account: $ACCOUNT_EMAIL"
 echo "  • Key File: $KEY_FILE"
 echo ""
 echo "📋 Next Steps:"
 echo "  1. Share any Google Sheets with this email address:"
 echo "     $ACCOUNT_EMAIL"
 echo "  2. Move $KEY_FILE to your auto-archiver secrets directory"
 echo "  3. Update your auto-archiver config to use this key file (if needed)"
 echo "====================================================="
--- a/scripts/settings/package-lock.json
+++ b/scripts/settings/package-lock.json
--- a/src/auto_archiver/core/base_module.py
+++ b/src/auto_archiver/core/base_module.py
@@ -71,7 +71,16 @@ class BaseModule(ABC):
        :param site: the domain of the site to get authentication information for
        :param extract_cookies: whether or not to extract cookies from the given browser/file and return the cookie jar (disabling can speed up processing if you don't actually need the cookies jar).
-        :returns: authdict dict of login information for the given site
+        :returns: authdict dict -> {
            "username": str,
            "password": str,
            "api_key": str,
            "api_secret": str,
            "cookie": str,
            "cookies_file": str,
            "cookies_from_browser": str,
            "cookies_jar": CookieJar
        }
        **Global options:**\n
        * cookies_from_browser: str - the name of the browser to extract cookies from (e.g. 'chrome', 'firefox' - uses ytdlp under the hood to extract\n
@@ -85,6 +94,7 @@ class BaseModule(ABC):
        * cookie: str - a cookie string to use for login (specific to this site)\n
        * cookies_file: str - the path to a cookies file to use for login (specific to this site)\n
        * cookies_from_browser: str - the name of the browser to extract cookies from (specitic for this site)\n
        """
        # TODO: think about if/how we can deal with sites that have multiple domains (main one is x.com/twitter.com)
        # for now the user must enter them both, like "x.com,twitter.com" in their config. Maybe we just hard-code?
--- a/src/auto_archiver/core/module.py
+++ b/src/auto_archiver/core/module.py
@@ -5,6 +5,7 @@ by handling user configuration, validating the steps properties, and implementin
 """
 from __future__ import annotations
 import subprocess
 from dataclasses import dataclass
 from typing import List, TYPE_CHECKING, Type
@@ -17,7 +18,7 @@ import os
 from os.path import join
 from loguru import logger
 import auto_archiver
-from auto_archiver.core.consts import DEFAULT_MANIFEST, MANIFEST_FILE
+from auto_archiver.core.consts import DEFAULT_MANIFEST, MANIFEST_FILE, SetupError
 if TYPE_CHECKING:
    from .base_module import BaseModule
@@ -85,7 +86,11 @@ class ModuleFactory:
        if not available:
            message = f"Module '{module_name}' not found. Are you sure it's installed/exists?"
            if "archiver" in module_name:
-                message += f" Did you mean {module_name.replace('archiver', 'extractor')}?"
+                message += f" Did you mean '{module_name.replace('archiver', 'extractor')}'?"
            elif "gsheet" in module_name:
                message += " Did you mean 'gsheet_feeder_db'?"
            elif "atlos" in module_name:
                message += " Did you mean 'atlos_feeder_db_storage'?"
            raise IndexError(message)
        return available[0]
@@ -216,9 +221,9 @@ class LazyBaseModule:
                if not check(dep):
                    logger.error(
                        f"Module '{self.name}' requires external dependency '{dep}' which is not available/setup. \
-                                 Have you installed the required dependencies for the '{self.name}' module? See the README for more information."
+                                 Have you installed the required dependencies for the '{self.name}' module? See the documentation for more information."
                    )
-                    exit(1)
+                    raise SetupError()
        def check_python_dep(dep):
            # first check if it's a module:
@@ -237,8 +242,22 @@ class LazyBaseModule:
            return find_spec(dep)
        def check_bin_dep(dep):
            dep_exists = shutil.which(dep)
            if dep == "docker":
                if os.environ.get("RUNNING_IN_DOCKER"):
                    # this is only for the WACZ enricher, which requires docker
                    # if we're already running in docker then we don't need docker
                    return True
                # check if docker daemon is running
                return dep_exists and subprocess.run(["docker", "ps", "-q"]).returncode == 0
            return dep_exists
        check_deps(self.dependencies.get("python", []), check_python_dep)
-        check_deps(self.dependencies.get("bin", []), lambda dep: shutil.which(dep))
+        check_deps(self.dependencies.get("bin", []), check_bin_dep)
        logger.debug(f"Loading module '{self.display_name}'...")
--- a/src/auto_archiver/core/orchestrator.py
+++ b/src/auto_archiver/core/orchestrator.py
@@ -373,9 +373,17 @@ Here's how that would look: \n\nsteps:\n  extractors:\n  - [your_extractor_name_
                if module in invalid_modules:
                    continue
                # check to make sure that we're trying to load it as the correct type - i.e. make sure the user hasn't put it under the wrong 'step'
                lazy_module: LazyBaseModule = self.module_factory.get_module_lazy(module)
                if module_type not in lazy_module.type:
                    types = ",".join(f"'{t}'" for t in lazy_module.type)
                    raise SetupError(
                        f"Configuration Error: Module '{module}' is not a {module_type}, but has the types: {types}. Please check you set this module up under the right step in your orchestration file."
                    )
                loaded_module = None
                try:
-                    loaded_module: BaseModule = self.module_factory.get_module(module, self.config)
+                    loaded_module: BaseModule = lazy_module.load(self.config)
                except (KeyboardInterrupt, Exception) as e:
                    if not isinstance(e, KeyboardInterrupt) and not isinstance(e, SetupError):
                        logger.error(f"Error during setup of modules: {e}\n{traceback.format_exc()}")
--- a/src/auto_archiver/modules/generic_extractor/manifest.py
+++ b/src/auto_archiver/modules/generic_extractor/manifest.py
@@ -74,6 +74,11 @@ If you are having issues with the extractor, you can review the version of `yt-d
            "default": "inf",
            "help": "Use to limit the number of videos to download when a channel or long page is being extracted. 'inf' means no limit.",
        },
        "extractor_args": {
            "default": {},
            "help": "Additional arguments to pass to the yt-dlp extractor. See https://github.com/yt-dlp/yt-dlp/blob/master/README.md#extractor-arguments.",
            "type": "json_loader",
        },
        "ytdlp_update_interval": {
            "default": 5,
            "help": "How often to check for yt-dlp updates (days). If positive, will check and update yt-dlp every [num] days. Set it to -1 to disable, or 0 to always update on every run.",
--- a/src/auto_archiver/modules/generic_extractor/dropin.py
+++ b/src/auto_archiver/modules/generic_extractor/dropin.py
@@ -1,3 +1,4 @@
 from typing import Type
 from yt_dlp.extractor.common import InfoExtractor
 from auto_archiver.core.metadata import Metadata
 from auto_archiver.core.extractor import Extractor
@@ -24,6 +25,8 @@ class GenericDropin:
    """
    extractor: Type[Extractor] = None
    def extract_post(self, url: str, ie_instance: InfoExtractor):
        """
        This method should return the post data from the url.
@@ -55,3 +58,19 @@ class GenericDropin:
        This method should download any additional media from the post.
        """
        return metadata
    def suitable(self, url, info_extractor: InfoExtractor):
        """
        A method to allow dropins to override their InfoExtractor's 'suitable' method.
        Dropins should override this method and return True if the url is suitable for the extractor
        (based on being able to parse other URLs). See the `suitable_extractors` method in the
        `GenericExtractor` class for how this is implemented.
        The default behaviour of this method is to return the result of the InfoExtractor's 'suitable' method.
        ### Example: An example of where this is useful is for the FacebookIE extractor in yt-dlp. By default,
        it's 'suitable' method only returns True for video URLs. However, we can override this method in the
        Facebook dropin to return True for all Facebook URLs (photo/post types). This way, the Facebook dropin
        can be used for all Facebook URLs.
        """
        return info_extractor.suitable(url)
--- a/src/auto_archiver/modules/generic_extractor/facebook.py
+++ b/src/auto_archiver/modules/generic_extractor/facebook.py
@@ -1,17 +1,154 @@
 import re
 from .dropin import GenericDropin
 from auto_archiver.core.metadata import Metadata
 from yt_dlp.extractor.facebook import FacebookIE
 # TODO: Remove if / when  https://github.com/yt-dlp/yt-dlp/pull/12275 is merged
 from yt_dlp.utils import (
    clean_html,
    get_element_by_id,
    traverse_obj,
    get_first,
    merge_dicts,
    int_or_none,
    parse_count,
 )
 def _extract_metadata(self, webpage, video_id):
    post_data = [
        self._parse_json(j, video_id, fatal=False)
        for j in re.findall(r"data-sjs>({.*?ScheduledServerJS.*?})</script>", webpage)
    ]
    post = (
        traverse_obj(
            post_data,
            (..., "require", ..., ..., ..., "__bbox", "require", ..., ..., ..., "__bbox", "result", "data"),
            expected_type=dict,
        )
        or []
    )
    media = traverse_obj(
        post,
        (
            ...,
            "attachments",
            ...,
            lambda k, v: (k == "media" and str(v["id"]) == video_id and v["__typename"] == "Video"),
        ),
        expected_type=dict,
    )
    title = get_first(media, ("title", "text"))
    description = get_first(media, ("creation_story", "comet_sections", "message", "story", "message", "text"))
    page_title = title or self._html_search_regex(
        (
            r'<h2\s+[^>]*class="uiHeaderTitle"[^>]*>(?P<content>[^<]*)</h2>',
            r'(?s)<span class="fbPhotosPhotoCaption".*?id="fbPhotoPageCaption"><span class="hasCaption">(?P<content>.*?)</span>',
            self._meta_regex("og:title"),
            self._meta_regex("twitter:title"),
            r"<title>(?P<content>.+?)</title>",
        ),
        webpage,
        "title",
        default=None,
        group="content",
    )
    description = description or self._html_search_meta(
        ["description", "og:description", "twitter:description"], webpage, "description", default=None
    )
    uploader_data = (
        get_first(media, ("owner", {dict}))
        or get_first(
            post, ("video", "creation_story", "attachments", ..., "media", lambda k, v: k == "owner" and v["name"])
        )
        or get_first(post, (..., "video", lambda k, v: k == "owner" and v["name"]))
        or get_first(post, ("node", "actors", ..., {dict}))
        or get_first(post, ("event", "event_creator", {dict}))
        or get_first(post, ("video", "creation_story", "short_form_video_context", "video_owner", {dict}))
        or {}
    )
    uploader = uploader_data.get("name") or (
        clean_html(get_element_by_id("fbPhotoPageAuthorName", webpage))
        or self._search_regex(
            (r'ownerName\s*:\s*"([^"]+)"', *self._og_regexes("title")), webpage, "uploader", fatal=False
        )
    )
    timestamp = int_or_none(self._search_regex(r'<abbr[^>]+data-utime=["\'](\d+)', webpage, "timestamp", default=None))
    thumbnail = self._html_search_meta(["og:image", "twitter:image"], webpage, "thumbnail", default=None)
    # some webpages contain unretrievable thumbnail urls
    # like https://lookaside.fbsbx.com/lookaside/crawler/media/?media_id=10155168902769113&get_thumbnail=1
    # in https://www.facebook.com/yaroslav.korpan/videos/1417995061575415/
    if thumbnail and not re.search(r"\.(?:jpg|png)", thumbnail):
        thumbnail = None
    info_dict = {
        "description": description,
        "uploader": uploader,
        "uploader_id": uploader_data.get("id"),
        "timestamp": timestamp,
        "thumbnail": thumbnail,
        "view_count": parse_count(
            self._search_regex(
                (r'\bviewCount\s*:\s*["\']([\d,.]+)', r'video_view_count["\']\s*:\s*(\d+)'),
                webpage,
                "view count",
                default=None,
            )
        ),
        "concurrent_view_count": get_first(
            post, (("video", (..., ..., "attachments", ..., "media")), "liveViewerCount", {int_or_none})
        ),
        **traverse_obj(
            post,
            (
                lambda _, v: video_id in v["url"],
                "feedback",
                {
                    "like_count": ("likers", "count", {int}),
                    "comment_count": ("total_comment_count", {int}),
                    "repost_count": ("share_count_reduced", {parse_count}),
                },
            ),
            get_all=False,
        ),
    }
    info_json_ld = self._search_json_ld(webpage, video_id, default={})
    info_json_ld["title"] = (
        re.sub(r"\s*\|\s*Facebook$", "", title or info_json_ld.get("title") or page_title or "")
        or (description or "").replace("\n", " ")
        or f"Facebook video #{video_id}"
    )
    return merge_dicts(info_json_ld, info_dict)
 class Facebook(GenericDropin):
-    def extract_post(self, url: str, ie_instance):
+    def extract_post(self, url: str, ie_instance: FacebookIE):
-        video_id = ie_instance._match_valid_url(url).group("id")
+        post_id_regex = r"(?P<id>pfbid[A-Za-z0-9]+|\d+|t\.(\d+\/\d+))"
-        ie_instance._download_webpage(url.replace("://m.facebook.com/", "://www.facebook.com/"), video_id)
+        post_id = re.search(post_id_regex, url).group("id")
-        webpage = ie_instance._download_webpage(url, ie_instance._match_valid_url(url).group("id"))
+        webpage = ie_instance._download_webpage(url.replace("://m.facebook.com/", "://www.facebook.com/"), post_id)
-        # TODO: fix once https://github.com/yt-dlp/yt-dlp/pull/12275 is merged
+        # TODO: For long posts, this _extract_metadata only seems to return the first 100 or so characters, followed by ...
-        post_data = ie_instance._extract_metadata(webpage)
+
        # TODO: If/when https://github.com/yt-dlp/yt-dlp/pull/12275 is merged, uncomment next line and delete the one after
        # post_data = ie_instance._extract_metadata(webpage, post_id)
        post_data = _extract_metadata(ie_instance, webpage, post_id)
        return post_data
-    def create_metadata(self, post: dict, ie_instance, archiver, url):
+    def create_metadata(self, post: dict, ie_instance: FacebookIE, archiver, url):
-        metadata = archiver.create_metadata(url)
+        result = Metadata()
-        metadata.set_title(post.get("title")).set_content(post.get("description")).set_post_data(post)
+        result.set_content(post.get("description", ""))
-        return metadata
+        result.set_title(post.get("title", ""))
        result.set("author", post.get("uploader", ""))
        result.set_url(url)
        return result
    def suitable(self, url, info_extractor: FacebookIE):
        regex = r"(?:https?://(?:[\w-]+\.)?(?:facebook\.com||facebookwkhpilnemxj7asaniu7vnjjbiltxjqhye3mhbshg7kx5tfyd\.onion)/)"
        return re.match(regex, url)
    def skip_ytdlp_download(self, url: str, is_instance: FacebookIE):
        """
        Skip using the ytdlp download method for Facebook *photo* posts, they have a URL with an id of t.XXXXX/XXXXX
        """
        if re.search(r"/t.\d+/\d+", url):
            return True
--- a/src/auto_archiver/modules/generic_extractor/generic_extractor.py
+++ b/src/auto_archiver/modules/generic_extractor/generic_extractor.py
@@ -1,3 +1,4 @@
 import sys
 import datetime
 import os
 import importlib
@@ -13,6 +14,8 @@ from loguru import logger
 from auto_archiver.core.extractor import Extractor
 from auto_archiver.core import Metadata, Media
 from auto_archiver.utils import get_datetime_from_str
 from .dropin import GenericDropin
 class SkipYtdlp(Exception):
@@ -35,12 +38,24 @@ class GenericExtractor(Extractor):
                next_update_check = datetime.datetime.fromisoformat(f.read())
        if not next_update_check or next_update_check < datetime.datetime.now():
-            self.update_ytdlp()
+            updated = self.update_ytdlp()
            next_update_check = datetime.datetime.now() + datetime.timedelta(days=self.ytdlp_update_interval)
            with open(path, "w") as f:
                f.write(next_update_check.isoformat())
            if not updated:
                return
            if os.environ.get("AUTO_ARCHIVER_ALLOW_RESTART", "1") != "1":
                logger.warning(
                    "yt-dlp has been updated. Auto archiver should be restarted for these changes to take effect"
                )
            else:
                logger.warning("Restarting auto-archiver to apply yt-dlp update")
                logger.warning(" ======= RESTARTING ======= ")
                os.execv(sys.executable, [sys.executable] + sys.argv)
    def update_ytdlp(self):
        logger.info("Checking and updating yt-dlp...")
        logger.info(
@@ -56,18 +71,27 @@ class GenericExtractor(Extractor):
            if "Successfully installed yt-dlp" in result.stdout.decode():
                new_version = importlib.metadata.version("yt-dlp")
                logger.info(f"yt-dlp successfully (from {old_version} to {new_version})")
-                importlib.reload(yt_dlp)
+                return True
            else:
                logger.info("yt-dlp already up to date")
                return False
        except Exception as e:
            logger.error(f"Error updating yt-dlp: {e}")
            return False
    def suitable_extractors(self, url: str) -> Generator[str, None, None]:
        """
        Returns a list of valid extractors for the given URL"""
        for info_extractor in yt_dlp.YoutubeDL()._ies.values():
-            if info_extractor.suitable(url) and info_extractor.working():
+            if not info_extractor.working():
                continue
            # check if there's a dropin and see if that declares whether it's suitable
            dropin: GenericDropin = self.dropin_for_name(info_extractor.ie_key())
            if dropin and dropin.suitable(url, info_extractor):
                yield info_extractor
            elif info_extractor.suitable(url):
                yield info_extractor
    def suitable(self, url: str) -> bool:
@@ -188,9 +212,13 @@ class GenericExtractor(Extractor):
        result = self.download_additional_media(video_data, info_extractor, result)
        # keep both 'title' and 'fulltitle', but prefer 'title', falling back to 'fulltitle' if it doesn't exist
-        result.set_title(video_data.pop("title", video_data.pop("fulltitle", "")))
+        if not result.get_title():
-        result.set_url(url)
+            result.set_title(video_data.pop("title", video_data.pop("fulltitle", "")))
-        if "description" in video_data:
+
        if not result.get("url"):
            result.set_url(url)
        if "description" in video_data and not result.get("content"):
            result.set_content(video_data["description"])
        # extract comments if enabled
        if self.comments:
@@ -207,11 +235,14 @@ class GenericExtractor(Extractor):
            )
        # then add the common metadata
-        if timestamp := video_data.pop("timestamp", None):
+        timestamp = video_data.pop("timestamp", None)
        if timestamp and not result.get("timestamp"):
            timestamp = datetime.datetime.fromtimestamp(timestamp, tz=datetime.timezone.utc).isoformat()
            result.set_timestamp(timestamp)
-        if upload_date := video_data.pop("upload_date", None):
+
-            upload_date = datetime.datetime.strptime(upload_date, "%Y%m%d").replace(tzinfo=datetime.timezone.utc)
+        upload_date = video_data.pop("upload_date", None)
        if upload_date and not result.get("upload_date"):
            upload_date = get_datetime_from_str(upload_date, "%Y%m%d").replace(tzinfo=datetime.timezone.utc)
            result.set("upload_date", upload_date)
        # then clean away any keys we don't want
@@ -240,7 +271,8 @@ class GenericExtractor(Extractor):
            return False
        post_data = dropin.extract_post(url, ie_instance)
-        return dropin.create_metadata(post_data, ie_instance, self, url)
+        result = dropin.create_metadata(post_data, ie_instance, self, url)
        return self.add_metadata(post_data, info_extractor, url, result)
    def get_metadata_for_video(
        self, data: dict, info_extractor: Type[InfoExtractor], url: str, ydl: yt_dlp.YoutubeDL
@@ -285,7 +317,7 @@ class GenericExtractor(Extractor):
        return self.add_metadata(data, info_extractor, url, result)
-    def dropin_for_name(self, dropin_name: str, additional_paths=[], package=__package__) -> Type[InfoExtractor]:
+    def dropin_for_name(self, dropin_name: str, additional_paths=[], package=__package__) -> GenericDropin:
        dropin_name = dropin_name.lower()
        if dropin_name == "generic":
@@ -296,6 +328,7 @@ class GenericExtractor(Extractor):
        def _load_dropin(dropin):
            dropin_class = getattr(dropin, dropin_class_name)()
            dropin.extractor = self
            return self._dropins.setdefault(dropin_name, dropin_class)
        try:
@@ -340,7 +373,7 @@ class GenericExtractor(Extractor):
        dropin_submodule = self.dropin_for_name(info_extractor.ie_key())
        try:
-            if dropin_submodule and dropin_submodule.skip_ytdlp_download(info_extractor, url):
+            if dropin_submodule and dropin_submodule.skip_ytdlp_download(url, info_extractor):
                logger.debug(f"Skipping using ytdlp to download files for {info_extractor.ie_key()}")
                raise SkipYtdlp()
@@ -359,7 +392,7 @@ class GenericExtractor(Extractor):
            if not isinstance(e, SkipYtdlp):
                logger.debug(
-                    f'Issue using "{info_extractor.IE_NAME}" extractor to download video (error: {repr(e)}), attempting to use extractor to get post data instead'
+                    f'Issue using "{info_extractor.IE_NAME}" extractor to download video (error: {repr(e)}), attempting to use dropin to get post data instead'
                )
            try:
@@ -404,16 +437,20 @@ class GenericExtractor(Extractor):
            "--write-subs" if self.subtitles else "--no-write-subs",
            "--write-auto-subs" if self.subtitles else "--no-write-auto-subs",
            "--live-from-start" if self.live_from_start else "--no-live-from-start",
            "--proxy",
            self.proxy if self.proxy else "",
            f"--max-downloads {self.max_downloads}" if self.max_downloads != "inf" else "",
            f"--playlist-end {self.max_downloads}" if self.max_downloads != "inf" else "",
        ]
        # proxy handling
        if self.proxy:
            ydl_options.extend(["--proxy", self.proxy])
        # max_downloads handling
        if self.max_downloads != "inf":
            ydl_options.extend(["--max-downloads", str(self.max_downloads)])
            ydl_options.extend(["--playlist-end", str(self.max_downloads)])
        # set up auth
        auth = self.auth_for_site(url, extract_cookies=False)
-
+        # order of importance: username/password -> api_key -> cookie -> cookies_from_browser -> cookies_file
        # order of importance: username/pasword -> api_key -> cookie -> cookies_from_browser -> cookies_file
        if auth:
            if "username" in auth and "password" in auth:
                logger.debug(f"Using provided auth username and password for {url}")
@@ -429,6 +466,16 @@ class GenericExtractor(Extractor):
                logger.debug(f"Using cookies from file {auth['cookies_file']} for {url}")
                ydl_options.extend(("--cookies", auth["cookies_file"]))
        # Applying user-defined extractor_args
        if self.extractor_args:
            for key, args in self.extractor_args.items():
                logger.debug(f"Setting extractor_args: {key}")
                if isinstance(args, dict):
                    arg_str = ";".join(f"{k}={v}" for k, v in args.items())
                else:
                    arg_str = str(args)
                ydl_options.extend(["--extractor-args", f"{key}:{arg_str}"])
        if self.ytdlp_args:
            logger.debug("Adding additional ytdlp arguments: {self.ytdlp_args}")
            ydl_options += self.ytdlp_args.split(" ")
--- a/src/auto_archiver/modules/generic_extractor/tiktok.py
+++ b/src/auto_archiver/modules/generic_extractor/tiktok.py
@@ -1,5 +1,8 @@
 import requests
 from loguru import logger
 from yt_dlp.extractor.tiktok import TikTokIE, TikTokLiveIE, TikTokVMIE, TikTokUserIE
 from auto_archiver.core import Metadata, Media
 from datetime import datetime, timezone
 from .dropin import GenericDropin
@@ -13,6 +16,11 @@ class Tiktok(GenericDropin):
    TIKWM_ENDPOINT = "https://www.tikwm.com/api/?url={url}"
    def suitable(self, url, info_extractor) -> bool:
        """This dropin (which uses Tikvm) is suitable for *all* Tiktok type URLs - videos, lives, VMs, and users.
        Return the 'suitable' method from the TikTokIE class."""
        return any(extractor().suitable(url) for extractor in (TikTokIE, TikTokLiveIE, TikTokVMIE, TikTokUserIE))
    def extract_post(self, url: str, ie_instance):
        logger.debug(f"Using Tikwm API to attempt to download tiktok video from {url=}")
@@ -38,6 +46,9 @@ class Tiktok(GenericDropin):
        api_data["video_url"] = video_url
        return api_data
    def keys_to_clean(self, video_data: dict, info_extractor):
        return ["video_url", "title", "create_time", "author", "cover", "origin_cover", "ai_dynamic_cover", "duration"]
    def create_metadata(self, post: dict, ie_instance, archiver, url):
        # prepare result, start by downloading video
        result = Metadata()
@@ -54,17 +65,17 @@ class Tiktok(GenericDropin):
            logger.error(f"failed to download video from {video_url}")
            return False
        video_media = Media(video_downloaded)
-        if duration := post.pop("duration", None):
+        if duration := post.get("duration", None):
            video_media.set("duration", duration)
        result.add_media(video_media)
        # add remaining metadata
-        result.set_title(post.pop("title", ""))
+        result.set_title(post.get("title", ""))
-        if created_at := post.pop("create_time", None):
+        if created_at := post.get("create_time", None):
            result.set_timestamp(datetime.fromtimestamp(created_at, tz=timezone.utc))
-        if author := post.pop("author", None):
+        if author := post.get("author", None):
            result.set("author", author)
        result.set("api_data", post)
--- a/src/auto_archiver/modules/generic_extractor/twitter.py
+++ b/src/auto_archiver/modules/generic_extractor/twitter.py
@@ -1,13 +1,11 @@
 import re
 import mimetypes
 import json
 from datetime import datetime
 from loguru import logger
 from slugify import slugify
 from auto_archiver.core.metadata import Metadata, Media
-from auto_archiver.utils import url as UrlUtil
+from auto_archiver.utils import url as UrlUtil, get_datetime_from_str
 from auto_archiver.core.extractor import Extractor
 from .dropin import GenericDropin, InfoExtractor
@@ -33,19 +31,24 @@ class Twitter(GenericDropin):
        twid = ie_instance._match_valid_url(url).group("id")
        return ie_instance._extract_status(twid=twid)
    def keys_to_clean(self, video_data, info_extractor):
        return ["user", "created_at", "entities", "favorited", "translator_type"]
    def create_metadata(self, tweet: dict, ie_instance: InfoExtractor, archiver: Extractor, url: str) -> Metadata:
        result = Metadata()
        try:
            if not tweet.get("user") or not tweet.get("created_at"):
                raise ValueError("Error retreiving post. Are you sure it exists?")
-            timestamp = datetime.strptime(tweet["created_at"], "%a %b %d %H:%M:%S %z %Y")
+            timestamp = get_datetime_from_str(tweet["created_at"], "%a %b %d %H:%M:%S %z %Y")
        except (ValueError, KeyError) as ex:
            logger.warning(f"Unable to parse tweet: {str(ex)}\nRetreived tweet data: {tweet}")
            return False
-        result.set_title(tweet.get("full_text", "")).set_content(json.dumps(tweet, ensure_ascii=False)).set_timestamp(
+        full_text = tweet.pop("full_text", "")
-            timestamp
+        author = tweet["user"].get("name", "")
-        )
+        result.set("author", author).set_url(url)
        result.set_title(f"{author} - {full_text}").set_content(full_text).set_timestamp(timestamp)
        if not tweet.get("entities", {}).get("media"):
            logger.debug("No media found, archiving tweet text only")
            result.status = "twitter-ytdl"
--- a/src/auto_archiver/modules/gsheet_feeder_db/manifest.py
+++ b/src/auto_archiver/modules/gsheet_feeder_db/manifest.py
@@ -70,10 +70,14 @@
    - Skips redundant updates for empty or invalid data fields.
    ### Setup
-    - Requires a Google Service Account JSON file for authentication, which should be stored in `secrets/gsheets_service_account.json`.
+    1. Requires a Google Service Account JSON file for authentication.
-    To set up a service account, follow the instructions [here](https://gspread.readthedocs.io/en/latest/oauth2.html).
+    To set up a service account, follow the instructions in the [how to](https://auto-archiver.readthedocs.io/en/latest/how_to/gsheets_setup.html),
-    - Define the `sheet` or `sheet_id` configuration to specify the sheet to archive.
+    or use the script:
-    - Customize the column names in your Google sheet using the `columns` configuration.
+    ```
-    - The Google Sheet can be used soley as a feeder or as a feeder and database, but note you can't currently feed into the database from an alternate feeder.
+    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/bellingcat/auto-archiver/refs/heads/main/scripts/generate_google_services.sh)"
    ```
    2. Create a Google sheet with the required column(s) and then define the `sheet` or `sheet_id` configuration to specify this sheet.
    3. Customize the column names in your Google sheet using the `columns` configuration.
    4. The Google Sheet can be used solely as a feeder or as a feeder and database, but note you can't currently feed into the database from an alternate feeder.
    """,
 }
--- a/src/auto_archiver/modules/instagram_extractor/instagram_extractor.py
+++ b/src/auto_archiver/modules/instagram_extractor/instagram_extractor.py
@@ -29,6 +29,9 @@ class InstagramExtractor(Extractor):
    # TODO: links to stories
    def setup(self) -> None:
        logger.warning("Instagram Extractor is not actively maintained, and may not work as expected.")
        logger.warning("Please consider using the Instagram Tbot Extractor or Instagram API Extractor instead.")
        self.insta = instaloader.Instaloader(
            download_geotags=True,
            download_comments=True,
--- a/src/auto_archiver/modules/local_storage/manifest.py
+++ b/src/auto_archiver/modules/local_storage/manifest.py
@@ -20,7 +20,7 @@
        "save_absolute": {
            "default": False,
            "type": "bool",
-            "help": "whether the path to the stored file is absolute or relative in the output result inc. formatters (WARN: leaks the file structure)",
+            "help": "whether the path to the stored file is absolute or relative in the output result inc. formatters (Warning: saving an absolute path will show your computer's file structure)",
        },
    },
    "description": """
--- a/src/auto_archiver/modules/screenshot_enricher/screenshot_enricher.py
+++ b/src/auto_archiver/modules/screenshot_enricher/screenshot_enricher.py
@@ -19,12 +19,21 @@ class ScreenshotEnricher(Enricher):
    def enrich(self, to_enrich: Metadata) -> None:
        url = to_enrich.get_url()
        if UrlUtil.is_auth_wall(url):
            logger.debug(f"[SKIP] SCREENSHOT since url is behind AUTH WALL: {url=}")
            return
        logger.debug(f"Enriching screenshot for {url=}")
        auth = self.auth_for_site(url)
        # screenshot enricher only supports cookie-type auth (selenium)
        has_valid_auth = auth and (auth.get("cookies") or auth.get("cookies_jar") or auth.get("cookie"))
        if UrlUtil.is_auth_wall(url) and not has_valid_auth:
            logger.warning(f"[SKIP] SCREENSHOT since url is behind AUTH WALL and no login details provided: {url=}")
            if any(auth.get(key) for key in ["username", "password", "api_key", "api_secret"]):
                logger.warning(
                    f"Screenshot enricher only supports cookie-type authentication, you have provided {auth.keys()} which are not supported.\
                               Consider adding 'cookie', 'cookies_file' or 'cookies_from_browser' to your auth for this site."
                )
            return
        with self.webdriver_factory(
            self.width,
            self.height,
--- a/src/auto_archiver/modules/timestamping_enricher/manifest.py
+++ b/src/auto_archiver/modules/timestamping_enricher/manifest.py
@@ -3,30 +3,38 @@
    "type": ["enricher"],
    "requires_setup": True,
    "dependencies": {
-        "python": ["loguru", "slugify", "tsp_client", "asn1crypto", "certvalidator", "certifi"],
+        "python": ["loguru", "slugify", "cryptography", "rfc3161_client", "certifi"],
    },
    "configs": {
        "tsa_urls": {
            "default": [
-                # [Adobe Approved Trust List] and [Windows Cert Store]
+                # See https://github.com/trailofbits/rfc3161-client/issues/46 for a list of valid TSAs
-                "http://timestamp.digicert.com",
+                # Full list of TSAs: https://gist.github.com/Manouchehri/fd754e402d98430243455713efada710
-                "http://timestamp.identrust.com",
+                    "http://timestamp.identrust.com",
-                # "https://timestamp.entrust.net/TSS/RFC3161sha2TS", # not valid for timestamping
+                    "http://timestamp.ssl.trustwave.com",
-                # "https://timestamp.sectigo.com", # wait 15 seconds between each request.
+                    "http://zeitstempel.dfn.de",
-                # [Adobe: European Union Trusted Lists].
+                    "http://ts.ssl.com",
-                # "https://timestamp.sectigo.com/qualified", # wait 15 seconds between each request.
+                    # "http://tsa.izenpe.com", # self-signed
-                # [Windows Cert Store]
+                    "http://tsa.lex-persona.com/tsa",
-                "http://timestamp.globalsign.com/tsa/r6advanced1",
+                    # "http://ca.signfiles.com/TSAServer.aspx", # self-signed
-                # [Adobe: European Union Trusted Lists] and [Windows Cert Store]
+                    # "http://tsa.sinpe.fi.cr/tsaHttp/", # self-signed
-                # "http://ts.quovadisglobal.com/eu", # not valid for timestamping
+                    # "http://tsa.cra.ge/signserver/tsa?workerName=qtsa", # self-signed
-                # "http://tsa.belgium.be/connect", # self-signed certificate in certificate chain
+                    "http://tss.cnbs.gob.hn/TSS/HttpTspServer",
-                # "https://timestamp.aped.gov.gr/qtss", # self-signed certificate in certificate chain
+                    "http://dss.nowina.lu/pki-factory/tsa/good-tsa",
-                # "http://tsa.sep.bg", # self-signed certificate in certificate chain
+                    # "https://freetsa.org/tsr", # self-signed
-                # "http://tsa.izenpe.com", #unable to get local issuer certificate
+                ],
                # "http://kstamp.keynectis.com/KSign", # unable to get local issuer certificate
                "http://tss.accv.es:8318/tsa",
            ],
            "help": "List of RFC3161 Time Stamp Authorities to use, separate with commas if passed via the command line.",
        },
        "cert_authorities": {
            "default": None,
            "help": "Path to a file containing trusted Certificate Authorities (CAs) in PEM format. If empty, the default system authorities are used.",
            "type": "str",
        },
        "allow_selfsigned": {
            "default": False,
            "help": "Whether or not to allow and save self-signed Timestamping certificates. This allows for a greater range of timestamping servers to be used, \
 but they are not trusted authorities",
            "type": "bool"
        }
    },
    "description": """
--- a/src/auto_archiver/modules/timestamping_enricher/timestamping_enricher.py
+++ b/src/auto_archiver/modules/timestamping_enricher/timestamping_enricher.py
@@ -1,15 +1,22 @@
 import os
 from loguru import logger
 from tsp_client import TSPSigner, SigningSettings, TSPVerifier
 from tsp_client.algorithms import DigestAlgorithm
 from importlib.metadata import version
 from asn1crypto.cms import ContentInfo
 from certvalidator import CertificateValidator, ValidationContext
 from asn1crypto import pem
 import certifi
 from importlib.metadata import version
 import hashlib
 from slugify import slugify
 import requests
 from loguru import logger
 from rfc3161_client import (decode_timestamp_response,TimestampRequestBuilder,TimeStampResponse, VerifierBuilder)
 from rfc3161_client import VerificationError as Rfc3161VerificationError
 from rfc3161_client.base import HashAlgorithm
 from rfc3161_client.tsp import SignedData
 from cryptography import x509
 from cryptography.hazmat.primitives import serialization
 import certifi
 from auto_archiver.core import Enricher
 from auto_archiver.core import Metadata, Media
 from auto_archiver.version import __version__
 class TimestampingEnricher(Enricher):
@@ -21,6 +28,25 @@ class TimestampingEnricher(Enricher):
    See https://gist.github.com/Manouchehri/fd754e402d98430243455713efada710 for list of timestamp authorities.
    """
    session = None
    def setup(self):
        self.session = requests.Session()
        self.session.headers.update(
            {
                "Content-Type": "application/timestamp-query",
                "User-Agent": f"Auto-Archiver {__version__}",
                "Accept": "application/timestamp-reply",
            }
        )
    def cleaup(self) -> None:
        """
        Terminates the underlying network session.
        """
        if self.session:
            self.session.close()
    def enrich(self, to_enrich: Metadata) -> None:
        url = to_enrich.get_url()
        logger.debug(f"RFC3161 timestamping existing files for {url=}")
@@ -34,8 +60,8 @@ class TimestampingEnricher(Enricher):
            logger.warning(f"No hashes found in {url=}")
            return
-        tmp_dir = self.tmp_dir
+        
-        hashes_fn = os.path.join(tmp_dir, "hashes.txt")
+        hashes_fn = os.path.join(self.tmp_dir, "hashes.txt")
        data_to_sign = "\n".join(hashes)
        with open(hashes_fn, "w") as f:
@@ -43,62 +69,160 @@ class TimestampingEnricher(Enricher):
        hashes_media = Media(filename=hashes_fn)
        timestamp_tokens = []
        from slugify import slugify
        for tsa_url in self.tsa_urls:
            try:
-                signing_settings = SigningSettings(tsp_server=tsa_url, digest_algorithm=DigestAlgorithm.SHA256)
+                message = bytes(data_to_sign, encoding='utf8')
-                signer = TSPSigner()
+
-                message = bytes(data_to_sign, encoding="utf8")
+                logger.debug(f"Timestamping {url=} with {tsa_url=}")
-                # send TSQ and get TSR from the TSA server
+                signed: TimeStampResponse = self.sign_data(tsa_url, message)
-                signed = signer.sign(message=message, signing_settings=signing_settings)
+                
-                # fail if there's any issue with the certificates, uses certifi list of trusted CAs
+                # fail if there's any issue with the certificates, uses certifi list of trusted CAs or the user-defined `cert_authorities`
-                TSPVerifier(certifi.where()).verify(signed, message=message)
+                root_cert = self.verify_signed(signed, message)
-                # download and verify timestamping certificate
+
-                cert_chain = self.download_and_verify_certificate(signed)
+                if not root_cert:
-                # continue with saving the timestamp token
+                    if self.allow_selfsigned:
-                tst_fn = os.path.join(tmp_dir, f"timestamp_token_{slugify(tsa_url)}")
+                        logger.warning(f"Allowing self-signed certificat from TSA {tsa_url=}")
-                with open(tst_fn, "wb") as f:
+                    else:
-                    f.write(signed)
+                        raise ValueError(f"No valid root certificate found for {tsa_url=}. Are you sure it's a trusted TSA? Or define an alternative trusted root with `cert_authorities`. (tried: {self.cert_authorities or certifi.where()})")
-                timestamp_tokens.append(Media(filename=tst_fn).set("tsa", tsa_url).set("cert_chain", cert_chain))
+
                # save the timestamping certificate
                cert_chain = self.save_certificate(signed, root_cert)
                timestamp_token_path = self.save_timestamp_token(signed.time_stamp_token(), tsa_url)
                timestamp_tokens.append(Media(filename=timestamp_token_path).set("tsa", tsa_url).set("cert_chain", cert_chain))
            except Exception as e:
                logger.warning(f"Error while timestamping {url=} with {tsa_url=}: {e}")
        if len(timestamp_tokens):
            hashes_media.set("timestamp_authority_files", timestamp_tokens)
            hashes_media.set("certifi v", version("certifi"))
-            hashes_media.set("tsp_client v", version("tsp_client"))
+            hashes_media.set("rfc3161-client v", version("rfc3161_client"))
-            hashes_media.set("certvalidator v", version("certvalidator"))
+            hashes_media.set("cryptography v", version("cryptography"))
            to_enrich.add_media(hashes_media, id="timestamped_hashes")
            to_enrich.set("timestamped", True)
            logger.success(f"{len(timestamp_tokens)} timestamp tokens created for {url=}")
        else:
            logger.warning(f"No successful timestamps for {url=}")
-    def download_and_verify_certificate(self, signed: bytes) -> list[Media]:
+    def save_timestamp_token(self, timestamp_token: bytes, tsa_url: str) -> str:
        """
        Takes a timestamp token, and saves it to a file with the TSA URL as part of the filename.
        """
        tst_path = os.path.join(self.tmp_dir, f"timestamp_token_{slugify(tsa_url)}")
        with open(tst_path, "wb") as f:
            f.write(timestamp_token)
        return tst_path
    def verify_signed(self, timestamp_response: TimeStampResponse, message: bytes) ->  x509.Certificate:
        """
        Verify a Signed Timestamp Response is trusted by a known Certificate Authority.
        Args:
            timestamp_response (TimeStampResponse): The signed timestamp response.
            message (bytes): The message that was timestamped.
        Returns:
            x509.Certificate: A valid root certificate that was used to sign the timestamp response, or None
        Raises:
            ValueError: If no valid root certificate was found in the trusted root store.
        """
        trusted_root_path = self.cert_authorities or certifi.where()
        cert_authorities = []
        with open(trusted_root_path, 'rb') as f:
            cert_authorities = x509.load_pem_x509_certificates(f.read())
        if not cert_authorities:
            raise ValueError(f"No trusted roots found in {trusted_root_path}.")
        timestamp_certs = self.tst_certs(timestamp_response)
        intermediate_certs = timestamp_certs[1:-1]
        message_hash = None
        hash_algorithm = timestamp_response.tst_info.message_imprint.hash_algorithm
        if hash_algorithm == x509.ObjectIdentifier(value="2.16.840.1.101.3.4.2.3"):
            message_hash = hashlib.sha512(message).digest()
        elif hash_algorithm == x509.ObjectIdentifier(value="2.16.840.1.101.3.4.2.1"):
            message_hash = hashlib.sha256(message).digest()
        else:
            raise ValueError(f"Unsupported hash algorithm: {hash_algorithm}")
        for certificate in cert_authorities:
            builder = VerifierBuilder()
            builder.add_root_certificate(certificate)
            for intermediate_cert in intermediate_certs:
                builder.add_intermediate_certificate(intermediate_cert)
            verifier = builder.build()
            try:
                verifier.verify(timestamp_response, message_hash)
                return certificate
            except Rfc3161VerificationError:
                continue
        return None
    def sign_data(self, tsa_url: str, bytes_data: bytes) -> TimeStampResponse:
        # see https://github.com/sigstore/sigstore-python/blob/99948d5b80525a5a104e904ffea58169dc6e0629/sigstore/_internal/timestamp.py#L84-L121
        timestamp_request = (
                TimestampRequestBuilder().data(bytes_data).nonce(nonce=True).build()
            )
        try:
            response = self.session.post(tsa_url, data=timestamp_request.as_bytes(), timeout=10)
            response.raise_for_status()
        except requests.RequestException as e:
            logger.error(f"Error while sending request to {tsa_url=}: {e}")
            raise
        # Check that we can parse the response but do not *verify* it
        try:
            timestamp_response = decode_timestamp_response(response.content)
        except ValueError as e:
            logger.error(f"Invalid timestamp response from server {tsa_url}: {e}")
            raise
        return timestamp_response
    def tst_certs(self, tsp_response: TimeStampResponse):
        signed_data: SignedData = tsp_response.signed_data
        certs = [x509.load_der_x509_certificate(c) for c in signed_data.certificates]
        # reorder the certs to be in the correct order
        ordered_certs = []
        if len(certs) == 1:
            return certs
        while(len(ordered_certs) < len(certs)):
            if len(ordered_certs) == 0:
                for cert in certs:
                    if not [c for c in certs if cert.subject == c.issuer]:
                        ordered_certs.append(cert)
                        break
            else:
                for cert in certs:
                    if cert.subject == ordered_certs[-1].issuer:
                        ordered_certs.append(cert)
                        break
        return ordered_certs
    def save_certificate(self, tsp_response: TimeStampResponse, verified_root_cert: x509.Certificate) -> list[Media]:
        # returns the leaf certificate URL, fails if not set
        tst = ContentInfo.load(signed)
-        trust_roots = []
+        certificates = self.tst_certs(tsp_response)
        with open(certifi.where(), "rb") as f:
            for _, _, der_bytes in pem.unarmor(f.read(), multiple=True):
                trust_roots.append(der_bytes)
        context = ValidationContext(trust_roots=trust_roots)
-        certificates = tst["content"]["certificates"]
+        if verified_root_cert:
-        first_cert = certificates[0].dump()
+            # add the verified root certificate (if there is one - self signed certs will have None here)
-        intermediate_certs = []
+            certificates += [verified_root_cert]
        for i in range(1, len(certificates)):  # cannot use list comprehension [1:]
            intermediate_certs.append(certificates[i].dump())
        validator = CertificateValidator(first_cert, intermediate_certs=intermediate_certs, validation_context=context)
        path = validator.validate_usage({"digital_signature"}, extended_key_usage={"time_stamping"})
        cert_chain = []
-        for cert in path:
+        for i, cert in enumerate(certificates):
-            cert_fn = os.path.join(self.tmp_dir, f"{str(cert.serial_number)[:20]}.crt")
+            cert_fn = os.path.join(self.tmp_dir, f"{i+1} – {str(cert.serial_number)[:20]}.crt")
            with open(cert_fn, "wb") as f:
-                f.write(cert.dump())
+                f.write(cert.public_bytes(encoding=serialization.Encoding.PEM))
-            cert_chain.append(Media(filename=cert_fn).set("subject", cert.subject.native["common_name"]))
+            cert_chain.append(Media(filename=cert_fn).set("subject", cert.subject.get_attributes_for_oid(x509.NameOID.COMMON_NAME)[0].value))
        return cert_chain
--- a/src/auto_archiver/modules/twitter_api_extractor/twitter_api_extractor.py
+++ b/src/auto_archiver/modules/twitter_api_extractor/twitter_api_extractor.py
@@ -2,7 +2,6 @@ import json
 import re
 import mimetypes
 import requests
 from datetime import datetime
 from loguru import logger
 from pytwitter import Api
@@ -10,6 +9,7 @@ from slugify import slugify
 from auto_archiver.core import Extractor
 from auto_archiver.core import Metadata, Media
 from auto_archiver.utils import get_datetime_from_str
 class TwitterApiExtractor(Extractor):
@@ -91,7 +91,7 @@ class TwitterApiExtractor(Extractor):
        result = Metadata()
        result.set_title(tweet.data.text)
-        result.set_timestamp(datetime.strptime(tweet.data.created_at, "%Y-%m-%dT%H:%M:%S.%fZ"))
+        result.set_timestamp(get_datetime_from_str(tweet.data.created_at, "%Y-%m-%dT%H:%M:%S.%fZ"))
        urls = []
        if tweet.includes:
--- a/src/auto_archiver/modules/wacz_extractor_enricher/manifest.py
+++ b/src/auto_archiver/modules/wacz_extractor_enricher/manifest.py
@@ -11,7 +11,7 @@
    "configs": {
        "profile": {
            "default": None,
-            "help": "browsertrix-profile (for profile generation see https://github.com/webrecorder/browsertrix-crawler#creating-and-using-browser-profiles).",
+            "help": "browsertrix-profile (for profile generation see https://crawler.docs.browsertrix.com/user-guide/browser-profiles/).",
        },
        "docker_commands": {"default": None, "help": "if a custom docker invocation is needed"},
        "timeout": {"default": 120, "help": "timeout for WACZ generation in seconds", "type": "int"},
@@ -40,14 +40,31 @@
    Creates .WACZ archives of web pages using the `browsertrix-crawler` tool, with options for media extraction and screenshot saving.
    [Browsertrix-crawler](https://crawler.docs.browsertrix.com/user-guide/) is a headless browser-based crawler that archives web pages in WACZ format.
-    ### Features
+    ## Features
    - Archives web pages into .WACZ format using Docker or direct invocation of `browsertrix-crawler`.
    - Supports custom profiles for archiving private or dynamic content.
    - Extracts media (images, videos, audio) and screenshots from the archive, optionally adding them to the enrichment pipeline.
    - Generates metadata from the archived page's content and structure (e.g., titles, text).
-    ### Notes
+    ## Setup
-    - Requires Docker for running `browsertrix-crawler` .
+
-    - Configurable via parameters for timeout, media extraction, screenshots, and proxy settings.
+    ### Using Docker
    If you are using the Auto Archiver [Docker image](https://auto-archiver.readthedocs.io/en/latest/installation/installation.html#installing-with-docker)
    to run Auto Archiver (recommended), then everything is set up and you can use WACZ out of the box!
    Otherwise, if you are using a local install of Auto Archiver (e.g. pip or dev install), then you will need to install Docker and run 
    the docker daemon to be able to run the `browsertrix-crawler` tool.
    ### Browsertrix Profiles
    A browsertrix profile is a custom browser profile (login information, browser extensions, etc.) that can be used to archive private or dynamic content.
    You can run the WACZ Enricher without a profile, but for more resilient archiving, it is recommended to create a profile.
    See the [Browsertrix documentation](https://crawler.docs.browsertrix.com/user-guide/browser-profiles/) for more information on how to use the `create-login-profile` tool.
    ### Docker in Docker
    If you are running Auto Archiver within a Docker container, you will need to enable Docker in Docker to run the `browsertrix-crawler` tool.
    This can be done by setting the `WACZ_ENABLE_DOCKER` environment variable to `1`.
    """,
 }
--- a/src/auto_archiver/modules/wacz_extractor_enricher/wacz_extractor_enricher.py
+++ b/src/auto_archiver/modules/wacz_extractor_enricher/wacz_extractor_enricher.py
@@ -24,7 +24,8 @@ class WaczExtractorEnricher(Enricher, Extractor):
        self.use_docker = os.environ.get("WACZ_ENABLE_DOCKER") or not os.environ.get("RUNNING_IN_DOCKER")
        self.docker_in_docker = os.environ.get("WACZ_ENABLE_DOCKER") and os.environ.get("RUNNING_IN_DOCKER")
-        self.cwd_dind = f"/crawls/crawls{random_str(8)}"
+        self.crawl_id = random_str(8)
        self.cwd_dind = f"/crawls/crawls{self.crawl_id}"
        self.browsertrix_home_host = os.environ.get("BROWSERTRIX_HOME_HOST")
        self.browsertrix_home_container = os.environ.get("BROWSERTRIX_HOME_CONTAINER") or self.browsertrix_home_host
        # create crawls folder if not exists, so it can be safely removed in cleanup
@@ -50,7 +51,7 @@ class WaczExtractorEnricher(Enricher, Extractor):
        url = to_enrich.get_url()
-        collection = random_str(8)
+        collection = self.crawl_id
        browsertrix_home_host = self.browsertrix_home_host or os.path.abspath(self.tmp_dir)
        browsertrix_home_container = self.browsertrix_home_container or browsertrix_home_host
@@ -85,6 +86,12 @@ class WaczExtractorEnricher(Enricher, Extractor):
        if self.docker_in_docker:
            cmd.extend(["--cwd", self.cwd_dind])
        if self.auth_for_site(url):
            # there's an auth for this site, but browsertrix only supports username/password auth
            logger.warning(
                "The WACZ enricher / Browsertrix does not support using the 'authentication' information for logging in. You should consider creating a Browser Profile for WACZ archiving. More information: https://auto-archiver.readthedocs.io/en/latest/modules/autogen/extractor/wacz_extractor_enricher.html#browsertrix-profiles"
            )
        # call docker if explicitly enabled or we are running on the host (not in docker)
        if self.use_docker:
            logger.debug(f"generating WACZ in Docker for {url=}")
@@ -102,10 +109,11 @@ class WaczExtractorEnricher(Enricher, Extractor):
                ] + cmd
            if self.profile:
-                profile_fn = os.path.join(browsertrix_home_container, "profile.tar.gz")
+                profile_file = f"profile-{self.crawl_id}.tar.gz"
                profile_fn = os.path.join(browsertrix_home_container, profile_file)
                logger.debug(f"copying {self.profile} to {profile_fn}")
                shutil.copyfile(self.profile, profile_fn)
-                cmd.extend(["--profile", os.path.join("/crawls", "profile.tar.gz")])
+                cmd.extend(["--profile", os.path.join("/crawls", profile_file)])
        else:
            logger.debug(f"generating WACZ without Docker for {url=}")
--- a/src/auto_archiver/utils/url.py
+++ b/src/auto_archiver/utils/url.py
@@ -4,8 +4,8 @@ from ipaddress import ip_address
 AUTHWALL_URLS = [
-    re.compile(r"https:\/\/t\.me(\/c)\/(.+)\/(\d+)"),  # telegram private channels
+    re.compile(r"https?:\/\/t\.me(\/c)\/(.+)\/(\d+)"),  # telegram private channels
-    re.compile(r"https:\/\/www\.instagram\.com"),  # instagram
+    re.compile(r"https?:\/\/(www\.)?instagram\.com"),  # instagram
 ]
@@ -81,56 +81,43 @@ def is_relevant_url(url: str) -> bool:
    """
    clean_url = remove_get_parameters(url)
-    # favicons
+    IRRELEVANT_URLS = [
-    if "favicon" in url:
+        # favicons
-        return False
+        ("favicon",),
-    # ifnore icons
+        # twitter profile pictures
-    if clean_url.endswith(".ico"):
+        ("twimg.com/profile_images",),
-        return False
+        ("twimg.com", "default_profile_images"),
-    # ignore SVGs
+        # instagram profile pictures
-    if remove_get_parameters(url).endswith(".svg"):
+        ("https://scontent.cdninstagram.com/", "150x150"),
-        return False
+        # instagram recurring images
        ("https://static.cdninstagram.com/rsrc.php/",),
        # telegram
        ("https://telegram.org/img/emoji/",),
        # youtube
        ("https://www.youtube.com/s/gaming/emoji/",),
        ("https://yt3.ggpht.com", "default-user="),
        ("https://www.youtube.com/s/search/audio/",),
        # ok
        ("https://ok.ru/res/i/",),
        ("https://vk.com/emoji/",),
        ("vk.com/images/",),
        ("vk.com/images/reaction/",),
        # wikipedia
        ("wikipedia.org/static",),
    ]
-    # twitter profile pictures
+    IRRELEVANT_ENDS_WITH = [
-    if "twimg.com/profile_images" in url:
+        ".svg",  # ignore SVGs
-        return False
+        ".ico",  # ignore icons
-    if "twimg.com" in url and "/default_profile_images" in url:
+    ]
        return False
-    # instagram profile pictures
+    for end in IRRELEVANT_ENDS_WITH:
-    if "https://scontent.cdninstagram.com/" in url and "150x150" in url:
+        if clean_url.endswith(end):
-        return False
+            return False
    # instagram recurring images
    if "https://static.cdninstagram.com/rsrc.php/" in url:
        return False
-    # telegram
+    for parts in IRRELEVANT_URLS:
-    if "https://telegram.org/img/emoji/" in url:
+        if all(part in clean_url for part in parts):
-        return False
+            return False
    # youtube
    if "https://www.youtube.com/s/gaming/emoji/" in url:
        return False
    if "https://yt3.ggpht.com" in url and "default-user=" in url:
        return False
    if "https://www.youtube.com/s/search/audio/" in url:
        return False
    # ok
    if " https://ok.ru/res/i/" in url:
        return False
    # vk
    if "https://vk.com/emoji/" in url:
        return False
    if "vk.com/images/" in url:
        return False
    if "vk.com/images/reaction/" in url:
        return False
    # wikipedia
    if "wikipedia.org/static" in url:
        return False
    return True
--- a/src/auto_archiver/utils/webdriver.py
+++ b/src/auto_archiver/utils/webdriver.py
@@ -22,35 +22,35 @@ from loguru import logger
 class CookieSettingDriver(webdriver.Firefox):
    facebook_accept_cookies: bool
-    cookies: str
+    cookie: str
-    cookiejar: MozillaCookieJar
+    cookie_jar: MozillaCookieJar
-    def __init__(self, cookies, cookiejar, facebook_accept_cookies, *args, **kwargs):
+    def __init__(self, cookie, cookie_jar, facebook_accept_cookies, *args, **kwargs):
        if os.environ.get("RUNNING_IN_DOCKER"):
            # Selenium doesn't support linux-aarch64 driver, we need to set this manually
            kwargs["service"] = webdriver.FirefoxService(executable_path="/usr/local/bin/geckodriver")
        super(CookieSettingDriver, self).__init__(*args, **kwargs)
-        self.cookies = cookies
+        self.cookie = cookie
-        self.cookiejar = cookiejar
+        self.cookie_jar = cookie_jar
        self.facebook_accept_cookies = facebook_accept_cookies
    def get(self, url: str):
-        if self.cookies or self.cookiejar:
+        if self.cookie_jar or self.cookie:
            # set up the driver to make it not 'cookie averse' (needs a context/URL)
            # get the 'robots.txt' file which should be quick and easy
            robots_url = urlunparse(urlparse(url)._replace(path="/robots.txt", query="", fragment=""))
            super(CookieSettingDriver, self).get(robots_url)
-            if self.cookies:
+            if self.cookie:
                # an explicit cookie is set for this site, use that first
                for cookie in self.cookies.split(";"):
                    for name, value in cookie.split("="):
                        self.driver.add_cookie({"name": name, "value": value})
-            elif self.cookiejar:
+            elif self.cookie_jar:
-                domain = urlparse(url).netloc
+                domain = urlparse(url).netloc.removeprefix("www.")
                regex = re.compile(f"(www)?.?{domain}$")
-                for cookie in self.cookiejar:
+                for cookie in self.cookie_jar:
                    if regex.match(cookie.domain):
                        try:
                            self.add_cookie(
@@ -145,8 +145,8 @@ class Webdriver:
        try:
            self.driver = CookieSettingDriver(
-                cookies=self.auth.get("cookies"),
+                cookie=self.auth.get("cookie"),
-                cookiejar=self.auth.get("cookies_jar"),
+                cookie_jar=self.auth.get("cookies_jar"),
                facebook_accept_cookies=self.facebook_accept_cookies,
                options=options,
            )
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -10,7 +10,7 @@ from typing import Dict, Tuple
 import hashlib
 import pytest
-from auto_archiver.core.metadata import Metadata
+from auto_archiver.core.metadata import Metadata, Media
 from auto_archiver.core.module import ModuleFactory
 # Test names inserted into this list will be run last. This is useful for expensive/costly tests
@@ -118,7 +118,7 @@ def pytest_runtest_setup(item):
                pytest.xfail(f"previous test failed ({test_name})")
-@pytest.fixture()
+@pytest.fixture
 def unpickle():
    """
    Returns a helper function that unpickles a file
@@ -140,6 +140,14 @@ def mock_binary_dependencies(mocker):
    return mock_shutil_which
@pytest.fixture
 def sample_media(tmp_path) -> Media:
    """Fixture creating a Media object with temporary source file"""
    src_file = tmp_path / "source.txt"
    src_file.write_text("test content")
    return Media(_key="subdir/test.txt", filename=str(src_file))
@pytest.fixture
 def sample_datetime():
    return datetime(2023, 1, 1, 12, 0, tzinfo=timezone.utc)
--- a/tests/data/test_modules/example_extractor/manifest.py
+++ b/tests/data/test_modules/example_extractor/manifest.py
@@ -0,0 +1,11 @@
 {
    # Display Name of your module
    "name": "Example Extractor",
    # Optional version number, for your own versioning purposes
    "version": 2.0,
    # The type of the module, must be one (or more) of the built in module types
    "type": ["extractor"],
    # a boolean indicating whether or not a module requires additional user setup before it can be used
    # for example: adding API keys, installing additional software etc.
    "requires_setup": False,
 }
--- a/tests/data/test_modules/example_extractor/example_extractor.py
+++ b/tests/data/test_modules/example_extractor/example_extractor.py
@@ -0,0 +1,6 @@
 from auto_archiver.core import Extractor
 class ExampleExtractor(Extractor):
    def download(self, item):
        print("download")
--- a/tests/data/timestamping/digicert.tsr
+++ b/tests/data/timestamping/digicert.tsr
--- a/tests/data/timestamping/rfc3161-client-issue-104.tsr
+++ b/tests/data/timestamping/rfc3161-client-issue-104.tsr
--- a/tests/data/timestamping/self_signed.tsr
+++ b/tests/data/timestamping/self_signed.tsr
--- a/tests/data/timestamping/timestamp_token_http-timestamp-identrust-com
+++ b/tests/data/timestamping/timestamp_token_http-timestamp-identrust-com
--- a/tests/data/timestamping/valid_timestamp.tsr
+++ b/tests/data/timestamping/valid_timestamp.tsr
--- a/tests/enrichers/test_screenshot_enricher.py
+++ b/tests/enrichers/test_screenshot_enricher.py
@@ -85,8 +85,8 @@ def test_enrich_adds_screenshot(
    mock_driver, mock_driver_class, mock_options_instance = mock_selenium_env
    screenshot_enricher.enrich(metadata_with_video)
    mock_driver_class.assert_called_once_with(
-        cookies=None,
+        cookie=None,
-        cookiejar=None,
+        cookie_jar=None,
        facebook_accept_cookies=False,
        options=mock_options_instance,
    )
@@ -124,6 +124,38 @@ def test_enrich_auth_wall(
        assert metadata_with_video.media[1].properties.get("id") == "screenshot"
 def test_skip_authwall_no_cookies(screenshot_enricher, caplog):
    with caplog.at_level("WARNING"):
        screenshot_enricher.enrich(Metadata().set_url("https://instagram.com"))
    assert "[SKIP] SCREENSHOT since url" in caplog.text
@pytest.mark.parametrize(
    "auth",
    [
        {"cookie": "cookie"},
        {"cookies_jar": "cookie"},
    ],
 )
 def test_dont_skip_authwall_with_cookies(screenshot_enricher, caplog, mocker, mock_selenium_env, auth):
    mocker.patch("auto_archiver.utils.url.is_auth_wall", return_value=True)
    # patch the authentication dict:
    screenshot_enricher.authentication = {"example.com": auth}
    with caplog.at_level("WARNING"):
        screenshot_enricher.enrich(Metadata().set_url("https://example.com"))
    assert "[SKIP] SCREENSHOT since url" not in caplog.text
 def test_show_warning_wrong_auth_type(screenshot_enricher, caplog, mocker, mock_selenium_env):
    mock_driver, mock_driver_class, _ = mock_selenium_env
    mocker.patch("auto_archiver.utils.url.is_auth_wall", return_value=True)
    screenshot_enricher.authentication = {"example.com": {"username": "user", "password": "pass"}}
    with caplog.at_level("WARNING"):
        screenshot_enricher.enrich(Metadata().set_url("https://example.com"))
    assert "Screenshot enricher only supports cookie-type authentication" in caplog.text
 def test_handle_timeout_exception(screenshot_enricher, metadata_with_video, mock_selenium_env, mocker):
    mock_driver, mock_driver_class, mock_options_instance = mock_selenium_env
--- a/tests/enrichers/test_timestamping_enricher.py
+++ b/tests/enrichers/test_timestamping_enricher.py
@@ -0,0 +1,215 @@
 from pathlib import Path
 import pytest
 from rfc3161_client import (
    TimeStampResponse,
    decode_timestamp_response,
 )
 import requests
 from auto_archiver.modules.timestamping_enricher.timestamping_enricher import TimestampingEnricher
 from auto_archiver.core import Metadata
@pytest.fixture
 def timestamp_response() -> TimeStampResponse:
    with open("tests/data/timestamping/valid_timestamp.tsr", "rb") as f:
        return decode_timestamp_response(f.read())
@pytest.fixture
 def wrong_order_timestamp_response() -> TimeStampResponse:
    with open("tests/data/timestamping/rfc3161-client-issue-104.tsr", "rb") as f:
        return decode_timestamp_response(f.read())
@pytest.fixture
 def selfsigned_response() -> TimeStampResponse:
    with open("tests/data/timestamping/self_signed.tsr", "rb") as f:
        return decode_timestamp_response(f.read())
@pytest.fixture
 def digicert_response() -> TimeStampResponse:
    with open("tests/data/timestamping/digicert.tsr", "rb") as f:
        return f.read()
@pytest.fixture
 def filehash():
    return "4b7b4e39f12b8c725e6e603e6d4422500316df94211070682ef10260ff5759ef"
@pytest.mark.download
 def test_enriching(setup_module, sample_media):
    tsp: TimestampingEnricher = setup_module("timestamping_enricher")
    # tests the current TSAs set as default in the __manifest__ to make sure they are all still working
    # test the enrich method
    metadata = Metadata().set_url("https://example.com")
    sample_media.set("hash", "4b7b4e39f12b8c725e6e603e6d4422500316df94211070682ef10260ff5759ef")
    metadata.add_media(sample_media)
    tsp.enrich(metadata)
 def test_full_enriching_selfsigned(setup_module, sample_media, mocker, selfsigned_response, filehash):
    mock_post = mocker.patch("requests.sessions.Session.post")
    mock_post.return_value.status_code = 200
    mock_decode_timestamp_response = mocker.patch(
        "auto_archiver.modules.timestamping_enricher.timestamping_enricher.decode_timestamp_response"
    )
    mock_decode_timestamp_response.return_value = selfsigned_response
    tsp: TimestampingEnricher = setup_module("timestamping_enricher", {"tsa_urls": ["http://timestamp.identrust.com"]})
    metadata = Metadata().set_url("https://example.com")
    sample_media.set("hash", filehash)
    metadata.add_media(sample_media)
    tsp.enrich(metadata)
    assert len(metadata.media) == 1  # doesn't allow self-signed
    # set self-signed on tsp
    tsp.allow_selfsigned = True
    tsp.enrich(metadata)
    assert len(metadata.media) == 2
 def test_full_enriching(setup_module, sample_media, mocker, timestamp_response, filehash):
    mock_post = mocker.patch("requests.sessions.Session.post")
    mock_post.return_value.status_code = 200
    mock_decode_timestamp_response = mocker.patch(
        "auto_archiver.modules.timestamping_enricher.timestamping_enricher.decode_timestamp_response"
    )
    mock_decode_timestamp_response.return_value = timestamp_response
    tsp: TimestampingEnricher = setup_module("timestamping_enricher", {"tsa_urls": ["http://timestamp.identrust.com"]})
    metadata = Metadata().set_url("https://example.com")
    sample_media.set("hash", filehash)
    metadata.add_media(sample_media)
    tsp.enrich(metadata)
    assert metadata.get("timestamped") is True
    assert len(metadata.media) == 2  # the original 'sample_media' and the new 'timestamp_media'
    timestamp_media = metadata.media[1]
    assert timestamp_media.filename == f"{tsp.tmp_dir}/hashes.txt"
    assert Path(timestamp_media.filename).read_text() == filehash
    # we only have one authority file because we only used one TSA
    assert len(timestamp_media.get("timestamp_authority_files")) == 1
    timestamp_authority_file = timestamp_media.get("timestamp_authority_files")[0]
    assert Path(timestamp_authority_file.filename).read_bytes() == timestamp_response.time_stamp_token()
    cert_chain = timestamp_authority_file.get("cert_chain")
    assert len(cert_chain) == 3
    assert cert_chain[0].filename == f"{tsp.tmp_dir}/1 – 85078758028491331763.crt"
    assert cert_chain[1].filename == f"{tsp.tmp_dir}/2 – 85078371663472981624.crt"
    assert cert_chain[2].filename == f"{tsp.tmp_dir}/3 – 13298821034946342390.crt"
 def test_full_enriching_multiple_tsa(setup_module, sample_media, mocker, timestamp_response, filehash):
    mock_post = mocker.patch("requests.sessions.Session.post")
    mock_post.return_value.status_code = 200
    mock_decode_timestamp_response = mocker.patch(
        "auto_archiver.modules.timestamping_enricher.timestamping_enricher.decode_timestamp_response"
    )
    mock_decode_timestamp_response.return_value = timestamp_response
    tsp: TimestampingEnricher = setup_module(
        "timestamping_enricher", {"tsa_urls": ["http://example.com/timestamp1", "http://example.com/timestamp2"]}
    )
    metadata = Metadata().set_url("https://example.com")
    sample_media.set("hash", filehash)
    metadata.add_media(sample_media)
    tsp.enrich(metadata)
    assert metadata.get("timestamped") is True
    assert len(metadata.media) == 2  # the original 'sample_media' and the new 'timestamp_media'
    timestamp_media = metadata.media[1]
    assert len(timestamp_media.get("timestamp_authority_files")) == 2
    for timestamp_token_media in timestamp_media.get("timestamp_authority_files"):
        assert Path(timestamp_token_media.filename).read_bytes() == timestamp_response.time_stamp_token()
        assert len(timestamp_token_media.get("cert_chain")) == 3
 def test_fails_for_digicert(setup_module, mocker, digicert_response):
    """
    Digicert TSRs are not compliant with RFC 3161.
    See https://github.com/trailofbits/rfc3161-client/issues/104#issuecomment-2621960840
    """
    mocker.patch("requests.sessions.Session.post", return_value=requests.Response())
    mocker.patch("requests.Response.raise_for_status")
    mocker.patch("requests.Response.content", new_callable=mocker.PropertyMock, return_value=digicert_response)
    tsa_url = "http://timestamp.digicert.com"
    tsp: TimestampingEnricher = setup_module("timestamping_enricher")
    data = b"4b7b4e39f12b8c725e6e603e6d4422500316df94211070682ef10260ff5759ef"
    with pytest.raises(ValueError) as e:
        tsp.sign_data(tsa_url, data)
    assert "ASN.1 parse error: ParseError" in str(e.value)
@pytest.mark.download
 def test_download_tsr(setup_module):
    tsa_url = "http://timestamp.identrust.com"
    tsp: TimestampingEnricher = setup_module("timestamping_enricher")
    data = b"4b7b4e39f12b8c725e6e603e6d4422500316df94211070682ef10260ff5759ef"
    result: TimeStampResponse = tsp.sign_data(tsa_url, data)
    assert isinstance(result, TimeStampResponse)
    verified_root_cert = tsp.verify_signed(result, data)
    assert verified_root_cert.subject.rfc4514_string() == "CN=IdenTrust Commercial Root CA 1,O=IdenTrust,C=US"
    # test downloading the cert
    cert_chain = tsp.save_certificate(result, verified_root_cert)
    assert len(cert_chain) == 3
 def test_verify_save(setup_module, timestamp_response):
    tsp: TimestampingEnricher = setup_module("timestamping_enricher")
    verified_root_cert = tsp.verify_signed(
        timestamp_response, b"4b7b4e39f12b8c725e6e603e6d4422500316df94211070682ef10260ff5759ef"
    )
    assert verified_root_cert.subject.rfc4514_string() == "CN=IdenTrust Commercial Root CA 1,O=IdenTrust,C=US"
    cert_chain = tsp.save_certificate(timestamp_response, verified_root_cert)
    assert len(cert_chain) == 3
    assert cert_chain[0].filename == f"{tsp.tmp_dir}/1 – 85078758028491331763.crt"
    assert cert_chain[1].filename == f"{tsp.tmp_dir}/2 – 85078371663472981624.crt"
    assert cert_chain[2].filename == f"{tsp.tmp_dir}/3 – 13298821034946342390.crt"
 def test_order_crt_correctly(setup_module, wrong_order_timestamp_response):
    # reference: https://github.com/trailofbits/rfc3161-client/issues/104#issuecomment-2711244010
    tsp: TimestampingEnricher = setup_module("timestamping_enricher")
    # get the certificates, make sure the reordering is working:
    ordered_certs = tsp.tst_certs(wrong_order_timestamp_response)
    assert len(ordered_certs) == 2
    assert ordered_certs[0].subject.rfc4514_string() == "CN=TrustID Timestamp Authority,O=IdenTrust,C=US"
    assert ordered_certs[1].subject.rfc4514_string() == "CN=TrustID Timestamping CA 3,O=IdenTrust,C=US"
 def test_invalid_tsa_invalid_response(setup_module, mocker):
    mocker.patch("requests.sessions.Session.post", return_value=requests.Response())
    raise_for_status = mocker.patch("requests.Response.raise_for_status")
    raise_for_status.side_effect = requests.exceptions.HTTPError("404 Client Error")
    tsp = setup_module("timestamping_enricher")
    with pytest.raises(requests.exceptions.HTTPError, match="404 Client Error"):
        tsp.sign_data("http://bellingcat.com/page-not-found/", b"my-message")
 def test_fail_on_selfsigned_cert(setup_module, selfsigned_response):
    tsp = setup_module("timestamping_enricher")
    root_cert = tsp.verify_signed(selfsigned_response, b"my-message")
    assert root_cert is None
--- a/tests/enrichers/test_wacz_enricher.py
+++ b/tests/enrichers/test_wacz_enricher.py
@@ -4,6 +4,7 @@ from zipfile import ZipFile
 import pytest
 from auto_archiver.core import Metadata, Media
 from auto_archiver.core.consts import SetupError
@pytest.fixture
@@ -22,6 +23,15 @@ def wacz_enricher(setup_module, mock_binary_dependencies):
    return wacz
 def test_raises_error_without_docker_installed(setup_module, mocker, caplog):
    # pretend that docker isn't installed
    mocker.patch("shutil.which").return_value = None
    with pytest.raises(SetupError):
        setup_module("wacz_extractor_enricher", {})
    assert "requires external dependency 'docker' which is not available/setup" in caplog.text
 def test_setup_without_docker(wacz_enricher, mocker):
    mocker.patch.dict(os.environ, {"RUNNING_IN_DOCKER": "1"}, clear=True)
    wacz_enricher.setup()
--- a/tests/extractors/test_extractor_base.py
+++ b/tests/extractors/test_extractor_base.py
@@ -25,5 +25,5 @@ class TestExtractorBase(object):
        else:
            assert status == test_response.status
-        assert title == test_response.get_title()
+        assert title in test_response.get_title()
-        assert timestamp, test_response.get("timestamp")
+        assert timestamp == test_response.get("timestamp")
--- a/tests/extractors/test_generic_extractor.py
+++ b/tests/extractors/test_generic_extractor.py
@@ -40,6 +40,22 @@ class TestGenericExtractor(TestExtractorBase):
        path = os.path.join(dirname(dirname(__file__)), "data/")
        assert self.extractor.dropin_for_name("dropin", additional_paths=[path])
    @pytest.mark.parametrize(
        "url, suitable_extractors",
        [
            ("https://www.youtube.com/watch?v=5qap5aO4i9A", ["youtube"]),
            ("https://www.tiktok.com/@funnycats0ftiktok/video/7345101300750748970?lang=en", ["tiktok"]),
            ("https://www.instagram.com/p/CU1J9JYJ9Zz/", ["instagram"]),
            ("https://www.facebook.com/nytimes/videos/10160796550110716", ["facebook"]),
            ("https://www.facebook.com/BylineFest/photos/t.100057299682816/927879487315946/", ["facebook"]),
        ],
    )
    def test_suitable_extractors(self, url, suitable_extractors):
        suitable_extractors = suitable_extractors + ["generic"]  # the generic is valid for all
        extractors = list(self.extractor.suitable_extractors(url))
        assert len(extractors) == len(suitable_extractors)
        assert [e.ie_key().lower() for e in extractors] == suitable_extractors
    @pytest.mark.parametrize(
        "url, is_suitable",
        [
@@ -55,7 +71,7 @@ class TestGenericExtractor(TestExtractorBase):
            ("https://google.com", True),
        ],
    )
-    def test_suitable_urls(self, make_item, url, is_suitable):
+    def test_suitable_urls(self, url, is_suitable):
        """
        Note: expected behaviour is to return True for all URLs, as YoutubeDLArchiver should be able to handle all URLs
        This behaviour may be changed in the future (e.g. if we want the youtubedl archiver to just handle URLs it has extractors for,
@@ -190,10 +206,11 @@ class TestGenericExtractor(TestExtractorBase):
        self.assertValidResponseMetadata(
            post,
-            "Onion rings are just vegetable donuts.",
+            "Cookie Monster - Onion rings are just vegetable donuts.",
            datetime.datetime(2023, 1, 24, 16, 25, 51, tzinfo=datetime.timezone.utc),
            "yt-dlp_Twitter: success",
        )
        assert post.get("content") == "Onion rings are just vegetable donuts."
    @pytest.mark.download
    def test_twitter_download_video(self, make_item):
@@ -201,7 +218,7 @@ class TestGenericExtractor(TestExtractorBase):
        post = self.extractor.download(make_item(url))
        self.assertValidResponseMetadata(
            post,
-            "Bellingcat - This month's Bellingchat Premium is with @KolinaKoltai. She reveals how she investigated a platform allowing users to create AI-generated child sexual abuse material and explains why it's crucial to investigate the people behind these services",
+            "Bellingcat - This month's Bellingchat Premium is with @KolinaKoltai",
            datetime.datetime(2024, 12, 24, 13, 44, 46, tzinfo=datetime.timezone.utc),
        )
@@ -245,3 +262,32 @@ class TestGenericExtractor(TestExtractorBase):
        self.assertValidResponseMetadata(post, title, timestamp)
        assert len(post.media) == 1
        assert post.media[0].hash == image_hash
    @pytest.mark.download
    def test_download_facebook_video(self, make_item):
        post = self.extractor.download(make_item("https://www.facebook.com/bellingcat/videos/588371253839133"))
        assert len(post.media) == 2
        assert post.media[0].filename.endswith("588371253839133.mp4")
        assert post.media[0].mimetype == "video/mp4"
        assert post.media[1].filename.endswith(".jpg")
        assert post.media[1].mimetype == "image/jpeg"
        assert "Bellingchat Premium is with Kolina Koltai" in post.get_title()
    @pytest.mark.download
    def test_download_facebook_image(self, make_item):
        post = self.extractor.download(
            make_item("https://www.facebook.com/BylineFest/photos/t.100057299682816/927879487315946/")
        )
        assert len(post.media) == 1
        assert post.media[0].filename.endswith(".png")
        assert "Byline Festival - BylineFest Partner" == post.get_title()
    @pytest.mark.download
    def test_download_facebook_text_only(self, make_item):
        url = "https://www.facebook.com/bellingcat/posts/pfbid02rzpwZxAZ8bLkAX8NvHv4DWAidFaqAUfJMbo9vWkpwxL7uMUWzWMiizXLWRSjwihVl"
        post = self.extractor.download(make_item(url))
        assert "Bellingcat researcher Kolina Koltai delves deeper into Clothoff" in post.get("content")
        assert post.get_title() == "Bellingcat"
--- a/tests/extractors/test_tiktok_tikwm_extractor.py
+++ b/tests/extractors/test_tiktok_tikwm_extractor.py
@@ -4,6 +4,8 @@ import pytest
 import yt_dlp
 from auto_archiver.modules.generic_extractor.generic_extractor import GenericExtractor
 from auto_archiver.modules.generic_extractor.tiktok import Tiktok, TikTokIE
 from .test_extractor_base import TestExtractorBase
@@ -17,11 +19,16 @@ def skip_ytdlp_own_methods(mocker):
    )
-@pytest.fixture()
+@pytest.fixture
 def mock_get(mocker):
    return mocker.patch("auto_archiver.modules.generic_extractor.tiktok.requests.get")
@pytest.fixture
 def tiktok_dropin() -> Tiktok:
    return Tiktok()
 class TestTiktokTikwmExtractor(TestExtractorBase):
    """
    Test suite for TestTiktokTikwmExtractor.
@@ -34,6 +41,25 @@ class TestTiktokTikwmExtractor(TestExtractorBase):
    VALID_EXAMPLE_URL = "https://www.tiktok.com/@example/video/1234"
    @pytest.mark.parametrize(
        "url, is_suitable",
        [
            ("https://bellingcat.com", False),
            ("https://youtube.com", False),
            ("https://tiktok.co/", False),
            ("https://tiktok.com/", False),
            ("https://www.tiktok.com/", False),
            ("https://api.cool.tiktok.com/", False),
            (VALID_EXAMPLE_URL, True),
            ("https://www.tiktok.com/@bbcnews/video/7478038212070411542", True),
            ("https://www.tiktok.com/@ggs68taiwan.official/video/7441821351142362375", True),
            ("https://www.tiktok.com/t/ZP8YQ8e5j/", True),
            ("https://vt.tiktok.com/ZSMTJeqRP/", True),
        ],
    )
    def test_is_suitable(self, url, is_suitable, tiktok_dropin):
        assert tiktok_dropin.suitable(url, TikTokIE()) == is_suitable
    def test_invalid_json_responses(self, mock_get, make_item, caplog):
        mock_get.return_value.status_code = 200
        mock_get.return_value.json.side_effect = ValueError
--- a/tests/test_modules.py
+++ b/tests/test_modules.py
@@ -1,6 +1,7 @@
 import pytest
 from auto_archiver.core.module import ModuleFactory, LazyBaseModule
 from auto_archiver.core.base_module import BaseModule
 from auto_archiver.core.consts import SetupError
@pytest.fixture
@@ -25,11 +26,9 @@ def test_python_dependency_check(example_module):
    # monkey patch the manifest to include a nonexistnet dependency
    example_module.manifest["dependencies"]["python"] = ["does_not_exist"]
-    with pytest.raises(SystemExit) as load_error:
+    with pytest.raises(SetupError):
        example_module.load({})
    assert load_error.value.code == 1
 def test_binary_dependency_check(example_module):
    # example_module requires ffmpeg, which is not installed
@@ -81,8 +80,20 @@ def test_load_modules(module_name):
    # check that default settings are applied
    default_config = module.configs
    assert loaded_module.name in loaded_module.config.keys()
    defaults = {k for k in default_config}
    assert defaults in [loaded_module.config[module_name].keys()]
@pytest.mark.parametrize("module_name", ["local_storage", "generic_extractor", "html_formatter", "csv_db"])
 def test_config_defaults(module_name):
    # test the values of the default config values are set
    # Note: some modules can alter values in the setup() method, this test checks cases that don't
    module = ModuleFactory().get_module_lazy(module_name)
    loaded_module = module.load({})
    # check that default config values are set
    default_config = module.configs
    defaults = {k: v.get("default") for k, v in default_config.items()}
-    assert loaded_module.config[module_name] == defaults
+    assert defaults == loaded_module.config[module_name]
@pytest.mark.parametrize("module_name", ["local_storage", "generic_extractor", "html_formatter", "csv_db"])
--- a/tests/test_orchestrator.py
+++ b/tests/test_orchestrator.py
@@ -4,6 +4,7 @@ from auto_archiver.core.orchestrator import ArchivingOrchestrator
 from auto_archiver.version import __version__
 from auto_archiver.core.config import read_yaml, store_yaml
 from auto_archiver.core import Metadata
 from auto_archiver.core.consts import SetupError
 TEST_ORCHESTRATION = "tests/data/test_orchestration.yaml"
 TEST_MODULES = "tests/data/test_modules/"
@@ -224,3 +225,15 @@ def test_multiple_orchestrator(test_args):
    output: Metadata = list(o2.feed())
    assert len(output) == 1
    assert output[0].get_url() == "https://example.com"
 def test_wrong_step_type(test_args, caplog):
    args = test_args + [
        "--feeders",
        "example_extractor",  # example_extractor is not a valid feeder!
    ]
    orchestrator = ArchivingOrchestrator()
    with pytest.raises(SetupError) as err:
        orchestrator.setup(args)
        assert "Module 'example_extractor' is not a feeder" in str(err.value)
--- a/tests/utils/test_urls.py
+++ b/tests/utils/test_urls.py
@@ -0,0 +1,113 @@
 import pytest
 from auto_archiver.utils.url import (
    is_auth_wall,
    check_url_or_raise,
    domain_for_url,
    is_relevant_url,
    remove_get_parameters,
    twitter_best_quality_url,
 )
@pytest.mark.parametrize(
    "url, is_auth",
    [
        ("https://example.com", False),
        ("https://t.me/c/abc/123", True),
        ("https://t.me/not-private/", False),
        ("https://instagram.com", True),
        ("https://www.instagram.com", True),
        ("https://www.instagram.com/p/INVALID", True),
        ("https://www.instagram.com/p/C4QgLbrIKXG/", True),
    ],
 )
 def test_is_auth_wall(url, is_auth):
    assert is_auth_wall(url) == is_auth
@pytest.mark.parametrize(
    "url, raises",
    [
        ("http://example.com", False),
        ("https://example.com", False),
        ("ftp://example.com", True),
        ("http://localhost", True),
        ("http://", True),
    ],
 )
 def test_check_url_or_raise(url, raises):
    if raises:
        with pytest.raises(ValueError):
            check_url_or_raise(url)
    else:
        assert check_url_or_raise(url)
@pytest.mark.parametrize(
    "url, domain",
    [
        ("https://example.com", "example.com"),
        ("https://www.example.com", "www.example.com"),
        ("https://www.example.com/path", "www.example.com"),
        ("https://", ""),
        ("http://localhost", "localhost"),
    ],
 )
 def test_domain_for_url(url, domain):
    assert domain_for_url(url) == domain
@pytest.mark.parametrize(
    "url, without_get",
    [
        ("https://example.com", "https://example.com"),
        ("https://example.com?utm_source=example", "https://example.com"),
        ("https://example.com?utm_source=example&other=1", "https://example.com"),
        ("https://example.com/something", "https://example.com/something"),
        ("https://example.com/something?utm_source=example", "https://example.com/something"),
    ],
 )
 def test_remove_get_parameters(url, without_get):
    assert remove_get_parameters(url) == without_get
@pytest.mark.parametrize(
    "url, relevant",
    [
        ("https://example.com", True),
        ("https://example.com/favicon.ico", False),
        ("https://twimg.com/profile_images", False),
        ("https://twimg.com/something/default_profile_images", False),
        ("https://scontent.cdninstagram.com/username/150x150.jpg", False),
        ("https://static.cdninstagram.com/rsrc.php/", False),
        ("https://telegram.org/img/emoji/", False),
        ("https://www.youtube.com/s/gaming/emoji/", False),
        ("https://yt3.ggpht.com/default-user=", False),
        ("https://www.youtube.com/s/search/audio/", False),
        ("https://ok.ru/res/i/", False),
        ("https://vk.com/emoji/", False),
        ("https://vk.com/images/", False),
        ("https://vk.com/images/reaction/", False),
        ("https://wikipedia.org/static", False),
        ("https://example.com/file.svg", False),
        ("https://example.com/file.ico", False),
        ("https://example.com/file.mp4", True),
        ("https://example.com/150x150.jpg", True),
        ("https://example.com/rsrc.php/", True),
        ("https://example.com/img/emoji/", True),
    ],
 )
 def test_is_relevant_url(url, relevant):
    assert is_relevant_url(url) == relevant
@pytest.mark.parametrize(
    "url, best_quality",
    [
        ("https://twitter.com/some_image.jpg?name=small", "https://twitter.com/some_image.jpg?name=orig"),
        ("https://twitter.com/some_image.jpg", "https://twitter.com/some_image.jpg"),
        ("https://twitter.com/some_image.jpg?name=orig", "https://twitter.com/some_image.jpg?name=orig"),
    ],
 )
 def test_twitter_best_quality_url(url, best_quality):
    assert twitter_best_quality_url(url) == best_quality
Author	SHA1	Message	Date
Patrick Robertson	a9ff55a36e	Merge pull request #278 from bellingcat/dependabot_fix This force-pins cryptography to >44.0.1 to fix dependabot warning	2025-03-26 11:57:35 +00:00
Patrick Robertson	5bb0cbf3ff	Lock poetry file	2025-03-26 15:43:03 +04:00
Patrick Robertson	3eb9ffddfe	This force-pins cryptography to >44.0.1 to fix dependabot warning pyOpenSSL also no longer needed	2025-03-26 15:39:53 +04:00
Patrick Robertson	43a80dbcda	Merge pull request #224 from bellingcat/timestamping_rewrite Timestamping rewrite	2025-03-26 11:25:55 +00:00
Patrick Robertson	cb3ae055d6	Also remove certvalidator from poetry/project	2025-03-26 15:11:25 +04:00
Patrick Robertson	0073a08525	Update manifest dependencies to remove tsp_client et al.	2025-03-26 14:57:55 +04:00
Patrick Robertson	46e31808f6	Version bump	2025-03-26 14:54:33 +04:00
Patrick Robertson	4af23e13d1	Bump rfc3161-client to 1.0.1	2025-03-26 14:50:12 +04:00
Patrick Robertson	d6be1ff84f	Merge branch 'main' into timestamping_rewrite	2025-03-26 14:37:51 +04:00
Patrick Robertson	74974ef0ed	Merge pull request #268 from bellingcat/minor_improvements Minor improvements	2025-03-25 12:52:08 +00:00
Patrick Robertson	5c6005d843	Merge pull request #269 from bellingcat/update-dependabot Add explicit dependabots for pip/poetry, GH actions and npm	2025-03-25 06:30:24 +00:00
Patrick Robertson	d6a7f31248	Add note that authentication only works for some modules	2025-03-24 18:28:35 +04:00
Patrick Robertson	8aba663534	Update node module versions	2025-03-24 18:28:30 +04:00
Patrick Robertson	ace97ac7fd	Don't run ruff on non-python file changes	2025-03-24 18:00:14 +04:00
Patrick Robertson	ad373ae733	Add explicit dependabots for pip/poetry, GH actiona and npm	2025-03-24 17:57:53 +04:00
Patrick Robertson	260e76dd3d	Update dependencies	2025-03-24 17:48:25 +04:00
Patrick Robertson	a9fe959ea1	Fix unit tests for latest yt-dlp (Yt-dlp title is now truncated)	2025-03-24 17:48:15 +04:00
Patrick Robertson	beb7f3893d	Add comments/notes to WACZ enricher about browser profiles	2025-03-24 17:39:47 +04:00
Patrick Robertson	5055402c2a	Bump browsertrix version	2025-03-24 17:39:44 +04:00
Patrick Robertson	3c4625d708	Further ruff tweaks	2025-03-24 16:39:59 +04:00
Patrick Robertson	31fa7380f5	Fix up unit tests + issue when working with self-signed certs	2025-03-24 16:00:40 +04:00
Patrick Robertson	396ec03bae	Tidy up unit tests further + make more non-download	2025-03-24 15:26:22 +04:00
Patrick Robertson	e811196711	Ruff fixes	2025-03-24 15:10:46 +04:00
Patrick Robertson	dfde6f1995	Merge main into timestamping_enricher	2025-03-24 15:09:29 +04:00
Miguel Sozinho Ramalho	7b454baa02	Create dependabot.yml	2025-03-24 10:49:36 +00:00
Patrick Robertson	0f9c6a9a5c	Update yt-dlp to latest	2025-03-24 14:49:18 +04:00
Patrick Robertson	c980500978	Actually restart AA after updating yt-dlp. A simple 'importlib.reload()' doesn't take into account all imports	2025-03-24 14:33:59 +04:00
Patrick Robertson	01516724d3	Merge pull request #264 from bellingcat/minor_fixes Minor fixes	2025-03-21 10:49:39 +00:00
Patrick Robertson	a066bf4ca9	Clean up comments	2025-03-21 14:47:50 +04:00
Patrick Robertson	2233af81f7	Version bump	2025-03-21 14:33:08 +04:00
Patrick Robertson	aacb874b56	removeprefix for www. is required here	2025-03-21 12:23:45 +04:00
Patrick Robertson	4b5a8c0199	Add warning inside instagram_extractor that it's not actively maintained	2025-03-21 12:09:58 +04:00
Patrick Robertson	14c56f4916	Provide better logs for screenshot enricher when auth is/isn't supported (cookies only)	2025-03-21 12:05:47 +04:00
Patrick Robertson	5b131996c6	Add return type for auth_for_site	2025-03-21 11:55:12 +04:00
Patrick Robertson	168dfb6254	Unit tests for url utils	2025-03-21 11:53:47 +04:00
Patrick Robertson	42e16aebd6	Merge pull request #255 from bellingcat/autogenerate_services_account Script to auto-generate a service account	2025-03-20 18:00:45 +00:00
Patrick Robertson	d6d5a08204	Allow user to save downloaded keyfile to a different folder	2025-03-20 20:45:28 +04:00
Patrick Robertson	e6c5705f70	Merge pull request #261 from bellingcat/wacz_separate_profile Wacz minor adjustments	2025-03-20 15:51:56 +00:00
Erin Clark	613ba0c05d	Merge pull request #262 from bellingcat/generic_extractor_args Add flexible extractor_args to generic_extractor.py This allows users to pass any of the options listed [here](https://github.com/yt-dlp/yt-dlp/blob/master/README.md#extractor-arguments) to yt-dlp extractor_args. example usage: ``` generic_extractor: facebook_cookie: ... extractor_args: youtube: player_client: web,tv generic: is_live: true ```	2025-03-20 15:38:20 +00:00
Patrick Robertson	b997bbea2b	Merge pull request #263 from bellingcat/wrong_steps When loading modules, check they have been added to the right 'step' in the config	2025-03-20 15:31:38 +00:00
erinhmclark	54f53886ef	Update tests for default config values	2025-03-20 14:57:26 +00:00
Patrick Robertson	0a5ba3385e	Fix small bug in twitter dropin - previously the 'content' was being set to a json dump of the tweet, it should be set to full_text	2025-03-20 18:55:22 +04:00
Patrick Robertson	034857075d	Merge branch 'main' into wrong_steps	2025-03-20 18:44:19 +04:00
Patrick Robertson	6700250891	Add a test for checking module type on setup	2025-03-20 18:18:53 +04:00
Patrick Robertson	5e5e1c43a1	When loading modules, check they have been added to the right 'step' in the config Fixes an issue seen on discord where a user accidentally set up metadata_enricher under 'extractors'	2025-03-20 18:09:26 +04:00
Patrick Robertson	1e19ad77c6	Fix tests	2025-03-20 18:08:19 +04:00
Patrick Robertson	f22af5e123	Tweak WACZ enricher docs + add comment on WACZ_ENABLE_DOCKER	2025-03-20 16:48:30 +04:00
Patrick Robertson	799cef3a8c	Cleanup docker-compose	2025-03-20 16:48:30 +04:00
erinhmclark	2921061fde	Add flexible extractor_args to generic_extractor.py	2025-03-19 19:19:28 +00:00
Patrick Robertson	e531906d73	Create an independent profile file for each wacz_extractor_enricher instance	2025-03-19 18:08:24 +04:00
Patrick Robertson	244341d22c	Skip check for 'docker' bin dependency if already running in docker	2025-03-19 18:08:04 +04:00
Erin Clark	90932a7bc8	Merge pull request #259 from bellingcat/fix_youtube_generic Small fix for generic_extractor.py for general/ youtube extraction.	2025-03-19 11:52:56 +00:00
Patrick Robertson	488675056b	Download generate_google_services.sh script from GH - it's not packaged with the app	2025-03-19 15:52:39 +04:00
erinhmclark	a577228465	Update generic_extractor.py for general/ youtube extraction.	2025-03-18 21:10:06 +00:00
Miguel Sozinho Ramalho	f6863b8eb2	Update src/auto_archiver/modules/gsheet_feeder_db/__manifest__.py	2025-03-18 14:10:47 +00:00
Miguel Sozinho Ramalho	5c34ac1293	Update docs/source/how_to/gsheets_setup.md	2025-03-18 14:05:23 +00:00
Patrick Robertson	7d972ee9b8	Merge pull request #258 from bellingcat/version_bump Version bump	2025-03-18 12:18:09 +00:00
Patrick Robertson	b64826dc16	Merge pull request #257 from bellingcat/standardise_parsedates Standardise parse dates to get_datetime_from_str	2025-03-18 12:17:51 +00:00
Patrick Robertson	23e74803ee	Version bump	2025-03-18 10:52:23 +00:00
Patrick Robertson	d03ecdb037	Standardise parse dates to get_datetime_from_str	2025-03-18 10:22:58 +00:00
Patrick Robertson	a5ebbf4726	Merge pull request #256 from bellingcat/dropin_cleanup Refactor the dropin 'is_suitable' method + fix for tikwm	2025-03-18 10:08:24 +00:00
Patrick Robertson	89e387030d	Tests for suitable URLs for tikwm	2025-03-18 10:04:03 +00:00
Patrick Robertson	8ec053ed1b	Refactor the dropin 'is_suitable' method + fix tikwm implementation Makes it easier to maintain/understand.	2025-03-18 09:14:14 +00:00
Patrick Robertson	29db537fab	Docs on using the script to auto-generate service accounts	2025-03-17 18:11:18 +00:00
Patrick Robertson	c4a3a45bf7	Script to auto-generate a service account	2025-03-17 15:42:43 +00:00
Patrick Robertson	3ea02c115e	Merge pull request #254 from bellingcat/rtd_docs Add info on building RTD versions + automated building of tagged versions	2025-03-17 13:01:20 +00:00
Patrick Robertson	3d4056ef70	Merge pull request #223 from bellingcat/facebook_extractor Create facebook dropin - working for images + text.	2025-03-17 12:45:05 +00:00
Patrick Robertson	51041bf91e	Merge pull request #253 from bellingcat/settings_page Update material version, minify code	2025-03-17 11:59:37 +00:00
Patrick Robertson	0765640bff	Fix up tiktok dropin for slightly modified generic_extractor format	2025-03-17 10:31:22 +00:00
Patrick Robertson	06b1f4c0ca	Fix lingering merge conflict issues	2025-03-17 10:12:55 +00:00
Patrick Robertson	59b910ec30	Merge main	2025-03-17 10:05:11 +00:00
Patrick Robertson	7e360240bf	Copy ytdlp code into AA project - seems like ytdlp won't be merged anytime soon	2025-03-17 09:57:05 +00:00
Patrick Robertson	89ee6f19b6	List out all valid TSAs + option for users to allow self signed if they want	2025-03-11 16:12:13 +00:00
Patrick Robertson	294033f156	Fix bug ordering tsr that only have one cert + more unit tests	2025-03-11 15:44:04 +00:00
Patrick Robertson	2ffe124d95	Add unit test for invalid digicert tsrs	2025-03-11 11:13:36 +00:00
Patrick Robertson	1db8be91db	Improved unit tests for timestamping	2025-03-11 11:08:52 +00:00
Patrick Robertson	3f6acc0917	fully working timestamping enricher	2025-03-11 10:04:46 +00:00
Patrick Robertson	a0869bb3b2	Fixed up timestamp verifying - waiting on issue with rfc-client to be fixed Ref: https://github.com/trailofbits/rfc3161-client/issues/104#issuecomment-2693890607	2025-03-03 10:28:30 +00:00
Patrick Robertson	afc117a229	Get downloading certs working	2025-02-26 09:33:56 +00:00
Patrick Robertson	4dcb77c29f	Merge branch 'main' into timestamping_rewrite	2025-02-25 17:10:55 +00:00
Patrick Robertson	898faf6fe4	Further WIP - currently working on verify_signed	2025-02-25 12:08:08 +00:00
Patrick Robertson	6987a4827e	Set poetry packages - remove tsp_client and update cryptography	2025-02-25 11:57:20 +00:00
Patrick Robertson	f8e846d59a	Create facebook dropin - working for images + text. CAVEAT: only gets the first ~100 chars of the post at the moment	2025-02-25 11:44:35 +00:00
Patrick Robertson	01bf88a695	Merge branch 'main' into timestamping_rewrite	2025-02-24 12:03:14 +00:00
Patrick Robertson	d0c379a3ba	WIP - timestamping enricher	2025-02-11 18:18:19 +00:00
Patrick Robertson	3163cb793a	Fix timestamping enricher for new module structure (temp paths)	2025-02-11 15:26:40 +00:00
Patrick Robertson	7bb4d68a22	Merge branch 'load_modules' into timestamping_rewrite	2025-02-11 15:21:31 +00:00
Patrick Robertson	4c1c8953ca	Add unit tests for timestamping_enricher	2025-01-29 12:20:52 +01:00