173 Commits

Author SHA1 Message Date
Tristan Lee
2a4af674ef Merge pull request #30 from bellingcat/change-default-limit
Reduce video limit to avoid error
2024-03-11 18:54:51 -05:00
Galen Reich
3f4fd2606b Bump version 2024-03-08 14:50:07 +00:00
Galen Reich
d2fb0cc484 change video limit to avoid error 2024-03-08 14:44:55 +00:00
Tristan Lee
18e2c4de3e Merge pull request #26 from bellingcat/error-module
used updated Playwright error module in import
2023-12-08 05:19:17 -06:00
Tristan Lee
6369e9579a used updated Playwright error module in import v2.0.3 2023-12-08 05:13:03 -06:00
Tristan Lee
efa3a47984 changed publishing workflow file to hopefully avoid making a new PyPi package every push to main, and incremented version patch number v2.0.2 2023-09-21 06:14:59 -05:00
Tristan Lee
8a416c098d Merge pull request #23 from bellingcat/workflow-tests
Workflow tests
2023-09-21 06:00:11 -05:00
Tristan Lee
c7f2db1f9d no way to robustly set Windows files to readonly, so removed those tests 2023-09-21 05:51:28 -05:00
Tristan Lee
89d89521fa added headed argument to more robustly handle issues with scrapers headless mode 2023-09-19 16:30:13 -05:00
Tristan Lee
0bd87f944e removed tmate from windows workflow 2023-09-19 00:52:12 -05:00
Tristan Lee
d2a0e3d5ad updated README from main 2023-09-19 00:51:05 -05:00
Tristan Lee
fee300d4d7 reorganized output directory parsing tests 2023-09-19 00:45:56 -05:00
Tristan Lee
e548b6fca9 Update README.md (fixed typo) 2023-09-15 16:15:19 -05:00
Tristan Lee
0b273cb7bd Updated README.md to include Playwright installation command 2023-09-15 16:14:58 -05:00
Tristan Lee
7603d9c769 attempting to debug Windows workflow 2023-09-15 02:55:12 -05:00
Tristan Lee
8e10c93e31 made the process_output_dir function more reliable on Windows 2023-09-15 02:43:30 -05:00
Tristan Lee
fc61489def added tests for Windows environment 2023-09-15 01:31:37 -05:00
Tristan Lee
847fcb55cb Merge branch 'main' into workflow-tests 2023-09-15 01:27:07 -05:00
Tristan Lee
4836fd93aa updated README with playwright installation command, added pytest workflow 2023-09-15 01:26:51 -05:00
Tristan Lee
ea4da1b700 Merge pull request #22 from bellingcat/adding-token-video
merged
2023-09-12 11:29:17 -05:00
Tristan Lee
92ae29c722 updated version 2023-09-12 11:26:07 -05:00
Tristan Lee
b916512bde removed auth module and authorization, since msToken isnt actually required to run scraper 2023-09-11 21:43:33 -05:00
Tristan Lee
92861e0e5d configured verbosity argument with logging level 2023-09-11 21:29:37 -05:00
Tristan Lee
6fa1e5026c made downloading more robust against transient and permanent errors, fixed issue where media file URLs weren't being updated after scraping 2023-09-09 00:42:56 -05:00
Tristan Lee
1f4b956ce9 made scraping more robust against transient playwright exceptions, set order of hashtags to scrape based on file modified time 2023-09-07 11:18:22 -05:00
Tristan Lee
91a8aaef38 added video link to msToken input, improved handling of output directories without write permission (and added relevant unit test), removed unused requirements.txt things 2023-09-06 19:51:16 -05:00
Tristan Lee
6a56c354e1 Update README.md 2023-09-06 13:17:27 -05:00
Tristan Lee
900d6adc69 Merge pull request #20 from bellingcat/refactor
Refactor
2023-09-06 09:53:57 -05:00
Tristan Lee
10821e30f2 preparing for publishing (removed pipenv commands from workflow, added Contributing section on README, added functionality to pin dependency versions with requirements.txt) 2023-09-06 09:51:31 -05:00
Tristan Lee
8c32a3cf16 updated README, made yt-dlp downloading more robust against errors, changed name of videos folder to media (since images and audio files are also downloaded now) 2023-09-04 13:51:28 -05:00
Tristan Lee
5ae9624968 added tests, changed __main__ to cli 2023-09-04 13:26:38 -05:00
Tristan Lee
0f8e865bf3 added type hints for auth, incorporated auth into base module 2023-09-04 10:40:30 -05:00
Tristan Lee
cf575e6cf6 updated README and added authorization 2023-09-01 18:33:32 -05:00
Tristan Lee
a7bd023c21 simplified downloading logic (methods for keeping track of files less necessary since scraping can be done in Python), added functionality to use yt-dlp to download videos, added functionality to download TikTok image galleries 2023-09-01 17:05:13 -05:00
Miguel Sozinho Ramalho
06b4a74c7d Update README.md 2023-03-16 09:48:50 +00:00
msramalho
7b63b9f349 Bump version to v1.0.4 for release v1.0.4 2023-03-13 10:08:42 +00:00
msramalho
e1ac3b5057 fixing not founds 2023-03-13 10:08:35 +00:00
msramalho
f962878354 Bump version to v1.0.3 for release v1.0.3 2023-03-13 09:54:14 +00:00
msramalho
6b4ceaae61 attempts at fixing CLI issues #18 2023-03-13 09:54:06 +00:00
msramalho
c4aa5a6cc5 Bump version to v1.0.2 for release v1.0.2 2023-02-13 16:55:27 +00:00
msramalho
ad9cac8cdd readme fix 2023-02-13 16:54:20 +00:00
msramalho
2968ada6c8 Bump version to v1.0.1 for release v1.0.1 2023-02-13 16:52:23 +00:00
msramalho
4f81673f04 Bump version to v1.0.0 for release v1.0.0 2023-02-13 16:49:54 +00:00
Miguel Sozinho Ramalho
14eaae0f20 Merge pull request #17 from rly0nheart/main
WIP
2023-02-13 16:49:02 +00:00
msramalho
980a27ff96 pypi fixes 2023-02-13 16:48:26 +00:00
Richard Mwewa
83fe050c15 Update run_downloader.py 2023-01-19 03:41:10 +02:00
Richard Mwewa
f8c12a8d68 Update hashtag_frequencies.py 2023-01-19 03:40:37 +02:00
Richard Mwewa
99467f0e91 Update README.md 2023-01-19 03:35:34 +02:00
Richard Mwewa
fb4755244f Update main.py 2023-01-19 03:32:04 +02:00
Richard Mwewa
5df653ccef Create Dockerfile 2023-01-19 03:29:10 +02:00