Commit Graph

150 Commits

Author SHA1 Message Date
Tristan Lee
6fa1e5026c made downloading more robust against transient and permanent errors, fixed issue where media file URLs weren't being updated after scraping 2023-09-09 00:42:56 -05:00
Tristan Lee
1f4b956ce9 made scraping more robust against transient playwright exceptions, set order of hashtags to scrape based on file modified time 2023-09-07 11:18:22 -05:00
Tristan Lee
91a8aaef38 added video link to msToken input, improved handling of output directories without write permission (and added relevant unit test), removed unused requirements.txt things 2023-09-06 19:51:16 -05:00
Tristan Lee
6a56c354e1 Update README.md 2023-09-06 13:17:27 -05:00
Tristan Lee
900d6adc69 Merge pull request #20 from bellingcat/refactor
Refactor
2023-09-06 09:53:57 -05:00
Tristan Lee
10821e30f2 preparing for publishing (removed pipenv commands from workflow, added Contributing section on README, added functionality to pin dependency versions with requirements.txt) 2023-09-06 09:51:31 -05:00
Tristan Lee
8c32a3cf16 updated README, made yt-dlp downloading more robust against errors, changed name of videos folder to media (since images and audio files are also downloaded now) 2023-09-04 13:51:28 -05:00
Tristan Lee
5ae9624968 added tests, changed __main__ to cli 2023-09-04 13:26:38 -05:00
Tristan Lee
0f8e865bf3 added type hints for auth, incorporated auth into base module 2023-09-04 10:40:30 -05:00
Tristan Lee
cf575e6cf6 updated README and added authorization 2023-09-01 18:33:32 -05:00
Tristan Lee
a7bd023c21 simplified downloading logic (methods for keeping track of files less necessary since scraping can be done in Python), added functionality to use yt-dlp to download videos, added functionality to download TikTok image galleries 2023-09-01 17:05:13 -05:00
Miguel Sozinho Ramalho
06b4a74c7d Update README.md 2023-03-16 09:48:50 +00:00
msramalho
7b63b9f349 Bump version to v1.0.4 for release v1.0.4 2023-03-13 10:08:42 +00:00
msramalho
e1ac3b5057 fixing not founds 2023-03-13 10:08:35 +00:00
msramalho
f962878354 Bump version to v1.0.3 for release v1.0.3 2023-03-13 09:54:14 +00:00
msramalho
6b4ceaae61 attempts at fixing CLI issues #18 2023-03-13 09:54:06 +00:00
msramalho
c4aa5a6cc5 Bump version to v1.0.2 for release v1.0.2 2023-02-13 16:55:27 +00:00
msramalho
ad9cac8cdd readme fix 2023-02-13 16:54:20 +00:00
msramalho
2968ada6c8 Bump version to v1.0.1 for release v1.0.1 2023-02-13 16:52:23 +00:00
msramalho
4f81673f04 Bump version to v1.0.0 for release v1.0.0 2023-02-13 16:49:54 +00:00
Miguel Sozinho Ramalho
14eaae0f20 Merge pull request #17 from rly0nheart/main
WIP
2023-02-13 16:49:02 +00:00
msramalho
980a27ff96 pypi fixes 2023-02-13 16:48:26 +00:00
Richard Mwewa
83fe050c15 Update run_downloader.py 2023-01-19 03:41:10 +02:00
Richard Mwewa
f8c12a8d68 Update hashtag_frequencies.py 2023-01-19 03:40:37 +02:00
Richard Mwewa
99467f0e91 Update README.md 2023-01-19 03:35:34 +02:00
Richard Mwewa
fb4755244f Update main.py 2023-01-19 03:32:04 +02:00
Richard Mwewa
5df653ccef Create Dockerfile 2023-01-19 03:29:10 +02:00
Richard Mwewa
4c69f616e6 Create setup.py 2023-01-19 03:26:38 +02:00
Richard Mwewa
9dd22c90c7 Create main.py 2023-01-19 03:17:11 +02:00
Richard Mwewa
5f4eb9f2c8 Refactored for PyPI 2023-01-19 03:15:28 +02:00
Richard Mwewa
1409c50034 Refactored for PyPI 2023-01-19 03:13:20 +02:00
johannawild
a0f4320635 Update README.md 2022-05-16 13:14:40 +02:00
johannawild
c3d9b415c6 Update README.md 2022-05-16 13:13:57 +02:00
johannawild
db08aacab5 Update README.md 2022-05-16 13:12:54 +02:00
johannawild
99a0b16d66 Update README.md 2022-05-16 13:11:50 +02:00
johannawild
26b4bcc00d Update README.md 2022-05-10 16:34:24 +02:00
johannawild
5866763adc Update README.md 2022-05-10 16:34:05 +02:00
johannawild
2c345fa27a Update README.md 2022-05-10 16:33:38 +02:00
X
41007a8fa6 changed filehandler to debug level to capture logged data 2022-05-06 12:20:00 +02:00
X
280303f461 changed filehandler level to INFO and changed Logger to files 2022-05-06 11:54:00 +02:00
johannawild
161699d2b9 Update README.md 2022-05-06 11:24:42 +02:00
johannawild
474e39568b Merge pull request #5 from bellingcat/even_more_tristan_edits
Finishing touches
2022-05-06 10:42:40 +02:00
Tristan Lee
e0f55145e1 fixed typo in error message 2022-05-06 03:39:33 -05:00
X
52338d47de Add total posts to the hashtag_frequencies console printing 2022-05-06 10:25:10 +02:00
Tristan Lee
21b404ff57 renamed source directory 2022-05-06 03:13:40 -05:00
Tristan Lee
f377408960 updated README with new hashtag_frequencies table 2022-05-06 02:57:56 -05:00
Tristan Lee
6bddcfb238 modified formatting of print_occurrences function 2022-05-06 02:56:38 -05:00
Tristan Lee
f77214c71f fixed typo in hashtag_frequencies.plot 2022-05-06 02:49:46 -05:00
Tristan Lee
595a6e6535 specified filepath argument in tiktok-scraper to avoid chdir commands 2022-05-06 02:36:21 -05:00
Tristan Lee
0cb9d4b1b9 made docstrings more consistent, changed argument of hashtag_frequencies script to use the hashtag rather than the post_id file for the hashtag, to make it easier to use 2022-05-06 01:49:55 -05:00