mirror of
https://github.com/bellingcat/vk-url-scraper.git
synced 2026-06-07 19:08:38 +03:00
cleanup
This commit is contained in:
2
.github/pull_request_template.md
vendored
2
.github/pull_request_template.md
vendored
@@ -10,7 +10,7 @@ Changes proposed in this pull request:
|
|||||||
## Before submitting
|
## Before submitting
|
||||||
|
|
||||||
<!-- Please complete this checklist BEFORE submitting your PR to speed along the review process. -->
|
<!-- Please complete this checklist BEFORE submitting your PR to speed along the review process. -->
|
||||||
- [ ] I've read and followed all steps in the [Making a pull request](https://github.com/bellingcat/vk-url-scraper/blob/main/CONTRIBUTING.md#making-a-pull-request)
|
- [ ] I've read and followed all steps in the [Making a pull request](https://github.com/bellingcat/vk-url-scraper/blob/main/CONTRIBUTING.md#making-a-pull-request)
|
||||||
section of the `CONTRIBUTING` docs.
|
section of the `CONTRIBUTING` docs.
|
||||||
- [ ] I've updated or added any relevant docstrings following the syntax described in the
|
- [ ] I've updated or added any relevant docstrings following the syntax described in the
|
||||||
[Writing docstrings](https://github.com/bellingcat/vk-url-scraper/blob/main/CONTRIBUTING.md#writing-docstrings) section of the `CONTRIBUTING` docs.
|
[Writing docstrings](https://github.com/bellingcat/vk-url-scraper/blob/main/CONTRIBUTING.md#writing-docstrings) section of the `CONTRIBUTING` docs.
|
||||||
|
|||||||
4
.github/workflows/main.yml
vendored
4
.github/workflows/main.yml
vendored
@@ -121,10 +121,6 @@ jobs:
|
|||||||
name: package
|
name: package
|
||||||
path: dist
|
path: dist
|
||||||
|
|
||||||
# - name: Generate release notes
|
|
||||||
# run: |
|
|
||||||
# python scripts/release_notes.py > ${{ github.workspace }}-RELEASE_NOTES.md
|
|
||||||
|
|
||||||
- name: Publish package to PyPI
|
- name: Publish package to PyPI
|
||||||
run: |
|
run: |
|
||||||
twine upload -u '${{ secrets.PYPI_USERNAME }}' -p '${{ secrets.PYPI_PASSWORD }}' dist/*
|
twine upload -u '${{ secrets.PYPI_USERNAME }}' -p '${{ secrets.PYPI_PASSWORD }}' dist/*
|
||||||
|
|||||||
27
.github/workflows/pr_checks.yml
vendored
27
.github/workflows/pr_checks.yml
vendored
@@ -1,27 +0,0 @@
|
|||||||
name: PR Checks
|
|
||||||
|
|
||||||
concurrency:
|
|
||||||
group: ${{ github.workflow }}-${{ github.ref }}
|
|
||||||
cancel-in-progress: true
|
|
||||||
|
|
||||||
on:
|
|
||||||
pull_request:
|
|
||||||
branches:
|
|
||||||
- main
|
|
||||||
paths:
|
|
||||||
- 'vk_url_scraper/**'
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
changelog:
|
|
||||||
name: CHANGELOG
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
if: github.event_name == 'pull_request'
|
|
||||||
|
|
||||||
steps:
|
|
||||||
- uses: actions/checkout@v1
|
|
||||||
|
|
||||||
- name: Check that CHANGELOG has been updated
|
|
||||||
run: |
|
|
||||||
# If this step fails, this means you haven't updated the CHANGELOG.md
|
|
||||||
# file with notes on your contribution.
|
|
||||||
git diff --name-only $(git merge-base origin/main HEAD) | grep '^CHANGELOG.md$' && echo "Thanks for helping keep our CHANGELOG up-to-date!"
|
|
||||||
13
CHANGELOG.md
13
CHANGELOG.md
@@ -1,13 +0,0 @@
|
|||||||
# Changelog
|
|
||||||
|
|
||||||
All notable changes to this project will be documented in this file.
|
|
||||||
|
|
||||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
||||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
||||||
|
|
||||||
## Unreleased
|
|
||||||
|
|
||||||
## [0.1.2]
|
|
||||||
* Added wall scraper with tests
|
|
||||||
* Added photo scraper with tests
|
|
||||||
* Added scraper with tests
|
|
||||||
@@ -156,8 +156,6 @@ When you're ready to contribute code to address an open issue, please follow the
|
|||||||
|
|
||||||
If the build fails, it's most likely due to small formatting issues. If the error message isn't clear, feel free to comment on this in your pull request.
|
If the build fails, it's most likely due to small formatting issues. If the error message isn't clear, feel free to comment on this in your pull request.
|
||||||
|
|
||||||
And finally, please update the [CHANGELOG](https://github.com/bellingcat/vk-url-scraper/blob/main/CHANGELOG.md) with notes on your contribution in the "Unreleased" section at the top.
|
|
||||||
|
|
||||||
After all of the above checks have passed, you can now open [a new GitHub pull request](https://github.com/bellingcat/vk-url-scraper/pulls).
|
After all of the above checks have passed, you can now open [a new GitHub pull request](https://github.com/bellingcat/vk-url-scraper/pulls).
|
||||||
Make sure you have a clear description of the problem and the solution, and include a link to relevant issues.
|
Make sure you have a clear description of the problem and the solution, and include a link to relevant issues.
|
||||||
|
|
||||||
|
|||||||
30
README.md
30
README.md
@@ -2,10 +2,7 @@
|
|||||||
Library to scrape data and especially media links (videos and photos) from vk.com URLs.
|
Library to scrape data and especially media links (videos and photos) from vk.com URLs.
|
||||||
|
|
||||||
|
|
||||||
# TODO
|
## Quick usage API
|
||||||
* docs online from sphinx
|
|
||||||
|
|
||||||
## Quick usage
|
|
||||||
`pip install vk-url-scraper` to install.
|
`pip install vk-url-scraper` to install.
|
||||||
|
|
||||||
|
|
||||||
@@ -43,7 +40,13 @@ print(res[0]["text]) # eg: -> to get the text from code
|
|||||||
|
|
||||||
see [docs] for all available functions.
|
see [docs] for all available functions.
|
||||||
|
|
||||||
### Development
|
### TODO
|
||||||
|
* docs online from sphinx
|
||||||
|
|
||||||
|
## Development
|
||||||
|
(more info in [CONTRIBUTING.md](CONTRIBUTING.md)).
|
||||||
|
|
||||||
|
1. setup dev environment with `pip install -r dev-requirements` or `pipenv install -r dev-requirements`
|
||||||
1. setup environment with `pip install -r requirements` or `pipenv install -r requirements`
|
1. setup environment with `pip install -r requirements` or `pipenv install -r requirements`
|
||||||
2. To run all checks to `make run-checks` (fixes style) or individually
|
2. To run all checks to `make run-checks` (fixes style) or individually
|
||||||
1. To fix style: `black .` and `isort .` -> `flake8 .` to validate lint
|
1. To fix style: `black .` and `isort .` -> `flake8 .` to validate lint
|
||||||
@@ -51,7 +54,18 @@ see [docs] for all available functions.
|
|||||||
3. To test: `pytest .` (`pytest -v --color=yes --doctest-modules tests/ vk_url_scraper/` to user verbose, colors, and test docstring examples)
|
3. To test: `pytest .` (`pytest -v --color=yes --doctest-modules tests/ vk_url_scraper/` to user verbose, colors, and test docstring examples)
|
||||||
3. `make docs` to generate shpynx docs -> edit [config.py](docs/source/conf.py) if needed
|
3. `make docs` to generate shpynx docs -> edit [config.py](docs/source/conf.py) if needed
|
||||||
|
|
||||||
### Releasing new version
|
## Releasing new version
|
||||||
1. edit [version.py](vk_url_scraper/version.py) with proper versioning
|
1. edit [version.py](vk_url_scraper/version.py) with proper versioning
|
||||||
2. `git tag vx.y.z` to tag version
|
2. run `./scripts/release.sh` to create a tag and push, alternatively
|
||||||
3. `git push origin vx.y.z` -> this will trigger workflow and put project on [pypi](https://pypi.org/project/vk-url-scraper/)
|
1. `git tag vx.y.z` to tag version
|
||||||
|
2. `git push origin vx.y.z` -> this will trigger workflow and put project on [pypi](https://pypi.org/project/vk-url-scraper/)
|
||||||
|
|
||||||
|
### Fixing a failed release
|
||||||
|
|
||||||
|
If for some reason the GitHub Actions release workflow failed with an error that needs to be fixed, you'll have to delete both the tag and corresponding release from GitHub. After you've pushed a fix, delete the tag from your local clone with
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git tag -l | xargs git tag -d && git fetch -t
|
||||||
|
```
|
||||||
|
|
||||||
|
Then repeat the steps above.
|
||||||
@@ -1,24 +0,0 @@
|
|||||||
# GitHub Release Process
|
|
||||||
|
|
||||||
## Steps
|
|
||||||
|
|
||||||
1. Update the version in `vk_url_scraper/version.py`.
|
|
||||||
|
|
||||||
3. Run the release script:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./scripts/release.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
This will commit the changes to the CHANGELOG and `version.py` files and then create a new tag in git
|
|
||||||
which will trigger a workflow on GitHub Actions that handles the rest.
|
|
||||||
|
|
||||||
## Fixing a failed release
|
|
||||||
|
|
||||||
If for some reason the GitHub Actions release workflow failed with an error that needs to be fixed, you'll have to delete both the tag and corresponding release from GitHub. After you've pushed a fix, delete the tag from your local clone with
|
|
||||||
|
|
||||||
```bash
|
|
||||||
git tag -l | xargs git tag -d && git fetch -t
|
|
||||||
```
|
|
||||||
|
|
||||||
Then repeat the steps above.
|
|
||||||
@@ -1 +0,0 @@
|
|||||||
../../CHANGELOG.md
|
|
||||||
@@ -23,7 +23,6 @@ Contents
|
|||||||
|
|
||||||
installation
|
installation
|
||||||
overview
|
overview
|
||||||
CHANGELOG
|
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:hidden:
|
:hidden:
|
||||||
|
|||||||
@@ -1,39 +0,0 @@
|
|||||||
from datetime import datetime
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from vk_url_scraper.version import VERSION
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
changelog = Path("CHANGELOG.md")
|
|
||||||
|
|
||||||
with changelog.open() as f:
|
|
||||||
lines = f.readlines()
|
|
||||||
|
|
||||||
insert_index: int = -1
|
|
||||||
for i in range(len(lines)):
|
|
||||||
line = lines[i]
|
|
||||||
if line.startswith("## Unreleased"):
|
|
||||||
insert_index = i + 1
|
|
||||||
elif line.startswith(f"## [v{VERSION}]"):
|
|
||||||
print("CHANGELOG already up-to-date")
|
|
||||||
return
|
|
||||||
elif line.startswith("## [v"):
|
|
||||||
break
|
|
||||||
|
|
||||||
if insert_index < 0:
|
|
||||||
raise RuntimeError("Couldn't find 'Unreleased' section")
|
|
||||||
|
|
||||||
lines.insert(insert_index, "\n")
|
|
||||||
lines.insert(
|
|
||||||
insert_index + 1,
|
|
||||||
f"## [v{VERSION}](https://github.com/bellingcat/vk-url-scraper/releases/tag/v{VERSION}) - "
|
|
||||||
f"{datetime.now().strftime('%Y-%m-%d')}\n",
|
|
||||||
)
|
|
||||||
|
|
||||||
with changelog.open("w") as f:
|
|
||||||
f.writelines(lines)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
@@ -7,7 +7,6 @@ TAG=$(python -c 'from vk_url_scraper.version import VERSION; print("v" + VERSION
|
|||||||
read -p "Creating new release for $TAG. Do you want to continue? [Y/n] " prompt
|
read -p "Creating new release for $TAG. Do you want to continue? [Y/n] " prompt
|
||||||
|
|
||||||
if [[ $prompt == "y" || $prompt == "Y" || $prompt == "yes" || $prompt == "Yes" ]]; then
|
if [[ $prompt == "y" || $prompt == "Y" || $prompt == "yes" || $prompt == "Yes" ]]; then
|
||||||
python scripts/prepare_changelog.py
|
|
||||||
git add -A
|
git add -A
|
||||||
git commit -m "Bump version to $TAG for release" || true && git push
|
git commit -m "Bump version to $TAG for release" || true && git push
|
||||||
echo "Creating new git tag $TAG"
|
echo "Creating new git tag $TAG"
|
||||||
|
|||||||
@@ -1,78 +0,0 @@
|
|||||||
# encoding: utf-8
|
|
||||||
|
|
||||||
"""
|
|
||||||
Prepares markdown release notes for GitHub releases.
|
|
||||||
"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
from typing import List, Optional
|
|
||||||
|
|
||||||
import packaging.version
|
|
||||||
|
|
||||||
TAG = os.environ["TAG"]
|
|
||||||
|
|
||||||
ADDED_HEADER = "### Added 🎉"
|
|
||||||
CHANGED_HEADER = "### Changed ⚠️"
|
|
||||||
FIXED_HEADER = "### Fixed ✅"
|
|
||||||
REMOVED_HEADER = "### Removed 👋"
|
|
||||||
|
|
||||||
|
|
||||||
def get_change_log_notes() -> str:
|
|
||||||
in_current_section = False
|
|
||||||
current_section_notes: List[str] = []
|
|
||||||
with open("CHANGELOG.md") as changelog:
|
|
||||||
for line in changelog:
|
|
||||||
if line.startswith("## "):
|
|
||||||
if line.startswith("## Unreleased"):
|
|
||||||
continue
|
|
||||||
if line.startswith(f"## [{TAG}]"):
|
|
||||||
in_current_section = True
|
|
||||||
continue
|
|
||||||
break
|
|
||||||
if in_current_section:
|
|
||||||
if line.startswith("### Added"):
|
|
||||||
line = ADDED_HEADER + "\n"
|
|
||||||
elif line.startswith("### Changed"):
|
|
||||||
line = CHANGED_HEADER + "\n"
|
|
||||||
elif line.startswith("### Fixed"):
|
|
||||||
line = FIXED_HEADER + "\n"
|
|
||||||
elif line.startswith("### Removed"):
|
|
||||||
line = REMOVED_HEADER + "\n"
|
|
||||||
current_section_notes.append(line)
|
|
||||||
assert current_section_notes
|
|
||||||
return "## What's new\n\n" + "".join(current_section_notes).strip() + "\n"
|
|
||||||
|
|
||||||
|
|
||||||
def get_commit_history() -> str:
|
|
||||||
new_version = packaging.version.parse(TAG)
|
|
||||||
|
|
||||||
# Get all tags sorted by version, latest first.
|
|
||||||
all_tags = os.popen("git tag -l --sort=-version:refname 'v*'").read().split("\n")
|
|
||||||
|
|
||||||
# Out of `all_tags`, find the latest previous version so that we can collect all
|
|
||||||
# commits between that version and the new version we're about to publish.
|
|
||||||
# Note that we ignore pre-releases unless the new version is also a pre-release.
|
|
||||||
last_tag: Optional[str] = None
|
|
||||||
for tag in all_tags:
|
|
||||||
if not tag.strip(): # could be blank line
|
|
||||||
continue
|
|
||||||
version = packaging.version.parse(tag)
|
|
||||||
if new_version.pre is None and version.pre is not None:
|
|
||||||
continue
|
|
||||||
if version < new_version:
|
|
||||||
last_tag = tag
|
|
||||||
break
|
|
||||||
if last_tag is not None:
|
|
||||||
commits = os.popen(f"git log {last_tag}..{TAG}^ --oneline --first-parent").read()
|
|
||||||
else:
|
|
||||||
commits = os.popen("git log --oneline --first-parent").read()
|
|
||||||
return "## Commits\n\n" + commits
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
print(get_change_log_notes())
|
|
||||||
print(get_commit_history())
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
Reference in New Issue
Block a user