From 78a38e6ec004b74ad80ac5bf606244020a581b6d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Felix=20Sp=C3=B6ttel?= <1682504+fspoettel@users.noreply.github.com> Date: Tue, 27 Jun 2023 22:55:32 +0200 Subject: [PATCH] docs: improve deploy wording --- README.md | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index 3c5051f..fc3899a 100644 --- a/README.md +++ b/README.md @@ -6,30 +6,29 @@ This project wraps OpenAI's `whisper` speech-to-text models with a HTTP API. -The API design takes inspiration from the [rev.ai async speech-to-text API](https://docs.rev.ai/api/asynchronous/get-started/). Transcription jobs are submitted via a HTTP `POST` request. After the job is accepted, an id is returned, which can later be used to retrieve the transcription results from the service. Results are stored in an internal database until retrieved, and can optionally be deleted afterwards. +The API design of this service draws inspiration from the [rev.ai async speech-to-text API](https://docs.rev.ai/api/asynchronous/get-started/). Transcription jobs are submitted by making a HTTP POST request to the service. Once the job is accepted, an ID is returned, which can be later utilized to retrieve the transcription results. These results are stored in an internal database until they are retrieved and can optionally be deleted afterwards. It is assumed that the service is used by exactly one consumer, so a pre-shared API key is used as authentication method. OpenAPI documentation for the service is available at `/docs`. ## Deploy -### 0. Choose model & instance size +
+0. Choose model & instance size +Whisper offers a range of models in [different sizes](https://github.com/openai/whisper#available-models-and-languages). The model size affects factors such as accuracy, resource usage, and transcription speed. Smaller models are generally faster and consume fewer resources, but they may be less accurate, especially when working with non-English languages or translation tasks. -Whisper provides [several sizes](https://github.com/openai/whisper#available-models-and-languages) of their model where size is a trade-off between model accuracy, resource usage and transcription speed. Smaller models are generally faster and lighter, but more inaccurate, especially for non-english languages and translation tasks. +Whisper supports inference on both CPU and GPU, and this project includes slightly modified Docker Compose configurations to enable both options. CPU inference is slower but usually more cost-effective for hosting purposes. CPU inference performance typically scales well with the CPU speed. -Whisper inference can be run on both CPU and GPU, and this project supports both via slightly altered docker compose configurations. CPU inference is slower, but easier and cheaper to host. CPU inference, while overall slower than the GPU, generally scales well with CPU speed. +When selecting an instance for your application, it's important to consider the disk size. Media files need to be downloaded before they can be transcribed, so the disk must have sufficient free space to accommodate them. -Another consideration when choosing your instance is disk size. In order to transcribe media, it first needs to be downloaded to a temporary file, so the HDD needs to have enough free space to allow for that. For some hosting environments (e.g. Digital Ocean), it can make sense to mount an additional disk in the VM instead of choosing a larger instance. - -As a baseline, the `small` model can run on a `4GB` Digital Ocean droplet, achieving roughly a 1-2x speedup over original audio when transcribing. +As a starting point, the "small" model can run on a 4GB Digital Ocean droplet with, achieving approximately a 1-2x speed-up over to the original audio length when transcribing. +
### 1. Prepare host environment -This project is intended to be run via [docker compose](https://docs.docker.com/compose/). To get started: - 1. [Install](https://docs.docker.com/engine/install/) docker engine. - 2. Clone this repository to the machine. +This project is intended to be run via [docker compose](https://docs.docker.com/compose/). In order to get started, [install](https://docs.docker.com/engine/install/) docker engine on your VPS. Then, clone this repository to the machine. > **Note** - > If you want to use the GPU, uncomment the sections tagged with __ in docker-compose.prod.yml + > If you want to use a GPU, uncomment the sections tagged with __ in docker-compose.prod.yml ### 2. Configure service @@ -41,7 +40,7 @@ This project is intended to be run via [docker compose](https://docs.docker.com/ ### 3. Run service -Run `make run` to start the server.4. To launch at system startup, wrap it in a systemd launch service. +Run `make run` to start the server. To launch at system startup, wrap it in a systemd launch service. ## Develop @@ -49,7 +48,7 @@ Run `make run` to start the server.4. To launch at system startup, wrap it in a It is recommended to setup a virtual environment for python tooling. To install dependencies in your virtual env, run `pip install -e .[tooling,web,worker]`. -Copy `.env.test` to `.env` to configure the service. +Copy `.env.dev` to `.env` to configure the service. ### Start @@ -67,7 +66,7 @@ http://whisperbox-transcribe.localhost/docs => API docs ./whisperbox-transcribe.sqlite => Database ``` -## Destroy +#### Clean This removes all containers and attached volumes.