Free Whisper API with GPU backend

This repository demonstrates how to create a free whisper API with a GPU backend, so you can get transcripts more quickly. Here's a comparison of inference times on the different hardware options in Colab (source):

While not a robust/permanent solution, it can save you money on small projects. Here's the cost to transcribe 1000 hours of audio with Whisper on an A100 in GCP for different model and batch sizes (source):

You can try also transcribe files in just a few minutes with Python if you're looking for a no-setup solution using AssemblyAI. Grab a free API key to start transcribing, understanding, and prompting your audio files.

To understand how the API works, check out our companion article.

Initial setup

Create an Ngrok account if you do not already have one and verify your email
Make sure Python 3.9 or 3.10 is installed on your system

Start the API

To run with GPU

Go to the companion Colab:

Follow the instruction to start the API

To run locally

You can also run the API locally. While this method doesn't offload the inference to Colab, you may want to do this while using the tiny model for testing, or with the large model if you have a GPU.

Local setup

Open a terminal and set your ngrok authtoken with ngrok authtoken YOUR-AUTHTOKEN-HERE. You can find your authtoken on your dashboard
Install ffmpeg if it is not already installed on your system
(Optional) Create a virtual environment for your project with python3 -m venv venv and then activate it with . venv/bin/activate on Linux/MacOS or .\venv\Scripts\activate.bat on Windows. You may have to use python instead of python3
Install the required packages with pip install -r requirements.txt

Start the API

Run python3 api.py in order to start the Flask API at http://127.0.0.1:8008

Consume the API

Once your API is up and running, you can hit it using any tool that can make POST requests. For example, you can use cURL to make requests in the terminal. Be sure to replace the URl with your Ngrok URL (or localhost if running locally) with the /transcribe endpoint

curl -X POST "https://YOUR-URL.ngrok-free.app/transcribe" \
-H "Content-Type: application/json" \
-d '{"file": "https://storage.googleapis.com/aai-web-samples/Custom-Home-Builder.mp3", "model": "tiny"}'

You can also consume your API in Python with the requests library (you'll need to pip install requests if you haven't done so already):

import requests
import os

NGROK_URL = "https://YOUR-URL.ngrok-free.app/"
TRANSCRIBE_ENDPOINT = os.path.join(NGROK_URL, "transcribe")

json_data = {'file': "https://storage.googleapis.com/aai-web-samples/Custom-Home-Builder.mp3",
            'model': 'tiny'}
response = requests.post(TRANSCRIBE_ENDPOINT, json=json_data)

print(response.json()['transcript'])

You can check out/execute basic-request.py to see how to make requests to the API for both remote and local files.

You can take a look at transcribe.py to see a more robust way of calling the API that abstracts away the calling details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Free Whisper API with GPU backend

Initial setup

Start the API

To run with GPU

To run locally

Local setup

Start the API

Consume the API

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
transcripts		transcripts
Custom-Home-Builder.mp3		Custom-Home-Builder.mp3
README.md		README.md
api.py		api.py
basic-request.py		basic-request.py
key.svg		key.svg
requirements.txt		requirements.txt
transcribe.py		transcribe.py

AssemblyAI-Community/free-whisper-api-gpu

Folders and files

Latest commit

History

Repository files navigation

Free Whisper API with GPU backend

Initial setup

Start the API

To run with GPU

To run locally

Local setup

Start the API

Consume the API

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages