Transcription Pipeline

This project is for the voice transcription task. It define the Transcription Pipeline class, to transcribe the audio file into text.

Getting Started

Prerequisites

The prerequisites are listed in the requirements.txt file. You can install them by:

pip install -r requirements.txt

You can also running a container with the dockerfile. To build the image, run:

docker build -t transcription-pipeline .

To run the container with a volume, run:

docker run -it -v /path/to/audio:/audio transcription-pipeline

To run the container with a volume and GPU, run:

docker run -it --gpus all -v /path/to/audio:/audio transcription-pipeline

Usage

To use the Transcription Pipeline, you can run the following command:

python transcription_pipeline.py --audio_path /path/to/audio  --engine whisper

The output will be saved in the same directory as the audio file, with the same name as the audio file, but with a .txt extension.

Output format

The output file will be formated as follows:

{
    "audio_path": "/path/to/audio",
    "engine": "whisper",
    "language": "en",
    "transcription": "This is the transcription of the audio file"
}

TODO

Define Transcription Pipeline class
Implement with Whisper
Implement with Google Cloud Speech-to-Text API
Implement with OpenAI API
Add tests
Add support for other languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transcription Pipeline

Getting Started

Prerequisites

Usage

Output format

TODO

Authors

License

Built With

About

Releases

Packages

License

sborquez/SyFM

Folders and files

Latest commit

History

Repository files navigation

Transcription Pipeline

Getting Started

Prerequisites

Usage

Output format

TODO

Authors

License

Built With

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages