#

speech-translation

Here are 55 public repositories matching this topic...

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated Dec 25, 2024
Python

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated Dec 24, 2024
Python

espnet / espnet

End-to-End Speech Processing Toolkit

text-to-speech deep-learning chainer end-to-end machine-translation pytorch speech-synthesis speech-recognition kaldi voice-conversion speaker-diarization speech-separation speech-enhancement spoken-language-understanding speech-translation singing-voice-synthesis

Updated Dec 23, 2024
Python

huggingface / speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

python machine-learning ai speech speech-synthesis assistant speech-to-text language-model speech-translation

Updated Dec 4, 2024
Python

microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

speech-synthesis speech-recognition speech-translation speech-pretraining speecht5 speech2c speechlm speechut speech-text-pretraining vatlm vallex

Updated Apr 24, 2024
Python

ictnlp / StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Updated Aug 24, 2024
Python

zhangshaolei1998 / Awesome-Simultaneous-Translation

Paper list of simultaneous translation / streaming translation, including text-to-text machine translation and speech-to-text translation.

nlp natural-language-processing streaming awesome paper machine-translation text-translation paperlist speech-translation simultaneous-translation simultaneous-machine-translation

Updated Jun 7, 2024

Dadangdut33 / Speech-Translate

A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.

python translate whisper tkinter-python speech-translation speech-transcription

Updated Jan 18, 2024
Python

double22a / speech_dataset

The dataset of Speech Recognition

audio text-to-speech deep-neural-networks deep-learning speech tts speech-synthesis dataset wav speech-recognition automatic-speech-recognition speech-to-text voice-conversion asr speech-separation speech-enhancement speech-segmentation speech-translation speech-diarization

Updated Jul 2, 2024

kahne / SpeechTransProgress

Tracking the progress in end-to-end speech translation

natural-language-processing machine-translation artificial-intelligence natural-language-generation speech-processing spoken-language-processing speech-translation spoken-language-translation

Updated Oct 25, 2023

echogarden-project / echogarden

Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.

text-to-speech command-line speech language-detection speech-synthesis speech-recognition node-js speech-to-text source-separation language-identification forced-alignment speech-translation speech-alignment voice-isolation

Updated Dec 24, 2024
TypeScript

MooreThreads / MooER

MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not limited to end-to-end speech interaction, end-to-end speech translation and speech recognition.

speech-recognition speech-to-text speech-translation speech-to-speech large-language-models chatgpt gpt-4o speech-interaction

Updated Dec 18, 2024
Python

dqqcasia / awesome-speech-translation

natural-language-processing machine-translation speech speech-synthesis speech-recognition speech-processing text-translation disfluency-detection speech-translation multimodal-machine-learning multimodal-machine-translation punctuation-restoration speech-to-speech simultaneous-translation cascaded-speech-translation non-autoregressive-translation speech-to-subtitles

Updated Nov 10, 2021

bzhangGo / zero

Zero -- A neural machine translation system

transformer neural-machine-translation average-attention-network aan speech-translation depth-scaled-initialization deep-transformer l0drop adaptive-feature-selection massively-multilingual-translation opus-100 fast-bidirectional-decoder

Updated May 8, 2023
Python

ReneeYe / ConST

code for paper "Cross-modal Contrastive Learning for Speech Translation" (NAACL 2022)

translation machine-translation pytorch transformer neural-machine-translation spoken-language-processing speec speech-translation contrastive-learning naacl2022

Updated May 25, 2022
Python

ictnlp / DASpeech

Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".

machine-translation speech-translation speech-to-speech speech-to-speech-translation

Updated Jul 22, 2024
Python

hlt-mt / FBK-fairseq

Repository containing the open source code of works published at the FBK MT unit.

deep-learning pytorch speech-to-text subtitling gender-bias speech-translation simultaneous-translation

Updated Jul 1, 2024
Python

mt-upc / SHAS

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

speech speech-to-text audio-segmentation speech-translation wav2vec2

Updated Feb 9, 2023
Python

Rongjiehuang / awesome-speech-to-speech-translation

List of direct speech-to-speech translation papers.

awesome awesome-list speech-translation speech-to-speech-translation s2st

Updated Jan 31, 2023

ictnlp / STEMM

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

machine-translation speech-to-text speech-translation

Updated Oct 25, 2023
Python

Improve this page

Add a description, image, and links to the speech-translation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech-translation topic, visit your repo's landing page and select "manage topics."