Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OpenAI Whisper for text-to-speech transcription #1

Open
kevinjosethomas opened this issue May 30, 2024 · 0 comments
Open

Use OpenAI Whisper for text-to-speech transcription #1

kevinjosethomas opened this issue May 30, 2024 · 0 comments
Labels
enhancement New feature or request expressive English → ASL technology

Comments

@kevinjosethomas
Copy link
Owner

Description

The current interface uses the browser's built-in SpeechRecognition object through the react-speech-recognition library. While this is functional, it is not as accurate as Whisper and also often misses out on the first few words.

The ideal solution would be a client-side solution to reduce the load on the server once this service is publicly accessible. However, I would also prefer to not send voice recordings to OpenAI so a locally hosted instance of Whisper on the Flask server might have to be the approach. I attempted to do this by transmitting audio from the client via websockets but it didn't quite work out. The whisper library is not really designed for relatime transcription, but rather uploaded files that are transcribed over time.

The other alternative is to use something like use-whisper and send the voice recordings to OpenAI to transcribe. This would reduce the server load and also make the transcription more reliable. Maybe toggling between two options for privacy might be the go-to solution in the future.

@kevinjosethomas kevinjosethomas added enhancement New feature or request expressive English → ASL technology labels May 30, 2024
@kevinjosethomas kevinjosethomas changed the title Use OpenAI Whisper for Text-to-Speech Transcription Use OpenAI Whisper for text-to-speech transcription May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request expressive English → ASL technology
Projects
None yet
Development

No branches or pull requests

1 participant