less latency idea #857
Replies: 4 comments
-
Thanks for asking your question about Deepgram! If you didn't already include it in your post, please be sure to add as much detail as possible so we can assist you efficiently, such as:
|
Beta Was this translation helpful? Give feedback.
-
You settings, namely This is a guide for measuring latency: https://developers.deepgram.com/docs/measuring-streaming-latency it recommends only making these measurements on interim results, but it's actually also valid for final results if those results are also |
Beta Was this translation helpful? Give feedback.
-
Hi Team I have a clear understanding of the
About the Utterance End EventIf we do not receive the This logic helps us avoid noise while ensuring the LLM call is made. Current LogicWe have two triggers for making an LLM call:
One of these two triggers will always occur, so we are confident about our LLM call workflow. Concern I am exploring ways to eliminate interim results (
Question
Additionally, how much latency improvement can we expect if we set For reference, we are following the implementation at Deepgram-Twilio Streaming Voice Agent. I’d appreciate any suggestions or best practices to reduce latency while retaining functionality for interruption detection and maintaining reliable triggers for LLM calls. Thank you! CODE SNIPPET. our endpointing is of 50 ms can we remove interm results logic with the other logic deepgram.addListener(LiveTranscriptionEvents.Transcript, (data) => {
deepgram.addListener(LiveTranscriptionEvents.UtteranceEnd, () => { |
Beta Was this translation helpful? Give feedback.
-
This is very true, utterance_end does catch important edge-cases. I wanted to suggest maybe trying out Deepgram's new voice agent API in early access: https://deepgram.com/learn/introducing-ai-voice-agent-api it handles these conversational flow issues with innovative techniques under the hood and can greatly simplify things.. You can even bring your own LLM. |
Beta Was this translation helpful? Give feedback.
-
see let me tell u by default the endpointing is 10 seconds like as soon as user speaks if the text is ready it will give the text after 10 ms i am using STT fine cool i dont need any interim results mostly i dont want to use this utternace_end_ms=1000 like my first question i have is IF I DONT USE THE UTTERANCE_END_MS also why it is taking 1.2 seconds to give the output if is use also its giving me the same latency liek i jsut want the text comes after 10 ms i dont want anything else to be used then how much latency will be there like tell me coming latency going latency + trabscription latency . i want the STT to come below 1 like around 500 ms is that possible .what is the best possible number so that i can have a transcription . i dont care about the speaker jsut tell me what is the best latency number i can get by using all those parameter
the latency is 1.2 to 2 i want to make to 0.5 to 0.9 is it possible ignore .BEST LATENCY FIGURE PLEASE????
Beta Was this translation helpful? Give feedback.
All reactions