Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Latency in TTS Synthesis on Android with Screen Readers #1337

Open
divineDev-dotcom opened this issue Sep 11, 2024 · 5 comments
Open

High Latency in TTS Synthesis on Android with Screen Readers #1337

divineDev-dotcom opened this issue Sep 11, 2024 · 5 comments

Comments

@divineDev-dotcom
Copy link

Hello,

I am using your TTS as an Android Text-to-Speech (TTS) engine for offline use, but I have encountered an issue with the audio synthesis. It is taking approximately 500 ms to speak the text on the screen when using screen readers on Android in the onSynthesize method.

Is there any option or solution to reduce this latency? I am trying to create a TTS system for Android that blind/low-vision users can use effectively with their screen readers, so minimizing latency is critical for accessibility.

@yuyun2000
Copy link

  1. Use the piper model, which has a faster inference speed; 2. Adjust the inference thread, if the thread is 1; 3. Use a better mobile phone

@csukuangfj
Copy link
Collaborator

Which model are you using and which kind of android phone, i.e., the CPU of your phone, are you using?

@nanaghartey
Copy link

From experience, latency depends on the kind of model, the device's processing power (CPU), and the length of the text being processed. Piper's "medium" quality models (around 60mb) have lower latency , compared to other models.

I was able to speed up inference by splitting the input text into batches, using punctuation as natural sentence boundaries, This allows for quicker synthesis of smaller chunks of text. On more powerful devices, this step may not be necessary, as they can handle larger texts efficiently. You can adjust the batching based on the device's capabilities.

@divineDev-dotcom
Copy link
Author

Which model are you using and which kind of android phone, i.e., the CPU of your phone, are you using?

here are the details:
phone: Motorola edge 40(processor: MediaTek Dimensity 7030 (6nm))
Model: vits-piper-en_US-lessac-medium

@csukuangfj
Copy link
Collaborator

phone: Motorola edge 40(processor: MediaTek Dimensity 7030 (6nm))

This phone has
2 Cortex-A78 and 6 Cortex-A55 CPUs.

If it uses Cortex A78 during synthesis, then it should be very fast.

If it uses Cortex A55, then it would be slow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants