Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added speech to text capability #275

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

navyseal4000
Copy link

Verify your system default microphone is the one you're testing with, as that's the primary limitation of this initial text to speech implementation for voice prompting.

@chrismahoney chrismahoney added the enhancement New feature or request label Nov 14, 2024
@chrismahoney
Copy link
Collaborator

Awesome! Once we settle in on some provider work this is on my radar for when we've got room for feature adds. 👍

@wonderwhy-er
Copy link
Collaborator

I actually pulled it and merged with provider work.
There are conflicts but minimal ones, it can go in parallel.

Tested here and seems to work.
There is one thing I would fix before merging
https://www.youtube.com/watch?v=3Gc0yOgx-EQ

When user submits we should clean out ongoing text so that when he speaks next time its new text.

There are some other potential UX changes.
I would love to play with allowing it to do commands.
Aka "stop/start/submit" so I can control it with my voice only.

Use "Hello Google/Alexa" style wake up and go to sleep commands?

Ugh so exciting, thanks for your great work @navyseal4000

@milutinke
Copy link

milutinke commented Nov 14, 2024

Oof, I started working on #281 before this was created.

@milutinke
Copy link

milutinke commented Nov 14, 2024

@wonderwhy-er Can we combine the two so it's not time wasted?
Maybe use this when available and mine when the browser doesn't support this.

@wonderwhy-er
Copy link
Collaborator

Oof, I started working on #281 before this was created.

that is why I am for posting PRs early in draft mode for early feedback.
I shared in the group that my policy is to review ones who added PR earlier first.

@navyseal4000
Copy link
Author

Won't have time this morning to get the enhancements done, I'll try to get to them later tonight

@navyseal4000
Copy link
Author

In fact if you'd like, feel free to finish this @milutinke and if you can, feel free to take the author role. Just lmk if you do so we don't both work on it tonight

@milutinke
Copy link

milutinke commented Nov 14, 2024

In fact if you'd like, feel free to finish this @milutinke and if you can, feel free to take the author role. Just lmk if you do so we don't both work on it tonight

I had quite a busy day, sorry for the late reply, I won't have the time to do anything at least up to 22. of November.
Wrote a proposal to Eduard in my PR, waiting for his reply, but I'd say you can finish this, and if he agrees, I would pull your changes and then add my option as the fallback. Great job btw.

Edit:
PS: Maybe add a better indicator when the user is recording, just to make clear to them, maybe even like grab my code and adapt it to use that animation when speaking.

@chrismahoney
Copy link
Collaborator

chrismahoney commented Nov 15, 2024

#Voice support is an awesome feature to support, I’m really biased toward it from a human computer interface perspective so just full disclosure.

If we can all work together towards integration of this and #281 I am more than happy to help out where I can. Cheers!

Quick Edit: I’ll qualify this by saying this feature may end up on the roadmap in a certain priority, so please don’t see that as a disappointment; just making sure all the wheels roll in the same direction. 🤓

@wonderwhy-er
Copy link
Collaborator

wonderwhy-er commented Nov 15, 2024

I am pretty excited for this feature and already tested @milutinke solution that it works pretty well with one exception.

I am honestly willing to merge this and then add those improvements as separate PRs so that ball is rolling.

May be will get to this in the evening

And we can add other things in separate PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants