-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Voice-to-text not working again #253
Comments
@hoshizorista Are you experiencing the same issue? If so, do you have any suggestions for a fix? Our weekly podcast relies on this heavily and we are missing having it. Thank you! |
@StudioDweller yeeeep, working on it! sorry for not noticing github didnt informed me, im actually finishing off a rework of the extension with some autostart and cool functions (such as use base gpt voices) ill have it ready for tomorrow at late night, hang in there! |
Thank you so much, Juan! You are the best. I attempted to use GPT4-01 Preview to troubleshoot it. I was impressed with how it identified the issue with the prompt/text area, but it feels like it might be a bit deeper of an issue than previous changes OpenAI has made to the page. I copied the response below just in case it can be of help to you, but it sounds like you’re already on top of it. Thanks again for keeping this alive! It’s very much appreciated and I will be singing your praises when we record our weekly podcast tomorrow night.
Cheers!
Larry
Certainly! Let's work together to identify and resolve why your browser extension is no longer populating the prompt area with the converted text on your custom GPT page. Based on the information and the provided HTML snippet, it appears that there have been changes in the page's structure that are affecting the extension's functionality.
## **1. Analyzing the Provided HTML**
Here's a simplified version of the relevant part of your HTML:
```html
<div class="ProseMirror" id="prompt-textarea" contenteditable="true">
<p data-placeholder="Message RAINA" class="placeholder">
<br class="ProseMirror-trailingBreak">
</p>
</div>
```
**Key Observations:**
1. **Element Type Change:**
- **Before:** The `prompt-textarea` was likely a `<textarea>` element.
- **Now:** `prompt-textarea` is a `<div>` with `contenteditable="true"`, which means it's a rich text editor powered by **ProseMirror**.
2. **Hidden Textarea:**
- There's a `<textarea>` element with `id="prompt-textarea"` that's set to `display: none;`. This suggests that the actual input area is now the `div` and not the `textarea`.
## **2. Identifying the Issue**
Given the change from a `<textarea>` to a `contenteditable` `<div>`, the extension's selector and method for inserting text need to be updated. The extension likely attempts to:
1. **Select the Textarea:**
- Using `jQuery("#prompt-textarea")` to select the `<textarea>`.
2. **Set the Value:**
- Using `textarea.value = "converted text"` or a similar method to insert text.
However, since the `<textarea>` is now hidden and the actual input area is the `contenteditable` `<div>`, these actions no longer affect the visible prompt area.
## **3. Steps to Troubleshoot and Fix**
### **A. Update the Extension's Selector**
**Current Selector:**
```javascript
var textarea = jQuery("#prompt-textarea")[0];
```
**Issue:** This selects the hidden `<textarea>`, not the visible `contenteditable` `<div>`.
**Solution:** Modify the selector to target the `div` with `id="prompt-textarea"` and `class="ProseMirror"`.
**Updated Selector:**
```javascript
var promptDiv = jQuery("#prompt-textarea")[0];
```
### **B. Modify the Text Insertion Method**
**Current Method:**
```javascript
function CN_SetTextareaValue(text) {
setNativeValue(textarea, text);
textarea.dispatchEvent(new Event('input', { bubbles: true }));
}
```
**Issue:** This function sets the `value` of a `<textarea>`, which doesn't affect the `contenteditable` `<div>`.
**Solution:** Update the function to set the text of the `contenteditable` `<div>`. You can use `innerText`, `textContent`, or ProseMirror's APIs to insert text.
**Updated Function:**
```javascript
function CN_SetPromptDivValue(text) {
var promptDiv = jQuery("#prompt-textarea")[0];
if (!promptDiv) {
console.error('Prompt div not found');
return;
}
// Clear existing content
promptDiv.innerHTML = '';
// Insert new text
promptDiv.innerText = text;
// Dispatch input event to notify ProseMirror of the change
promptDiv.dispatchEvent(new Event('input', { bubbles: true }));
}
```
**Explanation:**
1. **Clear Existing Content:**
- `promptDiv.innerHTML = '';` ensures that previous content is removed before inserting new text.
2. **Insert New Text:**
- `promptDiv.innerText = text;` sets the text content of the `div`. Alternatively, you can use `promptDiv.textContent = text;` depending on your requirements.
3. **Dispatch Input Event:**
- This notifies ProseMirror that the content has changed, ensuring that any associated event listeners or bindings are triggered.
### **C. Update All References to the Textarea**
Ensure that **all** parts of the extension that reference the `<textarea>` are updated to interact with the `contenteditable` `<div>`. This includes:
1. **Sending Messages:**
- If the extension programmatically sends messages by simulating button clicks or triggering events, ensure it interacts with the `div` correctly.
2. **Event Listeners:**
- If there are event listeners attached to the `<textarea>`, update them to listen to the `div` instead.
### **D. Example: Updating the Send Functionality**
**Current Send Function:**
```javascript
function CN_SendMessage(text) {
CN_SetTextareaValue(text);
// Click the send button
jQuery("#send-button").click();
}
```
**Issue:** It sets the value of the hidden `<textarea>` and attempts to click a button that may have a different selector.
**Solution:** Update the send functionality to interact with the `contenteditable` `<div>` and ensure the send button is correctly selected.
**Updated Send Function:**
```javascript
function CN_SendMessage(text) {
CN_SetPromptDivValue(text);
// Ensure the send button selector is correct
var sendButton = jQuery("button[aria-label='Send prompt']")[0];
if (sendButton) {
sendButton.click();
} else {
console.error('Send button not found');
}
}
```
**Explanation:**
1. **Set Text in Div:**
- Uses the updated `CN_SetPromptDivValue` to insert text into the `contenteditable` div.
2. **Select Send Button:**
- Uses a more specific selector: `button[aria-label='Send prompt']` to find the send button.
3. **Click Send Button:**
- Programmatically clicks the send button to submit the message.
### **E. Verify the Send Button Selector**
In your provided HTML, the send button is defined as:
```html
<button disabled="" aria-label="Send prompt" data-testid="send-button" class="...">
<!-- SVG Icon -->
</button>
```
Ensure that the selector accurately targets this button. For example:
```javascript
var sendButton = jQuery("button[aria-label='Send prompt']")[0];
```
If there are multiple buttons with similar attributes, make the selector more specific.
### **F. Test the Updated Extension**
1. **Reload the Extension:**
- After making changes to the extension's code, reload it in the browser:
- **Chrome:** Go to `chrome://extensions/`, enable **Developer mode**, and click **Reload** on your extension.
- **Edge:** Go to `edge://extensions/`, enable **Developer mode**, and click **Reload**.
2. **Clear Cache (Optional but Recommended):**
- Sometimes, cached scripts can interfere. Clear the browser cache to ensure the latest code is executed.
3. **Test the Functionality:**
- Activate the extension and perform a speech-to-text operation.
- Verify that the prompt area is now populated with the converted text.
- Check if the send button is triggered correctly.
4. **Monitor Console for Errors:**
- Open **Developer Tools** (`Ctrl+Shift+I` or `Cmd+Option+I`).
- Navigate to the **Console** tab.
- Look for any error messages related to the extension. Address them as needed.
## **4. Additional Recommendations**
### **A. Utilize ProseMirror's APIs (Advanced)**
Since the prompt area is powered by **ProseMirror**, leveraging its APIs can offer more reliable interactions, especially for complex text manipulations.
**Example:**
```javascript
function CN_SetPromptDivValue(text) {
var promptDiv = jQuery("#prompt-textarea")[0];
if (!promptDiv) {
console.error('Prompt div not found');
return;
}
// Access ProseMirror's editor view if available
if (promptDiv.pmView && promptDiv.pmView.dispatch) {
// Create a transaction to insert text at the end
var transaction = promptDiv.pmView.state.tr.insertText(text);
promptDiv.pmView.dispatch(transaction);
} else {
// Fallback to setting innerText
promptDiv.innerHTML = '';
promptDiv.innerText = text;
promptDiv.dispatchEvent(new Event('input', { bubbles: true }));
}
}
```
**Note:** This approach assumes that the ProseMirror editor instance (`pmView`) is accessible. If not, you might need to explore the page's JavaScript to interact with ProseMirror directly.
### **B. Ensure Proper Permissions**
Make sure that your extension has the necessary permissions to interact with the page's DOM:
- **Content Scripts:**
- If your extension uses content scripts, ensure they are injected into the correct pages.
- **Permissions in `manifest.json`:**
- Verify that the `permissions` section includes access to the necessary URLs or domains.
**Example `manifest.json`:**
```json
{
"name": "Talk-to-ChatGPT",
"version": "2.9.0",
"manifest_version": 3,
"permissions": [
"activeTab",
"scripting",
"storage",
"microphone"
],
"content_scripts": [
{
"matches": ["https://your-custom-gpt-page.com/*"],
"js": ["contentScript.js"]
}
],
...
}
```
### **C. Debugging Steps**
1. **Console Logs:**
- Add `console.log` statements at critical points in your extension's code to trace execution.
2. **Error Handling:**
- Ensure that all possible errors are caught and logged, aiding in identifying issues.
3. **Isolation Testing:**
- Temporarily disable other extensions to rule out conflicts.
4. **Browser Compatibility:**
- Test the extension across different browsers (if supported) to see if the issue is browser-specific.
### **D. Update Extension Documentation**
If you've made changes to the extension's functionality or requirements, update any accompanying documentation to reflect these changes. This ensures future maintenance is smoother and other users (if any) can understand the modifications.
## **5. Example: Comprehensive Function Update**
Here's an example of how you might comprehensively update a function responsible for inserting text and sending the message:
```javascript
// Updated function to set text in the ProseMirror div and send the message
function CN_SendMessage(text) {
var promptDiv = jQuery("#prompt-textarea")[0];
if (!promptDiv) {
console.error('Prompt div not found');
return;
}
// Insert text into the ProseMirror editor
promptDiv.innerHTML = ''; // Clear existing content
promptDiv.innerText = text; // Set new text
promptDiv.dispatchEvent(new Event('input', { bubbles: true })); // Notify ProseMirror
// Find and click the send button
var sendButton = jQuery("button[aria-label='Send prompt']")[0];
if (sendButton) {
sendButton.click();
} else {
console.error('Send button not found');
}
}
// Example usage after speech-to-text conversion
function CN_HandleSpeechResult(finalTranscript) {
console.log("[SPEECH-REC] Final Transcript: " + finalTranscript);
// Populate the prompt area
CN_SendMessage(finalTranscript);
}
```
**Explanation:**
1. **Clear Existing Content:**
- Ensures that previous messages don't interfere with the new input.
2. **Insert New Text:**
- Uses `innerText` to safely insert plain text. If you need to preserve formatting, consider `innerHTML`, but ensure the text is sanitized to prevent XSS vulnerabilities.
3. **Dispatch Input Event:**
- Notifies ProseMirror of the content change, allowing it to update its internal state.
4. **Click Send Button:**
- Programmatically triggers the send action to submit the message.
## **6. Verifying the Fix**
After implementing the updates:
1. **Reload the Extension:**
- Navigate to `chrome://extensions/` (or the equivalent in your browser).
- Enable **Developer mode**.
- Click **Reload** on the Talk-to-ChatGPT extension.
2. **Test Speech-to-Text:**
- Activate the speech recognition.
- Speak a message and observe if the prompt area is populated correctly.
- Ensure that the send button is triggered, and the message is submitted to ChatGPT.
3. **Monitor Console for Logs and Errors:**
- Open **Developer Tools**.
- Check the **Console** for any error messages or logs that indicate the flow of execution.
4. **Iterate as Needed:**
- If issues persist, use the console logs to pinpoint where the process breaks and adjust accordingly.
## **7. Final Thoughts**
Transitioning from a `<textarea>` to a `contenteditable` `<div>` introduces complexities, especially when interacting programmatically. By updating your selectors and methods to align with the new HTML structure, you can restore the extension's functionality. Additionally, leveraging ProseMirror's APIs (if accessible) can offer more robust interactions.
If you encounter further issues or need more detailed assistance with specific parts of the extension's code, feel free to share those sections, and I'll be happy to help!
… On Sep 25, 2024, at 7:31 PM, Juan Granados ***@***.***> wrote:
@StudioDweller <https://github.com/StudioDweller> yeeeep, working on it! sorry for not noticing github didnt informed me, im actually finishing off a rework of the extension with some autostart and cool functions (such as use base gpt voices) ill have it ready for tomorrow at late night, hang in there!
—
Reply to this email directly, view it on GitHub <#253 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BJSQL5X7G26NAM2UD6EMXR3ZYNBUNAVCNFSM6AAAAABODYE3BCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZVGQ2DGNZRGU>.
You are receiving this because you were mentioned.
|
@hoshizorista We would love to have you as a guest on our podcast. Let me know if that’s something that you would be interested in. We sincerely appreciate your efforts with maintaining this extension and would love to talk to you about it. |
Hi @hoshizorista also really appreciate you working on this - I've been using the extension as a way to experiment with chat gpt in performance so getting it back on line would be amazing. There isn't really a comparable tool. Also @StudioDweller would love to know more about where to find your podcast - sounds interesting. |
@nowallslive I do a weekly podcast called Up Against Reality on all things AI along my co-host Chris and we leverage this extension for realtime interaction with our AI co-host/custom GPT we call RAINA. The podcast is available on most of the major podcast platforms. This episode is a good showcase of realtime interactions using this extension. Thanks for your interest! https://upagainstreality.com/2024/03/12/rainas-20-questions/ |
@StudioDweller It would be an honor :), @nowallslive My pleasure! I just released the update on my fork, added some new functions, I'm praying is not that buggy haha, please check it out and let me know if it works for you guys, https://github.com/hoshizorista/talkgpt/tree/main Just download the extension from the latest release, decompress, make sure to uninstall all previous versions, install and enjoy! lmk if you had any issues You guys can reefer to my fork, looks like C-Nedelu already moved on from this so I'll work on my fork to keep his work alive so it works for all of us! |
@hoshizorista 👍 hello friend , First of all, thank you so much for your incredible work on the extension! I’ve really been enjoying its features, and I’m excited about the new updates you mentioned, like autostart and the base GPT voices. I wanted to recommend a service that might interest you: Fish.audio. It’s quite similar to Eleven Labs, but much more affordable and accessible. They offer 50 free uses per day, and their API pricing for cloned voices is significantly lower compared to Eleven Labs. This service allows you to use a high-quality voice through its API at a very low cost for much longer periods, which could be a more budget-friendly option compared to Eleven Labs. Eleven Labs can get expensive quickly and is hard to use for ongoing conversations with the chatbot because it consumes too much money, making it impractical for constant use—more suited for specific, punctual moments. That’s why I’d like to recommend Fish Audio, as I use it for my projects, and it works really well for me. Additionally, they offer an API that could be adapted to your extension. I think it would be awesome to have the option to integrate Fish Audio into your ChatGPT extension for Text-to-Speech (TTS). The quality is excellent, the latency is super fast, and the pricing is very competitive. Here’s the link to their site: https://fish.audio/, and here is their documentation: https://docs.fish.audio/. I really believe this would be a great addition to the extension, and I’m sure many users would appreciate it. If you're interested in exploring the integration, I’d love to support the project, and I’d be happy to make a donation to your PayPal as a token of appreciation for adding this feature. I’m confident many users would be excited about the idea as well! Thanks for your time, and keep up the great work. Cheers! |
@enrix507 Hey! sounds like a good idea! havent heard of it but if it supports streaming its very likely it can be added, gonna look into it! |
It's not putting the text in the message box to send. I received an email from OpenAI yesterday saying that the default model is changing to "gpt-4o-2024-08-06". I wouldn't think that would break it, but I guess they changed something on the page.
The text was updated successfully, but these errors were encountered: