Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Mismatch Structs on STT REST #467

Merged
merged 1 commit into from
Sep 26, 2024

Conversation

davidvonthenen
Copy link
Contributor

@davidvonthenen davidvonthenen commented Sep 26, 2024

Proposed changes

Addresses: #466

Verified working modifying an example:
examples/speech-to-text/rest/url

Options:

        options: PrerecordedOptions = PrerecordedOptions(
            model="nova-2",
            smart_format=True,
            paragraphs=True,
        )

Result:

response: {
    "metadata": {
        "transaction_key": "deprecated",
        "request_id": "3c6973b5-d755-4687-97db-938530c82bd6",
        "sha256": "5324da68ede209a16ac69a38e8cd29cee4d754434a041166cda3a1f5e0b24566",
        "created": "2024-09-26T01:37:48.494Z",
        "duration": 17.566313,
        "channels": 1,
        "models": [
            "30089e05-99d1-4376-b32e-c263170674af"
        ],
        "model_info": {
            "30089e05-99d1-4376-b32e-c263170674af": {
                "name": "2-general-nova",
                "version": "2024-01-09.29447",
                "arch": "nova-2"
            }
        }
    },
    "results": {
        "channels": [
            {
                "alternatives": [
                    {
                        "transcript": "Yep. I said it before, and I'll say it again. Life moves pretty fast. You don't stop and look around once in a while, you could miss it.",
                        "confidence": 0.99853814,
                        "words": [
                            {
                                "word": "yep",
                                "start": 5.52,
                                "end": 6.02,
                                "confidence": 0.995842,
                                "punctuated_word": "Yep."
                            },
                            {
                                "word": "i",
                                "start": 7.04,
                                "end": 7.2799997,
                                "confidence": 0.51270777,
                                "punctuated_word": "I"
                            },
                            {
                                "word": "said",
                                "start": 7.2799997,
                                "end": 7.44,
                                "confidence": 0.96725076,
                                "punctuated_word": "said"
                            },
                            {
                                "word": "it",
                                "start": 7.44,
                                "end": 7.6,
                                "confidence": 0.9997285,
                                "punctuated_word": "it"
                            },
                            {
                                "word": "before",
                                "start": 7.6,
                                "end": 7.9199996,
                                "confidence": 0.7845937,
                                "punctuated_word": "before,"
                            },
                            {
                                "word": "and",
                                "start": 7.9199996,
                                "end": 8.16,
                                "confidence": 0.9998627,
                                "punctuated_word": "and"
                            },
                            {
                                "word": "i'll",
                                "start": 8.16,
                                "end": 8.32,
                                "confidence": 0.9998943,
                                "punctuated_word": "I'll"
                            },
                            {
                                "word": "say",
                                "start": 8.32,
                                "end": 8.48,
                                "confidence": 0.9996518,
                                "punctuated_word": "say"
                            },
                            {
                                "word": "it",
                                "start": 8.48,
                                "end": 8.639999,
                                "confidence": 0.99982834,
                                "punctuated_word": "it"
                            },
                            {
                                "word": "again",
                                "start": 8.639999,
                                "end": 9.139999,
                                "confidence": 0.97370684,
                                "punctuated_word": "again."
                            },
                            {
                                "word": "life",
                                "start": 9.991312,
                                "end": 10.391313,
                                "confidence": 0.99573696,
                                "punctuated_word": "Life"
                            },
                            {
                                "word": "moves",
                                "start": 10.391313,
                                "end": 10.711312,
                                "confidence": 0.9988587,
                                "punctuated_word": "moves"
                            },
                            {
                                "word": "pretty",
                                "start": 10.711312,
                                "end": 11.031313,
                                "confidence": 0.9996013,
                                "punctuated_word": "pretty"
                            },
                            {
                                "word": "fast",
                                "start": 11.031313,
                                "end": 11.531313,
                                "confidence": 0.9995537,
                                "punctuated_word": "fast."
                            },
                            {
                                "word": "you",
                                "start": 12.071312,
                                "end": 12.231313,
                                "confidence": 0.9514584,
                                "punctuated_word": "You"
                            },
                            {
                                "word": "don't",
                                "start": 12.231313,
                                "end": 12.4713125,
                                "confidence": 0.99988735,
                                "punctuated_word": "don't"
                            },
                            {
                                "word": "stop",
                                "start": 12.4713125,
                                "end": 12.711312,
                                "confidence": 0.99979633,
                                "punctuated_word": "stop"
                            },
                            {
                                "word": "and",
                                "start": 12.711312,
                                "end": 12.871312,
                                "confidence": 0.9987136,
                                "punctuated_word": "and"
                            },
                            {
                                "word": "look",
                                "start": 12.871312,
                                "end": 13.031313,
                                "confidence": 0.9996673,
                                "punctuated_word": "look"
                            },
                            {
                                "word": "around",
                                "start": 13.031313,
                                "end": 13.351313,
                                "confidence": 0.9995766,
                                "punctuated_word": "around"
                            },
                            {
                                "word": "once",
                                "start": 13.351313,
                                "end": 13.591312,
                                "confidence": 0.9981976,
                                "punctuated_word": "once"
                            },
                            {
                                "word": "in",
                                "start": 13.591312,
                                "end": 13.751312,
                                "confidence": 0.99853814,
                                "punctuated_word": "in"
                            },
                            {
                                "word": "a",
                                "start": 13.751312,
                                "end": 13.831312,
                                "confidence": 0.9861081,
                                "punctuated_word": "a"
                            },
                            {
                                "word": "while",
                                "start": 13.831312,
                                "end": 14.331312,
                                "confidence": 0.92629486,
                                "punctuated_word": "while,"
                            },
                            {
                                "word": "you",
                                "start": 14.631312,
                                "end": 14.791312,
                                "confidence": 0.99702364,
                                "punctuated_word": "you"
                            },
                            {
                                "word": "could",
                                "start": 14.791312,
                                "end": 14.951313,
                                "confidence": 0.9983536,
                                "punctuated_word": "could"
                            },
                            {
                                "word": "miss",
                                "start": 14.951313,
                                "end": 15.191313,
                                "confidence": 0.9984425,
                                "punctuated_word": "miss"
                            },
                            {
                                "word": "it",
                                "start": 15.191313,
                                "end": 15.691313,
                                "confidence": 0.9950415,
                                "punctuated_word": "it."
                            }
                        ],
                        "paragraphs": {
                            "transcript": "\nYep. I said it before, and I'll say it again. Life moves pretty fast. You don't stop and look around once in a while, you could miss it.",
                            "paragraphs": [
                                {
                                    "sentences": [
                                        {
                                            "text": "Yep.",
                                            "start": 5.52,
                                            "end": 6.02
                                        },
                                        {
                                            "text": "I said it before, and I'll say it again.",
                                            "start": 7.04,
                                            "end": 9.139999
                                        },
                                        {
                                            "text": "Life moves pretty fast.",
                                            "start": 9.991312,
                                            "end": 11.531313
                                        },
                                        {
                                            "text": "You don't stop and look around once in a while, you could miss it.",
                                            "start": 12.071312,
                                            "end": 15.691313
                                        }
                                    ],
                                    "start": 5.52,
                                    "end": 15.691313,
                                    "num_words": 28
                                }
                            ]
                        }
                    }
                ]
            }
        ]
    }
}

Types of changes

What types of changes does your code introduce to the community Python SDK?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update or tests (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING doc
  • I have lint'ed all of my code using repo standards
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Further comments

NA

Summary by CodeRabbit

  • New Features

    • Introduced new entities for handling WebSocket and REST responses, including ListenWSWord, ListenWSAlternative, ListenWSChannel, and their REST counterparts.
    • Enhanced data representation for WebSocket responses, improving the handling of words, alternatives, and channels.
  • Bug Fixes

    • Clarified the handling of _asyncio_thread in the Microphone class for better state management.
  • Refactor

    • Streamlined import statements across various modules to focus on new listening functionalities, removing outdated entities.

Copy link
Contributor

coderabbitai bot commented Sep 26, 2024

Walkthrough

The changes involve a comprehensive restructuring of import statements and class definitions across multiple files in the Deepgram client library. The modifications primarily focus on removing outdated entities such as Alternative, Channel, and Word, and replacing them with new entities related to WebSocket and REST functionalities, including ListenWSAlternative, ListenWSChannel, ListenWSWord, ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord. This shift indicates a reorganization of the module's structure to enhance clarity and functionality.

Changes

File Path Change Summary
deepgram/__init__.py Removed imports for Alternative, Channel, and Word; added imports for ListenWSAlternative, ListenWSChannel, ListenWSWord, ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord.
deepgram/audio/microphone/microphone.py Updated _asyncio_thread attribute to be Optional[threading.Thread] = None and simplified its handling in start and finish methods.
deepgram/client.py Removed imports for Alternative, Channel, and Word; added imports for ListenWSAlternative, ListenWSChannel, ListenWSWord, ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord.
deepgram/clients/__init__.py Removed imports for Alternative, Channel, and Word; added imports for ListenWSAlternative, ListenWSChannel, ListenWSWord, ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord.
deepgram/clients/common/__init__.py Removed exports for Word, Alternative, and Channel.
deepgram/clients/common/v1/__init__.py Removed imports for Word, Alternative, and Channel.
deepgram/clients/common/v1/shared_response.py Removed classes Word, Alternative, and Channel.
deepgram/clients/listen/__init__.py Removed exports for Alternative, Channel, and Word; added exports for ListenWSAlternative, ListenWSChannel, ListenWSWord, ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord.
deepgram/clients/listen/client.py Removed variables for Alternative, Channel, and Word; added variables for ListenWSAlternative, ListenWSChannel, ListenWSWord, ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord.
deepgram/clients/listen/v1/__init__.py Removed imports for Alternative, Channel, and Word; added imports for ListenWSAlternative, ListenWSChannel, ListenWSWord, ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord.
deepgram/clients/listen/v1/rest/__init__.py Removed exports for Alternative, Channel, and Word; added exports for ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord.
deepgram/clients/listen/v1/rest/response.py Introduced new classes ListenRESTWord, ListenRESTAlternative, and ListenRESTChannel, replacing previous classes and updating attributes accordingly.
deepgram/clients/listen/v1/websocket/__init__.py Removed exports for Alternative, Channel, and Word; added exports for ListenWSAlternative, ListenWSChannel, and ListenWSWord.
deepgram/clients/listen/v1/websocket/response.py Introduced new classes ListenWSWord, ListenWSAlternative, and ListenWSChannel, updating the response structure to align with new naming conventions.

Possibly related PRs

  • Reintroduce TTS WS #434: Modifies the deepgram/__init__.py file by changing import statements related to WebSocket functionalities, aligning with the main PR's focus on updating import statements for WebSocket-related entities.
  • Re-expose Error Classes #462: Involves changes to the deepgram/__init__.py file, specifically enhancing error handling capabilities, which may relate to the overall restructuring of imports and functionalities in the main PR.

Suggested reviewers

  • jpvajda
  • SandraRodgers
  • lukeocodes

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (7)
deepgram/clients/listen/v1/__init__.py (2)

Line range hint 1-130: Excellent organization of imports, consider adding type comments

The restructuring of imports into clear sections (backward compat, shared, common between rest and websocket, etc.) greatly improves the file's readability and maintainability. This organization aligns well with the PR objective and provides a clear separation between REST and WebSocket functionalities.

To further enhance the code's clarity, consider adding type comments to the import statements. This can help developers quickly understand the nature of each imported entity without needing to refer to their definitions.

Example:

from .rest import (
    ListenRESTAlternative,  # type: class
    ListenRESTChannel,  # type: class
    ListenRESTWord,  # type: class
)

Line range hint 1-130: Consider updating documentation to reflect the restructured imports

The changes made to this file are part of a larger refactoring effort to distinguish between REST and WebSocket implementations. To ensure a smooth transition for developers using this SDK, it would be beneficial to update any relevant documentation to reflect these changes.

Specifically, consider:

  1. Updating any API documentation that references the old import structure.
  2. Adding a migration guide if these changes might affect existing code using the SDK.
  3. Updating examples in the README or other guides to use the new import structure.
deepgram/clients/__init__.py (1)

124-126: Expanded public API: Consider updating documentation

The additions to the __init__.py file expand the SDK's public API by providing more granular access to REST and WebSocket specific components. This change offers developers greater flexibility and control over REST and WebSocket interactions without breaking existing code.

Given these API additions:

  1. Please ensure that the new components (ListenRESTAlternative, ListenRESTChannel, ListenRESTWord, ListenWSWord, ListenWSAlternative, and ListenWSChannel) are properly documented, including their purpose, usage, and any differences from their counterparts.
  2. Consider updating the SDK's main documentation to reflect these new capabilities and provide examples of when and how to use these more specific components.
  3. If there are any performance or functionality implications of using these new components over the existing ones, please highlight them in the documentation.

Also applies to: 153-155

deepgram/__init__.py (1)

124-125: LGTM: Commented out shared imports

The decision to comment out Alternative, Channel, and Word imports in the shared section is consistent with the PR objectives of addressing mismatched structs. This change likely prevents confusion between generic and API-specific versions of these components.

However, consider removing these commented-out imports if they are no longer needed to maintain code cleanliness. If you're keeping them for future reference, it might be worth adding a comment explaining why they are retained.

Also applies to: 137-137

deepgram/clients/listen/v1/websocket/response.py (2)

Line range hint 130-149: Correct the handling of channel and metadata in LiveResultResponse's __getitem__ method

The __getitem__ method treats channel and metadata as lists and attempts to iterate over them. However, in LiveResultResponse, channel is defined as a single ListenWSChannel instance, and metadata is a single Metadata instance.

This mismatch could lead to runtime errors. Update the code to handle these attributes as single instances.

Apply this diff to fix the issue:

 def __getitem__(self, key):
     _dict = self.to_dict()
     if "channel" in _dict:
-        _dict["channel"] = [
-            ListenWSChannel.from_dict(channel) for channel in _dict["channel"]
-        ]
+        _dict["channel"] = ListenWSChannel.from_dict(_dict["channel"])
     if "metadata" in _dict:
-        _dict["metadata"] = [
-            Metadata.from_dict(metadata) for metadata in _dict["metadata"]
-        ]
+        _dict["metadata"] = Metadata.from_dict(_dict["metadata"])
     return _dict[key]

31-92: Add unit tests for the new data classes

New classes (ListenWSWord, ListenWSAlternative, ListenWSChannel) have been added to handle WebSocket responses. However, there are no accompanying unit tests to verify their functionality.

To ensure these classes work as intended, please add unit tests that cover their initialization, methods, and interactions. Would you like me to help generate sample unit tests or open a GitHub issue to track this task?

deepgram/clients/listen/v1/rest/response.py (1)

Line range hint 321-344: Consider refactoring ListenRESTAlternative to reduce instance attributes

Similar to ListenRESTWord, the ListenRESTAlternative class has a large number of instance attributes, leading to a disabled pylint warning. Refactoring could enhance code maintainability by reducing complexity.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between ee4a581 and e20766d.

📒 Files selected for processing (14)
  • deepgram/init.py (4 hunks)
  • deepgram/audio/microphone/microphone.py (3 hunks)
  • deepgram/client.py (2 hunks)
  • deepgram/clients/init.py (2 hunks)
  • deepgram/clients/common/init.py (0 hunks)
  • deepgram/clients/common/v1/init.py (0 hunks)
  • deepgram/clients/common/v1/shared_response.py (0 hunks)
  • deepgram/clients/listen/init.py (2 hunks)
  • deepgram/clients/listen/client.py (4 hunks)
  • deepgram/clients/listen/v1/init.py (2 hunks)
  • deepgram/clients/listen/v1/rest/init.py (1 hunks)
  • deepgram/clients/listen/v1/rest/response.py (9 hunks)
  • deepgram/clients/listen/v1/websocket/init.py (1 hunks)
  • deepgram/clients/listen/v1/websocket/response.py (3 hunks)
💤 Files not reviewed due to no reviewable changes (3)
  • deepgram/clients/common/init.py
  • deepgram/clients/common/v1/init.py
  • deepgram/clients/common/v1/shared_response.py
🧰 Additional context used
📓 Learnings (1)
deepgram/clients/listen/v1/__init__.py (2)
Learnt from: dvonthenen
PR: deepgram/deepgram-python-sdk#426
File: deepgram/clients/listen/v1/__init__.py:36-43
Timestamp: 2024-07-01T19:17:04.194Z
Learning: Unused imports in `deepgram/clients/listen/v1/__init__.py` are retained to maintain backward compatibility and should not be flagged for removal in reviews.
Learnt from: dvonthenen
PR: deepgram/deepgram-python-sdk#426
File: deepgram/clients/listen/v1/websocket/__init__.py:8-8
Timestamp: 2024-07-01T19:21:39.778Z
Learning: Unused imports in `deepgram/clients/listen/v1/websocket/__init__.py` are retained to maintain backward compatibility and should not be flagged for removal in reviews.
🔇 Additional comments not posted (25)
deepgram/clients/listen/v1/rest/__init__.py (1)

56-58: LGTM! Verify usage of new entity names.

The addition of ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord appears to be part of a refactoring effort to make the API more explicit and avoid potential naming conflicts. This change aligns with the PR objective of fixing mismatched structs in the STT REST API.

To ensure consistency, let's verify that the old entity names are not used elsewhere in the codebase:

If the script returns any usage of the old entity names or no usage of the new entity names, please update the relevant files to use the new ListenREST* prefixed entities.

✅ Verification successful

LGTM! All usages of old entity names have been successfully replaced with the new ones.

The refactoring aligns with the PR objective of fixing mismatched structs in the STT REST API. The old entity names are no longer used in executable code.

  • No further actions are required unless you choose to clean up the residual comments.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining usage of old entity names

# Test: Search for usage of old entity names. Expect: No results
echo "Searching for usage of 'Alternative':"
rg --type python -g '!**/rest/__init__.py' '\bAlternative\b'

echo "Searching for usage of 'Channel':"
rg --type python -g '!**/rest/__init__.py' '\bChannel\b'

echo "Searching for usage of 'Word':"
rg --type python -g '!**/rest/__init__.py' '\bWord\b'

# Test: Verify new entity names are used in other files. Expect: Some results
echo "Verifying usage of new entity names:"
rg --type python -g '!**/rest/__init__.py' 'ListenRESTAlternative|ListenRESTChannel|ListenRESTWord'

Length of output: 737


Script:

#!/bin/bash
# Description: Check for any remaining usage of old entity names with corrected file type

# Test: Search for usage of old entity names. Expect: No results
echo "Searching for usage of 'Alternative':"
rg --type py -g '!**/rest/__init__.py' '\bAlternative\b'

echo "Searching for usage of 'Channel':"
rg --type py -g '!**/rest/__init__.py' '\bChannel\b'

echo "Searching for usage of 'Word':"
rg --type py -g '!**/rest/__init__.py' '\bWord\b'

# Test: Verify new entity names are used in other files. Expect: Some results
echo "Verifying usage of new entity names:"
rg --type py -g '!**/rest/__init__.py' 'ListenRESTAlternative|ListenRESTChannel|ListenRESTWord'

Length of output: 4194


Script:

#!/bin/bash
# Description: Use ast-grep to search for actual code usage of old entity names

# Search for class instantiations or variable assignments using old entity names
echo "Searching for code usage of 'Alternative':"
ast-grep --lang python --pattern $'$_ = Alternative($_)'

echo "Searching for code usage of 'Channel':"
ast-grep --lang python --pattern $'$_ = Channel($_)'

echo "Searching for code usage of 'Word':"
ast-grep --lang python --pattern $'$_ = Word($_)'

# Additionally, search for function arguments or returns using old entity names
echo "Searching for old entity names in function signatures and returns:"
ast-grep --lang python --pattern $'def $_(self, alternative: Alternative):'

ast-grep --lang python --pattern $'def $_(self) -> Channel:'

Length of output: 725

deepgram/clients/listen/__init__.py (1)

70-72: LGTM! Improved clarity in entity naming.

The addition of specific REST and WebSocket entities (ListenRESTAlternative, ListenRESTChannel, ListenRESTWord, ListenWSWord, ListenWSAlternative, ListenWSChannel) aligns well with the PR objectives. This change improves clarity by differentiating between REST and WebSocket-related components.

To ensure these changes don't introduce breaking changes elsewhere in the codebase, please run the following script:

Also applies to: 97-99

✅ Verification successful

Verification Successful: No usages of old entities found and new entities are correctly used.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining usage of the old entities (Alternative, Channel, Word) in the codebase.

# Test: Search for usage of old entities. Expect: No results, indicating all usages have been updated.
rg --type python 'from .*import.*\b(Alternative|Channel|Word)\b' || echo "No usages found."

# Test: Verify that the new entities are being used correctly. Expect: Results showing correct usage of new entities.
rg --type python '\b(ListenRESTAlternative|ListenRESTChannel|ListenRESTWord|ListenWSWord|ListenWSAlternative|ListenWSChannel)\b'

Length of output: 302


Script:

#!/bin/bash
# Description: Re-check for any remaining usage of the old entities (Alternative, Channel, Word) in the codebase with corrected file type.

# Test: Search for usage of old entities. Expect: No results, indicating all usages have been updated.
rg --type py 'from .*import.*\b(Alternative|Channel|Word)\b' || echo "No usages found."

# Test: Verify that the new entities are being used correctly. Expect: Results showing correct usage of new entities.
rg --type py '\b(ListenRESTAlternative|ListenRESTChannel|ListenRESTWord|ListenWSWord|ListenWSAlternative|ListenWSChannel)\b'

Length of output: 5278

deepgram/clients/listen/v1/__init__.py (1)

99-101: Approve restructuring of imports for REST and WebSocket specific classes

The changes to import statements improve the structure by clearly differentiating between REST and WebSocket specific classes (ListenRESTAlternative, ListenRESTChannel, ListenRESTWord, ListenWSWord, ListenWSAlternative, ListenWSChannel). This aligns well with the PR objective of fixing mismatch structs on STT REST and provides a more explicit API design.

To ensure backward compatibility is maintained, please run the following script:

If the script returns any results, it may indicate potential breaking changes that need to be addressed.

Also applies to: 128-130

deepgram/clients/listen/client.py (4)

59-61: LGTM: New REST-specific aliases added

The addition of ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord aligns well with the PR objective of fixing mismatches in structs for the STT REST API. These new aliases maintain consistency with the existing naming conventions and contribute to a clearer separation between REST and WebSocket implementations.


83-85: LGTM: WebSocket-specific aliases added for consistency

The addition of ListenWSAlternative, ListenWSChannel, and ListenWSWord complements the earlier REST-specific aliases. This maintains consistency in the separation of REST and WebSocket implementations throughout the SDK. The naming convention aligns well with existing WebSocket-specific entities.


155-157: LGTM: Proper versioning of new aliases

The assignment of the new REST and WebSocket-specific aliases to their respective 'Latest' versions (ListenRESTAlternativeLatest, ListenRESTChannelLatest, ListenRESTWordLatest, ListenWSAlternativeLatest, ListenWSChannelLatest, ListenWSWordLatest) is consistent with the file's approach to version management. This practice facilitates easier updates and maintains backwards compatibility.

Also applies to: 172-174


59-61: Summary: Improved structure for REST and WebSocket entities

The changes introduced in this PR effectively address the objective of fixing mismatches in structs for the STT REST API. By adding separate aliases for REST and WebSocket-specific entities (Alternative, Channel, and Word), the code now has a clearer separation of concerns between these two communication methods.

Key improvements:

  1. Better organization of REST and WebSocket-specific components.
  2. Consistent naming conventions (ListenREST* and ListenWS*).
  3. Proper versioning through the use of *Latest aliases.

These changes should enhance the maintainability of the SDK and make it easier for developers to work with the appropriate structures for each communication method. The modifications are non-breaking and align well with the existing code structure.

Also applies to: 83-85, 155-157, 172-174

deepgram/clients/__init__.py (2)

124-126: LGTM: Addition of ListenREST imports aligns with PR objectives

The addition of ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord imports is consistent with the PR's goal of addressing mismatched structs in the STT REST API. These new imports likely provide more accurate representations of the API response structure, which should help resolve the issue mentioned in #466.


153-155: Approve WebSocket imports, but clarification needed

The addition of ListenWSWord, ListenWSAlternative, and ListenWSChannel imports appears to be a parallel update to the WebSocket API handling. While these changes seem beneficial for maintaining consistency between REST and WebSocket implementations, they weren't explicitly mentioned in the PR objectives.

Could you please clarify how these WebSocket-related changes relate to the "Fix Mismatch Structs on STT REST" objective? Are they part of a broader effort to standardize structures across both REST and WebSocket APIs?

deepgram/__init__.py (3)

92-94: LGTM: WebSocket-specific imports added

The addition of ListenWSAlternative, ListenWSChannel, and ListenWSWord imports aligns with the PR objectives of addressing mismatched structs. This change introduces a more specific naming convention for WebSocket-related components, which should improve code organization and reduce potential naming conflicts.


154-156: LGTM: REST-specific imports added

The addition of ListenRESTAlternative, ListenRESTChannel, and ListenRESTWord imports is consistent with the PR objectives and complements the earlier changes. These REST-specific versions follow the same naming convention as the WebSocket-specific versions, improving code organization and clarity.

This change, along with the earlier modifications, effectively addresses the issue of mismatched structs mentioned in the PR objectives.


92-94: Summary: Improved organization and separation of WebSocket and REST components

The changes in this file effectively address the PR objectives by:

  1. Adding WebSocket-specific imports (ListenWSAlternative, ListenWSChannel, ListenWSWord)
  2. Commenting out generic shared imports (Alternative, Channel, Word)
  3. Adding REST-specific imports (ListenRESTAlternative, ListenRESTChannel, ListenRESTWord)

These modifications improve the organization of the codebase by clearly separating WebSocket and REST-specific components. The consistent naming convention enhances code readability and maintainability. Overall, these changes successfully address the issue of mismatched structs mentioned in the PR objectives.

Also applies to: 124-125, 137-137, 154-156

deepgram/audio/microphone/microphone.py (4)

36-36: Improved type annotation for _asyncio_thread

The change from threading.Thread to Optional[threading.Thread] = None is a good improvement. It accurately reflects that _asyncio_thread can be None in some cases, which aligns with its usage in the class. This change enhances type safety and makes the code more self-documenting.


145-145: Explicit assignment of None to _asyncio_thread

Setting self._asyncio_thread = None when using a regular threaded callback is a good practice. It ensures consistency with the new type annotation and clearly indicates that no asyncio thread is being used in this case. This change improves code clarity and helps prevent potential issues with leftover thread references.


271-271: Removal of type ignore comment

The removal of the type ignore comment from self._asyncio_thread = None is appropriate given the new type annotation. This change simplifies the code by removing unnecessary comments and indicates that the type checker should now correctly handle this assignment without warnings. It's a good cleanup that improves code readability.


Line range hint 36-271: Potential discrepancy between PR objectives and implemented changes

The changes in this file improve the handling of the _asyncio_thread attribute in the Microphone class, enhancing type safety and code clarity. However, these modifications don't seem to directly address the stated PR objective of "Fix Mismatch Structs on STT REST" or the changes mentioned in the PR summary regarding PrerecordedOptions struct.

Could you please clarify how these changes in the Microphone class relate to the PR objectives? If these are part of a larger refactoring effort or if there are additional changes in other files that address the struct mismatches, it would be helpful to mention this in the PR description for better context.

deepgram/client.py (2)

86-88: Improved structure with separate WebSocket and REST entities

The addition of specific WebSocket (WS) and REST entities for Listen functionality (Alternative, Channel, Word) improves the code organization and aligns with the PR objective of fixing mismatch structs on STT REST. This separation will likely make it easier to maintain different implementations for WebSocket and REST.

Also applies to: 157-159


86-88: Verify usage of new WebSocket and REST entities

While the changes to the imports improve the code structure, it's important to ensure that these new entities (ListenWSAlternative, ListenWSChannel, ListenWSWord, ListenRESTAlternative, ListenRESTChannel, ListenRESTWord) are correctly used throughout the codebase.

Please run the following script to check the usage of these new entities:

This script will help identify where the new entities are being used and if there are any remaining instances of the old entities that need to be updated.

Also applies to: 157-159

✅ Verification successful

Please run the following script to verify the usage of the new and deprecated entities:


Import Changes Verified Successfully

The new WebSocket and REST entities (ListenWSAlternative, ListenWSChannel, ListenWSWord, ListenRESTAlternative, ListenRESTChannel, ListenRESTWord) are correctly used throughout the codebase. Deprecated entities are not present in active code and only appear in comments and examples.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check usage of new WebSocket and REST entities

# Check for WebSocket entities
echo "Checking WebSocket entities usage:"
rg --type python "ListenWSAlternative|ListenWSChannel|ListenWSWord" -g '!deepgram/client.py'

# Check for REST entities
echo "Checking REST entities usage:"
rg --type python "ListenRESTAlternative|ListenRESTChannel|ListenRESTWord" -g '!deepgram/client.py'

# Check for old entities (should not be used anymore)
echo "Checking for deprecated entity usage:"
rg --type python "\bAlternative\b|\bChannel\b|\bWord\b" -g '!deepgram/client.py'

Length of output: 604


Script:

# 
#!/bin/bash
# Description: Check usage of new WebSocket and REST entities

# Check for WebSocket entities
echo "Checking WebSocket entities usage:"
rg "ListenWSAlternative|ListenWSChannel|ListenWSWord" -g "*.py" --ignore-file deepgram/client.py

# Check for REST entities
echo "Checking REST entities usage:"
rg "ListenRESTAlternative|ListenRESTChannel|ListenRESTWord" -g "*.py" --ignore-file deepgram/client.py

# Check for deprecated entity usage
echo "Checking for deprecated entity usage:"
rg "\bAlternative\b|\bChannel\b|\bWord\b" -g "*.py" --ignore-file deepgram/client.py

Length of output: 6699

deepgram/clients/listen/v1/rest/response.py (7)

283-283: Updated words attribute to use ListenRESTWord

The words attribute in the Utterance class now correctly references ListenRESTWord, ensuring consistency with the updated data models.


298-300: Ensure proper deserialization of words in Utterance

The __getitem__ method appropriately deserializes the words list into ListenRESTWord instances, which is essential for accurate data handling.


349-351: Correct deserialization of words in ListenRESTAlternative

The __getitem__ method correctly converts items in the words list into ListenRESTWord instances, ensuring data integrity.


379-379: Updated alternatives attribute to use ListenRESTAlternative

The alternatives attribute in ListenRESTChannel now references ListenRESTAlternative, aligning with the restructured response classes.


393-395: Ensure proper deserialization of alternatives in ListenRESTChannel

The __getitem__ method accurately deserializes the alternatives list into ListenRESTAlternative instances, maintaining consistency in data processing.


405-405: Updated channels attribute to use ListenRESTChannel

The channels attribute in the Results class has been updated to use ListenRESTChannel, reflecting the updated data model structure.


428-430: Correct deserialization of channels in Results

The __getitem__ method properly deserializes the channels list into ListenRESTChannel instances, which is crucial for accurate data handling.

@davidvonthenen davidvonthenen merged commit c2b0790 into deepgram:main Sep 26, 2024
5 checks passed
@davidvonthenen davidvonthenen deleted the fix-missing-properties branch September 26, 2024 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants