-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Mismatch Structs on STT REST #467
Fix Mismatch Structs on STT REST #467
Conversation
WalkthroughThe changes involve a comprehensive restructuring of import statements and class definitions across multiple files in the Deepgram client library. The modifications primarily focus on removing outdated entities such as Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (7)
deepgram/clients/listen/v1/__init__.py (2)
Line range hint
1-130
: Excellent organization of imports, consider adding type commentsThe restructuring of imports into clear sections (backward compat, shared, common between rest and websocket, etc.) greatly improves the file's readability and maintainability. This organization aligns well with the PR objective and provides a clear separation between REST and WebSocket functionalities.
To further enhance the code's clarity, consider adding type comments to the import statements. This can help developers quickly understand the nature of each imported entity without needing to refer to their definitions.
Example:
from .rest import ( ListenRESTAlternative, # type: class ListenRESTChannel, # type: class ListenRESTWord, # type: class )
Line range hint
1-130
: Consider updating documentation to reflect the restructured importsThe changes made to this file are part of a larger refactoring effort to distinguish between REST and WebSocket implementations. To ensure a smooth transition for developers using this SDK, it would be beneficial to update any relevant documentation to reflect these changes.
Specifically, consider:
- Updating any API documentation that references the old import structure.
- Adding a migration guide if these changes might affect existing code using the SDK.
- Updating examples in the README or other guides to use the new import structure.
deepgram/clients/__init__.py (1)
124-126
: Expanded public API: Consider updating documentationThe additions to the
__init__.py
file expand the SDK's public API by providing more granular access to REST and WebSocket specific components. This change offers developers greater flexibility and control over REST and WebSocket interactions without breaking existing code.Given these API additions:
- Please ensure that the new components (
ListenRESTAlternative
,ListenRESTChannel
,ListenRESTWord
,ListenWSWord
,ListenWSAlternative
, andListenWSChannel
) are properly documented, including their purpose, usage, and any differences from their counterparts.- Consider updating the SDK's main documentation to reflect these new capabilities and provide examples of when and how to use these more specific components.
- If there are any performance or functionality implications of using these new components over the existing ones, please highlight them in the documentation.
Also applies to: 153-155
deepgram/__init__.py (1)
124-125
: LGTM: Commented out shared importsThe decision to comment out
Alternative
,Channel
, andWord
imports in the shared section is consistent with the PR objectives of addressing mismatched structs. This change likely prevents confusion between generic and API-specific versions of these components.However, consider removing these commented-out imports if they are no longer needed to maintain code cleanliness. If you're keeping them for future reference, it might be worth adding a comment explaining why they are retained.
Also applies to: 137-137
deepgram/clients/listen/v1/websocket/response.py (2)
Line range hint
130-149
: Correct the handling ofchannel
andmetadata
inLiveResultResponse
's__getitem__
methodThe
__getitem__
method treatschannel
andmetadata
as lists and attempts to iterate over them. However, inLiveResultResponse
,channel
is defined as a singleListenWSChannel
instance, andmetadata
is a singleMetadata
instance.This mismatch could lead to runtime errors. Update the code to handle these attributes as single instances.
Apply this diff to fix the issue:
def __getitem__(self, key): _dict = self.to_dict() if "channel" in _dict: - _dict["channel"] = [ - ListenWSChannel.from_dict(channel) for channel in _dict["channel"] - ] + _dict["channel"] = ListenWSChannel.from_dict(_dict["channel"]) if "metadata" in _dict: - _dict["metadata"] = [ - Metadata.from_dict(metadata) for metadata in _dict["metadata"] - ] + _dict["metadata"] = Metadata.from_dict(_dict["metadata"]) return _dict[key]
31-92
: Add unit tests for the new data classesNew classes (
ListenWSWord
,ListenWSAlternative
,ListenWSChannel
) have been added to handle WebSocket responses. However, there are no accompanying unit tests to verify their functionality.To ensure these classes work as intended, please add unit tests that cover their initialization, methods, and interactions. Would you like me to help generate sample unit tests or open a GitHub issue to track this task?
deepgram/clients/listen/v1/rest/response.py (1)
Line range hint
321-344
: Consider refactoringListenRESTAlternative
to reduce instance attributesSimilar to
ListenRESTWord
, theListenRESTAlternative
class has a large number of instance attributes, leading to a disabledpylint
warning. Refactoring could enhance code maintainability by reducing complexity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (14)
- deepgram/init.py (4 hunks)
- deepgram/audio/microphone/microphone.py (3 hunks)
- deepgram/client.py (2 hunks)
- deepgram/clients/init.py (2 hunks)
- deepgram/clients/common/init.py (0 hunks)
- deepgram/clients/common/v1/init.py (0 hunks)
- deepgram/clients/common/v1/shared_response.py (0 hunks)
- deepgram/clients/listen/init.py (2 hunks)
- deepgram/clients/listen/client.py (4 hunks)
- deepgram/clients/listen/v1/init.py (2 hunks)
- deepgram/clients/listen/v1/rest/init.py (1 hunks)
- deepgram/clients/listen/v1/rest/response.py (9 hunks)
- deepgram/clients/listen/v1/websocket/init.py (1 hunks)
- deepgram/clients/listen/v1/websocket/response.py (3 hunks)
💤 Files not reviewed due to no reviewable changes (3)
- deepgram/clients/common/init.py
- deepgram/clients/common/v1/init.py
- deepgram/clients/common/v1/shared_response.py
🧰 Additional context used
📓 Learnings (1)
deepgram/clients/listen/v1/__init__.py (2)
Learnt from: dvonthenen PR: deepgram/deepgram-python-sdk#426 File: deepgram/clients/listen/v1/__init__.py:36-43 Timestamp: 2024-07-01T19:17:04.194Z Learning: Unused imports in `deepgram/clients/listen/v1/__init__.py` are retained to maintain backward compatibility and should not be flagged for removal in reviews.
Learnt from: dvonthenen PR: deepgram/deepgram-python-sdk#426 File: deepgram/clients/listen/v1/websocket/__init__.py:8-8 Timestamp: 2024-07-01T19:21:39.778Z Learning: Unused imports in `deepgram/clients/listen/v1/websocket/__init__.py` are retained to maintain backward compatibility and should not be flagged for removal in reviews.
🔇 Additional comments not posted (25)
deepgram/clients/listen/v1/rest/__init__.py (1)
56-58
: LGTM! Verify usage of new entity names.The addition of
ListenRESTAlternative
,ListenRESTChannel
, andListenRESTWord
appears to be part of a refactoring effort to make the API more explicit and avoid potential naming conflicts. This change aligns with the PR objective of fixing mismatched structs in the STT REST API.To ensure consistency, let's verify that the old entity names are not used elsewhere in the codebase:
If the script returns any usage of the old entity names or no usage of the new entity names, please update the relevant files to use the new
ListenREST*
prefixed entities.✅ Verification successful
LGTM! All usages of old entity names have been successfully replaced with the new ones.
The refactoring aligns with the PR objective of fixing mismatched structs in the STT REST API. The old entity names are no longer used in executable code.
- No further actions are required unless you choose to clean up the residual comments.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for any remaining usage of old entity names # Test: Search for usage of old entity names. Expect: No results echo "Searching for usage of 'Alternative':" rg --type python -g '!**/rest/__init__.py' '\bAlternative\b' echo "Searching for usage of 'Channel':" rg --type python -g '!**/rest/__init__.py' '\bChannel\b' echo "Searching for usage of 'Word':" rg --type python -g '!**/rest/__init__.py' '\bWord\b' # Test: Verify new entity names are used in other files. Expect: Some results echo "Verifying usage of new entity names:" rg --type python -g '!**/rest/__init__.py' 'ListenRESTAlternative|ListenRESTChannel|ListenRESTWord'Length of output: 737
Script:
#!/bin/bash # Description: Check for any remaining usage of old entity names with corrected file type # Test: Search for usage of old entity names. Expect: No results echo "Searching for usage of 'Alternative':" rg --type py -g '!**/rest/__init__.py' '\bAlternative\b' echo "Searching for usage of 'Channel':" rg --type py -g '!**/rest/__init__.py' '\bChannel\b' echo "Searching for usage of 'Word':" rg --type py -g '!**/rest/__init__.py' '\bWord\b' # Test: Verify new entity names are used in other files. Expect: Some results echo "Verifying usage of new entity names:" rg --type py -g '!**/rest/__init__.py' 'ListenRESTAlternative|ListenRESTChannel|ListenRESTWord'Length of output: 4194
Script:
#!/bin/bash # Description: Use ast-grep to search for actual code usage of old entity names # Search for class instantiations or variable assignments using old entity names echo "Searching for code usage of 'Alternative':" ast-grep --lang python --pattern $'$_ = Alternative($_)' echo "Searching for code usage of 'Channel':" ast-grep --lang python --pattern $'$_ = Channel($_)' echo "Searching for code usage of 'Word':" ast-grep --lang python --pattern $'$_ = Word($_)' # Additionally, search for function arguments or returns using old entity names echo "Searching for old entity names in function signatures and returns:" ast-grep --lang python --pattern $'def $_(self, alternative: Alternative):' ast-grep --lang python --pattern $'def $_(self) -> Channel:'Length of output: 725
deepgram/clients/listen/__init__.py (1)
70-72
: LGTM! Improved clarity in entity naming.The addition of specific REST and WebSocket entities (ListenRESTAlternative, ListenRESTChannel, ListenRESTWord, ListenWSWord, ListenWSAlternative, ListenWSChannel) aligns well with the PR objectives. This change improves clarity by differentiating between REST and WebSocket-related components.
To ensure these changes don't introduce breaking changes elsewhere in the codebase, please run the following script:
Also applies to: 97-99
✅ Verification successful
Verification Successful: No usages of old entities found and new entities are correctly used.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for any remaining usage of the old entities (Alternative, Channel, Word) in the codebase. # Test: Search for usage of old entities. Expect: No results, indicating all usages have been updated. rg --type python 'from .*import.*\b(Alternative|Channel|Word)\b' || echo "No usages found." # Test: Verify that the new entities are being used correctly. Expect: Results showing correct usage of new entities. rg --type python '\b(ListenRESTAlternative|ListenRESTChannel|ListenRESTWord|ListenWSWord|ListenWSAlternative|ListenWSChannel)\b'Length of output: 302
Script:
#!/bin/bash # Description: Re-check for any remaining usage of the old entities (Alternative, Channel, Word) in the codebase with corrected file type. # Test: Search for usage of old entities. Expect: No results, indicating all usages have been updated. rg --type py 'from .*import.*\b(Alternative|Channel|Word)\b' || echo "No usages found." # Test: Verify that the new entities are being used correctly. Expect: Results showing correct usage of new entities. rg --type py '\b(ListenRESTAlternative|ListenRESTChannel|ListenRESTWord|ListenWSWord|ListenWSAlternative|ListenWSChannel)\b'Length of output: 5278
deepgram/clients/listen/v1/__init__.py (1)
99-101
: Approve restructuring of imports for REST and WebSocket specific classesThe changes to import statements improve the structure by clearly differentiating between REST and WebSocket specific classes (
ListenRESTAlternative
,ListenRESTChannel
,ListenRESTWord
,ListenWSWord
,ListenWSAlternative
,ListenWSChannel
). This aligns well with the PR objective of fixing mismatch structs on STT REST and provides a more explicit API design.To ensure backward compatibility is maintained, please run the following script:
If the script returns any results, it may indicate potential breaking changes that need to be addressed.
Also applies to: 128-130
deepgram/clients/listen/client.py (4)
59-61
: LGTM: New REST-specific aliases addedThe addition of
ListenRESTAlternative
,ListenRESTChannel
, andListenRESTWord
aligns well with the PR objective of fixing mismatches in structs for the STT REST API. These new aliases maintain consistency with the existing naming conventions and contribute to a clearer separation between REST and WebSocket implementations.
83-85
: LGTM: WebSocket-specific aliases added for consistencyThe addition of
ListenWSAlternative
,ListenWSChannel
, andListenWSWord
complements the earlier REST-specific aliases. This maintains consistency in the separation of REST and WebSocket implementations throughout the SDK. The naming convention aligns well with existing WebSocket-specific entities.
155-157
: LGTM: Proper versioning of new aliasesThe assignment of the new REST and WebSocket-specific aliases to their respective 'Latest' versions (
ListenRESTAlternativeLatest
,ListenRESTChannelLatest
,ListenRESTWordLatest
,ListenWSAlternativeLatest
,ListenWSChannelLatest
,ListenWSWordLatest
) is consistent with the file's approach to version management. This practice facilitates easier updates and maintains backwards compatibility.Also applies to: 172-174
59-61
: Summary: Improved structure for REST and WebSocket entitiesThe changes introduced in this PR effectively address the objective of fixing mismatches in structs for the STT REST API. By adding separate aliases for REST and WebSocket-specific entities (
Alternative
,Channel
, andWord
), the code now has a clearer separation of concerns between these two communication methods.Key improvements:
- Better organization of REST and WebSocket-specific components.
- Consistent naming conventions (
ListenREST*
andListenWS*
).- Proper versioning through the use of
*Latest
aliases.These changes should enhance the maintainability of the SDK and make it easier for developers to work with the appropriate structures for each communication method. The modifications are non-breaking and align well with the existing code structure.
Also applies to: 83-85, 155-157, 172-174
deepgram/clients/__init__.py (2)
124-126
: LGTM: Addition of ListenREST imports aligns with PR objectivesThe addition of
ListenRESTAlternative
,ListenRESTChannel
, andListenRESTWord
imports is consistent with the PR's goal of addressing mismatched structs in the STT REST API. These new imports likely provide more accurate representations of the API response structure, which should help resolve the issue mentioned in #466.
153-155
: Approve WebSocket imports, but clarification neededThe addition of
ListenWSWord
,ListenWSAlternative
, andListenWSChannel
imports appears to be a parallel update to the WebSocket API handling. While these changes seem beneficial for maintaining consistency between REST and WebSocket implementations, they weren't explicitly mentioned in the PR objectives.Could you please clarify how these WebSocket-related changes relate to the "Fix Mismatch Structs on STT REST" objective? Are they part of a broader effort to standardize structures across both REST and WebSocket APIs?
deepgram/__init__.py (3)
92-94
: LGTM: WebSocket-specific imports addedThe addition of
ListenWSAlternative
,ListenWSChannel
, andListenWSWord
imports aligns with the PR objectives of addressing mismatched structs. This change introduces a more specific naming convention for WebSocket-related components, which should improve code organization and reduce potential naming conflicts.
154-156
: LGTM: REST-specific imports addedThe addition of
ListenRESTAlternative
,ListenRESTChannel
, andListenRESTWord
imports is consistent with the PR objectives and complements the earlier changes. These REST-specific versions follow the same naming convention as the WebSocket-specific versions, improving code organization and clarity.This change, along with the earlier modifications, effectively addresses the issue of mismatched structs mentioned in the PR objectives.
92-94
: Summary: Improved organization and separation of WebSocket and REST componentsThe changes in this file effectively address the PR objectives by:
- Adding WebSocket-specific imports (ListenWSAlternative, ListenWSChannel, ListenWSWord)
- Commenting out generic shared imports (Alternative, Channel, Word)
- Adding REST-specific imports (ListenRESTAlternative, ListenRESTChannel, ListenRESTWord)
These modifications improve the organization of the codebase by clearly separating WebSocket and REST-specific components. The consistent naming convention enhances code readability and maintainability. Overall, these changes successfully address the issue of mismatched structs mentioned in the PR objectives.
Also applies to: 124-125, 137-137, 154-156
deepgram/audio/microphone/microphone.py (4)
36-36
: Improved type annotation for_asyncio_thread
The change from
threading.Thread
toOptional[threading.Thread] = None
is a good improvement. It accurately reflects that_asyncio_thread
can beNone
in some cases, which aligns with its usage in the class. This change enhances type safety and makes the code more self-documenting.
145-145
: Explicit assignment ofNone
to_asyncio_thread
Setting
self._asyncio_thread = None
when using a regular threaded callback is a good practice. It ensures consistency with the new type annotation and clearly indicates that no asyncio thread is being used in this case. This change improves code clarity and helps prevent potential issues with leftover thread references.
271-271
: Removal of type ignore commentThe removal of the type ignore comment from
self._asyncio_thread = None
is appropriate given the new type annotation. This change simplifies the code by removing unnecessary comments and indicates that the type checker should now correctly handle this assignment without warnings. It's a good cleanup that improves code readability.
Line range hint
36-271
: Potential discrepancy between PR objectives and implemented changesThe changes in this file improve the handling of the
_asyncio_thread
attribute in theMicrophone
class, enhancing type safety and code clarity. However, these modifications don't seem to directly address the stated PR objective of "Fix Mismatch Structs on STT REST" or the changes mentioned in the PR summary regardingPrerecordedOptions
struct.Could you please clarify how these changes in the
Microphone
class relate to the PR objectives? If these are part of a larger refactoring effort or if there are additional changes in other files that address the struct mismatches, it would be helpful to mention this in the PR description for better context.deepgram/client.py (2)
86-88
: Improved structure with separate WebSocket and REST entitiesThe addition of specific WebSocket (WS) and REST entities for Listen functionality (Alternative, Channel, Word) improves the code organization and aligns with the PR objective of fixing mismatch structs on STT REST. This separation will likely make it easier to maintain different implementations for WebSocket and REST.
Also applies to: 157-159
86-88
: Verify usage of new WebSocket and REST entitiesWhile the changes to the imports improve the code structure, it's important to ensure that these new entities (ListenWSAlternative, ListenWSChannel, ListenWSWord, ListenRESTAlternative, ListenRESTChannel, ListenRESTWord) are correctly used throughout the codebase.
Please run the following script to check the usage of these new entities:
This script will help identify where the new entities are being used and if there are any remaining instances of the old entities that need to be updated.
Also applies to: 157-159
✅ Verification successful
Please run the following script to verify the usage of the new and deprecated entities:
Import Changes Verified Successfully
The new WebSocket and REST entities (
ListenWSAlternative
,ListenWSChannel
,ListenWSWord
,ListenRESTAlternative
,ListenRESTChannel
,ListenRESTWord
) are correctly used throughout the codebase. Deprecated entities are not present in active code and only appear in comments and examples.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check usage of new WebSocket and REST entities # Check for WebSocket entities echo "Checking WebSocket entities usage:" rg --type python "ListenWSAlternative|ListenWSChannel|ListenWSWord" -g '!deepgram/client.py' # Check for REST entities echo "Checking REST entities usage:" rg --type python "ListenRESTAlternative|ListenRESTChannel|ListenRESTWord" -g '!deepgram/client.py' # Check for old entities (should not be used anymore) echo "Checking for deprecated entity usage:" rg --type python "\bAlternative\b|\bChannel\b|\bWord\b" -g '!deepgram/client.py'Length of output: 604
Script:
# #!/bin/bash # Description: Check usage of new WebSocket and REST entities # Check for WebSocket entities echo "Checking WebSocket entities usage:" rg "ListenWSAlternative|ListenWSChannel|ListenWSWord" -g "*.py" --ignore-file deepgram/client.py # Check for REST entities echo "Checking REST entities usage:" rg "ListenRESTAlternative|ListenRESTChannel|ListenRESTWord" -g "*.py" --ignore-file deepgram/client.py # Check for deprecated entity usage echo "Checking for deprecated entity usage:" rg "\bAlternative\b|\bChannel\b|\bWord\b" -g "*.py" --ignore-file deepgram/client.pyLength of output: 6699
deepgram/clients/listen/v1/rest/response.py (7)
283-283
: Updatedwords
attribute to useListenRESTWord
The
words
attribute in theUtterance
class now correctly referencesListenRESTWord
, ensuring consistency with the updated data models.
298-300
: Ensure proper deserialization ofwords
inUtterance
The
__getitem__
method appropriately deserializes thewords
list intoListenRESTWord
instances, which is essential for accurate data handling.
349-351
: Correct deserialization ofwords
inListenRESTAlternative
The
__getitem__
method correctly converts items in thewords
list intoListenRESTWord
instances, ensuring data integrity.
379-379
: Updatedalternatives
attribute to useListenRESTAlternative
The
alternatives
attribute inListenRESTChannel
now referencesListenRESTAlternative
, aligning with the restructured response classes.
393-395
: Ensure proper deserialization ofalternatives
inListenRESTChannel
The
__getitem__
method accurately deserializes thealternatives
list intoListenRESTAlternative
instances, maintaining consistency in data processing.
405-405
: Updatedchannels
attribute to useListenRESTChannel
The
channels
attribute in theResults
class has been updated to useListenRESTChannel
, reflecting the updated data model structure.
428-430
: Correct deserialization ofchannels
inResults
The
__getitem__
method properly deserializes thechannels
list intoListenRESTChannel
instances, which is crucial for accurate data handling.
Proposed changes
Addresses: #466
Verified working modifying an example:
examples/speech-to-text/rest/url
Options:
Result:
Types of changes
What types of changes does your code introduce to the community Python SDK?
Put an
x
in the boxes that applyChecklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.Further comments
NA
Summary by CodeRabbit
New Features
ListenWSWord
,ListenWSAlternative
,ListenWSChannel
, and their REST counterparts.Bug Fixes
_asyncio_thread
in theMicrophone
class for better state management.Refactor