Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update: fixing bug raised around stanza model not having certain words in vocabulary, along with efforts to improve latency #115

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

AchintyaX
Copy link
Contributor

A bug was found when testing in Oppo when users are saying something in Hindi but English ASR is used. certain words are out of vocabulary for the stanza model.
the PR is a error handling for this instance

@AchintyaX AchintyaX added the bug Something isn't working label Feb 10, 2022
@AchintyaX
Copy link
Contributor Author

Sentry link for the bug

@AchintyaX AchintyaX self-assigned this Feb 10, 2022
@AchintyaX
Copy link
Contributor Author

Added fuzzy search, using process module from theFuzz library, instead of manually computing match score using fuzzy_matchig.
Also upgraded the test file for ListSearchPlugin to make one test faster.

@codecov-commenter
Copy link

codecov-commenter commented Feb 13, 2022

Codecov Report

Merging #115 (7a580cf) into master (6ae0d8d) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #115   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           47        47           
  Lines         1921      1925    +4     
=========================================
+ Hits          1921      1925    +4     
Flag Coverage Δ
unittests 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ialogy/plugins/text/list_search_plugin/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ae0d8d...7a580cf. Read the comment docs.

"""
value = value + str(word.text) + " "
if value != "":
matches = process.extract(value, entity_patterns)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it a flag to get all entities or 1 entity

Copy link
Contributor Author

@AchintyaX AchintyaX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use pyinstrument for profiling the changes and speed gains

…and switched to extractOne method in fuzzy searching
…and switched to extractOne method in fuzzy searching, and updating type check

Signed-off-by: Achintya Shankhdhar <[email protected]>
…and switched to extractOne method in fuzzy searching, and updating type check, adding PatternList

Signed-off-by: Achintya Shankhdhar <[email protected]>
…and switched to extractOne method in fuzzy searching, and updating type check, adding PatternList
@AchintyaX AchintyaX changed the title Update: fixing bug raised around stanza model not having certain words in vocabulary Update: fixing bug raised around stanza model not having certain words in vocabulary, along with efforts to improve latency Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants