Update: fixing bug raised around stanza model not having certain words in vocabulary, along with efforts to improve latency #115

AchintyaX · 2022-02-10T19:24:26Z

A bug was found when testing in Oppo when users are saying something in Hindi but English ASR is used. certain words are out of vocabulary for the stanza model.
the PR is a error handling for this instance

…words

AchintyaX · 2022-02-10T19:25:39Z

Sentry link for the bug

… replacing with fuzzy search

AchintyaX · 2022-02-13T11:04:00Z

Added fuzzy search, using process module from theFuzz library, instead of manually computing match score using fuzzy_matchig.
Also upgraded the test file for ListSearchPlugin to make one test faster.

codecov-commenter · 2022-02-13T11:11:46Z

Codecov Report

Merging #115 (7a580cf) into master (6ae0d8d) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master      #115   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           47        47           
  Lines         1921      1925    +4     
=========================================
+ Hits          1921      1925    +4

Flag	Coverage Δ
unittests	`100.00% <100.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...ialogy/plugins/text/list_search_plugin/__init__.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ae0d8d...7a580cf. Read the comment docs.

AchintyaX · 2022-02-15T08:33:24Z

dialogy/plugins/text/list_search_plugin/__init__.py

+                    """
+                    value = value + str(word.text) + " "
+            if value != "":
+                matches = process.extract(value, entity_patterns)


Make it a flag to get all entities or 1 entity

AchintyaX

use pyinstrument for profiling the changes and speed gains

…e which was happening earlier

…and switched to extractOne method in fuzzy searching

…and switched to extractOne method in fuzzy searching, and updating type check Signed-off-by: Achintya Shankhdhar <[email protected]>

…and switched to extractOne method in fuzzy searching, and updating type check, adding PatternList Signed-off-by: Achintya Shankhdhar <[email protected]>

…and switched to extractOne method in fuzzy searching, and updating type check, adding PatternList

Update: fixing bug raised by Amey for model not trained on a certain …

9938eb9

…words

AchintyaX added the bug Something isn't working label Feb 10, 2022

AchintyaX requested a review from ltbringer February 10, 2022 19:24

AchintyaX self-assigned this Feb 10, 2022

AchintyaX requested a review from AmeyHengle February 10, 2022 19:26

Update: optimizing for reducing latency by removing manual search and…

89cb310

… replacing with fuzzy search

AchintyaX added 2 commits February 13, 2022 16:35

Update: added fuzzySearch and updated the unit tests

1def471

Update: added fuzzySearch and updated the unit tests

273e000

AchintyaX commented Feb 15, 2022

View reviewed changes

AchintyaX added 3 commits February 18, 2022 00:15

Update: preparing the search list in intit function instead of runtim…

a0fbdcf

…e which was happening earlier

Update: added regex pattern compilation during object initialization …

266c270

…and switched to extractOne method in fuzzy searching

Update: added regex pattern compilation during object initialization …

04cc969

…and switched to extractOne method in fuzzy searching, and updating type check Signed-off-by: Achintya Shankhdhar <[email protected]>

AchintyaX force-pushed the list_search_bug_fix branch from 6534fb1 to 85fa241 Compare February 18, 2022 14:53

Update: added regex pattern compilation during object initialization …

7bcd692

…and switched to extractOne method in fuzzy searching, and updating type check, adding PatternList Signed-off-by: Achintya Shankhdhar <[email protected]>

AchintyaX force-pushed the list_search_bug_fix branch from 85fa241 to 7bcd692 Compare February 18, 2022 15:00

Update: added regex pattern compilation during object initialization …

5f836e5

…and switched to extractOne method in fuzzy searching, and updating type check, adding PatternList

AchintyaX force-pushed the list_search_bug_fix branch from ebffdfa to 5f836e5 Compare February 18, 2022 15:04

AchintyaX changed the title ~~Update: fixing bug raised around stanza model not having certain words in vocabulary~~ Update: fixing bug raised around stanza model not having certain words in vocabulary, along with efforts to improve latency Feb 18, 2022

AchintyaX and others added 2 commits February 18, 2022 23:24

Update: improving the documentation of the plugin

77109ef

Merge branch 'master' into list_search_bug_fix

7a580cf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update: fixing bug raised around stanza model not having certain words in vocabulary, along with efforts to improve latency #115

Update: fixing bug raised around stanza model not having certain words in vocabulary, along with efforts to improve latency #115

AchintyaX commented Feb 10, 2022

AchintyaX commented Feb 10, 2022

AchintyaX commented Feb 13, 2022

codecov-commenter commented Feb 13, 2022 •

edited

Loading

AchintyaX Feb 15, 2022

AchintyaX left a comment

Update: fixing bug raised around stanza model not having certain words in vocabulary, along with efforts to improve latency #115

Are you sure you want to change the base?

Update: fixing bug raised around stanza model not having certain words in vocabulary, along with efforts to improve latency #115

Conversation

AchintyaX commented Feb 10, 2022

AchintyaX commented Feb 10, 2022

AchintyaX commented Feb 13, 2022

codecov-commenter commented Feb 13, 2022 • edited Loading

Codecov Report

AchintyaX Feb 15, 2022

Choose a reason for hiding this comment

AchintyaX left a comment

Choose a reason for hiding this comment

codecov-commenter commented Feb 13, 2022 •

edited

Loading