Some word segmentation results are different than we get in ICU4C #3522
riajain0412
started this conversation in
General
Replies: 2 comments 22 replies
-
Which LSTM constructors are you using? Please verify that you can reproduce these results with the dictionary constructors. |
Beta Was this translation helpful? Give feedback.
3 replies
-
I'm loading full data blob in my C++ code. How to confirm that whether dictionaries are loaded or not? And also which all keys are needed for word segmenter? I was trying to create a data blob for dictionary based word segmenter for SEA language only. I'm including only segmenter/word@1 and segmenter/dictionary/wl_ext@1. |
Beta Was this translation helpful? Give feedback.
19 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I was comparing results of text segmentation between ICU4C and ICU4X for SEA languages but I found some disparity between the results. Listing down the few strings which are having different result in ICU4X and ICU4C.
and many other strings
So, I wanted to confirm that are these expected results?
I'm using the full data blob with all keys and locales.
Beta Was this translation helpful? Give feedback.
All reactions