Add options for handling multilingual input #200
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an incomplete PR intended to start on addressing issues semi-related to #184.
The
multilingual_input
option controls whether multiple languages should be expected in the input stream. If False (the backwards compatible default), only one language is expected, and it will be either the one specified by the client, or the first one heard if none was specified by the client. If True, the language can change throughout the stream, and for transcription, this will result in a multilingual text. Notifications will be sent to the client whenever a language change is detected. If the pauses between utterances in different languages are not long enough, the transcript boundaries may be incorrect, i.e. the first sentence in the new language may be incorrectly transcribed in the previous language. This seems currently unavoidable due to the way the last work-in-progress segment gets reprocessed.The
lang_filter
option allows the client to restrict the candidate set of languages for which to listen. This may be useful regardless of themultilingual_input
setting, e.g. at the beginning of the input where the actual language may be incorrectly detected initially. If not set (the backwards compatible default), all known languages are listened for.If there's interest in adding these, I can propagate them to the TensorRT code as well. I'm not sure how to add tests since that would require using a large multilingual model (we would also need to add some multilingual samples, which might be useful anyway).