Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I would like to improve MT translation layer to handle context of discussion #517

Open
cyberluke opened this issue Oct 10, 2024 · 0 comments

Comments

@cyberluke
Copy link

Hello, current translation for S2S produces very low precision.

I am interested in Czech and Vietnamese translation.

If I say: "Today is cold" it will incorrectly translate as "big winter" instead of "cold". In Vietnamese: "mùa đông lớn" while correct term is "lạnh".

1) Is it possible to replace your MT layer by something else?

Next, I would like to implement some rule-based or phrase-based post processing according to context.

2) Translation produces disrepectful output
For example if I say in previous prompt: "mẹ ơi" to ask mom, I want to save this information in some user context and then when I mention "me" in next conversation it must not output "tôi", but it must replace this word by "con" which means child.

It is very high disrespect to use word "tôi" on parents. So this use might fall into ethical use of AI as well.

Could you recommend some approach how to accomplis that?

I think the right place to do it is here:
https://github.com/facebookresearch/seamless_communication/blob/main/src/seamless_communication/inference/translator.py

I'm also thinking that it would be better to translate Czech to English first and then English to Vietnamese and then synthetise the speech.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant