I would like to improve MT translation layer to handle context of discussion #517

cyberluke · 2024-10-10T01:43:23Z

Hello, current translation for S2S produces very low precision.

I am interested in Czech and Vietnamese translation.

If I say: "Today is cold" it will incorrectly translate as "big winter" instead of "cold". In Vietnamese: "mùa đông lớn" while correct term is "lạnh".

1) Is it possible to replace your MT layer by something else?

Next, I would like to implement some rule-based or phrase-based post processing according to context.

2) Translation produces disrepectful output
For example if I say in previous prompt: "mẹ ơi" to ask mom, I want to save this information in some user context and then when I mention "me" in next conversation it must not output "tôi", but it must replace this word by "con" which means child.

It is very high disrespect to use word "tôi" on parents. So this use might fall into ethical use of AI as well.

Could you recommend some approach how to accomplis that?

I think the right place to do it is here:
https://github.com/facebookresearch/seamless_communication/blob/main/src/seamless_communication/inference/translator.py

I'm also thinking that it would be better to translate Czech to English first and then English to Vietnamese and then synthetise the speech.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I would like to improve MT translation layer to handle context of discussion #517

I would like to improve MT translation layer to handle context of discussion #517

cyberluke commented Oct 10, 2024

I would like to improve MT translation layer to handle context of discussion #517

I would like to improve MT translation layer to handle context of discussion #517

Comments

cyberluke commented Oct 10, 2024