You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@seongtaehong and I were considering a way to make cross-lingual retrieval tasks more challenging by merging retrieval pools from two different languages.
Here’s the idea:
The task would be to retrieve two gold passages from a retrieval pool composed of content in two different languages.
The retrieval pool would consist of pairs of passages that have the same meaning but are written in different languages (e.g., StrategyQA and Ko-StrategyQA, with the latter being the Korean translation of StrategyQA).
Given a query in Korean, the model would need to retrieve the top 2 passages, ensuring the retrieved passages are in different languages. (And same for the query in English)
We believe this approach reflects a more realistic scenario, as many retrieval pools in the real world are derived from web crawling, and such pools naturally include data in multiple languages.
What are your thoughts on this idea? Let me know if you'd like me to adjust anything further!
The text was updated successfully, but these errors were encountered:
Hi MTEB maintainers @KennethEnevoldsen, @Muennighoff
@seongtaehong and I were considering a way to make cross-lingual retrieval tasks more challenging by merging retrieval pools from two different languages.
Here’s the idea:
We believe this approach reflects a more realistic scenario, as many retrieval pools in the real world are derived from web crawling, and such pools naturally include data in multiple languages.
What are your thoughts on this idea? Let me know if you'd like me to adjust anything further!
The text was updated successfully, but these errors were encountered: