Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIRACLRetrieval results are missing for most models. #1550

Open
Samoed opened this issue Dec 4, 2024 · 17 comments
Open

MIRACLRetrieval results are missing for most models. #1550

Samoed opened this issue Dec 4, 2024 · 17 comments
Labels
leaderboard issues related to the leaderboard

Comments

@Samoed
Copy link
Collaborator

Samoed commented Dec 4, 2024

Models results for MIRACLRetrieval are missing for MTEB(rus)

@KennethEnevoldsen
Copy link
Contributor

Do they need to be run or missing from the leaderboard?

@Samoed
Copy link
Collaborator Author

Samoed commented Dec 4, 2024

They are missing, but they were present in the old version of the leaderboard

@KennethEnevoldsen
Copy link
Contributor

Hmm so we need to check the results repo and if they are not there probably rerun them

@Samoed
Copy link
Collaborator Author

Samoed commented Dec 4, 2024

It seems strange. In the new leaderboard, only MIRACRLERetrieval is listed, but the benchmark includes both MIRACRLERetrieval and MIRACRLEReranking. I think the MIRACLReranking scores are missing because they were previously labeled with a suffix like (MIRACLE) (e.g., NDCG@10(MIRACL)). However, I don't understand why MIRACRLERetrieval is also missing.

Results for these datasets are available in the results repository, such as for bge-m3:

#1317 (comment)

image

"MIRACLReranking",
"RuBQReranking",
# Retrieval
"MIRACLRetrieval",

@KennethEnevoldsen
Copy link
Contributor

Sounds like we need to either rerun MICRACL or rewrite the scores.

However, I don't understand why MIRACRLERetrieval is also missing.

@Samoed isn't it because the main score has changed?

@KennethEnevoldsen KennethEnevoldsen added the leaderboard issues related to the leaderboard label Dec 5, 2024
@Samoed
Copy link
Collaborator Author

Samoed commented Dec 5, 2024

No, MIRACLRetrieval has scores as all retrieval tasks. Only MIRACRLReranking has different score names

@x-tabdeveloping
Copy link
Collaborator

I have a suspicion, that this might be because of duplicate results, where the incorrect one gets used in the leaderboard. Can't confirm yet though, I'll have a look

@x-tabdeveloping
Copy link
Collaborator

Wait a sec, bge-m3 shouldn't even show up in the leaderboard, since it has no metadata, and it doesn't for me. Are you sure it's bge-m3 we're looking at? @Samoed

@Samoed
Copy link
Collaborator Author

Samoed commented Dec 5, 2024

I've added bge-m3 just to show that in result repository some scores for tasks are presented

@x-tabdeveloping
Copy link
Collaborator

If I load the results then I only get MIRACL scores for one model, even when I don't filter models out that don't have metadata.

import mteb

all_results = mteb.load_results()
miracl = all_results.filter_tasks(["MIRACLRetrieval"])
print(len(miracl.model_results))
# 1
print(miracl[0])
# model_name='jinaai/jina-embeddings-v3' model_revision='215a6e121fa0183376388ac6b1ae230326bfeaed' task_results=[TaskResult(task_name=MIRACLRetrieval, scores=...)]

@x-tabdeveloping
Copy link
Collaborator

so something is either wrong with the results or with the result loading script

@x-tabdeveloping
Copy link
Collaborator

Also, when I load results, I get these warnings for almost all models:

Validation failed for MIRACLRetrieval in sentence-transformers/all-MiniLM-L6-v2 8b3219a92973c328a8e22fadcfa821b5dc75636a: Missing subsets {'fi', 'sw', 'te', 'fa', 'ko', 'zh', 'de', 'yo', 'e
n', 'ar', 'es', 'th', 'fr', 'hi', 'id', 'bn', 'ja'} for split dev

@x-tabdeveloping
Copy link
Collaborator

Maybe the loading function should be a bit more graceful, when stuff is missing

@x-tabdeveloping
Copy link
Collaborator

Raising warnings instead of errors when loading results and languages are missing partly fixes the issue.

@x-tabdeveloping
Copy link
Collaborator

I will submit a PR

@KennethEnevoldsen
Copy link
Contributor

KennethEnevoldsen commented Dec 6, 2024

I don't think this is solved, just circumvented

@x-tabdeveloping
Copy link
Collaborator

right.. I mean we are still missing a lot of results on MIRACL, just not the Russian ones

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
leaderboard issues related to the leaderboard
Projects
None yet
Development

No branches or pull requests

3 participants