Integrate ModernBERT #1624

Samoed · 2024-12-23T16:43:37Z

Arxiv: https://arxiv.org/abs/2412.13663
Model: https://huggingface.co/answerdotai/ModernBERT-base

ModernBERT was evaluated on BEIR, and I think it could be integrated into MTEB with a specific configuration. I tried adding it using SentenceTransformers with different pooling methods, but my results were much lower than those reported.

@orionw, since you’re one of the co-authors (congrats, by the way!), do you have scripts to reproduce the results?

orionw · 2024-12-23T16:50:24Z

Thanks @Samoed! ModernBERT is the base model, so if you use it out of the box it will be pretty bad, just like an un-finetuned BERT or RoBERTa.

@NohTow and @bclavie did some fine tuning on MS MARCO but I don’t think they’ve uploaded the models anywhere. They did put their fine tuning scripts here: https://github.com/AnswerDotAI/ModernBERT/blob/main/examples/train_st.py (and similar for ColBERT in the repo).

I expect others will replace BERT with it in their pipelines and we will see more retrieval models with it soon!

NohTow · 2024-12-23T17:46:57Z

Hello,
Yes, indeed, the models trained for the experiments in the paper are rather "weak" (especially DPR ones) compared to what people are used to. The goal of the experiments was to compare the performance of all the base models in a given and fair setup (which is just an average MS MARCO training).

We decided not to chase the top of BEIR leaderboard because fine tuning to this extend is a whole project in itself and takes a lot of work if you do not have the data for it available. Also, to some extend, the leaderboard is a bit gamed and even if we put the time and energy to grind the leaderboard, we might have come a bit short or end up with a model that is not performing as we believe a model should.

Thus, to avoid wasting time and get people only comparing the BEIR scores, we preferred to compare the models in a simple setup to get a signal comparing the actual potential of the base models and let the people that already have extensive pipelines available take the model and do a proper fine-tuning. These actors have seen the model and we have good reasons to believe that they will indeed do this fine-tuning in the future! Besides, I am also doing some experiments on my own, which might end up with a model that is not as strong as the top models, but way better than what we trained in the paper!

Edit: I also have the checkpoints of the models we trained for the experiments, but again, not sure reporting these one on MTEB is worth it.

KennethEnevoldsen · 2024-12-24T07:02:30Z

so some thought. It can be reasonable (as a reference) to benchmark models like BERT, ModernBERT etc.

These are fairly easy to benchmark (it can be run from the CLI). However I expect that we will see competitive finetunes due to:

More than anything, we’re really looking forward to seeing what creative ways to use these models the community will come up with! To encourage this, we’re opening a call for demos until January 10th, 2025: the 5 best ones will get added to this post in a showcase section and win a $100 (or local currency equivalent) Amazon gift card, as well as a 6-month HuggingFace Pro subscription! If you need a hint to get started, here’s a demo we thought about: code similarity HF space! And remember, this is an encoder model, so all the coolest downstream applications will likely require some sort of fine-tuning (on real or perhaps decoder-model synthetic data?). Thankfully, there's lots of cool frameworks out there to support fine-tuning encoders: 🤗Transformers itself for various tasks, including classification, GliNER for zero-shot Named Entity Recognition, or Sentence-Transformers for retrieval and similarity tasks!

source: https://huggingface.co/blog/modernbert

Samoed added the new-model Questions related to adding a new model to the benchmark label Dec 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate ModernBERT #1624

Integrate ModernBERT #1624

Samoed commented Dec 23, 2024

orionw commented Dec 23, 2024 •

edited

Loading

NohTow commented Dec 23, 2024 •

edited

Loading

KennethEnevoldsen commented Dec 24, 2024

Integrate ModernBERT #1624

Integrate ModernBERT #1624

Comments

Samoed commented Dec 23, 2024

orionw commented Dec 23, 2024 • edited Loading

NohTow commented Dec 23, 2024 • edited Loading

KennethEnevoldsen commented Dec 24, 2024

orionw commented Dec 23, 2024 •

edited

Loading

NohTow commented Dec 23, 2024 •

edited

Loading