Leaderboard: Add an Overview leaderboard #1432

KennethEnevoldsen · 2024-11-11T08:38:09Z

Instead of having either English or Multilingual as the default why not have "overview" as the default?

Something like:

Model Name	Multilingual	English	LongEmbed	...
M1	...
M2

(this does not need to be added for the first version of the benchmark)

isaac-chung · 2024-11-11T09:00:17Z

That's an interesting idea! When we think of a leaderboard, we think of rankings. Would this be sort of a "meta" leaderboard, where the models have a rank based on all the benchmarks, e.g. a borda count over all benchmarks? Or based on one of multilingual or English?

If it's the former, one probable scenario I'm not sure how to deal with is with missing values, i.e. if that model does not have values for a particular benchmark(s).

Otherwise, we could enable sorting models based on each benchmark available. And users can click into each benchmark's breakdown.

x-tabdeveloping · 2024-11-12T07:29:06Z

Hmm I'm not sure whether this is something we actually want to have. If people are agnostic and just want to get a reasonable overview, wouldn't they just choose the multilingual or the English benchmarks? Like why would anyone be equally interested in a somewhat arbitrary selection of benchmarks that are sometimes subsets of each other? (European, Scandinavian, French, German, etc.)

KennethEnevoldsen mentioned this issue Nov 11, 2024

Improve leaderboard 2.0 readability #1317

Open

7 tasks

x-tabdeveloping added the leaderboard issues related to the leaderboard label Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leaderboard: Add an Overview leaderboard #1432

Leaderboard: Add an Overview leaderboard #1432

KennethEnevoldsen commented Nov 11, 2024

isaac-chung commented Nov 11, 2024

x-tabdeveloping commented Nov 12, 2024

Leaderboard: Add an Overview leaderboard #1432

Leaderboard: Add an Overview leaderboard #1432

Comments

KennethEnevoldsen commented Nov 11, 2024

isaac-chung commented Nov 11, 2024

x-tabdeveloping commented Nov 12, 2024