Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Create a better leaderboard for OpenHands #5744

Open
openhands-agent opened this issue Dec 22, 2024 · 0 comments
Open

[Feature]: Create a better leaderboard for OpenHands #5744

openhands-agent opened this issue Dec 22, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@openhands-agent
Copy link
Contributor

openhands-agent commented Dec 22, 2024

What problem or use case are you trying to solve?
Currently, there is no comprehensive leaderboard that effectively tracks the performance of open-sourced models within OpenHands. A clear leaderboard would facilitate comparison and improvement.

Describe the UX of the solution you'd like
An online sheet that continuously updates and tracks performance metrics for various models, possibly with the ability to click on links for deeper insights specific to each model.

Do you have thoughts on the technical implementation?
The leaderboard could utilize existing performance data stored in a central database, automatically populated and refreshed at intervals, ensuring real-time updates.

Describe alternatives you've considered
Using a static document or manual tracking methods, but these lack the dynamic features and timeliness that a real leaderboard would provide.

Additional context
This initiative was discussed in a recent thread on Slack with users expressing interest in better tracking mechanisms for performance.

Issue Created By: Graham Neubig on Slack

@mamoodi mamoodi added the enhancement New feature or request label Dec 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants