OpenML Benchmark Suites

Machine learning research depends on objectively interpretable, comparable, and reproducible algorithm benchmarks. Therefore, we advocate the use of curated, comprehensive suites of machine learning datasets to standardize the setup, execution, and reporting of benchmarks. We enable this through platform-independent software tools that help to create and leverage these benchmarking suites. These are seamlessly integrated into the OpenML platform, and accessible through interfaces in Python, Java, and R.

OpenML benchmarking suites are:

easy to use through standardized data formats, APIs, andclient libraries
machine-readable, with extensive meta-information on the includeddatasets
allow benchmarks to be shared and reused in future studies.

Documentation

Detailed documentation on how to create and use OpenML benchmark suites
This also includes a list of current benchmark suites, such as the OpenML-CC18.

Notebooks

We provide a set of notebooks to explore existing benchmark suites, and create your own:

Automated benchmark suite generator: Allows you to specific a list of constraints and additional tests, and retrieve all datasets that adhere to them
CC18 score overview: Overview of shared results on the CC18 benchmark suites
CC18 benchmark analysis: A deeper analysis of existing results in R (note: this was done for an older benchmark set)
Mini-Benchmark of R algorithm on the CC18: http://rpubs.com/giuseppec/OpenML100
Mini-Benchmark of WEKA algorithms on the CC18
Tutorials for OpenML in R and Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

OpenML Benchmark Suites

Documentation

Notebooks

Files

README.md

Latest commit

History

README.md

File metadata and controls

OpenML Benchmark Suites

Documentation

Notebooks