This document covers topics useful for contributors to Scikit-learn_bench:
stateDiagram-v2
classDef inputOutput fill:#33b,color:white,stroke-width:2px,stroke:white;
user_arguments:::inputOutput --> ArgumentParser
BenchmarksRunner --> raw_results[JSON]:::inputOutput
raw_results[JSON] --> ReportGenerator
ReportGenerator --> benchmarks_report[Excel]:::inputOutput
state BenchmarksRunner {
ArgumentParser --> ConfigParser: config_arguments
ArgumentParser --> Benchmarks: other_arguments
ConfigParser --> Benchmarks: benchmark_cases\n[JSON-formatted string]
ConfigParser --> Benchmarks: benchmark_filters\n[JSON-formatted string]
state Benchmarks {
SklearnLikeEstimator --> raw_results[JSON]
... --> raw_results[JSON]
Functional --> raw_results[JSON]
}
}
Scikit-learn_bench consists of three main parts:
- Benchmarks runner:
- Consumes user-provided high-level arguments (argument parser).
- Transforms arguments to benchmark cases as parameters for individual benchmarks (config parser).
- Combines the raw outputs.
- Individual benchmarks wrapping specific entities or workloads (sklearn-like estimators, custom functions, etc.)
- Report generator which consumes benchmarks' outputs and generates high-level report with aggregated stats
Runner is responsible for orchestration of benchmarking cases, individual benchmarks - for actual run of each case, report generator - for human-readable output.
Benchmarking configuration exists as two stages:
- Benchmarking template where parameters or group of them might be defined as a range of values
- Benchmarking case with deducted scalar values of parameters
In other words, the template has the Dict[str, AnyJSONSerializable]
type, while the case has Dict[str, Dict[str, ... [str, Scalar] ... ]]
.
Configs parser steps:
- Find all config files from the user-provided
config
argument or use globally definedparameters
as a standalone config - Convert configs to templates
- Expand template-special values and ranges to all possible cases
- Remove duplicated cases and assign case-special values if possible
Special values might be assigned on three stages:
- During template reading in runner
- During benchmarking cases generation in runner
- During run of individual benchmark
Benchmark parameters the following overwriting priority:
- CLI parameters
- Config template parameters
- Parameters set