This is the repository for the paper which introduces Ruya, the memory-aware cluster configuration optimizer. The paper will be presented at the 2022 IEEE International Conference on Big Data.
The repository contains:
- The source code and the evaluation of Ruya, including resulting plots
- The main evaluation dataset in csv format from the "Arrow" paper by Hsu et al.
- The local runtime data set in csv format which was gathered by Crispy during the profiling runs
The full title of the paper is "Ruya: Memory-Aware Iterative Optimization of Cluster Configurations for Big Data Processing".
The authors of the paper are Jonathan Will, Lauritz Thamsen, Jonathan Bader, Dominik Scheinert, and Odej Kao.