Build a run evaluation engine in Python #1

PGijsbers · 2024-09-02T09:13:27Z

The evaluation engine is a component on the server which handles multiple tasks. This is currently implemented in Java and we want to rebuild it in Python, and compartmentalised per each function, for easier maintenance/more accessible to new contributors. One of its tasks is evaluating run results.

So we want an engine which can take in any run result, and produce a number of metrics of those results. It should be easily extendable towards new task types, and cover many (all?) of the currently available metrics - or at least ensure that those that share a name produce identical results. It would be best to have a base implementation that could be inherited from for separate evaluation engines that are specific to a task type.

joaquinvanschoren · 2024-09-06T12:09:02Z

This is a nice standalone project, assuming we can build this on top of the Python API. What would make a lot of sense is to sit together for an hour during the hackathon to design the overall architecture and concrete next steps.

PGijsbers added this to Roadmap (high level) Oct 20, 2022

PGijsbers converted this from a draft issue Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build a run evaluation engine in Python #1

Build a run evaluation engine in Python #1

PGijsbers commented Sep 2, 2024

joaquinvanschoren commented Sep 6, 2024

Build a run evaluation engine in Python #1

Build a run evaluation engine in Python #1

Comments

PGijsbers commented Sep 2, 2024

joaquinvanschoren commented Sep 6, 2024