Draft: Update core assignment algorithm in benchexec/resources.py #892

CGall42 · 2023-01-19T21:34:34Z

Referring to issue #748, the core assignment can now handle additional hierarchy layers (such as a shared L3 cache).
The addition of further layers can be implemented without knowing the exact topology of a machine - the hierarchy of the layers (CPUs, NUMA nodes, L3 caches, hyperthreading, etc) is determined by the algorithm.

Fixes #748
Fixes #850

Kernel documentation:

PhilippWendler

This is a preliminary review with some hints, mostly on code style and documentation. Please fix these issues and provide documentation, such that a full review becomes possible and makes sense.

As part of this (as first step actually), please format the source code with the formatter black and make sure to use it in the future for each commit. Consistent code formatting is a big help for readability.

And please also have a look at all the other CI failures. After each commit these checks will run automatically, so always check whether CI is green for your most recent commit. The check check-format is fixed automatically if you use the code formatter, and reuse only complains about the missing copyright header. But flake8 and pytype provide hints about potential errors in your code. And of course the unit-tests checks just execute our tests.

benchexec/resources.py

This method adds a "root" hierarchy level, if the system topology doesnt have one. Necessary for iterating through the whole topology.

now chooses the right starting core for the next thread

…r fixes

The new get_cpu_distribution method has no information about partial physical cores anymore, this is checked outside of it.

It is better to have CI green to be able to notice further regressions.

Also add unit tests for this function.

Also do not use "raise Exception", use assert to encode coding assumptions.

Only the standard format is lazy.

- Function name starting with "read" to indicate it reads from kernel. - Parameters in better order. - Identifier naming according to Python standard. - Actually use generic identifiers in a generic function and not names that are specific to one use case. - Also replace all trivial callers with a single function.

- Crucial constants should be present only once, documented, and defined in a central place. - Reading from the system and logic should be separate such that the latter is testable. - For reading from the system we can use an existing helper method. - Add tests.

Uses of plain dicts may catch errors in callers earlier. Furthermore, some of the functions even returned a defaultdict in some cases and a plain dict in other cases. The return type should be consistent. With dict.setdefault() the use of a plain dict is almost as convenient as a defaultdict.

…lity It is not really necessary.

We always want the user to allow us to use entire physical cores. This check was broken, because forbidden sibling cores were already removed from the data structure before the check. Furthermore, cores forbidden via cgroups and via the --allowedCores parameter were treated somehow differently, but the effect should be exactly the same.

So far we read the information about the hyperthreading hierarchy level differently from the other levels. This made the code more difficult to understand, and the way how the ids in the hierarchy_levels[0] dict were chosen differed from the other levels. But we can also read this information in the same way as for the other levels, so let's do this. We still also need to use the previous way of reading all siblings from a given list of cores, but we can also simplify that and the separation of concerns still provides an understandability benefit.

The allocation algorithm already supports an arbitrary number of levels, so we can future proof the allocation and read all information about cache levels that the kernel provides. We can also use the assumption that caches are named the same across all cores, and read the cache names only once instead of separately for every core.

This method actually has nothing to do with "sub" units (children), it just takes a set of cores and a level and groups the cores as appropriate for the level. So the names should reflect that.

To allow easier generation of new tests (where we ideally can automatically generate tests for a large number of (also weird) CPU configurations), it's desireable to be able to specify arbitrary layers more or less directly, without having to create new test classes. The final goal is to be able to generate machine configurations given a single argument (or two arguments: layer configuration as a list and total core count), so we can utilize pytest to write easily maintainable test cases.

As the new method already keeps the layers in the correct order and doesn't create duplicate layers, we can remove those parts from the code

After showing that the new layer generation code is equivalent to the existing one, we can remove the old code with the hardcoded layers and use the new code.

Charlie added 2 commits January 13, 2023 13:27

Assigmnment algorithm redesign

3bee91d

Added spreading of runs to assigment algorithm

327d882

CGall42 added the resource allocation related to allocation of resources like CPU cores and memory label Jan 19, 2023

CGall42 requested a review from PhilippWendler January 19, 2023 21:34

CGall42 self-assigned this Jan 19, 2023

CGall42 marked this pull request as draft January 19, 2023 21:39

PhilippWendler requested changes Jan 26, 2023

View reviewed changes

Charlie and others added 23 commits February 2, 2023 11:20

Merge branch 'main' into resources-update-core-assignment

30fe0db

Reformatting

6ee6ef3

Fixed copyright header

bb281b0

Comments updated

f540424

Added root hierarchy level

59c6e7d

This method adds a "root" hierarchy level, if the system topology doesnt have one. Necessary for iterating through the whole topology.

Fix distribution algorithm

52fec71

now chooses the right starting core for the next thread

Various improvements

e2c2a2f

Unittest rewrite for new core assignment

fcf7e34

fix imports

ce795b6

Added missing return statement, accidentally deleted method

7f06e78

Refactoring

869554e

Refactoring

212e25d

Merge branch 'main' into resources-update-core-assignment

fab1fca

Comments cleanup

8be832e

Siblings System Call now includes new path additionally

c69d56d

System Calls Added for L3caches, Groups, Dies, Clusters, Drawers, Books

45e5446

fixed variable names

7257975

added filter for slow cores

7b06ee8

comments edited

8d07ad8

formatting and comments edited

9a0a940

System Calls Error Handling added & Minor fixes

db390c2

resources: modified distribution algorithm for tighter packing & mino…

e2c1e9c

…r fixes

Refactoring

8cce256

PhilippWendler and others added 30 commits February 14, 2024 09:39

Remove a test that is no longer relevant

ef13d2a

The new get_cpu_distribution method has no information about partial physical cores anymore, this is checked outside of it.

Temporarily disable some tests that need to be investigated or fixed

deae4ac

It is better to have CI green to be able to notice further regressions.

Fix crash for machines with a single NUMA node in get_closest_nodes

fd58f75

Also add unit tests for this function.

remove irrelevant code

f657e9a

Simplify get_closest_nodes

8b55c58

Also do not use "raise Exception", use assert to encode coding assumptions.

Refactoring: list comprehension is easier than temporary list and for

d0a558b

Remove redundant str() calls in format() arguments

64aa7e5

Refactoring: use max() instead of sorting just to get largest element

bd1f11a

Simplifications

46a8d42

Remove debug logging statements without any understandable message

490d20a

Avoid duplicate log message about L3 cache in failure case

5e299c3

Use standard string format for logging, not f-strings

3a46391

Only the standard format is lazy.

remove unused code

316a64e

Refactoring: Remove allCpus parameter from check_distribution_feasibi…

81f2b35

…lity It is not really necessary.

remove unused datastructure

289b780

Refactor get_sub_unit_dict

a949fda

This method actually has nothing to do with "sub" units (children), it just takes a set of cores and a level and groups the cores as appropriate for the level. So the names should reflect that.

small fixes for CI

12fcb40

Merge branch 'main' into resources-update-core-assignment

2404d62

Re-add logging config in tests to make tests working again

2bc4d20

copied new tests, unchanged, to have as much tests as possible

9636ee0

remove redundant sorting and removal of duplicate layers

d0d01cd

As the new method already keeps the layers in the correct order and doesn't create duplicate layers, we can remove those parts from the code

replace layer generation code

cab9c90

After showing that the new layer generation code is equivalent to the existing one, we can remove the old code with the hardcoded layers and use the new code.

fix unnecessary import

4073a2b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Update core assignment algorithm in benchexec/resources.py #892

Draft: Update core assignment algorithm in benchexec/resources.py #892

CGall42 commented Jan 19, 2023 •

edited by PhilippWendler

Loading

PhilippWendler left a comment

Draft: Update core assignment algorithm in benchexec/resources.py #892

Are you sure you want to change the base?

Draft: Update core assignment algorithm in benchexec/resources.py #892

Conversation

CGall42 commented Jan 19, 2023 • edited by PhilippWendler Loading

PhilippWendler left a comment

Choose a reason for hiding this comment

CGall42 commented Jan 19, 2023 •

edited by PhilippWendler

Loading