-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distrib #370
Open
NicolasDenoyelle
wants to merge
135
commits into
open-mpi:master
Choose a base branch
from
NicolasDenoyelle:distrib
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Distrib #370
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
NicolasDenoyelle
force-pushed
the
distrib
branch
from
October 22, 2019 12:06
2e5e198
to
4d1d25a
Compare
No code change, just add comments about things being official instead of assumptions. The Gen5 specs is pretty-much finalized, things won't change anymore. Signed-off-by: Brice Goglin <[email protected]>
Check that Linux can add NUMA nodes to x86 CPU information. And check that Linux can annotate x86 AMD topoext NUMA nodes. Signed-off-by: Brice Goglin <[email protected]>
If the workspace clone ever ran on another branch (e.g. in my zbgoglin jobs), git branch returns multiple lines, which causes the 2nd branch name to be ran as a command-line after the only expected line "job-0-tarball.sh <firstbranch>" Use git rev-parse --abbrev-ref HEAD instead. Signed-off-by: Brice Goglin <[email protected]>
This tells the code not to ever merge that group with structurally-identical parent or children. This is useful for Groups implementing new "types" that cannot be backported to stable releases. New types won't be merged by default, but Groups would. Requested by Intel for Die objects. This doesn't break the ABI because the attribute structure has always been calloc'ed, which means this attribute was "0", which matches the default "merge group" behavior. Signed-off-by: Brice Goglin <[email protected]>
Update CPUID.1f x86 test case not to merge Die groups anymore. Hence there's no need to ignore Caches anymore. Signed-off-by: Brice Goglin <[email protected]>
…ile/module types Make them groups. Signed-off-by: Brice Goglin <[email protected]>
…rectories Signed-off-by: Brice Goglin <[email protected]>
I managed to convince Intel that adding another foo_siblings between core_siblings and thread_siblings would break userspace and situation could be even worse if they ever add another intermediate level in the future. So they are finally renaming to filenames whose semantics doesn't depend on intermediate levels: core_cpus and package_cpus. Signed-off-by: Brice Goglin <[email protected]>
Linux 5.3 will have new "die_cpus" and "die_id" sysfs files for upcoming architectures with multiple dies per packages. When the die cpuset is different from the package, add a "Die" group. Don't add it when there's a single Die per package because most CPUs don't want to show a useless additional Die level. We don't want to set the Die level to keep_structure because it would get automerged in L3 caches on CLX, and lstopo displays everything by default anyway. Set the "dont_merge" group flag if HWLOC_DONT_MERGE_DIE_GROUPS is set in the environment, just like in the x86 backend. Signed-off-by: Brice Goglin <[email protected]>
Old kernels exposed two packages on E5v3 in Cluster-on-Die mode because the package core_siblings was wrong. We detected that case when two packages had the same physical_package_id. This was fixed in Linux 3.18, backported in RHEL7. Other important distros use a more recent kernel now. Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Just like for cores and packages. Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Fixes commit c1c34a6 Signed-off-by: Brice Goglin <[email protected]>
Otherwise the matrix would be wrong. Further fixes commit c1c34a6 Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
…vailable Hence, we don't have to run both on Linux/x86 anymore, and we don't have to manually tarball the CPUID files. Refs open-mpi#186 Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Use realpath so that we can change the current directory without breaking the destination relative directory. Signed-off-by: Brice Goglin <[email protected]>
Reported by Intel from the output of klocwork. Signed-off-by: Brice Goglin <[email protected]>
Reported by Intel from the output of klocwork. Signed-off-by: Brice Goglin <[email protected]>
Reported by Intel from the output of klocwork. Signed-off-by: Brice Goglin <[email protected]>
…objects during build Thanks to Eloi Gaudry for the patch. Signed-off-by: Brice Goglin <[email protected]>
Instead of having all of them in the main solution file. Thanks to Eloi Gaudry for the patch. Signed-off-by: Brice Goglin <[email protected]>
Defined with recent VS. Signed-off-by: Brice Goglin <[email protected]>
Thanks to Eloi Gaudry for the patch. We force retarget to an old vs110 for ci.inria.fr. Signed-off-by: Brice Goglin <[email protected]>
Thanks to Eloi Gaudry for the patch. Signed-off-by: Brice Goglin <[email protected]>
Move idea of hwloc-ps to a github issue. Update some comments, add details for command-line build. Thanks to Eloi Gaudry for the suggestion. Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Will run in the extended nightly tests. Runs only on master on the main repo by default. Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
…-processors Closes open-mpi#368 Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
They are renamed to PREFIX_hwloc_FOO instead of PREFIX_HWLOC_FOO We could fix it but it doesn't matter much (people aren't supposed to use those renamed names anyway) and it could break existing hacks (if anybody actually depends on such renamed name). Thanks to Samuel K. Gutierrez for the report. Signed-off-by: Brice Goglin <[email protected]>
Don't AND(normal, topology_allowed) in the normal (v2) case to avoid hiding internal allowed set bugs. Signed-off-by: Brice Goglin <[email protected]>
In some (old?) corner cases, Linux cpusets may return offline PUs in the allowed sets of cpusets/cgroups. Signed-off-by: Brice Goglin <[email protected]>
…ectory fsroot and cpuid are implemented in tools using environment variables (those debug cases are not in the API since v2). Those backends forced by environment variable override the normal topology thissystem flag that may be set with set_flags() in the API and with --flags or --thissystem in cli tools. One must use the HWLOC_THISSYSTEM envvar to force the this system flag. Implement this automatically in the tools (common helpers). Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: Brice Goglin <[email protected]>
Signed-off-by: ndenoyelle <[email protected]>
Signed-off-by: ndenoyelle <[email protected]>
Signed-off-by: ndenoyelle <[email protected]>
Signed-off-by: ndenoyelle <[email protected]>
Signed-off-by: ndenoyelle <[email protected]>
Signed-off-by: ndenoyelle <[email protected]>
NicolasDenoyelle
force-pushed
the
distrib
branch
from
October 22, 2019 18:42
2024770
to
cbb230a
Compare
@NicolasDenoyelle can you rebase/squash these commits to ease review? |
Yes.
I am busy at the moment.
I'll let you know as soon as I am done!
…On Wed, Dec 4, 2019, 04:39 Brice Goglin ***@***.***> wrote:
@NicolasDenoyelle <https://github.com/NicolasDenoyelle> can you
rebase/squash these commits to ease review?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#370?email_source=notifications&email_token=ACMX3MFMTQLBZTOITDJAVYLQW6CEXA5CNFSM4JDFSQ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF4RM4A#issuecomment-561583728>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACMX3MCNJKP2SI7CKTDMHM3QW6CEXANCNFSM4JDFSQ5A>
.
|
Note to open pull requests: some things changed in the CI yesterday, you'll need to rebase on top of master to avoid total CI failure. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This branch is an addition to exisitng hwloc_distrib() method to distribute cpusets of the topology.
It adds a new way to iterate over topology objects of a single level with a hierarchical policy.
Utility hwloc-distrib has been modified to reflect the new capabilities.
This branch purpose is to bring thread binding policies to hwloc toolset by using it with hwloc-thread-bind branch.