-
Notifications
You must be signed in to change notification settings - Fork 174
Linux kernel bugs
The following hwloc error messages are caused by the Linux kernel reporting invalid topology information. Recent errors are listed first.
****************************************************************************
* hwloc 1.11.8 has encountered what looks like an error from the operating system.
*
* L3 (cpuset 0x60000060) intersects with NUMANode (P#0 cpuset 0x3f00003f
nodeset 0x00000001) without inclusion!
Fixed in Linux 4.14 in this commit (and backported in 4.13.16):
commit 2b83809a5e6d619a780876fcaf68cdc42b50d28c
Author: Suravee Suthikulpanit <[email protected]>
Date: Mon Jul 31 10:51:59 2017 +0200
x86/cpu/amd: Derive L3 shared_cpu_map from cpu_llc_shared_mask
Each dual-NUMA package is reported as two single-NUMA packages.
Fixed in Linux 3.18 in this commit:
commit cebf15eb09a2fd2fa73ee4faa9c4d2f813cf0f09
Author: Dave Hansen <[email protected]>
Date: Thu Sep 18 12:33:34 2014 -0700
x86, sched: Add new topology for multi-NUMA-node CPUs
****************************************************************************
* hwloc 1.11.2 has encountered an incorrect PCI locality information.
* PCI bus 0000:80 is supposedly close to 2nd NUMA node of 1st package,
* however hwloc believes this is impossible on this architecture.
* Therefore the PCI bus will be moved to 1st NUMA node of 2nd package.
*
* If you feel this fixup is wrong, disable it by setting in your environment
* HWLOC_PCI_0000_80_LOCALCPUS= (empty value), and report the problem
* to the hwloc's user mailing list together with the XML output of lstopo.
*
* You may silence this message by setting HWLOC_HIDE_ERRORS=1 in your environment.
This problem may look similar to the previous one but it's actually very different. This is actually a BIOS bug, nothing to fix in the kernel. hwloc detects the issue and fixes it automagically.
****************************************************************************
* Hwloc has encountered what looks like an error from the operating system.
*
* object (L3 cpuset 0x000003f0) intersection without inclusion!
The fix was NEVER pushed to Linux.
Use hwloc >=1.11.2 and set HWLOC_COMPONENTS=x86 in your environment to work around the issue.
****************************************************************************
* Hwloc has encountered what looks like an error from the operating system.
*
* Socket (P#2 cpuset 0x0000ffff,0x0) intersects with NUMANode (P#3 cpuset
0x0000ff00,0xff000000) without inclusion!
This is likely not a kernel bug but rather a BIOS reporting invalid SRAT information.
Upgrading the BIOS is the only chance to get a proper fix. Otherwise try hwloc >=1.11.2 and set HWLOC_COMPONENTS=x86 in your environment to work around the issue.