Red Hat Bugzilla – Bug 507033
multi-socket Intel 5500/Nehalem systems /sys shows all cores on first node, none on 2nd (this is not correct)
Last modified: 2010-06-28 09:10:23 EDT
This issue is somewhat similar to BZ 506805. This is with respect to
The issue is that, on multi-socket 5500 series / Nehalem systems like
XE270, all of the cores are showing up on the first node. Zero cores
are showing up on the 2nd node.
SGI XE270, Supermicro X8DTN v 1.1 mainboard
Memory: 8 GB total, made up of 2GB DDR3 1066 MHz DIMMs, part number
2 4-core sockets, hyperthreading turned off.
cpu info: Intel(R) Xeon(R) CPU X5570 @ 2.93GHz
(cpuinfo will be attached).
For example, for the first node:
[root@cct201 ~]# ls /sys/devices/system/node/node0
cpu0 cpu2 cpu4 cpu6 cpulist distance numastat
cpu1 cpu3 cpu5 cpu7 cpumap meminfo scan_unevictable_pages
[root@cct201 ~]# cat /sys/devices/system/node/node0/cpulist
For the 2nd node:
[root@cct201 ~]# ls /sys/devices/system/node/node2
cpulist cpumap distance meminfo numastat scan_unevictable_pages
[root@cct201 ~]# cat /sys/devices/system/node/node2/cpulist
/proc/cpuinfo confirms there are indeed two cores and that hyperthreading is
off. cpu cores is 4, siblings is 4, and the total core count is 8.
kernel version is: 22.214.171.124-167.fc11.x86_64
It is reported by Kannan Somangili, who also observed this problem on a
different multi-socket Nehalem system, that RHEL 5.3 and SLES 10 SP2 do
not suffer from this problem.
dmesg is being attached. Here are some interesting dmesg segments:
[root@cct201 ~]# dmesg|grep SRAT
ACPI: SRAT BF79A4C0, 0150 (r1 052809 OEMSRAT 1 INTL 1)
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 2 -> Node 0
SRAT: PXM 0 -> APIC 4 -> Node 0
SRAT: PXM 0 -> APIC 6 -> Node 0
SRAT: PXM 1 -> APIC 16 -> Node 1
SRAT: PXM 1 -> APIC 18 -> Node 1
SRAT: PXM 1 -> APIC 20 -> Node 1
SRAT: PXM 1 -> APIC 22 -> Node 1
SRAT: Node 0 PXM 0 0-a0000
SRAT: Node 0 PXM 0 100000-c0000000
SRAT: Node 0 PXM 0 100000000-140000000
SRAT: Node 2 PXM 257 140000000-240000000
Here is a segment with tracebacks:
NUMA: Allocated memnodemap from 18000 - 1c880
NUMA: Using 20 for the hash shift.
Bootmem setup node 0 0000000000000000-0000000140000000
NODE_DATA [000000000001c880 - 000000000003187f]
bootmap [0000000000032000 - 0000000000059fff] pages 28
(8 early reservations) ==> bootmem [0000000000 - 0140000000]
#0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
#1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
#2 [0000200000 - 0000ac700c] TEXT DATA BSS ==> [0000200000 - 0000ac700c]
#3 [0037c97000 - 0037fefdc0] RAMDISK ==> [0037c97000 - 0037fefdc0]
#4 [000009a800 - 0000100000] BIOS reserved ==> [000009a800 - 0000100000]
#5 [0000010000 - 0000013000] PGTABLE ==> [0000010000 - 0000013000]
#6 [0000013000 - 0000018000] PGTABLE ==> [0000013000 - 0000018000]
#7 [0000018000 - 000001c880] MEMNODEMAP ==> [0000018000 - 000001c880]
Bootmem setup node 2 0000000140000000-0000000240000000
NODE_DATA [0000000140000000 - 0000000140014fff]
bootmap [0000000140015000 - 0000000140034fff] pages 20
(8 early reservations) ==> bootmem [0140000000 - 0240000000]
#0 [0000000000 - 0000001000] BIOS data page
#1 [0000006000 - 0000008000] TRAMPOLINE
#2 [0000200000 - 0000ac700c] TEXT DATA BSS
#3 [0037c97000 - 0037fefdc0] RAMDISK
#4 [000009a800 - 0000100000] BIOS reserved
#5 [0000010000 - 0000013000] PGTABLE
#6 [0000013000 - 0000018000] PGTABLE
#7 [0000018000 - 000001c880] MEMNODEMAP
found SMP MP-table at [ffff8800000ff780] 000ff780
[ffffe20000000000-ffffe200045fffff] PMD -> [ffff880028200000-ffff88002b9fffff] on node 0
[ffffe20004600000-ffffe20007dfffff] PMD -> [ffff880140200000-ffff8801439fffff] on node 2
Zone PFN ranges:
Created attachment 348725 [details]
Created attachment 348726 [details]
Feel silly for saying they were tracebacks. I'm just trying to go home for the day :)
confirmed present in 126.96.36.199 community kernel.
Also still a problem in 2.6.30-git14
I applied the community fix (see the LKML links in 506805) and confirmed
this issue is fixed with that patch set. I applied the patch set against
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '11'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 11's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 11 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
The process we are following is described here:
Fedora 11 changed to end-of-life (EOL) status on 2010-06-25. Fedora 11 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.