Bug 519431 - Single socket Nehalem-EP causes issues in /proc/cpuinfo
Summary: Single socket Nehalem-EP causes issues in /proc/cpuinfo
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.8
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Luming Yu
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 499416
TreeView+ depends on / blocked
 
Reported: 2009-08-26 15:28 UTC by Jon Thomas
Modified: 2018-11-14 19:52 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-14 20:22:39 UTC


Attachments (Terms of Use)

Description Jon Thomas 2009-08-26 15:28:19 UTC
Description of problem:
When RHEL4.8 x86_64 is installed to a system with a single socket
Nehalem-EP processor, the smp kernel is chosen by anaconda because
Nehalem-EP processor has logical 8 processors, less than 16.

In that case, OS seems to misunderstand some processor information as follows.
---------------------
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU           X5550  @ 2.67GHz
stepping : 4
cpu MHz : 1596.000
cache size : 8192 KB
physical id : 2
siblings : 1
core id : 255
cpu cores : 1
--------------------

This is taken from /proc/cpuinfo. (see also the attachments)
The physical id, siblings, core id and cpu cores are illegal.
In the same environment, these values should be as follows.
--------------------
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU           X5550  @ 2.67GHz
stepping : 4
cpu MHz : 1596.000
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
---------------------

This issue is caused by the following codes.

arch/x86_64/kernel/setup.c L910-L923
------------------------------------------------------------------------

cpuid(1, &eax, &ebx, &ecx, &edx);
smp_num_siblings = (ebx & 0xff0000) >> 16;

if (smp_num_siblings == 1) {
printk(KERN_INFO  "CPU: Hyper-Threading is disabled\n");
} else if (smp_num_siblings > 1) {

index_msb = 31;

if (smp_num_siblings > NR_CPUS) {
printk(KERN_WARNING "CPU: Unsupported number of siblings %d", smp_num_siblings);
smp_num_siblings = 1;
return;
}
------------------------------------------------------------------------

The "smp_num_siblings" holds the maximum number of logical processors
taken by x86 CPUID instruction.  In the Nehalem-EP case, smp_num_siblings is 16, not 8.
On the other hand, the "NR_CPUS" equals to 8 in case of RHEL4.8 x86_64 smp kernel.
Therefore, " if (smp_num_siblings > NR_CPUS) " holds true and the subsequent
initializations are skipped.
As a result, cpu_core_id, phys_proc_id and relevant variables remain uninitialized.

This issue does not reproduced on i386 architecture or x86_64 largesmp kernel
because "NR_CPUS" is not 8 in those cases.

Version-Release number of selected component (if applicable):
RHEL4.8 x86_64 kernel-2.6.9-89.ELsmp


Hardware info:
The server equipped with a intel Xeon 5500 series processor (known as Nehalem-EP),
One of these servers is Express5800/R120a-1.

Comment 3 Jon Thomas 2009-09-15 14:04:43 UTC
Apparently there is a numa issue between the two kernels outlined in IT 339779.

Numa in regular smp is unbalanced....which is causing bad performance of parallel workloads.


alancha@caliph6:~> sudo dmihardware
Cisco Systems Inc N20-B6620-1
alancha@caliph6:~> cat /etc/motd
Cisco Linux 5.03-4 Kickstarted on: Sat Sep 5 05:03:44 PDT 2009.
alancha@caliph6:~> uname -a
Linux caliph6 2.6.9-89.0.10.ELsmp #1 SMP Fri Aug 21 17:14:28 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
alancha@caliph6:~> numastat
node1 node0
numa_hit 89486 2498428
numa_miss 0 0
numa_foreign 0 0
interleave_hit 89486 86840
local_node 0 2498428
other_node 89486 0
##
- showing more than 1 node, that means numa is turned on...
alancha@caliph6:~> ls /sys/devices/system/node/node[01]
/sys/devices/system/node/node0:
cpu0 cpu1 cpu2 cpu3 cpu4 cpu5 cpu6 cpu7 cpumap meminfo numastat
/sys/devices/system/node/node1:
cpumap meminfo numastat
##
- but... it's totally unbalanced, node0 has all the cpus...
Now I'm rebooting with the rc2's kernel...
##
alancha@caliph6:~> uname -a
Linux caliph6 2.6.9-89.0.9.ELlargesmp #1 SMP Wed Aug 19 08:12:11 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
alancha@caliph6:~> numastat
node1 node0
numa_hit 1036997 1472009
numa_miss 0 0
numa_foreign 0 0
interleave_hit 88865 86562
local_node 981327 1439606
other_node 55670 32403
##
- numa is on...
alancha@caliph6:~> ls /sys/devices/system/node/node[01]
/sys/devices/system/node/node0:
cpu0 cpu2 cpu4 cpu6 cpumap meminfo numastat
/sys/devices/system/node/node1:
cpu1 cpu3 cpu5 cpu7 cpumap meminfo numastat
##
- and it's perfect balance!

Comment 19 John Villalovos 2010-12-15 00:47:20 UTC
So in reality this is an Anaconda bug, since the wrong kernel is getting installed.

I think it should be installing the SMP kernel and not the largeSMP kernel.

There is a bug in the kernel too, but if the correct kernel is used you wouldn't see the issue, I think.

Comment 20 Luming Yu 2010-12-15 02:08:02 UTC
Please open necessary private comments to me (luyu@redhat.com) to enable me to follow up this bug.

Thanks,
Luming

Comment 21 Luming Yu 2010-12-15 08:11:56 UTC
> The "smp_num_siblings" holds the maximum number of logical processors
> taken by x86 CPUID instruction.  In the Nehalem-EP case, smp_num_siblings is
> 16, not 8.
> On the other hand, the "NR_CPUS" equals to 8 in case of RHEL4.8 x86_64 smp
> kernel.
> Therefore, " if (smp_num_siblings > NR_CPUS) " holds true and the subsequent
> initializations are skipped.
> As a result, cpu_core_id, phys_proc_id and relevant variables remain
> uninitialized.

There are several options:
1. bump up NR_CPUS for smp-kernel to 16, but will still fail for future processors coming with more and more cores.
2. use largeSMP kernel as John said to use full 16 logical processors.
3. Turn off HT in BIOS to get 8 cores.

if you don't like the three options listed above, and really need to have correct /proc/cpuinfo displayed and leave half logical processor resource not used, then please let me know. I will come up a proper fix for this case.

Comment 22 John Villalovos 2010-12-15 15:22:24 UTC
(In reply to comment #19)
> So in reality this is an Anaconda bug, since the wrong kernel is getting
> installed.
> 
> I think it should be installing the SMP kernel and not the largeSMP kernel.
> 
> There is a bug in the kernel too, but if the correct kernel is used you
> wouldn't see the issue, I think.

I got that backwards.  I think Anaconda should be installing the "largesmp" kernel, but it installs the "smp" kernel instead.


Note You need to log in before you can comment on or make changes to this bug.