Bug 707292 - cpu socket detection fails on some 5.7 i386 boxes
Summary: cpu socket detection fails on some 5.7 i386 boxes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: subscription-manager
Version: 5.7
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Adrian Likins
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 675214
TreeView+ depends on / blocked
 
Reported: 2011-05-24 15:57 UTC by Adrian Likins
Modified: 2021-03-02 23:22 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-07-21 08:45:26 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2011:1078 0 normal SHIPPED_LIVE new package: subscription-manager 2011-07-21 08:45:07 UTC

Description Adrian Likins 2011-05-24 15:57:27 UTC
Description of problem:
cpu socket detection code seems to fail on some 5.7 i386 boxes (
and potentially other arches).

The code looks for /sys/devices/system/cpu/cpuN/topology/physical_package_id
which just does not seem to be there on some i386 boxes.


How reproducible:
Not sure, different boxes seem to do different things. kvm i386 boxes have
/sys/devices/system/cpu/cpu0/topology/physical_package_id but
not
/sys/devices/system/cpu/cpu1/topology/physical_package_id

A bare metal Intel(R) Pentium(R) 4 CPU 3.00GHz has
neither /sys/devices/system/cpu/cpu0/topology/physical_package_id 
or  /sys/devices/system/cpu/cpu1/topology/physical_package_id

kvm X86_64 seem to have it. 


Steps to Reproduce:

1. Find a i386 RHEL5.7 box
2. run "subscription-manager facts --list" | grep cpu_socket
3. see what it shows
4. check /var/log/rhsm/rhsm.log for:
Hardware detection failed: [Errno 2] No such file or directory: '/sys/devices/system/cpu//cpu1/topology/physical_package_id'

5. Take machine apart
6. count cpu sockets
7. see if it matches anything above

Comment 1 Adrian Likins 2011-05-24 21:10:27 UTC
The box I've seen with no topology/ info appears to be a p4 zen dom0, so this might be a xen kernel thing.

Comment 2 Chris Duryee 2011-05-25 12:51:34 UTC
a patch is out for review

Comment 3 Adrian Likins 2011-05-25 17:45:56 UTC
commit 2a89d48fe747411f8c79038988292a3690691e36
Author: Adrian Likins <alikins>
Date:   Tue May 24 16:53:19 2011 -0400

    707292: Better counting of cpu sockets
    
    Look for physical_package_id if it exists. Handle it not being
    there for all cpu cores (i386 guest on x86_64 kvm guest).
    Handle it not existing at all (xen dom0).
    
    If all else fails, claim we have one cpu socket.

Comment 4 Chris Duryee 2011-05-25 17:51:53 UTC
fixed in 2a89d48, rhel5.7 branch v 0.95.5.19

Comment 5 John Sefler 2011-05-25 22:41:32 UTC
Partial Verification on s390x hardware where detection of the number of cpu sockets has been failing (see bug 696791)....

[root@ibm-z10-12 tmp]# uname -a
Linux ibm-z10-12.rhts.eng.bos.redhat.com 2.6.18-262.el5 #1 SMP Mon May 16 17:52:56 EDT 2011 s390x s390x s390x GNU/Linux
[root@ibm-z10-12 tmp]# rpm -q subscription-manager
subscription-manager-0.95.5.19-1.git.2.2a89d48.el5


[root@ibm-z10-12 tmp]# subscription-manager facts --list
cpu.core(s)_per_socket: 2
cpu.cpu(s): 2
cpu.cpu_socket(s): 1
distribution.id: Tikanga
distribution.name: Red Hat Enterprise Linux Server
distribution.version: 5.7
memory.memtotal: 508952
memory.swaptotal: 1048568
net.interface.eth0.broadcast: 10.16.71.255
net.interface.eth0.hwaddr: 02:de:ad:be:ef:0c
net.interface.eth0.ipaddr: 10.16.66.203
net.interface.eth0.netmask: 255.255.248.0
net.interface.lo.broadcast: 0.0.0.0
net.interface.lo.hwaddr: 00:00:00:00:00:00
net.interface.lo.ipaddr: 127.0.0.1
net.interface.lo.netmask: 255.0.0.0
net.interface.sit0.broadcast: unknown
net.interface.sit0.hwaddr: 00:00:00:00:00:00
net.interface.sit0.ipaddr: unknown
net.interface.sit0.netmask: unknown
network.hostname: ibm-z10-12.rhts.eng.bos.redhat.com
network.ipaddr: 10.16.66.203
system.entitlements_valid: False
uname.machine: s390x
uname.nodename: ibm-z10-12.rhts.eng.bos.redhat.com
uname.release: 2.6.18-262.el5
uname.sysname: Linux
uname.version: #1 SMP Mon May 16 17:52:56 EDT 2011
virt.host_type: ibm_systemz
ibm_systemz-zvm
virt.is_guest: True


The fix for this bugzilla is a partial continuation of the story from https://bugzilla.redhat.com/show_bug.cgi?id=696791#c4
Notice in the facts above, that on s390 the number of sockets is now set to 1 (presumably when detection of this hardware property fails as indicated in comment #3).  With respect to subscription availability based on sockets, this is a gracious default value.

Comment 6 John Sefler 2011-05-27 16:15:46 UTC
Recreate/Verification of this bug is somewhat tricky.  Here we go...

Using beaker (jobs/90124) to provision an i386 machine with the kernel-xen...

[root@dell-pe1650-02 ~]# uname -a
Linux dell-pe1650-02.rhts.eng.bos.redhat.com 2.6.18-264.el5xen #1 SMP Tue May 24 15:28:04 EDT 2011 i686 i686 i386 GNU/Linux

Now let's recreate the problem...
[root@dell-pe1650-02 ~]# ls /sys/devices/system/cpu/
cpu0  cpu1
[root@dell-pe1650-02 ~]# ls /sys/devices/system/cpu/cpu0/
crash_notes  topology
[root@dell-pe1650-02 ~]# ls /sys/devices/system/cpu/cpu0/topology/
[root@dell-pe1650-02 ~]# ls /sys/devices/system/cpu/cpu1
crash_notes  online  topology
[root@dell-pe1650-02 ~]# ls /sys/devices/system/cpu/cpu1/topology/
[root@dell-pe1650-02 ~]# 
[root@dell-pe1650-02 ~]# for cpu in `ls -1 /sys/devices/system/cpu/ | egrep cpu[[:digit:]]`; do echo \"cpu `cat /sys/devices/system/cpu/$cpu/topology/physical_package_id`\"; done | grep cpu | uniq | wc -l
cat: /sys/devices/system/cpu/cpu0/topology/physical_package_id: No such file or directory
cat: /sys/devices/system/cpu/cpu1/topology/physical_package_id: No such file or directory
1

^^^ Note there is no physical_package_id file on this hardware, therefore subscription-manger does not know how to count the sockets...

Let's verify that subscription-manager-0.95.5.18-1.el5 (built before the fix for this bug) will NOT report a system fact for cpu_socket  

[root@dell-pe1650-02 ~]# rpm -q subscription-manager
subscription-manager-0.95.5.18-1.el5
[root@dell-pe1650-02 ~]# subscription-manager facts --list | grep cpu
[root@dell-pe1650-02 ~]# 

^^^ Verified the re-create that no cpu facts are known to subscription-manager

Now, let's install subscription-manager with this bug fix...

[root@dell-pe1650-02 ~]# rpm -q subscription-manager
subscription-manager-0.95.5.19-1.git.2.2a89d48.el5
[root@dell-pe1650-02 ~]# subscription-manager facts --list | grep cpu
cpu.core(s)_per_socket: 2
cpu.cpu(s): 2
cpu.cpu_socket(s): 1
[root@dell-pe1650-02 ~]# 

^^^ Verified that subscription provides a sockets value of 1 in case when topology/physical_package_id does not exist at all (xen dom0)

Comment 7 errata-xmlrpc 2011-07-21 08:45:26 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-1078.html

Comment 8 errata-xmlrpc 2011-07-21 12:30:42 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-1078.html


Note You need to log in before you can comment on or make changes to this bug.