Bug 967871

Summary: net-snmp does not display correct lm_sensors sensor data / missing CPU cores
Product: Red Hat Enterprise Linux 6 Reporter: tunderhay
Component: net-snmpAssignee: Jan Safranek <jsafrane>
Status: CLOSED ERRATA QA Contact: Dalibor Pospíšil <dapospis>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.4CC: dapospis, mxxcon
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: snmpd expected, that sensor names reported by lm_sensors library are unique. However, on systems with multiple sockets with Xeon family CPUs, the thermal sensors on all CPUs have the same name. Consequence: snmpd reported temperature only of one CPU. Fix: snmpd adds prefix to all sensors, that have the same name. Result: snmpd reports temperature of all CPUs. For example, on 2-socket machine with two-core CPUs, the old snmpd reported just two thermal sensors (from the first CPU) in LM-SENSORS-MIB::lmTempSensorsTable: lmTempSensorsDevice.2 = STRING: Core 0 lmTempSensorsDevice.3 = STRING: Core 1 With this update, all four thermal sensors are reported. Notice the prefix of second set of sensor names: lmTempSensorsDevice.2 = STRING: Core 0 lmTempSensorsDevice.3 = STRING: Core 1 lmTempSensorsDevice.6 = STRING: coretemp-isa-0004:Core 0 lmTempSensorsDevice.7 = STRING: coretemp-isa-0004:Core 1 The first set of sensors is kept without prefix to keep compatibility with old applications, which may expect sensor named 'Core 0'.
Story Points: ---
Clone Of:
: 1252053 (view as bug list) Environment:
Last Closed: 2015-07-22 07:22:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description tunderhay 2013-05-28 12:58:11 UTC
Description of problem:  Using Dell R620 hardware with dual Dual Intel Xeon E5-2670 Eight Core, net-snmp fails to show data for all CPU cores detected by lm_sensors.  Only a subset are displayed.


Version-Release number of selected component (if applicable): net-snmp-5.5-44.el6_4.1.x86_64


How reproducible:  All Dell R620 hardware running El6 6.4 with net-snmp exhibit this behavior


Steps to Reproduce:

Hardware: Dell R620 hardware with dual Dual Intel Xeon E5-2670 Eight Core,
2.6GHz

OS: CentOS 6.4 64-bit

Packages:
# rpm -qa | grep -i snmp
net-snmp-5.5-44.el6_4.1.x86_64
net-snmp-utils-5.5-44.el6_4.1.x86_64
net-snmp-libs-5.5-44.el6_4.1.x86_64
# rpm -qa | grep -i lm_sensors
lm_sensors-libs-3.1.1-17.el6.x86_64
lm_sensors-3.1.1-17.el6.x86_64

1. Run 'sensors' and see correct output for all 16 cores:

# sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +70.0°C  (high = +90.0°C, crit = +100.0°C)
Core 0:        +69.0°C  (high = +90.0°C, crit = +100.0°C)
Core 1:        +65.0°C  (high = +90.0°C, crit = +100.0°C)
Core 2:        +67.0°C  (high = +90.0°C, crit = +100.0°C)
Core 3:        +65.0°C  (high = +90.0°C, crit = +100.0°C)
Core 4:        +63.0°C  (high = +90.0°C, crit = +100.0°C)
Core 5:        +63.0°C  (high = +90.0°C, crit = +100.0°C)
Core 6:        +65.0°C  (high = +90.0°C, crit = +100.0°C)
Core 7:        +64.0°C  (high = +90.0°C, crit = +100.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Physical id 1: +59.0°C  (high = +90.0°C, crit = +100.0°C)
Core 0:        +59.0°C  (high = +90.0°C, crit = +100.0°C)
Core 1:        +60.0°C  (high = +90.0°C, crit = +100.0°C)
Core 2:        +58.0°C  (high = +90.0°C, crit = +100.0°C)
Core 3:        +58.0°C  (high = +90.0°C, crit = +100.0°C)
Core 4:        +58.0°C  (high = +90.0°C, crit = +100.0°C)
Core 5:        +57.0°C  (high = +90.0°C, crit = +100.0°C)
Core 6:        +57.0°C  (high = +90.0°C, crit = +100.0°C)
Core 7:        +57.0°C  (high = +90.0°C, crit = +100.0°C)


2.  See snmpwalk output below for lmSensors and notice that only 8/16 cores are displayed: 

# snmpwalk -v2c -c mystring 127.0.0.1 lmSensors


Actual results:
# snmpwalk -v2c -c mystring 127.0.0.1 lmSensors
LM-SENSORS-MIB::lmTempSensorsIndex.1 = INTEGER: 1
LM-SENSORS-MIB::lmTempSensorsIndex.2 = INTEGER: 2
LM-SENSORS-MIB::lmTempSensorsIndex.3 = INTEGER: 3
LM-SENSORS-MIB::lmTempSensorsIndex.4 = INTEGER: 4
LM-SENSORS-MIB::lmTempSensorsIndex.5 = INTEGER: 5
LM-SENSORS-MIB::lmTempSensorsIndex.6 = INTEGER: 6
LM-SENSORS-MIB::lmTempSensorsIndex.7 = INTEGER: 7
LM-SENSORS-MIB::lmTempSensorsIndex.8 = INTEGER: 8
LM-SENSORS-MIB::lmTempSensorsIndex.9 = INTEGER: 9
LM-SENSORS-MIB::lmTempSensorsIndex.10 = INTEGER: 10
LM-SENSORS-MIB::lmTempSensorsDevice.1 = STRING: Physical id 0
LM-SENSORS-MIB::lmTempSensorsDevice.2 = STRING: Core 0
LM-SENSORS-MIB::lmTempSensorsDevice.3 = STRING: Core 1
LM-SENSORS-MIB::lmTempSensorsDevice.4 = STRING: Core 2
LM-SENSORS-MIB::lmTempSensorsDevice.5 = STRING: Core 3
LM-SENSORS-MIB::lmTempSensorsDevice.6 = STRING: Core 4
LM-SENSORS-MIB::lmTempSensorsDevice.7 = STRING: Core 5
LM-SENSORS-MIB::lmTempSensorsDevice.8 = STRING: Core 6
LM-SENSORS-MIB::lmTempSensorsDevice.9 = STRING: Core 7
LM-SENSORS-MIB::lmTempSensorsDevice.10 = STRING: Physical id 1
LM-SENSORS-MIB::lmTempSensorsValue.1 = Gauge32: 70000
LM-SENSORS-MIB::lmTempSensorsValue.2 = Gauge32: 64000
LM-SENSORS-MIB::lmTempSensorsValue.3 = Gauge32: 65000
LM-SENSORS-MIB::lmTempSensorsValue.4 = Gauge32: 62000
LM-SENSORS-MIB::lmTempSensorsValue.5 = Gauge32: 63000
LM-SENSORS-MIB::lmTempSensorsValue.6 = Gauge32: 61000
LM-SENSORS-MIB::lmTempSensorsValue.7 = Gauge32: 68000
LM-SENSORS-MIB::lmTempSensorsValue.8 = Gauge32: 63000
LM-SENSORS-MIB::lmTempSensorsValue.9 = Gauge32: 61000
LM-SENSORS-MIB::lmTempSensorsValue.10 = Gauge32: 69000


Expected results:  snmpwalk output should align with 'sensors' output but it doesn't.  It displays only 8 of 16 total CPU cores.


Additional info:  I believe that because 'sensors' displays the correct output the problem lies in net-snmp rather than lm_sensors itself.

Here's raw 'sensors' output:

# sensors -u
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:
 temp1_input: 69.00
 temp1_max: 90.00
 temp1_crit: 100.00
 temp1_crit_alarm: 0.00
Core 0:
 temp2_input: 69.00
 temp2_max: 90.00
 temp2_crit: 100.00
 temp2_crit_alarm: 0.00
Core 1:
 temp3_input: 61.00
 temp3_max: 90.00
 temp3_crit: 100.00
 temp3_crit_alarm: 0.00
Core 2:
 temp4_input: 67.00
 temp4_max: 90.00
 temp4_crit: 100.00
 temp4_crit_alarm: 0.00
Core 3:
 temp5_input: 67.00
 temp5_max: 90.00
 temp5_crit: 100.00
 temp5_crit_alarm: 0.00
Core 4:
 temp6_input: 62.00
 temp6_max: 90.00
 temp6_crit: 100.00
 temp6_crit_alarm: 0.00
Core 5:
 temp7_input: 63.00
 temp7_max: 90.00
 temp7_crit: 100.00
 temp7_crit_alarm: 0.00
Core 6:
 temp8_input: 64.00
 temp8_max: 90.00
 temp8_crit: 100.00
 temp8_crit_alarm: 0.00
Core 7:
 temp9_input: 64.00
 temp9_max: 90.00
 temp9_crit: 100.00
 temp9_crit_alarm: 0.00

coretemp-isa-0001
Adapter: ISA adapter
Physical id 1:
 temp1_input: 63.00
 temp1_max: 90.00
 temp1_crit: 100.00
 temp1_crit_alarm: 0.00
Core 0:
 temp2_input: 62.00
 temp2_max: 90.00
 temp2_crit: 100.00
 temp2_crit_alarm: 0.00
Core 1:
 temp3_input: 62.00
 temp3_max: 90.00
 temp3_crit: 100.00
 temp3_crit_alarm: 0.00
Core 2:
 temp4_input: 62.00
 temp4_max: 90.00
 temp4_crit: 100.00
 temp4_crit_alarm: 0.00
Core 3:
 temp5_input: 63.00
 temp5_max: 90.00
 temp5_crit: 100.00
 temp5_crit_alarm: 0.00
Core 4:
 temp6_input: 60.00
 temp6_max: 90.00
 temp6_crit: 100.00
 temp6_crit_alarm: 0.00
Core 5:
 temp7_input: 61.00
 temp7_max: 90.00
 temp7_crit: 100.00
 temp7_crit_alarm: 0.00
Core 6:
 temp8_input: 61.00
 temp8_max: 90.00
 temp8_crit: 100.00
 temp8_crit_alarm: 0.00
Core 7:
 temp9_input: 60.00
 temp9_max: 90.00
 temp9_crit: 100.00
 temp9_crit_alarm: 0.00

Comment 2 Mxx 2013-07-14 20:21:32 UTC
I see similar problem on Dell R720 server with dual Xeon E5-2667 CPU (6 cores+HT) running latest stable Oracle Enterprise Linux 6.4(based on RHEL)

[root@host log]$ rpm -qa|grep snmp
net-snmp-utils-5.5-44.0.1.el6_4.2.x86_64
net-snmp-libs-5.5-44.0.1.el6_4.2.x86_64
net-snmp-5.5-44.0.1.el6_4.2.x86_64

[root@host log]$ rpm -qa|grep sensors
lm_sensors-libs-3.1.1-17.el6.x86_64
lm_sensors-3.1.1-17.el6.x86_64

[root@host log]$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +56.0°C  (high = +96.0°C, crit = +102.0°C)
Core 0:        +55.0°C  (high = +96.0°C, crit = +102.0°C)
Core 1:        +50.0°C  (high = +96.0°C, crit = +102.0°C)
Core 2:        +52.0°C  (high = +96.0°C, crit = +102.0°C)
Core 3:        +55.0°C  (high = +96.0°C, crit = +102.0°C)
Core 4:        +52.0°C  (high = +96.0°C, crit = +102.0°C)
Core 5:        +56.0°C  (high = +96.0°C, crit = +102.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Physical id 1: +43.0°C  (high = +96.0°C, crit = +102.0°C)
Core 0:        +43.0°C  (high = +96.0°C, crit = +102.0°C)
Core 1:        +41.0°C  (high = +96.0°C, crit = +102.0°C)
Core 2:        +42.0°C  (high = +96.0°C, crit = +102.0°C)
Core 3:        +41.0°C  (high = +96.0°C, crit = +102.0°C)
Core 4:        +40.0°C  (high = +96.0°C, crit = +102.0°C)
Core 5:        +41.0°C  (high = +96.0°C, crit = +102.0°C)

my /etc/snmp/snmpd.conf has the following line to allow full access:
view all    included  .1                               80

[root@host log]# snmpwalk -c public -v 2c localhost sensor
LM-SENSORS-MIB::lmTempSensorsIndex.1 = INTEGER: 1
LM-SENSORS-MIB::lmTempSensorsIndex.2 = INTEGER: 2
LM-SENSORS-MIB::lmTempSensorsIndex.3 = INTEGER: 3
LM-SENSORS-MIB::lmTempSensorsIndex.4 = INTEGER: 4
LM-SENSORS-MIB::lmTempSensorsIndex.5 = INTEGER: 5
LM-SENSORS-MIB::lmTempSensorsIndex.6 = INTEGER: 6
LM-SENSORS-MIB::lmTempSensorsIndex.7 = INTEGER: 7
LM-SENSORS-MIB::lmTempSensorsIndex.8 = INTEGER: 8
LM-SENSORS-MIB::lmTempSensorsDevice.1 = STRING: Physical id 0
LM-SENSORS-MIB::lmTempSensorsDevice.2 = STRING: Core 0
LM-SENSORS-MIB::lmTempSensorsDevice.3 = STRING: Core 1
LM-SENSORS-MIB::lmTempSensorsDevice.4 = STRING: Core 2
LM-SENSORS-MIB::lmTempSensorsDevice.5 = STRING: Core 3
LM-SENSORS-MIB::lmTempSensorsDevice.6 = STRING: Core 4
LM-SENSORS-MIB::lmTempSensorsDevice.7 = STRING: Core 5
LM-SENSORS-MIB::lmTempSensorsDevice.8 = STRING: Physical id 1
LM-SENSORS-MIB::lmTempSensorsValue.1 = Gauge32: 60000
LM-SENSORS-MIB::lmTempSensorsValue.2 = Gauge32: 44000
LM-SENSORS-MIB::lmTempSensorsValue.3 = Gauge32: 42000
LM-SENSORS-MIB::lmTempSensorsValue.4 = Gauge32: 42000
LM-SENSORS-MIB::lmTempSensorsValue.5 = Gauge32: 42000
LM-SENSORS-MIB::lmTempSensorsValue.6 = Gauge32: 41000
LM-SENSORS-MIB::lmTempSensorsValue.7 = Gauge32: 41000
LM-SENSORS-MIB::lmTempSensorsValue.8 = Gauge32: 44000

Comment 3 RHEL Program Management 2013-10-14 03:29:37 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 4 Mxx 2013-10-14 03:36:42 UTC
(In reply to RHEL Product and Program Management from comment #3)
> This request was not resolved in time for the current release.
> Red Hat invites you to ask your support representative to
> propose this request, if still desired, for consideration in
> the next release of Red Hat Enterprise Linux.

I don't have support representatives.
Yes, I still desire to have this bug fixed.
This bug was submitted 5 month ago... It should've been plenty of time to make it into the current release cycle. :(

Comment 5 caseytb 2014-07-26 14:47:42 UTC
I'm having the same issue with snmp not displaying all the lm_sensors data. I have dual  Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz cpus on RHEL 6.4.

The "sensors" command displays:

$ sensors
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +56.0°C  (high = +80.0°C, crit = +90.0°C)
Core 0:        +54.0°C  (high = +80.0°C, crit = +90.0°C)
Core 1:        +56.0°C  (high = +80.0°C, crit = +90.0°C)
Core 2:        +50.0°C  (high = +80.0°C, crit = +90.0°C)
Core 3:        +50.0°C  (high = +80.0°C, crit = +90.0°C)
Core 4:        +51.0°C  (high = +80.0°C, crit = +90.0°C)
Core 5:        +55.0°C  (high = +80.0°C, crit = +90.0°C)
Core 6:        +52.0°C  (high = +80.0°C, crit = +90.0°C)
Core 7:        +56.0°C  (high = +80.0°C, crit = +90.0°C)

coretemp-isa-0008
Adapter: ISA adapter
Physical id 1: +62.0°C  (high = +80.0°C, crit = +90.0°C)
Core 0:        +58.0°C  (high = +80.0°C, crit = +90.0°C)
Core 1:        +62.0°C  (high = +80.0°C, crit = +90.0°C)
Core 2:        +60.0°C  (high = +80.0°C, crit = +90.0°C)
Core 3:        +58.0°C  (high = +80.0°C, crit = +90.0°C)
Core 4:        +59.0°C  (high = +80.0°C, crit = +90.0°C)
Core 5:        +59.0°C  (high = +80.0°C, crit = +90.0°C)
Core 6:        +60.0°C  (high = +80.0°C, crit = +90.0°C)
Core 7:        +59.0°C  (high = +80.0°C, crit = +90.0°C)

However when using snmpwalk the output is truncated:

$snmpwalk -v2c -c the_community localhost lmSensors
LM-SENSORS-MIB::lmTempSensorsIndex.1 = INTEGER: 1
LM-SENSORS-MIB::lmTempSensorsIndex.2 = INTEGER: 2
LM-SENSORS-MIB::lmTempSensorsIndex.3 = INTEGER: 3
LM-SENSORS-MIB::lmTempSensorsIndex.4 = INTEGER: 4
LM-SENSORS-MIB::lmTempSensorsIndex.5 = INTEGER: 5
LM-SENSORS-MIB::lmTempSensorsIndex.6 = INTEGER: 6
LM-SENSORS-MIB::lmTempSensorsIndex.7 = INTEGER: 7
LM-SENSORS-MIB::lmTempSensorsIndex.8 = INTEGER: 8
LM-SENSORS-MIB::lmTempSensorsIndex.9 = INTEGER: 9
LM-SENSORS-MIB::lmTempSensorsIndex.10 = INTEGER: 10
LM-SENSORS-MIB::lmTempSensorsDevice.1 = STRING: Physical id 0
LM-SENSORS-MIB::lmTempSensorsDevice.2 = STRING: Core 0
LM-SENSORS-MIB::lmTempSensorsDevice.3 = STRING: Core 1
LM-SENSORS-MIB::lmTempSensorsDevice.4 = STRING: Core 2
LM-SENSORS-MIB::lmTempSensorsDevice.5 = STRING: Core 3
LM-SENSORS-MIB::lmTempSensorsDevice.6 = STRING: Core 4
LM-SENSORS-MIB::lmTempSensorsDevice.7 = STRING: Core 5
LM-SENSORS-MIB::lmTempSensorsDevice.8 = STRING: Core 6
LM-SENSORS-MIB::lmTempSensorsDevice.9 = STRING: Core 7
LM-SENSORS-MIB::lmTempSensorsDevice.10 = STRING: Physical id 1
LM-SENSORS-MIB::lmTempSensorsValue.1 = Gauge32: 56000
LM-SENSORS-MIB::lmTempSensorsValue.2 = Gauge32: 57000
LM-SENSORS-MIB::lmTempSensorsValue.3 = Gauge32: 62000
LM-SENSORS-MIB::lmTempSensorsValue.4 = Gauge32: 60000
LM-SENSORS-MIB::lmTempSensorsValue.5 = Gauge32: 57000
LM-SENSORS-MIB::lmTempSensorsValue.6 = Gauge32: 58000
LM-SENSORS-MIB::lmTempSensorsValue.7 = Gauge32: 59000
LM-SENSORS-MIB::lmTempSensorsValue.8 = Gauge32: 60000
LM-SENSORS-MIB::lmTempSensorsValue.9 = Gauge32: 58000
LM-SENSORS-MIB::lmTempSensorsValue.10 = Gauge32: 62000

It seems the bug has been reported on the net-snmp sourceforge page: http://sourceforge.net/p/net-snmp/bugs/2561/

Perhaps if this is resolved it would be good to pull that into the next release.

Comment 10 Jan Safranek 2015-02-25 09:11:16 UTC
No response from upstream, pushed to their git: https://sourceforge.net/p/net-snmp/code/ci/e886f5eb9701851ad6948583156bfd59fcb6110f/

Comment 13 errata-xmlrpc 2015-07-22 07:22:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1385.html