Bug 1126333 - Very slow response from "ipmitool sdr"
Summary: Very slow response from "ipmitool sdr"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: ipmitool
Version: 6.5
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Ales Ledvinka
QA Contact: Frantisek Sumsal
URL:
Whiteboard:
Depends On:
Blocks: 1075802
TreeView+ depends on / blocked
 
Reported: 2014-08-04 08:18 UTC by Seiji Nishikawa
Modified: 2019-07-11 08:06 UTC (History)
4 users (show)

Fixed In Version: ipmitool-1.8.11-28.el6
Doc Type: Bug Fix
Doc Text:
Previously, ipmitool could not the recognize Sensor Data Repository (SDR) type Software ID. As a consequence, the default 5-second timeout for each affected line reported combined into a very slow response. This update fixes the IPMB request setup, and slow SDR access times are no longer experienced in the described situation.
Clone Of:
Environment:
Last Closed: 2015-07-22 06:59:19 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:1351 normal SHIPPED_LIVE ipmitool bug fix update 2015-07-20 17:59:41 UTC

Description Seiji Nishikawa 2014-08-04 08:18:14 UTC
Description of problem:

Very slow response from "ipmitool sdr" observed in RHEL6.5.

[Attached: ipmitoolsdr_slow.out.10917]
----------------------------------------------------------------
09:24:37.164603 execve("/usr/bin/ipmitool", ["ipmitool", "sdr"], ["HOSTNAME=sas-cfg02-c08.ocio.monash.edu.au", "SHELL=/bin/bash", "TERM=xterm", "HISTSIZE=1000", "QTDIR=/usr/lib64/qt-3.3", "QTINC=/usr/lib64/qt-3.3/include", "USER=root", "LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:", "MAIL=/var/spool/mail/root", "PATH=/usr/lib64/qt-3.3/bin:/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin::/srv/home/strack/bin", "PWD=/srv/home/strack", "LANG=en_US.UTF-8", "HISTCONTROL=ignoredups", "SHLVL=1", "HOME=/srv/home/strack", "LOGNAME=root", "QTLIB=/usr/lib64/qt-3.3/lib", "CVS_RSH=ssh", "LESSOPEN=|/usr/bin/lesspipe.sh %s", "G_BROKEN_FILENAMES=1", "_=/usr/bin/strace"]) = 0 <0.000323>
^^^^^^^^^^^^^^^ start time

09:24:37.427176 select(4, [3], NULL, NULL, NULL) = 1 (in [3]) <4.646032>
09:24:42.073289 ioctl(3, I2OEVTGET or IPMICTL_RECEIVE_MSG_TRUNC, 0x7ffff9677320) = 0 <0.000023>
09:24:42.073369 write(1, "ECC Corr Err     | Not Readable      | ns\n", 42) = 42 <0.000026>
^^^^^^^^^^^^^^ the first jump in wall time (5 secs)

09:24:42.077387 select(4, [3], NULL, NULL, NULL) = 1 (in [3]) <4.995815>
09:24:47.073350 ioctl(3, I2OEVTGET or IPMICTL_RECEIVE_MSG_TRUNC, 0x7ffff9677320) = 0 <0.000154>
09:24:47.087068 write(1, "ECC Uncorr Err   | Not Readable      | ns\n", 42) = 42 <0.000164>
^^^^^^^^^^^^^^^ the second jump in wall time (5 secs)

09:24:47.091677 select(4, [3], NULL, NULL, NULL) = 1 (in [3]) <4.981526>
09:24:52.073352 ioctl(3, I2OEVTGET or IPMICTL_RECEIVE_MSG_TRUNC, 0x7ffff9677320) = 0 <0.000111>
09:24:52.087125 write(1, "I/O Channel Chk  | Not Readable      | ns\n", 42) = 42 <0.000088>
^^^^^^^^^^^^^^^ the third jump in wall time (5 secs)

09:24:52.091084 select(4, [3], NULL, NULL, NULL) = 1 (in [3]) <4.982051>
09:24:57.073322 ioctl(3, I2OEVTGET or IPMICTL_RECEIVE_MSG_TRUNC, 0x7ffff9677320) = 0 <0.000116>
09:24:57.087149 write(1, "PCI Parity Err   | Not Readable      | ns\n", 42) = 42 <0.000116>
^^^^^^^^^^^^^^^ the fourth jump in wall time (5 secs)

09:24:57.090960 select(4, [3], NULL, NULL, NULL) = 1 (in [3]) <4.982252>
09:25:02.073349 ioctl(3, I2OEVTGET or IPMICTL_RECEIVE_MSG_TRUNC, 0x7ffff9677320) = 0 <0.000165>
09:25:02.087362 write(1, "PCI System Err   | Not Readable      | ns\n", 42) = 42 <0.000089>
^^^^^^^^^^^^^^^ the fifth jump in wall time (5 secs)

09:25:02.091292 select(4, [3], NULL, NULL, NULL) = 1 (in [3]) <4.981856>
09:25:07.073327 ioctl(3, I2OEVTGET or IPMICTL_RECEIVE_MSG_TRUNC, 0x7ffff9677320) = 0 <0.000131>
09:25:07.087428 write(1, "SBE Log Disabled | Not Readable      | ns\n", 42) = 42 <0.000125>
^^^^^^^^^^^^^^^ the sixth jump in wall time (5 secs)

09:27:07.093377 select(4, [3], NULL, NULL, NULL) = 1 (in [3]) <4.980744>
09:27:12.074328 ioctl(3, I2OEVTGET or IPMICTL_RECEIVE_MSG_TRUNC, 0x7ffff9677320) = 0 <0.000122>
09:27:12.095183 write(1, "Mem Fatal NB CRC | Not Readable      | ns\n", 42) = 42 <0.000112>
^^^^^^^^^^^^^^^ the seventh jump in wall time (5 secs)

09:27:12.099474 select(4, [3], NULL, NULL, NULL) = 1 (in [3]) <4.974729>
09:27:17.074407 ioctl(3, I2OEVTGET or IPMICTL_RECEIVE_MSG_TRUNC, 0x7ffff9677320) = 0 <0.000117>
09:27:17.095260 write(1, "NFatal PCIE Err  | Not Readable      | ns\n", 42) = 42 <0.000128>
^^^^^^^^^^^^^^^ the eighth jump in wall time (5 secs)

09:27:17.099026 select(4, [3], NULL, NULL, NULL) = 1 (in [3]) <4.975145>
09:27:22.074390 ioctl(3, I2OEVTGET or IPMICTL_RECEIVE_MSG_TRUNC, 0x7ffff9677320) = 0 <0.000124>
09:27:22.093547 write(1, "Fatal I/O Error  | Not Readable      | ns\n", 42) = 42 <0.000126>
^^^^^^^^^^^^^^^ the ninth jump in wall time (5 secs)
----------------------------------------------------------------

This continues with 5 seconds of wall time accumulating for each of these syscalls. 10 selects that are 5 seconds, so 9 * 5 = 50 seconds of selects.

The IPMItool utility tried to retrieve these SDR records from the IPMI bus instead of the BMC bus. Due to timeout setting for the SDR read attempt, serious performance issues occurred and no sensor data was shown. 

The poor performance comes from IPMITool trying to bridge the GET_SENSOR_READING command over IPMB for SDR entries that have “System Software” (instead of “IPMB Address”) defined as their “Sensor Owner ID”. Because IPMITool does not distinguish sensor owner with system software ID from sensor owner with IPMB address and always bridges the command over IPMB, the target with system software ID defined does not (and should not) return a sensor reading and therefore, based on OpenIPMI driver’s default behavior, there will be a 5 seconds delay for such sensor. Kernel driver handling ipmi request tries to query the information 5 times (first try and 4 reties). 

The timeout of each query is 1 second. So, it takes 5 seconds (1 second x 5 times) until replying timeout from kernel.

The similar issue has been tracked by the bug 671059 (https://bugzilla.redhat.com/show_bug.cgi?id=671059).


Version-Release number of selected component (if applicable):
2.6.32-431.20.3.el6.x86_64
ipmitool-1.8.11-16.el6.x86_64
Dell PowerEdge M710HD

How reproducible:
Always

Steps to Reproduce:
1. Run "ipmi sdr"
2.
3.

Actual results:
The IPMItool utility tries to retrieve these SDR records from the IPMI bus instead of the BMC bus. Due to timeout setting for the SDR read attempt, serious performance issues occurred and no sensor data is shown.

Expected results:
IPMItool should correctly reads these SDR records from the BMC and shows the correct sensor data.

Additional info:
----------------------------------------------------------------
struct ipmi_rs *
ipmi_sdr_get_sensor_reading_ipmb(struct ipmi_intf *intf, uint8_t sensor, 
                                 uint8_t target, uint8_t lun)
{
        struct ipmi_rq req;
        struct ipmi_rs *rsp;
        uint8_t save_addr;

-//      if ((strncmp(intf->name, "ipmb", 4)) != 0) 
-//              return ipmi_sdr_get_sensor_reading(intf, sensor);
+       if ((strncmp(intf->name, "ipmb", 4)) != 0)
+               return ipmi_sdr_get_sensor_reading(intf, sensor);

        save_addr = intf->target_addr;
        intf->target_addr = target; 

----------------------------------------------------------------

Comment 6 errata-xmlrpc 2015-07-22 06:59:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1351.html


Note You need to log in before you can comment on or make changes to this bug.