Red Hat Bugzilla – Bug 153958
OpenIPMI kernel drivers in 2.6.9 RHEL4 crash after loaded overlnight
Last modified: 2015-01-04 17:18:32 EST
Hey Dave, has anyone reported this problem in RHEL4? Im about to test the
OpenIPMI drivers in a 2.6.12-rc1/rc2 kernel soon to see if this is
reproducable. I am also testing this on EM64T mode to see if this occurs as
Description of problem:
IPMI kernel drivers in RHEL4 crash, cannot unload if left loaded overnight.
Version-Release number of selected component (if applicable): final 2.6.9
kernel in RHEL4: 2.6.9-5.0.3.ELsmp on x86 mode.
Load all the OpenIPMI kernel modules on a Dell PE1850, run some IPMItool
commands, leave driver loaded overnight, rmmod cannot unload, hangs.
Steps to Reproduce:
1.modprobe all the OpenIPMI kernel modules
2.install IPMItool RPM from ipmitool.sf.net
3.Run some ipmitool commands with -I open, display some sensor other info
IPMI Runs as expected, kernel drivers are unstable after long periods of being
IPMI Runs as expected, kernel modules remain stable after long periods of being
This was done an Dell PE1850 in x86 mode not EM64T mode.
ipmi message handler version v33
IPMI System Interface driver version v33, KCS version v33, SMIC version v33, BT
ipmi_si: Found SMBIOS-specified state machine at I/O address 0xca8
IPMI kcs interface initialized
ipmi device interface version v33
Copyright (C) 2004 MontaVista Software - IPMI Powerdown via sys_reboot version
IPMI poweroff: Found a chassis style poweroff function
IPMI Watchdog: driver version v33
I have confirmed from Dell Engineering the firmware of BMC is fine.
00:00.0 Host bridge: Intel Corp. E7520 Memory Controller Hub (rev 09)
00:02.0 PCI bridge: Intel Corp. E7525/E7520/E7320 PCI Express Port A (rev 09)
00:04.0 PCI bridge: Intel Corp. E7525/E7520 PCI Express Port B (rev 09)
00:05.0 PCI bridge: Intel Corp. E7520 PCI Express Port B1 (rev 09)
00:06.0 PCI bridge: Intel Corp. E7520 PCI Express Port C (rev 09)
00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI Controller
#1 (rev 02)
00:1d.1 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI Controller
#2 (rev 02)
00:1d.2 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #3 (rev 02)
00:1d.7 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2 EHCI
Controller (rev 02)
00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge
00:1f.1 IDE interface: Intel Corp. 82801EB/ER (ICH5/ICH5R) IDE Controller (rev
01:00.0 PCI bridge: Intel Corp. 6700PXH PCI Express-to-PCI Bridge A (rev 09)
01:00.2 PCI bridge: Intel Corp. 6700PXH PCI Express-to-PCI Bridge B (rev 09)
02:0b.0 Network controller: MYRICOM Inc. Myrinet 2000 Scalable Cluster
Interconnect (rev 04)
03:05.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-
MPT Dual Ultra320 SCSI (rev 08)
03:0c.0 PCI bridge: Mellanox Technologies MT23108 PCI Bridge (rev a1)
04:00.0 InfiniBand: Mellanox Technologies MT23108 InfiniHost (rev a1)
06:00.0 PCI bridge: Intel Corp. 6700PXH PCI Express-to-PCI Bridge A (rev 09)
06:00.2 PCI bridge: Intel Corp. 6700PXH PCI Express-to-PCI Bridge B (rev 09)
07:07.0 Ethernet controller: Intel Corp. 82541GI/PI Gigabit Ethernet Controller
08:08.0 Ethernet controller: Intel Corp. 82541GI/PI Gigabit Ethernet Controller
0a:05.0 Class ff00: Dell Remote Access Card 4 Daughter Card
0a:05.1 Class ff00: Dell Remote Access Card 4 Daughter Card Virtual UART
0a:05.2 Class ff00: Dell Remote Access Card 4 Daughter Card SMIC interface
0a:06.0 IDE interface: Silicon Image, Inc. (formerly CMD Technology Inc)
PCI0680 Ultra ATA-133 Host Controller (rev 02)
0a:0d.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon
The kernel will hang if the ipmi_si kernel driver is loaded.
The hang on insmod of ipmi_si happens when a Dell DRAC4 card is present in the
system. Dell Engineering is investigating this. It appears to be coming from
the wait_event() call in ipmi_si.c:ipmi_register_smi().
/* Wait for the channel info to be read. */
which never gets the completion.
The 2.4.x kernel openipmi driver v35, and 2.6.x kernel driver v33 fail
similarly. I haven't yet been able to try with the 2.6.12-rc4-mm2 + patches
as posted to LKML, but suspect similar behavior would occur.
The hang occurs two ways:
I should clarify this in the bug
1) sometimes rebooting the machine, and then trying to load the ipmi_si driver
will just hang with insmod
2) sometimes if the driver is successfully loaded, it will work for a period of
time, with the DRAC4 card visible, but after a period of time, rmmod will hang
and openipmi will hang trying to communicate to the BMC
its not always going to hang, when loading the driver with insmod
s/openipmi/ipmitool userland tools.
In my lab, tests with PE2800 RHEL3 U5 kernel and RHEL4 U1 beta kernel, with
and without DRAC4/i (small add-in daughtercard) succeed to insmod no problems.
I'll test with PE1850 next.
I believe this to be a bug in the BMC firmware, which has been corrected in an
internal build, and will be released to users in August. Individual customers
needing the fixed firmware before general release must call Dell tech support,
and ask the technician to "escalate to Engineering" to obtain the BMC firmware
which addresses the stuck attention bit problem. Customers will be required
to sign a Dell beta-code NDA and must have support from Dell.
As this is not a Red Hat kernel bug, I am going to close this issue.