Bug 153958
Summary: | OpenIPMI kernel drivers in 2.6.9 RHEL4 crash after loaded overlnight | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Shawn Starr <sstarr> |
Component: | kernel | Assignee: | Dave Jones <davej> |
Status: | CLOSED UPSTREAM | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 4.0 | CC: | jbaron, minyard, pfrields, riel, wwlinuxengineering |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | ia32e | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-06-03 22:01:49 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Shawn Starr
2005-04-06 02:35:58 UTC
lspci info: 00:00.0 Host bridge: Intel Corp. E7520 Memory Controller Hub (rev 09) 00:02.0 PCI bridge: Intel Corp. E7525/E7520/E7320 PCI Express Port A (rev 09) 00:04.0 PCI bridge: Intel Corp. E7525/E7520 PCI Express Port B (rev 09) 00:05.0 PCI bridge: Intel Corp. E7520 PCI Express Port B1 (rev 09) 00:06.0 PCI bridge: Intel Corp. E7520 PCI Express Port C (rev 09) 00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #3 (rev 02) 00:1d.7 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02) 00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev c2) 00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02) 00:1f.1 IDE interface: Intel Corp. 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02) 01:00.0 PCI bridge: Intel Corp. 6700PXH PCI Express-to-PCI Bridge A (rev 09) 01:00.2 PCI bridge: Intel Corp. 6700PXH PCI Express-to-PCI Bridge B (rev 09) 02:0b.0 Network controller: MYRICOM Inc. Myrinet 2000 Scalable Cluster Interconnect (rev 04) 03:05.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion- MPT Dual Ultra320 SCSI (rev 08) 03:0c.0 PCI bridge: Mellanox Technologies MT23108 PCI Bridge (rev a1) 04:00.0 InfiniBand: Mellanox Technologies MT23108 InfiniHost (rev a1) 06:00.0 PCI bridge: Intel Corp. 6700PXH PCI Express-to-PCI Bridge A (rev 09) 06:00.2 PCI bridge: Intel Corp. 6700PXH PCI Express-to-PCI Bridge B (rev 09) 07:07.0 Ethernet controller: Intel Corp. 82541GI/PI Gigabit Ethernet Controller (rev 05) 08:08.0 Ethernet controller: Intel Corp. 82541GI/PI Gigabit Ethernet Controller (rev 05) 0a:05.0 Class ff00: Dell Remote Access Card 4 Daughter Card 0a:05.1 Class ff00: Dell Remote Access Card 4 Daughter Card Virtual UART 0a:05.2 Class ff00: Dell Remote Access Card 4 Daughter Card SMIC interface 0a:06.0 IDE interface: Silicon Image, Inc. (formerly CMD Technology Inc) PCI0680 Ultra ATA-133 Host Controller (rev 02) 0a:0d.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE] The kernel will hang if the ipmi_si kernel driver is loaded. The hang on insmod of ipmi_si happens when a Dell DRAC4 card is present in the system. Dell Engineering is investigating this. It appears to be coming from the wait_event() call in ipmi_si.c:ipmi_register_smi(). /* Wait for the channel info to be read. */ up_read(&interfaces_sem); wait_event((*intf)->waitq, ((*intf)->curr_channel>=IPMI_MAX_CHANNELS)); down_read(&interfaces_sem); which never gets the completion. The 2.4.x kernel openipmi driver v35, and 2.6.x kernel driver v33 fail similarly. I haven't yet been able to try with the 2.6.12-rc4-mm2 + patches as posted to LKML, but suspect similar behavior would occur. The hang occurs two ways: I should clarify this in the bug 1) sometimes rebooting the machine, and then trying to load the ipmi_si driver will just hang with insmod 2) sometimes if the driver is successfully loaded, it will work for a period of time, with the DRAC4 card visible, but after a period of time, rmmod will hang and openipmi will hang trying to communicate to the BMC its not always going to hang, when loading the driver with insmod s/openipmi/ipmitool userland tools. In my lab, tests with PE2800 RHEL3 U5 kernel and RHEL4 U1 beta kernel, with and without DRAC4/i (small add-in daughtercard) succeed to insmod no problems. I'll test with PE1850 next. I believe this to be a bug in the BMC firmware, which has been corrected in an internal build, and will be released to users in August. Individual customers needing the fixed firmware before general release must call Dell tech support, and ask the technician to "escalate to Engineering" to obtain the BMC firmware which addresses the stuck attention bit problem. Customers will be required to sign a Dell beta-code NDA and must have support from Dell. As this is not a Red Hat kernel bug, I am going to close this issue. |