Bug 451164
Summary: | Firmware error with MT25204 Infiniband HCAs | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Gurhan Ozen <gozen> | |
Component: | openib | Assignee: | Doug Ledford <dledford> | |
Status: | CLOSED ERRATA | QA Contact: | ||
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.7 | CC: | ddomingo, jburke, mgahagan, peterm, riek, rlerch | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Hardware testing for the Mellanox MT25204 has revealed that an internal error
occurs under certain high-load conditions. When the ib_mthca driver reports a
catastrophic error on this hardware, it is usually related to an insufficient
completion queue depth relative to the number of outstanding work requests
generated by the user application.
Although the driver will reset the hardware and recover from such an event, all
existing connections at the time of the error will be lost. This generally
results in a segmentation fault in the user application. Further, if opensm is
running at the time the error occurs, then you need to manually restart it in
order to resume proper operation.
|
Story Points: | --- | |
Clone Of: | ||||
: | 488813 509904 (view as bug list) | Environment: | ||
Last Closed: | 2009-05-18 20:35:27 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 251934 | |||
Bug Blocks: | 458752, 488813, 509904 |
Comment 1
Don Domingo
2008-06-13 12:19:18 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Hardware testing for the Mellanox MT25204 has revealed that an internal error occurs under certain high-load conditions. When the ib_mthca driver reports a catastrophic error on this hardware, it is usually related to an insufficient completion queue depth relative to the number of outstanding work requests generated by the user application. Although the driver will reset the hardware and recover from such an event, all existing connections at the time of the error will be lost. This generally results in a segmentation fault in the user application. Further, if opensm is running at the time the error occurs, then you need to manually restart it in order to resume proper operation. Dev ACK for release note. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1022.html |