Description of problem: The mlx4 driver would fail to load on systems with 32 cores or more. When attempt to create 33 event queues (one for each core plus one for command interface) FW command fails. How reproducible: Try to load mlx4_core driver on a machine with with 32 cores. Actual results: Driver fails to load Expected results: The driver should load with 33 Interrupt vectors. Additional info: Attached a patch that fixes this issue by allocating more ICM paches for event queues.
Created attachment 355363 [details] Map sufficient ICM memory for event queues
in kernel-2.6.18-165.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
@Yevgeny, we've done our best to reproduce and verify that this issue has been fixed properly. Unfortunately though, we do not have the mlx4 hardware to reproduce the issue properly. We kindly request that you grab the -165 kernel and verify it fixes the issue as originally reported. Thanks!
The issue seems to be fixed in this build. Thanks!
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html