Bug 514141

Summary: mlx4_core fails to load on systems with32 cores
Product: Red Hat Enterprise Linux 5 Reporter: Yevgeny Petrilin <yevgenyp>
Component: kernelAssignee: Doug Ledford <dledford>
Status: CLOSED ERRATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.4CC: bzeranski, cward, dhoward, dzickus, hjia, jpirko, jtluka, mgahagan, peterm, tao, vanhoof
Target Milestone: rcKeywords: OtherQA, ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-30 06:51:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 499522, 520906    
Attachments:
Description Flags
Map sufficient ICM memory for event queues none

Description Yevgeny Petrilin 2009-07-28 06:42:10 UTC
Description of problem:
The mlx4 driver would fail to load on systems with 32 cores or more.
When attempt to create 33 event queues (one for each core plus one for command interface) FW command fails.

How reproducible:
Try to load mlx4_core driver on a machine with with 32 cores.

 
Actual results:
Driver fails to load

Expected results:
The driver should load with 33 Interrupt vectors.

Additional info:
Attached a patch that fixes this issue by allocating more ICM paches for event queues.

Comment 1 Yevgeny Petrilin 2009-07-28 06:46:57 UTC
Created attachment 355363 [details]
Map sufficient ICM memory for event queues

Comment 8 Don Zickus 2009-09-04 18:45:47 UTC
in kernel-2.6.18-165.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 13 Chris Ward 2009-09-22 08:51:06 UTC
@Yevgeny, we've done our best to reproduce and verify that this issue has been fixed properly. Unfortunately though, we do not have the mlx4 hardware to reproduce the issue properly. We kindly request that you grab the -165 kernel and verify it fixes the issue as originally reported. Thanks!

Comment 16 Yevgeny Petrilin 2009-09-27 06:22:55 UTC
The issue seems to be fixed in this build.
Thanks!

Comment 21 errata-xmlrpc 2010-03-30 06:51:57 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html