Bug 620188 - mlx4_en driver misfunctions in rh5.5 Xen flavor
mlx4_en driver misfunctions in rh5.5 Xen flavor
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen (Show other bugs)
5.5
All Linux
high Severity high
: rc
: 5.6
Assigned To: Xen Maintainance List
Red Hat Kernel QE team
:
Depends On:
Blocks: 557597 564513 Mellanox5.6FT
  Show dependency treegraph
 
Reported: 2010-08-01 09:20 EDT by Yevgeny Petrilin
Modified: 2010-11-24 09:19 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-16 08:57:56 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Yevgeny Petrilin 2010-08-01 09:20:36 EDT
Hello,
A patch that was submitted to RH5.5 kernel and was accepted to the default flavor is missing in the xen flavor, as a result the mlx4 driver missfunctions in the Xen flavor of RH5.5

Theis patch was added at: 
https://bugzilla.redhat.com/show_bug.cgi?id=534158#c10

Can this fix be added to the errata of RH5.5?

Thanks, Yevgeny
Comment 1 Andrew Jones 2010-08-09 10:48:48 EDT
What makes you think the lack of this patch that is causing the problems in Xen? 
The reason I ask is that there's only one driver for both variants "flavors". Do you have pci_pt_e820_access=on on dom0's kernel command line? Here's an example config in grub.conf

title Red Hat Enterprise Linux Server (2.6.18-181.el5xen)
        root (hd2,0)
        kernel /boot/xen.gz-2.6.18-181.el5 iommu=1 console=com1 com1=9600,8n1
        module /boot/vmlinuz-2.6.18-181.el5xen ro root=LABEL=/
console=ttyS0,9600n8 pci_pt_e820_access=on
        module /boot/initrd-2.6.18-181.el5xen.img

Note the "pci_pt_e820_access=on" on the first module line (dom0's kernel command line). That's needed for SR-IOV to work with Xen. Note also the iommu=1 on Xen's command line. That's needed if you want to pass a pci device (or VF) through to an HVM guest.

Drew
Comment 2 Andrew Jones 2010-08-16 03:14:04 EDT
From Yevgeny:

Hello Drew,

I could not update the bugzilla for some reason, but here is my comment:

The errors happened without enabling SRIOV or IOMMU, we tested the driver in single function mode, without pass-through.
I can now see from you kernel that for xen, the kernel version that is used is 2.6.18-181, the version where my fix was accepted is 2.6.18-191.
This can explain why the fix is not included in Xen flavor.

Thanks
Yevgeny Petrilin
Comment 3 Andrew Jones 2010-08-16 03:17:26 EDT
Yevgeny,

You can use -191 and all the same release numbers that bare-metal has for xen as well. For every kernel release there is also a xen version. If the patch you need is in -191, then it sounds like just using RHEL 5.5 (-194) should solve your problems.

Please try that, and let me know if it works.

Thanks,
Drew
Comment 4 Yevgeny Petrilin 2010-08-26 11:29:50 EDT
We tried it with the GA RH5.5 so kernel version should be 2.6.18-194

Yevgeny
Comment 5 Sandy Garza 2010-09-15 17:57:19 EDT
Will this be included in a 5.5.z or will we have to wait for RHEL 5.6?
Comment 6 Andrew Jones 2010-09-16 09:39:28 EDT
I'm still confused to why there appears to be a problem here. From comment 2 it looked like the original testing was done with -181, which would explain the absence of the patch, since the patch was integrated into -191. However, if testing was done with -194, then that indicates a new issue, as the patch is already there. Please note that both xen and bare-metal have the same module compiled so it wasn't missed from the xen flavor.
Comment 7 Sandy Garza 2010-10-07 10:43:02 EDT
Mellanox, please provide an update.
Comment 10 Larry Troan 2010-11-16 08:57:56 EST
Per Yevgeny Petrilin at Mellanox:
> It has passed on RH5.5 GA, I thought I have updated the bugzilla, will
> check again.

Closing bug as CURRENTRELEASE (5.5).

Note You need to log in before you can comment on or make changes to this bug.