A patch that was submitted to RH5.5 kernel and was accepted to the default flavor is missing in the xen flavor, as a result the mlx4 driver missfunctions in the Xen flavor of RH5.5
Theis patch was added at:
Can this fix be added to the errata of RH5.5?
What makes you think the lack of this patch that is causing the problems in Xen?
The reason I ask is that there's only one driver for both variants "flavors". Do you have pci_pt_e820_access=on on dom0's kernel command line? Here's an example config in grub.conf
title Red Hat Enterprise Linux Server (2.6.18-181.el5xen)
kernel /boot/xen.gz-2.6.18-181.el5 iommu=1 console=com1 com1=9600,8n1
module /boot/vmlinuz-2.6.18-181.el5xen ro root=LABEL=/
Note the "pci_pt_e820_access=on" on the first module line (dom0's kernel command line). That's needed for SR-IOV to work with Xen. Note also the iommu=1 on Xen's command line. That's needed if you want to pass a pci device (or VF) through to an HVM guest.
I could not update the bugzilla for some reason, but here is my comment:
The errors happened without enabling SRIOV or IOMMU, we tested the driver in single function mode, without pass-through.
I can now see from you kernel that for xen, the kernel version that is used is 2.6.18-181, the version where my fix was accepted is 2.6.18-191.
This can explain why the fix is not included in Xen flavor.
You can use -191 and all the same release numbers that bare-metal has for xen as well. For every kernel release there is also a xen version. If the patch you need is in -191, then it sounds like just using RHEL 5.5 (-194) should solve your problems.
Please try that, and let me know if it works.
We tried it with the GA RH5.5 so kernel version should be 2.6.18-194
Will this be included in a 5.5.z or will we have to wait for RHEL 5.6?
I'm still confused to why there appears to be a problem here. From comment 2 it looked like the original testing was done with -181, which would explain the absence of the patch, since the patch was integrated into -191. However, if testing was done with -194, then that indicates a new issue, as the patch is already there. Please note that both xen and bare-metal have the same module compiled so it wasn't missed from the xen flavor.
Mellanox, please provide an update.
Per Yevgeny Petrilin at Mellanox:
> It has passed on RH5.5 GA, I thought I have updated the bugzilla, will
> check again.
Closing bug as CURRENTRELEASE (5.5).