Bug 523341 - PCI SR-IOV BAR resources can't be reliably mapped
Summary: PCI SR-IOV BAR resources can't be reliably mapped
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.6
Hardware: All
OS: Linux
high
high
Target Milestone: beta
: 5.6
Assignee: Don Dutile (Red Hat)
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
: 573077 577179 (view as bug list)
Depends On:
Blocks: 557597 570372 577182 Rhel5KvmTier1 581655
TreeView+ depends on / blocked
 
Reported: 2009-09-14 23:28 UTC by Chris Wright
Modified: 2018-11-14 15:29 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-01-13 20:53:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
PCI SR-IOV: assign pci resources earlier (5.33 KB, patch)
2010-02-26 01:06 UTC, Chris Wright
no flags Details | Diff
PCI: clean up resource alignment management (10.44 KB, patch)
2010-02-26 01:07 UTC, Chris Wright
no flags Details | Diff
PCI SR-IOV: correct broken resource alignment calculations (5.73 KB, patch)
2010-02-26 01:08 UTC, Chris Wright
no flags Details | Diff
sosreport (581.04 KB, application/x-bzip)
2010-04-14 16:21 UTC, IBM Bug Proxy
no flags Details
dmesg log include SR-IOV failure info (56.81 KB, text/plain)
2010-08-27 04:44 UTC, Linqing Lu
no flags Details
patch 1/3 being posted to 5.6 (5.59 KB, text/plain)
2010-09-01 17:49 UTC, Don Dutile (Red Hat)
no flags Details
Patch 2/3 being posted to 5.6 (8.88 KB, patch)
2010-09-01 18:05 UTC, Don Dutile (Red Hat)
no flags Details | Diff
Patch 3/3 to be posted to rhel5.6 (4.24 KB, patch)
2010-09-01 18:16 UTC, Don Dutile (Red Hat)
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 62355 0 None None None Never
Red Hat Product Errata RHSA-2011:0017 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.6 kernel security and bug fix update 2011-01-13 10:37:42 UTC

Description Chris Wright 2009-09-14 23:28:43 UTC
Description of problem:

When loading a driver for a PCIe device which supports SR-IOV, the BARs associated with the SR-IOV capability may not be reliably requested, even if there is space.

Version-Release number of selected component (if applicable):

kernel-2.6.18-162.el5

How reproducible:

Platform dependent, but when it happens it happens 100% of the time.

Steps to Reproduce:
1. Enable SR-IOV in system BIOS
2. Boot RHEL-5.4
3. Load relevant driver (for SR-IOV capable device)
  
Actual results:

Driver modprobe fails and produces an error like: "not enough MMIO resources for SR-IOV"

This means the physical device (aka Physical Function or PF) will not work, and, of course, no virtual devices (aka Virtual Function or VF) associated w/ that PF can be allocated.

Expected results:

Driver is successfully loaded and VFs are successfully allocated making both the PF and its VFs functional.

Additional info:

This problem stems from 2 issues.  The core issue is that the kernel incorrectly uses the size of an SR-IOV BAR as the alignment requirement.  The size of an SR-IOV BAR is calculated by the size of a single VF BAR multiplied by the total possible VFs, effectively an array of all possible VF BARs for that resource.  However, the SR-IOV BAR need only be aligned to the size of a single VF BAR.

|- - - - -SR-IOV BAR0 - - - - -|
|                              |
|VF0 BAR0|VF1 BAR0|...|VFN BAR0|

Unfortunately, fixing the alignement calculation is not sufficient because SR-IOV BARs are not sorted at all.  Even if resource allocation used the smaller alignment requirement, the space may have become too fragmented by allocating an SR-IOV BAR w/ smaller alignment requirements before one w/ large alignment requirements.

Comment 3 Chris Wright 2010-02-26 01:01:06 UTC
The following 3 patches change how resources are allocated for SR-IOV VF BARs.  Rather than being allocated at driver load time, they are allocated during PCI bus enumeration along w/ traditional PCI BARs.  Further, the SR-IOV resources' alignment requirements are fixed so that a VF BAR alignment is only the size of a single VF BAR rather than the entire region allocated for all TotalVFs*VF BAR.

These may cause an issue for X on PPC see upstream commit:

  ad892a6 powerpc: Fix PCI ROM access

Applies cleanly to 190.el5

Comment 4 Chris Wright 2010-02-26 01:06:44 UTC
Created attachment 396439 [details]
PCI SR-IOV: assign pci resources earlier

Comment 5 Chris Wright 2010-02-26 01:07:48 UTC
Created attachment 396440 [details]
PCI: clean up resource alignment management

Comment 6 Chris Wright 2010-02-26 01:08:45 UTC
Created attachment 396441 [details]
PCI SR-IOV: correct broken resource alignment calculations

Comment 7 Chris Wright 2010-02-26 01:13:56 UTC
Scratch brew build 2282252.  Uploaded x86_64 kernel and src.rpm to:

  http://et.redhat.com/~chrisw/rhel5/5.5/bz523341/

Comment 8 Andrew Jones 2010-03-30 12:01:15 UTC
*** Bug 577179 has been marked as a duplicate of this bug. ***

Comment 9 Ram Pai 2010-04-13 00:24:58 UTC
Chris, 

   I have attached 5 patches to bugzilla 
   https://bugzilla.redhat.com/show_bug.cgi?id=534158

   They fix SRIOV issues with intel 1g,10g and mellanox sriov adapter. 

   I know 534158 bugzilla is the wrong place to attach. Let me know if you want me
   to attach the proposed patches in this bugzilla. There seem to be too many 
   bugzillas tracking this issue, namely

   https://bugzilla.redhat.com/show_bug.cgi?id=573077
   https://bugzilla.redhat.com/show_bug.cgi?id=567730
   

thanks,
RP

Comment 10 IBM Bug Proxy 2010-04-14 16:21:40 UTC
------- Comment From linuxram.com 2010-04-09 16:51 EDT-------
---Problem Description---
SRIOV fails with MMIO resource allocation failure
Contact Information = linuxram.com

---Additional Hardware Info---
x3650M2


---uname output---
Linux beaverton.ibm.com 2.6.18-194.el5 #70 SMP Thu Apr 8 17:15:56 PDT 2010 x86_64 x86_64 x86_64 GNU/Linux

Machine Type = x3650M2

---Debugger---
A debugger is not configured

---Steps to Reproduce---
rmmod ixgbe
modprobe ixgbe max_vfs=1

or

rmmod mlx4_en
rmmode_mlx4_core
modprobe mlx4_core  sr_iov=1 probe_vf=1

or

rmmod igb
modprobe igb max_vfs=1

In all the above case, the driver loads but fails to procure enough MMIO resource.

dmesg will show "MMIO allocation failure."

---Kernel Component Data---
Stack trace output:
no

Oops output:
no

*Additional Instructions for linuxram.com:
-Attach sysctl -a output output to the bug.

Please refer to the following bugzilla for more information including some proposed patches.

Comment 11 IBM Bug Proxy 2010-04-14 16:21:48 UTC
Created attachment 406562 [details]
sosreport


------- Comment (attachment only) From linuxram.com 2010-04-09 16:57 EDT-------

Comment 12 Larry Troan 2010-05-19 16:11:49 UTC
Does Exar (formerly Neterion) plan to work this and submit a patch for 5.6 consideration?

Comment 13 Ramkrishna Vepa 2010-05-20 00:44:58 UTC
We have been testing Exar/Neterion's X3100 adapters in SRIOV mode w/wo passthrough with the following patched RHEL 5.4 kernel since 09/2009 -
http://et.redhat.com/~chrisw/rhel5/5.4/kernel-2.6.18-165.el5.cdub_sriov.x86_64.rpm

We have also successfully tested  Exar/Neterion's X3100 adapters in SRIOV mode w/wo passthrough with the following patched RHEL 5.5 kernels since 02/2010 (same patches on both kernels) -
http://et.redhat.com/~chrisw/rhel5/5.5/bz523341/

Ram

Comment 14 Larry Troan 2010-06-15 15:42:29 UTC
Per Ram Vepa in a telecon 6/14, patches submitted and accepted upstream. They have been running a patched kernel from Chris Wright since last September.

Comment 15 Larry Troan 2010-07-27 21:22:16 UTC
Do we have the patches? They need to be attached by 8/01 for 5.6 consideration.

Comment 16 IBM Bug Proxy 2010-07-27 21:51:48 UTC
------- Comment From linuxram.com 2010-07-27 17:48 EDT-------
I had submitted the patches to Chris Wright in April.

Chris, do you want me to resubmit the patches through this bugzilla? Please suggest.

Comment 17 Larry Troan 2010-07-28 01:31:18 UTC
Chris, do we have the patches POSTed to rhkernel and approved for 5.6?

Note that the patch submission date is Aug 13 (not Aug 1 above) but they must be reviewed and approved by that date.

Comment 18 Larry Troan 2010-07-29 22:05:26 UTC
Requesting an exception until NEEDINFO resolved.

Comment 19 Don Dutile (Red Hat) 2010-08-13 17:57:07 UTC
Larry,

Can you have the appropriate testers/reporters try the following kernel:
 people.redhat.com:~ddutile/.2414719547/kernel-2.6.18-211.el5bz523341.x86_64.rpm

Make sure you read the readme file in the same directory: 0-EXPORT-README.FIRST

- Don

Comment 20 Don Dutile (Red Hat) 2010-08-13 22:52:56 UTC
sorry, proper web link address is:
     http://people.redhat.com/ddutile/.2414719547/

Comment 21 Masroor 2010-08-19 04:31:18 UTC
(In reply to comment #20)
> sorry, proper web link address is:
>      http://people.redhat.com/ddutile/.2414719547/

Could you please update the xen kernel too with the same fix? The above link has only base kernel.

Thanks,
Masroor

Comment 22 Don Dutile (Red Hat) 2010-08-20 03:14:45 UTC
done.

Comment 23 Linqing Lu 2010-08-26 13:13:59 UTC
Hi Andrew,

Could you please take a glance at Bug 581655 to see if that bug depends on this one?

Comment18 of 581655 ( https://bugzilla.redhat.com/show_bug.cgi?id=581655#c18 ) shows some similar info as this bug.

Thank you.

Comment 24 Andrew Jones 2010-08-26 14:13:36 UTC
I can't say. I haven't really looked too closely at these bugs. Probably ddugger would be best to ask. Have you tried ddutile's test kernel from comment 20 to see if that changes things?

Comment 26 Linqing Lu 2010-08-27 04:44:02 UTC
Created attachment 441399 [details]
dmesg log include SR-IOV failure info

Tried 82599 on another machine with a x8 PCI-Express slot.
Although dmesg info (in the attachment) seems different, SR-IOV still doesn't work:

ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.62-k2
ixgbe: Copyright (c) 1999-2010 Intel Corporation.
ACPI: PCI Interrupt Link [LXB2] enabled at IRQ 41
  alloc irq_desc for 41 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: PCI INT A -> Link[LXB2] -> GSI 41 (level, high) -> IRQ 41
ixgbe 0001:58:00.0: setting latency timer to 64
ixgbe 0001:58:00.0: not enough MMIO resources for SR-IOV
ixgbe: 0001:58:00.0: ixgbe_probe_vf: Failed to enable PCI sriov: -12
  alloc irq_desc for 52 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 52 for MSI/MSI-X
  alloc irq_desc for 53 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 53 for MSI/MSI-X
  alloc irq_desc for 54 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 54 for MSI/MSI-X
  alloc irq_desc for 55 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 55 for MSI/MSI-X
  alloc irq_desc for 56 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 56 for MSI/MSI-X
  alloc irq_desc for 57 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 57 for MSI/MSI-X
  alloc irq_desc for 58 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 58 for MSI/MSI-X
  alloc irq_desc for 59 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 59 for MSI/MSI-X
  alloc irq_desc for 60 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 60 for MSI/MSI-X
  alloc irq_desc for 61 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 61 for MSI/MSI-X
  alloc irq_desc for 62 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 62 for MSI/MSI-X
  alloc irq_desc for 63 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 63 for MSI/MSI-X
  alloc irq_desc for 64 on node 1
  alloc kstat_irqs on node 1
ixgbe 0001:58:00.0: irq 64 for MSI/MSI-X
ixgbe: 0001:58:00.0: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 12, Tx Queue count = 12
ixgbe 0001:58:00.0: (PCI Express:2.5Gb/s:Width x8) 00:1b:21:6e:07:4c
ixgbe 0001:58:00.0: MAC: 2, PHY: 0, PBA No: e68785-002

Comment 27 IBM Bug Proxy 2010-08-27 06:01:21 UTC
------- Comment From linuxram.com 2010-08-27 01:54 EDT-------
Can I get the link to the src rpm please. I want to see if the patches are included in the kernel or not.

Comment 28 Andrew Jones 2010-08-27 09:10:07 UTC
(In reply to comment #27)
> ------- Comment From linuxram.com 2010-08-27 01:54 EDT-------
> Can I get the link to the src rpm please. I want to see if the patches are
> included in the kernel or not.

Unfortunately it's probably gone, since the scratch build is pretty old. Don, do you still have the branch laying around to package up an srpm for IBM?

Comment 32 Masroor 2010-08-30 08:55:36 UTC
I tested the above xen-kernel using Neterion's x3100 series 10G adapter. Now the resource failure error went away. On loading vxge driver, all the VFs detected and configured. Also I am able to pass-through the VFs to the VM.

Thanks,
Masroor

Comment 33 Don Dutile (Red Hat) 2010-08-30 15:40:57 UTC
(In reply to comment #28)
> (In reply to comment #27)
> > ------- Comment From linuxram.com 2010-08-27 01:54 EDT-------
> > Can I get the link to the src rpm please. I want to see if the patches are
> > included in the kernel or not.
> 
> Unfortunately it's probably gone, since the scratch build is pretty old. Don,
> do you still have the branch laying around to package up an srpm for IBM?

I can provide the patches that went into the kernel listed in c#20 & c#21.

if that's sufficient, let me know & I'll attach them to this bz.

Comment 34 Ram Pai 2010-08-30 17:13:11 UTC
Don,

Yes. please point me to the patches that went into the kernel.

RP

Comment 35 Don Dutile (Red Hat) 2010-09-01 17:49:03 UTC
Created attachment 442455 [details]
patch 1/3 being posted to 5.6

 
Same as one chris listed; syncd to -214 kernel branch of 5.6

Comment 36 Don Dutile (Red Hat) 2010-09-01 18:05:31 UTC
Created attachment 442458 [details]
Patch 2/3 being posted to 5.6

This patch has a line added to pbus_size_io & pbus_size_mem that was missing in the patch Chris posted in bz originally.  Needed for some systems (like my vtd laptop).

Comment 37 Don Dutile (Red Hat) 2010-09-01 18:16:32 UTC
Created attachment 442461 [details]
Patch 3/3 to be posted to rhel5.6

This is same as Chris's original 3rd patch that was attached.
.... and...
if the bugzilla notes which (new) patch is obsoleting old patches, I mis-selected and have the 2/3 patch obsoleting original 3rd patch; and 3/3 obsoleting original 2nd patch... sorry about that confusion.
note: you may have to hit 'show obsoletes' in lower right corner of attachments box to see original patches

Comment 38 Ram Pai 2010-09-01 18:20:53 UTC
Ok. That patch by itself wont be sufficient to enable SRIOV on platforms on which the BIOS/uEFI is unaware of SRIOV BARs. 

I had submitted a couple of patches, see comment #9, to take care of assigning MMIO resources to SRIOV BARs. These patches were backports from upstream patches. 

However one of the key patch has been reverted upstream since then.  So at this point in time there is no clean solution to this problem.  I am working on a patch that assigns resources to unassigned BARs based on a kernel command line parameter. Plan to submit it soon upstream and will have it backported here ones it is in good state.

Comment 43 Ram Pai 2010-09-09 16:34:43 UTC
FYI: I just submitted a patch upstream 
http://marc.info/?l=linux-kernel&m=128392923724817&w=2

Comment 44 Don Dutile (Red Hat) 2010-09-10 15:58:28 UTC
(In reply to comment #43)
> FYI: I just submitted a patch upstream 
> http://marc.info/?l=linux-kernel&m=128392923724817&w=2

Yes, but (a) it hasn't been accepted, and (b) responders have questions/issues over it.

Until the issues are resolved and the patch is accepted upstream,
it would be _very_ difficult to get this into rhel5.

Comment 45 Ram Pai 2010-09-10 17:11:50 UTC
yes. wont push this into rhel5 till its fully baked and accepted upstream. However appreciate any help towards this.

Comment 46 Jarod Wilson 2010-09-10 21:38:32 UTC
in kernel-2.6.18-219.el5
You can download this test kernel from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 48 IBM Bug Proxy 2010-11-03 11:12:06 UTC
------- Comment From  2010-11-03 07:03 EDT-------
Redhat,

Is this issue fixed in RHEL5.6?

Thanks
Muni

Comment 49 Andrew Jones 2010-11-03 12:30:27 UTC
(In reply to comment #48)
> Is this issue fixed in RHEL5.6?

The patches for this bug were posted and integrated into -219, so they'll be in 5.6.

Comment 50 Andy Gospodarek 2010-11-03 15:49:16 UTC
*** Bug 573077 has been marked as a duplicate of this bug. ***

Comment 51 Chris Ward 2010-11-09 13:38:16 UTC
~~ Attention Customers and Partners - RHEL 5.6 Public Beta is now available on RHN ~~

A fix for this 'OtherQA' BZ should be present and testable in the release. 

If this Bugzilla is verified as resolved, please update the Verified field above with an appropriate value and include a summary of the testing executed and the results obtained.

If you encounter any issues or have questions while testing, please describe them and set this bug into NEED_INFO. 

If you encounter new defects or have additional patches to request for inclusion, promptly escalate the new issues through your support representative.

Finally, future Beta kernels can be found here:
 http://people.redhat.com/jwilson/el5/

Note: Bugs with the 'OtherQA' keyword require Third-Party testing to confirm the request has been properly addressed. See: https://bugzilla.redhat.com/describekeywords.cgi#OtherQA ).

Comment 52 Srirama 2010-11-25 12:24:02 UTC
Installed RHEL 5.6 Beta (snapshot1) image and verified that the vxge driver (version 2.0.8.20182-k) loads successfully in SRIOV mode. PF and all the VFs get enumerated properly and are functional. Also verified that the passthrough configuration works by attaching the PCI functions to 2 VMs and running traffic between the VMs and also to the remote machine.

[root@linux-rhel-5 ~]# uname -a
Linux linux-rhel-5.6-181 2.6.18-231.el5xen #1 SMP Mon Nov 8 18:45:02 EST 2010
x86_64 x86_64 x86_64 GNU/Linux
[root@linux-rhel-5 ~]# lspci -d17d5:*
02:00.0 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:00.1 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:00.2 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:00.3 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:00.4 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:00.5 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:00.6 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:00.7 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:01.0 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:01.1 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:01.2 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:01.3 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:01.4 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:01.5 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:01.6 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:01.7 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)
02:02.0 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe
(rev 02)

Comment 53 IBM Bug Proxy 2010-12-27 07:02:01 UTC
------- Comment From  2010-12-27 01:55 EDT-------
As per latest comment, can we close this bug?

Thanks
Muni

Comment 55 errata-xmlrpc 2011-01-13 20:53:22 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html


Note You need to log in before you can comment on or make changes to this bug.