Bug 873462

Summary: PCIe SRIOV VFs may not configure on PCIe port with no ARI support
Product: Red Hat Enterprise Linux 6 Reporter: Don Dutile (Red Hat) <ddutile>
Component: kernelAssignee: Don Dutile (Red Hat) <ddutile>
Status: CLOSED ERRATA QA Contact: Weibing Zhang <atzhang>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.3CC: kzhang, mjenner
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-345.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 06:55:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 881827    

Description Don Dutile (Red Hat) 2012-11-05 22:37:42 UTC
Description of problem:
Virtual Functions of a PCIe SRIOV device can be associated with the wrong IOMMU, causing VF DMA and Interrupt-Remapping failure.

Version-Release number of selected component (if applicable):


How reproducible:
Attach a PCI SRIOV device to a system port that does not have PCI ARI support

Steps to Reproduce:
1. Boot an Intel-based system with IOMMU support with 'intel_iommu=on' on the kernel command line.
2. Enable the VFs of a PCIe SRIOV device, e.g., 82599, on a system port without ARI support; modprobe ixgbe max_vfs=16
3. 
  
Actual results:
VFs fail to configure because the lack of ARI causes the VF to use
a PCI bus number value that is different from PCIe PF device.  The Intel-IOMMU DMAR tables only specify relationship of PCI(e) devices to IOMMUs for known busses.  VFs on a non-ARI port will cause the creation of a 'virtual bus' number that does not match the Intel-IOMMU DMAR tables in this configuration.

Expected results:
VFs configure successfully after modprobe execution.


Additional info:
Need to backport Linux upstream commit dda5549:
commit dda565492776b7dff5f8507298d868745e734aab
Author: Yinghai <yinghai.lu>
Date:   Fri Apr 9 01:07:55 2010 +0100

    intel-iommu: use physfn to search drhd for VF
    
    When virtfn is used, we should use physfn to find correct drhd
    
    -v2: add pci_physfn() Suggested by Roland Dreier <rdreier>
         do can remove ifdef in dmar.c
    -v3: Chris pointed out we need that for dma_find_matched_atsr_unit too
         also change dmar_pci_device_match() static
    
    Signed-off-by: Yinghai Lu <yinghai>
    Acked-by: Roland Dreier <rdreier>
    Acked-by: Chris Wright <chrisw>
    Acked-by: Jesse Barnes <jbarnes>
    Signed-off-by: David Woodhouse <David.Woodhouse>

Comment 3 RHEL Program Management 2012-11-05 23:01:02 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 5 Jarod Wilson 2012-12-04 17:15:59 UTC
Patch(es)

Comment 8 Weibing Zhang 2013-01-17 08:00:25 UTC
We do not have a system port that does not have PCI ARI support.
[root@ibm-x3650m4-04 ~]# lspci -vvv -s 8b:00.0 | grep ARI
	Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
		ARICap:	MFVC- ACS-, Next Function: 1
		ARICtl:	MFVC- ACS-, Function Group: 0
		IOVCtl:	Enable- Migration- Interrupt- MSE- ARIHierarchy+
[root@ibm-x3650m4-04 ~]# 

SR-IOV can work on this port as expected on kernel-2.6.32-350.el6.

17: eth12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 90:e2:ba:29:c0:ac brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 02:fb:2d:cc:9d:24
    vf 1 MAC f2:51:a3:86:0c:3e
    vf 2 MAC 0a:e3:4c:a5:e8:47
    vf 3 MAC f6:fb:98:60:55:bd
    vf 4 MAC 0e:b6:30:04:83:87
    vf 5 MAC 0e:d2:81:54:d5:37
    vf 6 MAC 1e:49:98:8b:60:cb
18: eth24: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether da:c7:96:7e:ba:85 brd ff:ff:ff:ff:ff:ff
19: eth13: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
    link/ether 90:e2:ba:29:c0:ad brd ff:ff:ff:ff:ff:ff
    vf 0 MAC c6:bc:84:93:c5:1c
    vf 1 MAC 9a:bd:b4:94:df:43
    vf 2 MAC ce:40:95:c9:59:42
    vf 3 MAC 72:ba:18:57:0b:c0
    vf 4 MAC 22:9e:90:50:1d:84
    vf 5 MAC fa:63:c3:5e:93:79
    vf 6 MAC a2:07:04:2e:38:4c

Set SanityOnly.

Comment 10 errata-xmlrpc 2013-02-21 06:55:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0496.html