Bug 679642
| Summary: | x3100 can't generate vfs on AMD host | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Chao Yang <chayang> | ||||||||||||||
| Component: | kernel | Assignee: | Alex Williamson <alex.williamson> | ||||||||||||||
| Status: | CLOSED CANTFIX | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||||||||||
| Severity: | urgent | Docs Contact: | |||||||||||||||
| Priority: | urgent | ||||||||||||||||
| Version: | 6.1 | CC: | ddutile, juzhang, khong, lcapitulino, michen, ndai | ||||||||||||||
| Target Milestone: | rc | Keywords: | TestBlocker | ||||||||||||||
| Target Release: | --- | ||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||
| OS: | Linux | ||||||||||||||||
| Whiteboard: | |||||||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||
| Last Closed: | 2011-03-01 03:49:51 UTC | Type: | --- | ||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
| Embargoed: | |||||||||||||||||
| Bug Depends On: | |||||||||||||||||
| Bug Blocks: | 580951 | ||||||||||||||||
| Attachments: |
|
||||||||||||||||
Additional info: this x3100 card can generate vf in west-mere platform and 82576 card works fine in this AMD host. (In reply to comment #0) > # modprobe -r vxge;modprobe vxge func_mode=2 > func_mode: > Changes the PCI function mode. > 0 - SF1_VP17 (1 function, 17 Vpaths) > 1 - MF8_VP2 (8 functions, 2 Vpaths each) > 2 - SR17_VP1 (17 VFs with 1 Vpath each) > 3 - MR17_VP1 (17 Virtual Hierarchies, 1 Vpath/Function/Hierarchy) > 4 - MR8_VP2 (8 Virtual Hierarchies, 2 Vpath/Function/Hierarchy) > 5 - MF17_VP1 (17 functions, 1 vpath each (PCIe ARI)) > 6 - SR8_VP2 (1PF, 7VF, 2 Vpaths each) > 7 - SR4_VP4 (1PF, 3VF, 4 Vpaths each) > 8 - MF2_VP8 (2 functions, 8 Vpaths each) > 9 - MF4_VP4 (4 Functions, 4 Vpaths each) > 10 - MR4_VP4 (4 Virtual Hierarchies, 4 Vpaths/Function/Hierarchy) > > > Actual results: > fail to generate any vf. > > Expected results: > x3100 can work with the mode I give. In reality, I find that the "bigger" modes this card supports only work on very few systems because each VF requires non-trivial MMIO, and the BIOS typically does not open the bridge aperture wide enough. > # dmesg |grep -i sriov > eth4: SRIOV 17 - 17 VF, 1 vpath per VF Enabled > # dmesg |grep -i vxge > vxge 0000:09:00.0: eth50: Link Down > vxge 0000:09:00.0: PCI INT A disabled > vxge: Unknown parameter `func_mode' > vxge: Copyright(c) 2002-2010 Exar Inc. > vxge: Driver version: 2.0.28.21260-p3.0.1.2 > vxge 0000:09:00.0: PCI INT A -> Link[LN48] -> GSI 48 (level, high) -> IRQ 48 > vxge 0000:09:00.0: setting latency timer to 64 > vxge 0000:09:00.0: not enough MMIO resources for SR-IOV This is the indicator for that occurring. This may work in other systems that have better sriov support in the bios and it may work in this system by configuring a mode with fewer VFs. Please provide full lspci -vvv for the system so we can see the bridge apertures. Please also test with func_mode=7 to reduce the number of VFs generated. This appears to be a platform BIOS issue. (In reply to comment #3) > Additional info: > this x3100 card can generate vf in west-mere platform and 82576 card works fine > in this AMD host. westmere platform has "strong" BIOS for SRIOV devices, i.e.,g can make PCI bridge windows large enough for big-mem-SRIOV-VF devices. 82576 VFs use small amt of mem-mapped space, and often 'squeeze' into the left over space in a PCI bridge (which is required to map on multiples of 1MB). On the other hand, we've had numerous AMD boxes that don't support SRIOV well at all, so as Alex stated in previous comment, this looks like a BIOS issue with your AMD box. (In reply to comment #5) > This may work in other systems that > have better sriov support in the bios and it may work in this system by > configuring a mode with fewer VFs. Please provide full lspci -vvv for the > system so we can see the bridge apertures. Please also test with func_mode=7 > to reduce the number of VFs generated. This appears to be a platform BIOS > issue. I think failed with func_mode=7, after reboot host still cannot see vf, will attach dmesg and lspci -vvv. # modprobe -r vxge;modprobe vxge func_mode=7 # lspci|grep Eth 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 09:00.0 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe (rev 02) Created attachment 481484 [details]
device message after passing func_mode=7 to vxge
Created attachment 481485 [details]
lspci -vvv info
Created attachment 481486 [details]
device message after reboot
(In reply to comment #8) > Created attachment 481484 [details] > device message after passing func_mode=7 to vxge Most of this log is for mode 2, it doesn't seem to include anything after reboot for mode 7. (In reply to comment #11) > (In reply to comment #8) > > Created attachment 481484 [details] > > device message after passing func_mode=7 to vxge > > Most of this log is for mode 2, it doesn't seem to include anything after > reboot for mode 7. Apologies, I see the new dmesg in a later attachment. It seems pretty clear that this BIOS isn't even attempting to open the bridge apertures for sr-iov devices. The parent bridge is this: 00:09.0 PCI bridge: ATI Technologies Inc RD890 PCI to PCI bridge (PCI express gpp port H) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=00, secondary=09, subordinate=09, sec-latency=0 I/O behind bridge: 0000f000-00000fff Memory behind bridge: ef300000-ef3fffff Prefetchable memory behind bridge: 00000000e4800000-00000000e57fffff So we have 1MB of MMIO and 16MB of prefetchable MMIO. The PF uses: 09:00.0 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe (rev 02) Subsystem: Exar Corp. X3120 Dual Port 10GBase-CR Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 48 Region 0: Memory at e4800000 (64-bit, prefetchable) [size=8M] Region 2: Memory at e57fc000 (64-bit, prefetchable) [size=8K] Region 4: Memory at e57fe000 (64-bit, prefetchable) [size=8K] Expansion ROM at ef380000 [disabled] [size=512K] 8M + 16k of prefetchable MMIO, so the bridge has the minimum aperture for just the PF, and it would be pure luck if the VFs had room here. Each of the VFs requires 3 prefetchables ranges: Capabilities: [170] Single Root I/O Virtualization (SR-IOV) IOVCap: Migration-, Interrupt Message Number: 000 IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+ IOVSta: Migration- Initial VFs: 16, Total VFs: 16, Number of VFs: 16, Function Dependency Link: 00 VF offset: 1, stride: 1, Device ID: 5833 Supported Page Size: 000007ff, System Page Size: 00000001 Region 0: Memory at 0000000000000000 (64-bit, prefetchable) Region 2: Memory at 00000000e5000000 (64-bit, prefetchable) Region 4: Memory at 00000000e5020000 (64-bit, prefetchable) VF Migration: offset: 00000000, BIR: 0 We can see that BAR0 didn't get mapped, and BARs 2 & 4 aren't even within the bridge aperture. We could use setpci to figure out the size of these: setpci -s 09:00.0 194.l <report address, as above> setpci -s 09:00.0 194.l=ffffffff setpci -s 09:00.0 194.l <report size mask> setpci -s 09:00.0 194.l=<restore value reported above> Repeat for offsets 19c and 1a4. I don't think this is really necessary though since it's pretty obvious that this BIOS isn't leaving room for the VFs. Testing needs to be done on an AMD based system with sufficient SR-IOV support in the BIOS. (In reply to comment #13) > We can see that BAR0 didn't get mapped, and BARs 2 & 4 aren't even within the > bridge aperture. Correction, BARs 2 & 4 do fit in the bridge aperture. IIRC, each VF has the same resource requirements as the PF, 8MB + 8k + 8k. The smaller BARs fit; 16 8k BARs in the 128k from e5000000 - e501ffff and 16 8k BARs from e5020000 - e503ffff. The 16 8M BAR0s would require 128MB on their own. With the smaller VF BARs and the PF BARs and PCI alignment, the BIOS would need to open the prefetchable aperture on 00:09.0 to 256MB. (In reply to comment #14) > (In reply to comment #13) > > We can see that BAR0 didn't get mapped, and BARs 2 & 4 aren't even within the > > bridge aperture. > > Correction, BARs 2 & 4 do fit in the bridge aperture. IIRC, each VF has the > same resource requirements as the PF, 8MB + 8k + 8k. The smaller BARs fit; 16 > 8k BARs in the 128k from e5000000 - e501ffff and 16 8k BARs from e5020000 - > e503ffff. The 16 8M BAR0s would require 128MB on their own. With the smaller > VF BARs and the PF BARs and PCI alignment, the BIOS would need to open the > prefetchable aperture on 00:09.0 to 256MB. Alex, We already know 82576 works fine on this AMD host, so my question is what's the difference between 82576 and x3100? I mean how to determine whether a BIOS will give a sufficient support to SRIOV capability nic card? Your answer will really help us a lot, thank you in advance! (In reply to comment #15) > (In reply to comment #14) > > (In reply to comment #13) > > > We can see that BAR0 didn't get mapped, and BARs 2 & 4 aren't even within the > > > bridge aperture. > > > > Correction, BARs 2 & 4 do fit in the bridge aperture. IIRC, each VF has the > > same resource requirements as the PF, 8MB + 8k + 8k. The smaller BARs fit; 16 > > 8k BARs in the 128k from e5000000 - e501ffff and 16 8k BARs from e5020000 - > > e503ffff. The 16 8M BAR0s would require 128MB on their own. With the smaller > > VF BARs and the PF BARs and PCI alignment, the BIOS would need to open the > > prefetchable aperture on 00:09.0 to 256MB. Alex, As your comment says above, the BIOS would need to open the prefetchable aperture on 00:09.0 to 256MB, but on another machine, x3100 can generate VFs successfully, its prefetchable memory is only 00000000e0000000-00000000e08fffff, so I am getting confused, could you please explain? will attach lspci -vvv info Created attachment 481515 [details]
pci tree
# lspci -vvv -s 00:01.0
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 13) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: e4400000-ec7fffff
Prefetchable memory behind bridge: 00000000e0000000-00000000e08fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Subsystem: Hewlett-Packard Company Device 130a
Capabilities: [60] MSI: Enable+ Count=1/2 Maskable+ 64bit-
Address: fee00000 Data: 4061
Masking: 00000002 Pending: 00000000
Capabilities: [90] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ RBE+ FLReset-
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 256 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x4, ASPM L0s L1, Latency L0 <512ns, L1 <64us
ClockPM- Surprise+ LLActRep+ BwNot+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surpise-
Slot # 0, PowerLimit 0.000000; Interlock- NoCompl-
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Off, PwrInd Off, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet+ LinkState+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range BCD, TimeoutDis+ ARIFwd+
DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- ARIFwd+
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB
Capabilities: [e0] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [100] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [150] Access Control Services
ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans-
ACSCtl: SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd+ EgressCtrl- DirectTrans-
Capabilities: [160] Vendor Specific Information <?>
Kernel driver in use: pcieport
Kernel modules: shpchp
Created attachment 481516 [details]
pci info
(In reply to comment #15) > > Alex, > We already know 82576 works fine on this AMD host, so my question is what's > the difference between 82576 and x3100? I mean how to determine whether a BIOS > will give a sufficient support to SRIOV capability nic card? Your answer will > really help us a lot, thank you in advance! It's simply a matter of the resource requirements. An 82576 PF has 2 non-prefetchable BARs, 128k & 16k. For a typical dual-port 82576, factoring PCI alignment, the BIOS needs to program the bridge aperture to at least 512k. Each VF for the 82576 requires 2 16k BARs. For a dual port, 82576 w/ 7 VFs per PF, that's 128k * 2 + 16k * 2 + 16k * 7 + 16k * 7 = 512k. So, all the VFs for a dual port card will fit into the extra space left over by the alignment requirements for the PF, and it should work even if the BIOS has no SR-IOV support. Note that the PCI spec actually requires a minimum granularity of 1M for prefetchable and non-prefetchable apertures, so there's actually more than enough space. The only resource contention I can imagine in setting up the VFs for an 82576 would be if the device shares a bus with other devices, which might infringe on the extra space. Perhaps you could see this on a system where the 82576 is an integrated device. The x3100, on the other hand, needs 16x the minimum alignment of the PF to support all of the VFs. The BIOS must support SR-IOV to enable this device. I suspect the massive resource requirements play a part in why Exar chose to support MF modes, which are supported by non-SR-IOV aware BIOSes. (In reply to comment #17) > 00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root > Port 1 (rev 13) (prog-if 00 [Normal decode]) > Bus: primary=00, secondary=03, subordinate=03, sec-latency=0 > I/O behind bridge: 0000f000-00000fff > Memory behind bridge: e4400000-ec7fffff > Prefetchable memory behind bridge: 00000000e0000000-00000000e08fffff 03:00.0 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe (rev 02) Subsystem: Exar Corp. X3120 Dual Port 10GBase-CR Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 28 Region 0: Memory at e0000000 (64-bit, prefetchable) [size=8M] Region 2: Memory at e0800000 (64-bit, prefetchable) [size=8K] Region 4: Memory at e0802000 (64-bit, prefetchable) [size=8K] Capabilities: [170] Single Root I/O Virtualization (SR-IOV) IOVCap: Migration-, Interrupt Message Number: 000 IOVCtl: Enable+ Migration- Interrupt- MSE+ ARIHierarchy+ IOVSta: Migration- Initial VFs: 16, Total VFs: 16, Number of VFs: 16, Function Dependency Link: 00 VF offset: 1, stride: 1, Device ID: 5833 Supported Page Size: 000007ff, System Page Size: 00000001 Region 0: Memory at 00000000e4800000 (64-bit, prefetchable) ^^^^^^^^ Region 2: Memory at 00000000e0804000 (64-bit, prefetchable) Region 4: Memory at 00000000e0824000 (64-bit, prefetchable) VF Migration: offset: 00000000, BIR: 0 In this case, note where the VF Region 0 memory is allocated. This comes from the non-prefetchable memory range of the bridge (note it's valid to use non-prefetchable bridge ranges for prefetchable device ranges, but not the reverse). So in this configuration, the 8MB VF BARs come out of the range e4800000 - ec800000 and the 8k BARs are all allocated out of the prefetchable range of the bridge. I'm not sure why the BIOS opened an extra 4MB of non-prefetchable aperture. Also, in coming up with 256MB, I was assuming normal PCI natural alignment for resources. Bridges actually have 1MB granularity, which doesn't need to be naturally aligned (as highlighted by this 132MB range above). So for VFs, we actually need 16 * 8k + 16 * 8k + 16 * 8M and the PF needs 8k + 8k + 8M, which can all fit in 137MB. The above does it as 9MB of prefetchable + 128MB of non-prefetchable (+ 4MB unallocated under the bridge). (In reply to comment #20) > (In reply to comment #17) > > 00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root > > Port 1 (rev 13) (prog-if 00 [Normal decode]) > > Bus: primary=00, secondary=03, subordinate=03, sec-latency=0 > > I/O behind bridge: 0000f000-00000fff > > Memory behind bridge: e4400000-ec7fffff > > Prefetchable memory behind bridge: 00000000e0000000-00000000e08fffff > > 03:00.0 Ethernet controller: Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe > (rev 02) > Subsystem: Exar Corp. X3120 Dual Port 10GBase-CR > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- > SERR+ FastB2B- DisINTx+ > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- > <MAbort- >SERR- <PERR- INTx- > Latency: 0, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 28 > Region 0: Memory at e0000000 (64-bit, prefetchable) [size=8M] > Region 2: Memory at e0800000 (64-bit, prefetchable) [size=8K] > Region 4: Memory at e0802000 (64-bit, prefetchable) [size=8K] > Capabilities: [170] Single Root I/O Virtualization (SR-IOV) > IOVCap: Migration-, Interrupt Message Number: 000 > IOVCtl: Enable+ Migration- Interrupt- MSE+ ARIHierarchy+ > IOVSta: Migration- > Initial VFs: 16, Total VFs: 16, Number of VFs: 16, Function Dependency Link: > 00 > VF offset: 1, stride: 1, Device ID: 5833 > Supported Page Size: 000007ff, System Page Size: 00000001 > Region 0: Memory at 00000000e4800000 (64-bit, prefetchable) > ^^^^^^^^ > Region 2: Memory at 00000000e0804000 (64-bit, prefetchable) > Region 4: Memory at 00000000e0824000 (64-bit, prefetchable) > VF Migration: offset: 00000000, BIR: 0 > > In this case, note where the VF Region 0 memory is allocated. This comes from > the non-prefetchable memory range of the bridge (note it's valid to use > non-prefetchable bridge ranges for prefetchable device ranges, but not the > reverse). So in this configuration, the 8MB VF BARs come out of the range > e4800000 - ec800000 and the 8k BARs are all allocated out of the prefetchable > range of the bridge. I'm not sure why the BIOS opened an extra 4MB of > non-prefetchable aperture. > > Also, in coming up with 256MB, I was assuming normal PCI natural alignment for > resources. Bridges actually have 1MB granularity, which doesn't need to be > naturally aligned (as highlighted by this 132MB range above). So for VFs, we > actually need 16 * 8k + 16 * 8k + 16 * 8M and the PF needs 8k + 8k + 8M, which > can all fit in 137MB. The above does it as 9MB of prefetchable + 128MB of > non-prefetchable (+ 4MB unallocated under the bridge). Alex Thanks a lot,very useful. BTW,based on your comments,we still have a problem,we can not make sure host whether or not support sr-iov before we real use it,even more,maybe we need to calculate prefetchable momory.would you please give me a more specific suggestion about sr-iov HW prerequisites?thanks again. Best Regards, Chayang (In reply to comment #21) > BTW,based on your comments,we still have a problem,we can not make sure host > whether or not support sr-iov before we real use it,even more,maybe we need to > calculate prefetchable momory.would you please give me a more specific > suggestion about sr-iov HW prerequisites?thanks again. Don is probably better able to comment on the specific hardware requirements as far as things like ARI/ACS in the chipset. Unfortunately on the BIOS side, there seems to be little we can do other than try it and use analysis like above to verify that if it doesn't work, it's because the BIOS isn't mapping sufficient resources. Ideally we can also let the hardware vendors know about these problems. It might be a good goal to make something like biosbits.org specifically test for these kinds of issues. |
Created attachment 480326 [details] device message Description of problem: Version-Release number of selected component (if applicable): AMD host & Magny-cours & X3100 # uname -r 2.6.32-118.el6.x86_64 # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Thread(s) per core: 1 Core(s) per socket: 12 CPU socket(s): 2 NUMA node(s): 4 Vendor ID: AuthenticAMD CPU family: 16 Model: 9 Stepping: 1 CPU MHz: 1900.321 BogoMIPS: 3800.37 Virtualization: AMD-V L1d cache: 64K L1i cache: 64K L2 cache: 512K L3 cache: 5118K NUMA node0 CPU(s): 0,2,4,6,8,10 NUMA node1 CPU(s): 12,14,16,18,20,22 NUMA node2 CPU(s): 13,15,17,19,21,23 NUMA node3 CPU(s): 1,3,5,7,9,11 How reproducible: 100% Steps to Reproduce: 1.compile REL_2.0.28.21260_LX_SRC-vxge and install vxge.ko # modinfo vxge filename: /lib/modules/2.6.32-118.el6.x86_64/updates/drivers/net/vxge/vxge.ko description: Neterion's X3100 Series 10GbE PCIe I/OVirtualized Server Adapter version: 2.0.28.21260-p3.0.1.2 license: Dual BSD/GPL srcversion: F0FEAD03AAAD881453187F7 alias: pci:v000017D5d00005833sv*sd*bc*sc*i* alias: pci:v000017D5d00005733sv*sd*bc*sc*i* depends: vermagic: 2.6.32-118.el6.x86_64 SMP mod_unload modversions parm: intr_type:int parm: vlan_tag_strip:int parm: promisc_en:int parm: promisc_all_en:int parm: rec_all_vid:int parm: max_config_vpath:int parm: max_mac_vpath:int parm: max_config_dev:int parm: func_mode:int parm: fw_upgrade:int parm: factory_default:int parm: port_mode:int parm: port_behavior:int parm: l2_switch:int parm: catch_basin_mode:int parm: port_failure:int parm: bw:array of int parm: tx_bw:array of int parm: rx_bw:array of int parm: priority:array of int parm: napi:int parm: lro:int parm: rx_steering_type:int parm: tx_steering_type:int parm: tx_pause_enable:int parm: rx_pause_enable:int parm: exec_mode:int parm: intr_adapt:int parm: udp_stream:int 2.generate vfs by: # modprobe -r vxge;modprobe vxge func_mode=2 func_mode: Changes the PCI function mode. 0 - SF1_VP17 (1 function, 17 Vpaths) 1 - MF8_VP2 (8 functions, 2 Vpaths each) 2 - SR17_VP1 (17 VFs with 1 Vpath each) 3 - MR17_VP1 (17 Virtual Hierarchies, 1 Vpath/Function/Hierarchy) 4 - MR8_VP2 (8 Virtual Hierarchies, 2 Vpath/Function/Hierarchy) 5 - MF17_VP1 (17 functions, 1 vpath each (PCIe ARI)) 6 - SR8_VP2 (1PF, 7VF, 2 Vpaths each) 7 - SR4_VP4 (1PF, 3VF, 4 Vpaths each) 8 - MF2_VP8 (2 functions, 8 Vpaths each) 9 - MF4_VP4 (4 Functions, 4 Vpaths each) 10 - MR4_VP4 (4 Virtual Hierarchies, 4 Vpaths/Function/Hierarchy) Actual results: fail to generate any vf. Expected results: x3100 can work with the mode I give. Additional info: # lspci -vvv -t -+-[0000:20]-+-00.0 ATI Technologies Inc RD890 Northbridge only dual slot (2x8) PCI-e GFX Hydra part | +-00.2 ATI Technologies Inc Device 5a23 | +-02.0-[21]-- | +-03.0-[22]-- | \-0b.0-[23]-- \-[0000:00]-+-00.0 ATI Technologies Inc RD890 PCI to PCI bridge (external gfx0 port A) +-00.2 ATI Technologies Inc Device 5a23 +-02.0-[01]--+-00.0 Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet | \-00.1 Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet +-03.0-[02]--+-00.0 Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet | \-00.1 Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet +-04.0-[03-08]----00.0-[04-08]--+-00.0-[05]----00.0 LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] | +-01.0-[06]-- | +-04.0-[07]-- | \-05.0-[08]-- +-09.0-[09]----00.0 Exar Corp. X3100 Series 10 Gigabit Ethernet PCIe (X3100 is here) <<<<<<<------------------------------------------------- # lspci -vvv -s 00:09.0|grep -i ari BridgeCtl: Parity+ SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ ARIFwd+ DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis- ARIFwd+ # dmesg |grep -i sriov eth4: SRIOV 17 - 17 VF, 1 vpath per VF Enabled # dmesg |grep -i vxge vxge 0000:09:00.0: eth50: Link Down vxge 0000:09:00.0: PCI INT A disabled vxge: Unknown parameter `func_mode' vxge: Copyright(c) 2002-2010 Exar Inc. vxge: Driver version: 2.0.28.21260-p3.0.1.2 vxge 0000:09:00.0: PCI INT A -> Link[LN48] -> GSI 48 (level, high) -> IRQ 48 vxge 0000:09:00.0: setting latency timer to 64 vxge 0000:09:00.0: not enough MMIO resources for SR-IOV