Bug 1434747
| Summary: | [Q35] code12 error when hotplug XL710 device in win2016 | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | jingzhao <jinzhao> | ||||||||||
| Component: | ovmf | Assignee: | Laszlo Ersek <lersek> | ||||||||||
| Status: | CLOSED ERRATA | QA Contact: | FuXiangChun <xfu> | ||||||||||
| Severity: | high | Docs Contact: | |||||||||||
| Priority: | high | ||||||||||||
| Version: | 7.4 | CC: | alex.williamson, chayang, jinzhao, juzhang, knoel, kraxel, lersek, marcel, michen, mrezanin, virt-maint, xfu, yfu | ||||||||||
| Target Milestone: | rc | ||||||||||||
| Target Release: | --- | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | ovmf-20171011-1.git92d07e48907f.el7 | Doc Type: | If docs needed, set a value | ||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2018-04-10 16:28:00 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Bug Depends On: | 1437113, 1469787 | ||||||||||||
| Bug Blocks: | 1473046 | ||||||||||||
| Attachments: |
|
||||||||||||
(CC Marcel, Gerd, Alex; see also 1434740, which I just reported too) Thanks for reporting this; indeed OVMF's resource reservation for hotplug purposes is something that is rudimentary at the moment, and we should define and design the requirements in a lot more detail than what exists now. * Specific comments: Based on the error message captured in attachment 1265344 [details], I believe that the x710 NIC might require more than 2MB of MMIO space. OVMF presently reserves only 2MB of MMIO for each bridge, for hotplug purposes, and that doesn't seem to suffice for the X710. To confirm, please attach the following: - the OVMF debug log (please never forget about this!) - "lspci -v -v -v" from the host, for the x710 NIC (before assignment) Also, does the assigned x710 work if you cold-plug it, before booting the guest? (It should.) * Generic comments: Marcel's PCI Express guidelines in the QEMU tree (docs/pcie.txt) dictate *some* requirements for hotplug-oriented resource reservation, but nothing specific for PCI Express hotplug. I raised that as an issue while review was on-going: see point {26} in https://www.mail-archive.com/qemu-devel@nongnu.org/msg405651.html On 10/14/16 13:36, Laszlo Ersek wrote: > {26} Another remark (important to me) in this section: the document > doesn't state firmware expectations. It's clear the firmware is expected > to reserve no IO space for PCI Express Downstream Ports and Root Ports, > but what about MMIO? > > We discussed this at length with Alex, but I think we didn't conclude > anything. It would be nice if firmware received some instructions from > this document in this regard, even before we implement our own ports and > bridges in QEMU. > > <digression> > > If we think such recommendations are out of scope at this point, *and* > noone disagrees strongly (Gerd?), then I could add some experimental > fw_cfg knobs to OVMF for this, such as (units in MB): > > -fw_cfg opt/org.tianocore.ovmf/X-ReservePciE/PrefMmio32Mb,string=... > -fw_cfg opt/org.tianocore.ovmf/X-ReservePciE/NonPrefMmio32Mb,string=... > -fw_cfg opt/org.tianocore.ovmf/X-ReservePciE/PrefMmio64Mb,string=.. > -fw_cfg opt/org.tianocore.ovmf/X-ReservePciE/NonPrefMmio64Mb,string=.. > > Under this idea, I would reserve no resources at all for Downstream > Ports and Root Ports in OVMF by default; but users could influence those > reservations. I think that would be enough to kick things off. It also > needs no modifications for QEMU. > > </digression> Gerd suggested to postpone the specifics until we develop our own generic port types: https://www.mail-archive.com/qemu-devel@nongnu.org/msg406096.html On 10/17/16 14:07, Gerd Hoffmann wrote: >> {26} Another remark (important to me) in this section: the document >> doesn't state firmware expectations. It's clear the firmware is expected >> to reserve no IO space for PCI Express Downstream Ports and Root Ports, >> but what about MMIO? >> >> We discussed this at length with Alex, but I think we didn't conclude >> anything. It would be nice if firmware received some instructions from >> this document in this regard, even before we implement our own ports and >> bridges in QEMU. > > Where do we stand in terms of generic pcie ports btw? > > I think the plan is still to communicate suggestions to the firmware via > pci config space, either by using reset defaults of the limit register, > or of that doesn't work due to initialization order issues using some > vendor specific pcie capability. > > As long as we don't have that there is nothing do document, other than > maybe briefly mentioning the plans we have and documenting the current > state (2M mmio in seabios, and I think the same for ovmf). > > The patches adding the generic ports can also update the documentation > of course. > >> <digression> >> >> If we think such recommendations are out of scope at this point, *and* >> noone disagrees strongly (Gerd?), then I could add some experimental >> fw_cfg knobs to OVMF for this, such as (units in MB): > > Why? Given that the virtio mmio bar size issue is solved I don't see a > strong reason to hurry with this. Just wait until the generic ports are > there. Thus, the firmware will have to take hints from QEMU about the desired resource reservation, for hotplug, regarding both IO and MMIO space. Bug 1434740 mentioned above is about IO space only. It is about *not* reserving *any* IO space for PCI Express ports (which will be a good thing). And in this bug I guess we should figure out the way to communicate MMIO needs (or else figure out a more robust MMIO reservation default). This might need QEMU patches too, similarly to how OVMF bug 1434740 depends on QEMU bug 1344299. Hi Laszlo X710 detail info: [root@dell-per730-29 win]# lspci -vvv -s 04:00.1 04:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02) Subsystem: Intel Corporation Ethernet Converged Network Adapter XL710-Q2 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 54 NUMA node: 0 Region 0: Memory at 90000000 (64-bit, prefetchable) [size=16M] Region 3: Memory at 92800000 (64-bit, prefetchable) [size=32K] Expansion ROM at 92b80000 [disabled] [size=512K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [70] MSI-X: Enable+ Count=129 Masked- Vector table: BAR=3 offset=00000000 PBA: BAR=3 offset=00001000 Capabilities: [a0] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 2048 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+ RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset- MaxPayload 256 bytes, MaxReadReq 4096 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L0s <2us, L1 <16us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [e0] Vital Product Data Product Name: XL710 40GbE Controller Read-only fields: [V0] Vendor specific: FFV17.5.3 [PN] Part number: KF46X [MN] Manufacture ID: 31 30 32 38 [V1] Vendor specific: DSV1028VPDR.VER2.0 [V3] Vendor specific: DTINIC [V4] Vendor specific: DCM10010380C521010380C512020380C523020380C514030380C525030380C516040380C527040380C518050380C529050380C51A060380C52B060380C51C070380C52D070380C51E080380C52F080380C5 [V5] Vendor specific: NPY2 [V6] Vendor specific: PMTA [V7] Vendor specific: NMVIntel Corp [V8] Vendor specific: L1D0 [RV] Reserved: checksum good, 1 byte(s) reserved Read/write fields: [Y1] System specific: CCF1 End Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+ Capabilities: [140 v1] Device Serial Number 88-90-15-ff-ff-fe-fd-3c Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 0 ARICtl: MFVC- ACS-, Function Group: 0 Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV) IOVCap: Migration-, Interrupt Message Number: 000 IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy- IOVSta: Migration- Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 01 VF offset: 79, stride: 1, Device ID: 154c Supported Page Size: 00000553, System Page Size: 00000001 Region 0: Memory at 0000000092000000 (64-bit, prefetchable) Region 3: Memory at 0000000092810000 (64-bit, prefetchable) VF Migration: offset: 00000000, BIR: 0 Capabilities: [1a0 v1] Transaction Processing Hints Device specific mode supported No steering table available Capabilities: [1b0 v1] Access Control Services ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Kernel driver in use: i40e Kernel modules: i40e Hi Laszlo 1. Guest worked well when booting with x710 device 2. OVMF log, please check the attachment. Thanks Jing Created attachment 1265550 [details]
ovmf log
> Region 0: Memory at 90000000 (64-bit, prefetchable) [size=16M]
Yep, larger than the default 2M window size, so this is the root cause.
So, now that the generic ports are there we can go on figure how to handle this best. I still think the best way to communicate window size hints would be to use a vendor specific pci capability (instead of setting the desired size on reset). The information will always be available then and we don't run into initialization order issues.
Right, what Gerd said. Here's the reservation from the OVMF log (comment 6): PciBus: Discovered PPB @ [00|03|00] GetResourcePadding: Address=00:03.0 DevicePath=PciRoot(0x0)/Pci(0x3,0x0) Padding: Type = Mem64; Alignment = 0x1FFFFF; Length = 0x200000 Padding: Type = Io; Alignment = 0x1FF; Length = 0x200 (In reply to Gerd Hoffmann from comment #7) > So, now that the generic ports are there we can go on figure how to handle > this best. I still think the best way to communicate window size hints > would be to use a vendor specific pci capability (instead of setting the > desired size on reset). The information will always be available then and > we don't run into initialization order issues. This seems good to me -- I can't promise 100% without actually trying, but I think I should be able to parse the capability list in config space for this hint, in the GetResourcePadding() callback. I propose that we try to handle this issue "holistically", together with bug 1434740. We need a method that provides controls for both IO and MMIO: - For IO, we need a mechanism that can prevent *both* firmware *and* Linux from reserving IO for PCI Express ports. I think Marcel's approach in bug 1344299 is sufficient, i.e., set the IO base/limit registers of the bridge to 0 for disabling IO support. And, if not disabled, just go with the default 4KB IO reservation (for both PCI Express ports and legacy PCI bridges, as the latter is documented in the guidelines). - For MMIO, the vendor specific capability structure should work something like this: - if the capability is missing, reserve 2MB, 32-bit, non-prefetchable, - otherwise, the capability structure should consist of 3 fields (reservation sizes): - uint32_t non_prefetchable_32, - uint32_t prefetchable_32, - uint64_t prefetchable_64, - of prefetchable_32 and prefetchable_64, at most one may be nonzero (they are mutually exclusive, and they can be both zero), - whenever a field is 0, that kind of reservation is not needed. (In reply to Laszlo Ersek from comment #10) > Marcel, should we file a separate RHBZ for this QEMU feature (comment 7, > comment 8)? Thanks! Hi Laszlo, Definitely a separate BZ! Is strange we don't have one yet, I am sure I had one... Anyway, to be on the safe side: https://bugzilla.redhat.com/show_bug.cgi?id=1437113 Thanks, Marcel posted upstream series:
[edk2] [PATCH 0/7] OvmfPkg/PciHotPlugInitDxe: obey PCI resource reservation
hints from QEMU
https://lists.01.org/pipermail/edk2-devel/2017-September/015296.html
Message-Id: <20170925195824.10866-1-lersek>
(In reply to Laszlo Ersek from comment #14) > posted upstream series: > > [edk2] [PATCH 0/7] OvmfPkg/PciHotPlugInitDxe: obey PCI resource reservation > hints from QEMU > https://lists.01.org/pipermail/edk2-devel/2017-September/015296.html > Message-Id: <20170925195824.10866-1-lersek> 1 8844f15d33c7 MdePkg/IndustryStandard/Pci23: add vendor-specific capability header 2 bdf73b57f283 OvmfPkg/IndustryStandard: define PCI Capabilities for QEMU's PCI Bridges 3 91231fc2ff2b OvmfPkg/PciHotPlugInitDxe: clean up protocol usage comment 4 c18ac9fbcc71 OvmfPkg/PciHotPlugInitDxe: clean up addr. range for non-prefetchable MMIO 5 a980324709b1 OvmfPkg/PciHotPlugInitDxe: generalize RESOURCE_PADDING composition 6 4776d5cb3abf OvmfPkg/PciHotPlugInitDxe: add helper functions for setting up paddings 7 fe4049471bdf OvmfPkg/PciHotPlugInitDxe: translate QEMU's resource reservation hints Note to virt-QE: The upstream OVMF patches don't automatically solve the issue reported in this BZ. Instead, the QEMU and OVMF patches together make it possible to (a) disable IO space reservation and (b) correctly size the MMIO-64 reservation for PCI Express Root Ports. This ensures both that a large number of PCI Express devices, without IO BARs, can be cold-plugged (IO space will not be exhausted) into a set of such root ports, and that PCI Express devices with no IO BARs, and large MMIO-64 BARs, can be hot-plugged into a set of such root ports. The relevant QEMU command line option / properties are: -device pcie-root-port,id=root-port-N,pref64-reserve=size,io-reserve=0 and then cold- or hot-plug the device in question into "root-port-N". I'm unsure if libvirt exposes the new pcie-root-port properties in the domain XML. Reproduce the issue with OVMF-20171011-1.git92d07e48907f.el7.noarch.rpm according to comment 0 Verified the issue withe OVMF-20171011-4.git92d07e48907f.el7.noarch.rpm & qemu-kvm-rhev-2.10.0-12.el7.x86_64 & kernel-3.10.0-820.el7.x86_64 according to comment 21 qemu command line: /usr/libexec/qemu-kvm \ -machine q35,smm=on,accel=kvm \ -cpu Haswell-noTSX \ -nodefaults -rtc base=utc \ -m 4G \ -smp 4,sockets=4,cores=1,threads=1 \ -enable-kvm \ -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \ -k en-us \ -nodefaults \ -drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on \ -drive file=/home/OVMF_VARS.fd,if=pflash,format=raw,unit=1 \ -serial unix:/tmp/serial0,server,nowait \ -debugcon file:/home/ovmf.log \ -global isa-debugcon.iobase=0x402 \ -boot menu=on \ -qmp tcp:0:6667,server,nowait \ -vga qxl \ -global driver=cfi.pflash01,property=secure,value=on \ -drive file=/home/win2016-ovmf-bk.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=0 \ -device pcie-root-port,bus=pcie.0,id=root3,io-reserve=0,pref64-reserve=16M \ -device ahci,id=ahci0 \ -drive file=/usr/share/virtio-win/virtio-win-1.9.3.iso,if=none,id=drive-virtio-disk1,format=raw \ -device ide-cd,drive=drive-virtio-disk1,id=virtio-disk1,bus=ahci0.0 \ -monitor stdio \ -vnc :1 \ Did not hit the code 12 error, please check the attachment Thanks Jing Created attachment 1367100 [details]
screenshot of verified
Created attachment 1367101 [details]
ovmf log of verified
I checked the attachments in comment 24 and comment 25 -- everything looks good; thanks! Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0902 |
Created attachment 1265344 [details] screenshot of hotplug nic Description of problem: code12 error when hotplug x710 device in win2016 Version-Release number of selected component (if applicable): qemu-kvm-rhev-2.9.0-0.el7.mrezanin201703210848.x86_64 kernel-3.10.0-613.el7.x86_64 OVMF-20170228-1.gitc325e41585e3.el7.noarch How reproducible: 3/3 Steps to Reproduce: 1. Boot guest with qemu command line [1] 2. Hot-plug x710 nic through qmp {"execute":"device_add","arguments":{"driver":"vfio-pci","host":"04:00.1","id":"pf2","bus":"root3"}} 3. Check the device in win2016 guest Actual results: Code 12 error, please check the attachment [root@dell-per730-29 ~]# ethtool -i p6p2 driver: i40e version: 1.6.27-k firmware-version: 5.02 0x80002400 17.5.9 expansion-rom-version: bus-info: 0000:04:00.1 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes Expected results: Guest work well after hot plug x710 device Additional info: [1] qemu line /usr/libexec/qemu-kvm \ -machine q35,smm=on,accel=kvm \ -cpu Haswell-noTSX \ -nodefaults -rtc base=utc \ -m 4G \ -smp 4,sockets=4,cores=1,threads=1 \ -enable-kvm \ -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \ -k en-us \ -nodefaults \ -drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on \ -drive file=/home/win/OVMF_VARS.fd,if=pflash,format=raw,unit=1 \ -serial unix:/tmp/serial0,server,nowait \ -debugcon file:/home/win/ovmf.log \ -global isa-debugcon.iobase=0x402 \ -boot menu=on \ -qmp tcp:0:6667,server,nowait \ -vga qxl \ -global driver=cfi.pflash01,property=secure,value=on \ -drive file=/home/win/ovmf-win2016.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=0 \ -device ioh3420,bus=pcie.0,id=root3 \ -device ahci,id=ahci0 \ -drive file=/usr/share/virtio-win/virtio-win-1.9.0.iso,if=none,id=drive-virtio-disk1,format=raw \ -device ide-cd,drive=drive-virtio-disk1,id=virtio-disk1,bus=ahci0.0 \ -monitor stdio \ -vnc :1 \ BTW: didn't reproduce the issue on q35 +seabios