Bug 1942362
Summary: | Live migration with iommu from rhel8.3.1 to rhel8.4 fails: qemu-kvm: get_pci_config_device: Bad config data | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Pei Zhang <pezhang> | ||||||
Component: | qemu-kvm | Assignee: | jason wang <jasowang> | ||||||
qemu-kvm sub component: | Live Migration | QA Contact: | Pei Zhang <pezhang> | ||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||
Severity: | high | ||||||||
Priority: | high | CC: | chayang, dgilbert, ehadley, jasowang, jinzhao, juzhang, mrezanin, smitterl, virt-maint, yanghliu | ||||||
Version: | 8.4 | Keywords: | Triaged | ||||||
Target Milestone: | rc | ||||||||
Target Release: | 8.4 | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | qemu-kvm-5.2.0-16.module+el8.4.0+10806+b7d97207 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2021-05-25 06:48:26 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1948358 | ||||||||
Attachments: |
|
Description
Pei Zhang
2021-03-24 09:21:34 UTC
Can you attach a: sudo lspci -vvv from booting the same VM on 8.3.1 and 8.4.0 please. Created attachment 1767584 [details]
lspci -vvv on rhel8.4
Created attachment 1767585 [details]
lspci -vvv on rhel8.3.1
(In reply to Dr. David Alan Gilbert from comment #1) > Can you attach a: > sudo lspci -vvv > > from booting the same VM on 8.3.1 and 8.4.0 please. Hello David, Please see Comment 2 and Comment 3, they are "lspci -vvv" in the same rhel8.4 VM booting on rhel8.4 host and rhel8.3.1 host. Best regards, Pei Interesting; I don't think that's showing me quite what I need; but it is showing me that the index range is within the 'address translation service' which is the bit to do with IOMMU; the only difference between those two lspci outputs is some lik status: 8.3.1: 00:04.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode]) LnkSta: Speed 16GT/s (ok), Width x32 (ok) >> TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt- SltCap: AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise+ Slot #0, PowerLimit 0.000W; Interlock+ NoCompl- SltCtl: Enable: AttnBtn+ PwrFlt- MRL- PresDet- CmdCplt+ HPIrq+ LinkChg- >> Control: AttnInd On, PwrInd Off, Power+ Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock- Changed: MRL- PresDet- LinkState- 01:00.0 SCSI storage controller: Red Hat, Inc. Virtio block device (rev 01) LnkSta: Speed 2.5GT/s (ok), Width x1 (ok) TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt- vs 8.4 LnkSta: Speed 16GT/s (ok), Width x32 (ok) >> TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt- SltCap: AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise+ Slot #0, PowerLimit 0.000W; Interlock+ NoCompl- SltCtl: Enable: AttnBtn+ PwrFlt- MRL- PresDet- CmdCplt+ HPIrq+ LinkChg- >> Control: AttnInd Off, PwrInd Off, Power+ Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock- Changed: MRL- PresDet- LinkState- 01:00.0 SCSI storage controller: Red Hat, Inc. Virtio block device (rev 01) LnkSta: Speed 2.5GT/s (ok), Width x1 (ok) >> TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt- It doesn't seem to be complaining about that though; (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read: 0 device: 20 cmask: ff wmask: 0 w1cmask:0 That i=0x104 is a much bigger index than the Express endpoint which is 0x40-0x7b in that output; both show: Capabilities: [100 v1] Address Translation Service (ATS) ATSCap: Invalidate Queue Depth: 00 ATSCtl: Enable-, Smallest Translation Unit: 00 so since this starts at 0x100, and it's IOMMU related, that's probably where the 0x104 difference is; unfortunately lspci isn't decoding/displaying it. Now we know it's in ATS, looking in the ATS spec says that: (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read: 0 device: 20 cmask: ff wmask: 0 w1cmask:0 offset 04 is 'ATS Control Register | ATS Capability Register' and bit 5 of the ATS Capability register (2^5=0x20) is 'Page aligned request' This looks like it's qemu commit: commit 4c70875372b821b045e84f462466a5c04b091ef5 Author: Jason Wang <jasowang> Date: Wed Sep 9 16:17:31 2020 +0800 pci: advertise a page aligned ATS After Linux kernel commit 61363c1474b1 ("iommu/vt-d: Enable ATS only if the device uses page aligned address."), ATS will be only enabled if device advertises a page aligned request. Unfortunately, vhost-net is the only user and we don't advertise the aligned request capability in the past since both vhost IOTLB and address_space_get_iotlb_entry() can support non page aligned request. Though it's not clear that if the above kernel commit makes sense. Let's advertise a page aligned ATS here to make vhost device IOTLB work with Intel IOMMU again. Note that in the future we may extend pcie_ats_init() to accept parameters like queue depth and page alignment. Cc: qemu-stable Signed-off-by: Jason Wang <jasowang> Message-Id: <20200909081731.24688-1-jasowang> Reviewed-by: Peter Xu <peterx> Reviewed-by: Michael S. Tsirkin <mst> Signed-off-by: Michael S. Tsirkin <mst> I've posted a patch upstream to fix this. Please review. Thanks Assign to Jason directly since he's working on the issue. A determination will need to be made as to "how" or "if" the backport will be necessary for RHEL-AV 8.4.0 since we're rather late in the release cycle. Jason's qemu commit fix landed upstream: commit d83f46d189a26fa32434139954d264326f199a45 Author: Jason Wang <jasowang> Date: Tue Apr 6 12:03:30 2021 +0800 virtio-pci: compat page aligned ATS Commit 4c70875372b8 ("pci: advertise a page aligned ATS") advertises the page aligned via ATS capability (RO) to unbrek recent Linux IOMMU drivers since 5.2. But it forgot the compat the capability which breaks the migration from old machine type: (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read: 0 device: 20 cmask: ff wmask: 0 w1cmask:0 This patch introduces a new parameter "x-ats-page-aligned" for virtio-pci device and turns it on for machine type which is newer than 5.1. Verification: Versions: Version: 8.3.1 host: 4.18.0-240.22.1.el8_3.x86_64 qemu-kvm-5.1.0-21.module+el8.3.1+10464+8ad18d1a.x86_64 8.4 host: 4.18.0-304.el8.x86_64 qemu-img-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 VM: rhel8.4 Following steps in Description, 5 ping-pong migration keeps working well. So this bug has been fixed very well. Will move to Verified directly once on_qa. Move to Verified as Comment 22. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2098 |