Bug 1457662
Summary: | Windows guest cannot boot with interrupt remapping (VT-d) | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Peter Xu <peterx> | ||||
Component: | qemu-kvm-rhev | Assignee: | Amnon Ilan <ailan> | ||||
Status: | CLOSED ERRATA | QA Contact: | jingzhao <jinzhao> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.4 | CC: | ailan, chayang, hhuang, jinzhao, juzhang, knoel, michen, mrezanin, peterx, pezhang, virt-maint, xfu, yfu | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | qemu-kvm-rhev-2.10.0-1.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-04-11 00:23:04 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Peter Xu
2017-06-01 06:36:44 UTC
From the debug tracing, it's strange that the Windows guest firstly set IQ_TAIL to 2: (vtd)vtd_handle_iqt_write: set iq tail 2 Then it tries to enable QI: (vtd)vtd_handle_gcmd_qie: error: can't enable Queued Invalidation: tail 2 While this is forbidden by VT-d spec (chap 6.5.2): """ To enable queued invalidations, software must: Ensure all invalidation requests submitted to hardware through the register-based invalidation registers are completed. (i.e. no pending invalidation requests in hardware). - Initialize the Invalidation Queue Tail Register (see Section 10.4.22) to zero. - Setup the IQ address and size through the Invalidation Queue Address Register (see Section 10.4.23). - Enable the queued invalidation interface through the Global Command Register (see Section 10.4.4). When enabled, hardware sets the QIES field in the Global Status Register (see Section 10.4.5). """ I am not familiar with how should we move on with this, since from what I captured it seems a Windows bug. Peter Amnon, Since it might be a Windows OS bug, do you have suggestion on how we should proceed with it? Thanks in adnvance. I was able to reproduce this and can only confirm Peter's analysis in comment 2. It seems to work fine on Windows 10 so reporting this to Microsoft is probably pointless. Windows 10 trace (excerpt): (vtd)vtd_handle_iqt_write: set iq tail 0 ... (vtd)vtd_handle_gcmd_qie: Queued Invalidation Enable on (vtd)vtd_handle_gcmd_qie: DMAR_IQA_REG 0x10d000 (vtd)vtd_handle_gcmd_qie: Invalidation Queue addr 0x10d000 size 256 (vtd)vtd_handle_gcmd_sirtp: set Interrupt Remap Table Pointer (vtd)vtd_interrupt_remap_table_setup: int remap table addr 0x10e000 size 256 ... (vtd)vtd_handle_iqt_write: set iq tail 1 Peter, would you be fine with making QEMU tolerant to the incorrect order of register writes performed by older Windows? As far as I can tell, we currently silently fail the attempt to enable queued invalidations so we're not going to lose anything by being more permissive. I can successfully boot Windows Server 2012 R2 with this patch: --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -1477,13 +1477,15 @@ static inline bool vtd_queued_inv_disable_check(IntelIOMMUState *s) (s->iq_last_desc_type == VTD_INV_DESC_WAIT); } +static void vtd_fetch_inv_desc(IntelIOMMUState *s); + static void vtd_handle_gcmd_qie(IntelIOMMUState *s, bool en) { uint64_t iqa_val = vtd_get_quad_raw(s, DMAR_IQA_REG); VTD_DPRINTF(INV, "Queued Invalidation Enable %s", (en ? "on" : "off")); if (en) { - if (vtd_queued_inv_enable_check(s)) { + if (true) { s->iq = iqa_val & VTD_IQA_IQA_MASK; /* 2^(x+8) entries */ s->iq_size = 1UL << ((iqa_val & VTD_IQA_QS) + 8); @@ -1493,9 +1495,12 @@ static void vtd_handle_gcmd_qie(IntelIOMMUState *s, bool en) s->iq, s->iq_size); /* Ok - report back to driver */ vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_QIES); - } else { - VTD_DPRINTF(GENERAL, "error: can't enable Queued Invalidation: " - "tail %"PRIu16, s->iq_tail); + } + + if (s->iq_tail != 0 && + !(vtd_get_long_raw(s, DMAR_FSTS_REG) & VTD_FSTS_IQE)) { + VTD_DPRINTF(INV, "iq tail is already set, processing queue"); + vtd_fetch_inv_desc(s); } } else { if (vtd_queued_inv_disable_check(s)) { Created attachment 1288050 [details]
Proof-of-concept fix
Attaching the patch.
(In reply to Ladi Prosek from comment #11) > I was able to reproduce this and can only confirm Peter's analysis in > comment 2. It seems to work fine on Windows 10 so reporting this to > Microsoft is probably pointless. Windows 10 trace (excerpt): > > (vtd)vtd_handle_iqt_write: set iq tail 0 > ... > (vtd)vtd_handle_gcmd_qie: Queued Invalidation Enable on > (vtd)vtd_handle_gcmd_qie: DMAR_IQA_REG 0x10d000 > (vtd)vtd_handle_gcmd_qie: Invalidation Queue addr 0x10d000 size 256 > (vtd)vtd_handle_gcmd_sirtp: set Interrupt Remap Table Pointer > (vtd)vtd_interrupt_remap_table_setup: int remap table addr 0x10e000 size 256 > ... > (vtd)vtd_handle_iqt_write: set iq tail 1 > > > Peter, would you be fine with making QEMU tolerant to the incorrect order of > register writes performed by older Windows? As far as I can tell, we > currently silently fail the attempt to enable queued invalidations so we're > not going to lose anything by being more permissive. > > I can successfully boot Windows Server 2012 R2 with this patch: > > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -1477,13 +1477,15 @@ static inline bool > vtd_queued_inv_disable_check(IntelIOMMUState *s) > (s->iq_last_desc_type == VTD_INV_DESC_WAIT); > } > > +static void vtd_fetch_inv_desc(IntelIOMMUState *s); > + > static void vtd_handle_gcmd_qie(IntelIOMMUState *s, bool en) > { > uint64_t iqa_val = vtd_get_quad_raw(s, DMAR_IQA_REG); > > VTD_DPRINTF(INV, "Queued Invalidation Enable %s", (en ? "on" : "off")); > if (en) { > - if (vtd_queued_inv_enable_check(s)) { > + if (true) { > s->iq = iqa_val & VTD_IQA_IQA_MASK; > /* 2^(x+8) entries */ > s->iq_size = 1UL << ((iqa_val & VTD_IQA_QS) + 8); > @@ -1493,9 +1495,12 @@ static void vtd_handle_gcmd_qie(IntelIOMMUState *s, > bool en) > s->iq, s->iq_size); > /* Ok - report back to driver */ > vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_QIES); > - } else { > - VTD_DPRINTF(GENERAL, "error: can't enable Queued Invalidation: " > - "tail %"PRIu16, s->iq_tail); > + } > + > + if (s->iq_tail != 0 && > + !(vtd_get_long_raw(s, DMAR_FSTS_REG) & VTD_FSTS_IQE)) { > + VTD_DPRINTF(INV, "iq tail is already set, processing > queue"); > + vtd_fetch_inv_desc(s); > } > } else { > if (vtd_queued_inv_disable_check(s)) { Hello, Ladi, Good to know that this patch works! I verified it on my buggy windows. It boots well. However I would suggest we use a property for it, rather than allowing it by default. Like, can we introduce a x-buggy-qtail bit for this? Then old windows users can turn this on if they want to use it. Also, new Windows users won't be affected as well even without this bit. That's good news. How do you think? (If you like, you can post a patch directly for it upstream) Thanks! Peter Hi Peter, (In reply to Peter Xu from comment #15) > Hello, Ladi, > > Good to know that this patch works! > > I verified it on my buggy windows. It boots well. > > However I would suggest we use a property for it, rather than allowing it by > default. Like, can we introduce a x-buggy-qtail bit for this? Then old > windows users can turn this on if they want to use it. The problem with adding a new property is that we would also need to add support for it to libvirt and RHV/OpenStack and they would need to set it if they know that the guest is running affected versions of Windows. That's unnecessarily complex. Unless there is a real concern that changing the behavior can break someone, I would advise against it. I think that in general QEMU often chooses to emulate real hardware's quirks, even if they are in conflict with the spec. I'm assuming that real chips do allow the formally incorrect order of operations, otherwise Windows would be very broken -- would it be easy for us to verify that? Thanks! > Also, new Windows users won't be affected as well even without this bit. > That's good news. > > How do you think? > > (If you like, you can post a patch directly for it upstream) > > Thanks! > > Peter (In reply to Ladi Prosek from comment #16) > Hi Peter, > > (In reply to Peter Xu from comment #15) > > Hello, Ladi, > > > > Good to know that this patch works! > > > > I verified it on my buggy windows. It boots well. > > > > However I would suggest we use a property for it, rather than allowing it by > > default. Like, can we introduce a x-buggy-qtail bit for this? Then old > > windows users can turn this on if they want to use it. > > The problem with adding a new property is that we would also need to add > support for it to libvirt and RHV/OpenStack and they would need to set it if > they know that the guest is running affected versions of Windows. > > That's unnecessarily complex. Unless there is a real concern that changing > the behavior can break someone, I would advise against it. > > I think that in general QEMU often chooses to emulate real hardware's > quirks, even if they are in conflict with the spec. I'm assuming that real > chips do allow the formally incorrect order of operations, otherwise Windows > would be very broken -- would it be easy for us to verify that? Thanks! Makes sense to me. Then let's do what you suggested for this one. Thanks, Peter Posted a polished version of the patch upstream: https://lists.nongnu.org/archive/html/qemu-devel/2017-06/msg03836.html Committed upstream as 8991c460be5a0811194fd4d2b49ba7146a23526b Reproduce the issue on qemu-kvm-rhev-2.9.0-16.el7_4.9 and verified it on qemu-kvm-rhev-2.10.0-1.el7.x86_64 According to above test result, changed the bz to verified Thanks Jing Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1104 |