Bug 1446147
Summary: | Windows 10 cannot migrate with qxl device | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Guo, Zhiyi <zhguo> |
Component: | qemu-kvm-rhev | Assignee: | Gerd Hoffmann <kraxel> |
Status: | CLOSED WONTFIX | QA Contact: | xianwang <xianwang> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.4 | CC: | ailan, chayang, dblechte, dgilbert, fgieseler, fziglio, juzhang, knoel, kraxel, michen, qzhang, virt-maint, ybendito |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Windows | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-07-12 08:11:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Guo, Zhiyi
2017-04-27 10:31:07 UTC
This was noticed upstream a while ago, it was causing qemu to abort at the time; see: https://bugs.launchpad.net/qemu/+bug/1635339 From qemu commit 86dbcdd9c75 by Gerd: /* * Windows 8 drivers place qxl commands in the vram * (instead of the ram) bar. We can't live migrate such a * guest, so add a migration blocker in case we detect * this, to avoid triggering the assert in pre_save(). * * https://cgit.freedesktop.org/spice/win32/qxl-wddm-dod/commit/?id=f6e099db39e7d0787f294d5fd0dce328b5210faa */ Fixed in upstream https://gitlab.com/spice/qxl-wddm-dod/commit/0214d5ceda3f0da94de3813fc902150d497c6b26 Was it a bug in QEMU or in the QXL-WDDM-DOD driver? IMHO is in Qemu but it's too late to package a new version, so we need to write a workaround in Windows driver. Basically Qemu is testing if the release pointers are in Bar0. WDDM DOD is not supported yet, the plan is to add it to the next RHVM release. moving to 7.5 Answering Amnon's question: from my point of view, the bug is between qxl-wddm-dod and qemu. It is currently fixed in upstream of qxl-wddm-dod and will be in next release of the driver, i.e. 0.17 as soon as we build it. Fixed by 0.17 version of the driver. If virtualization team want to allow migration to work even if resources are in Bar1 they should change Qemu code to allow that. Note that wddm dod is not officially supported by RHEL. (In reply to Frediano Ziglio from comment #8) > Fixed by 0.17 version of the driver. > If virtualization team want to allow migration to work even if resources are > in Bar1 they should change Qemu code to allow that. > Note that wddm dod is not officially supported by RHEL. Gerd? (In reply to Ademar Reis from comment #9) > (In reply to Frediano Ziglio from comment #8) > > Fixed by 0.17 version of the driver. > > If virtualization team want to allow migration to work even if resources are > > in Bar1 they should change Qemu code to allow that. > > Note that wddm dod is not officially supported by RHEL. > > Gerd? Hmm, we can support that in qemu but requires a update to the live migration format, so it would come with quite some overhead for backward compatibility, such as requiring a new machine type. Which implies backporting limitations (no z-stream). Also the guest driver can't just expect this to work as there will be old qemu versions not supporting this deployed for quite a while. Is there a compelling reason to place ressources in bar 1? There are no reason like there are no reasons Qemu doing this limitation beside implementation details. We could decide that resources to be released must be allocated only in Bar0 and put some documentation and enforcement (at least on debug versions or Qemu/spice-server) to detect and prevent such issues in the future. (In reply to Frediano Ziglio from comment #11) > There are no reason like there are no reasons Qemu doing this limitation > beside implementation details. The reason why this limitation in qemu exists basically is "historical reasons", dating back to the days where bar1 didn't exist. Yes, from a design point of view it doesn't make sense. Likewise having the bar0 / bar1 split in the first place doesn't make sense. When re-designing qxl today with the lessons learned in the last years there a quite a few things I would do in a different way ... But you have the memory in bar0, it makes sense to use it, so why not simply continue to place the commands there? You have to do that anyway to support older qemu versions. Either unconditionally (which would be the simplest and IMHO best way), or by detecting support and running different code paths in the driver depending on the qemu version you are running on. Which just increases your test matrix for no good reason. (In reply to Gerd Hoffmann from comment #12) > (In reply to Frediano Ziglio from comment #11) > > There are no reason like there are no reasons Qemu doing this limitation > > beside implementation details. > > The reason why this limitation in qemu exists basically is "historical > reasons", dating back to the days where bar1 didn't exist. Yes, from a > design point of view it doesn't make sense. Likewise having the bar0 / bar1 > split in the first place doesn't make sense. When re-designing qxl today > with the lessons learned in the last years there a quite a few things I > would do in a different way ... > > But you have the memory in bar0, it makes sense to use it, so why not simply > continue to place the commands there? You have to do that anyway to support > older qemu versions. Either unconditionally (which would be the simplest > and IMHO best way), or by detecting support and running different code paths > in the driver depending on the qemu version you are running on. Which just > increases your test matrix for no good reason. As far as I understand you are just rephrasing what we already said. The only good thing of Bar1 compared to Bar0 is that can be quite large as it supports 64 bit and save address space for devices that needs 32 bit space. As already said the switch from Bar0 to Bar1 for the DOD driver did not came with much explanation and reverting this caused not issues. I consider enforcing/documenting the Bar0 allocation as a good debug practice to avoid the issue for future versions but is more of an enhancement than a fix for this issue and should be deserved for a different bug. Personally as this was just an issue with a upstream version not supposed to be supported by RHEL I would close the bug. > The only good thing of Bar1 compared to Bar0 is that can be quite large as > it supports 64 bit and save address space for devices that needs 32 bit > space. Yes. You can store all the bulky data in bar1. Only stuff referenced from the rings directly must be in bar0. Stuff referenced indirectly such as image data chunks attached to commands can be in bar1. And surfaces can be in bar1 anyway, that is what bar1 was originally created for. > I consider enforcing/documenting the Bar0 allocation as a good debug > practice to avoid the issue for future versions but is more of an > enhancement than a fix for this issue and should be deserved for a different > bug. That is easy. qemu already detects that (see commit referenced in comment #2) an registers a migration blocker then. We could also raise an error IRQ and enter guest bug mode. > Personally as this was just an issue with a upstream version not supposed to
> be supported by RHEL I would close the bug.
Closing now. It's fixed in the windows guest driver.
It working fine. I downloaded the latest code from: https://www.spice-space.org/download/windows/qxl-wddm-dod/qxl-wddm-dod-0.18/ and updated the Windows Driver. After that, the SAVE command is working fine for Windows 10. Thank you so much for your support. Kind regards! |