RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1446147 - Windows 10 cannot migrate with qxl device
Summary: Windows 10 cannot migrate with qxl device
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.4
Hardware: x86_64
OS: Windows
high
high
Target Milestone: rc
: ---
Assignee: Gerd Hoffmann
QA Contact: xianwang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-27 10:31 UTC by Guo, Zhiyi
Modified: 2018-06-24 22:57 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-07-12 08:11:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Guo, Zhiyi 2017-04-27 10:31:07 UTC
Description of problem:
Windows 10 cannot migrate with qxl device

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.9.0-1.el7.x86_64
3.10.0-655.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Boot windows 10 with qemu cli:
/usr/libexec/qemu-kvm -name qxl -m 2G -machine pc,accel=kvm\
	-S \
        -cpu Skylake-Client\
        -smp 1 \
        -monitor stdio \
	-device qxl-vga  \
	-device ich9-intel-hda \
	-device hda-duplex \
	-drive file=win10x64.qcow2,if=none,id=drive-scsi-disk0,format=qcow2,cache=none,werror=stop,rerror=stop  -device ide-drive,drive=drive-scsi-disk0 \
	-netdev tap,id=idinWyYp,vhost=on -device e1000,mac=42:ce:a9:d2:4d:d7,id=idlbq7eA,netdev=idinWyYp \
	-qmp tcp:0:4444,server,nowait \
        -serial unix:/tmp/console1,server,nowait \
        -spice port=5900,disable-ticketing \
        -device virtio-serial-pci,id=virtio-serial1 \
        -chardev spicevmc,id=charchannel0,name=vdagent \
        -device virtserialport,bus=virtio-serial1.0,nr=3,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 \

2. Install qxl windows driver https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=539376 and migrate to another host.
3.

Actual results:
Migration cannot start with error:
(qemu) migrate -d tcp:10.66.8.189:9999
qxl: guest bug: command not in ram bar


Expected results:
Migration success

Additional info:
Uninstall qxl driver and migration can start and finish without error

Comment 2 Dr. David Alan Gilbert 2017-04-28 10:24:14 UTC
This was noticed upstream a while ago, it was causing qemu to abort at the time; see: https://bugs.launchpad.net/qemu/+bug/1635339

From qemu commit 86dbcdd9c75 by Gerd:
            /*
             * Windows 8 drivers place qxl commands in the vram
             * (instead of the ram) bar.  We can't live migrate such a
             * guest, so add a migration blocker in case we detect
             * this, to avoid triggering the assert in pre_save().
             *
             * https://cgit.freedesktop.org/spice/win32/qxl-wddm-dod/commit/?id=f6e099db39e7d0787f294d5fd0dce328b5210faa
             */

Comment 4 Amnon Ilan 2017-05-05 11:24:42 UTC
Was it a bug in QEMU or in the QXL-WDDM-DOD driver?

Comment 5 Frediano Ziglio 2017-05-05 11:34:39 UTC
IMHO is in Qemu but it's too late to package a new version, so we need to write a workaround in Windows driver.
Basically Qemu is testing if the release pointers are in Bar0.

Comment 6 David Blechter 2017-05-05 12:21:32 UTC
WDDM DOD is not supported yet, the plan is to add it to the next RHVM release.

moving to 7.5

Comment 7 ybendito 2017-05-05 12:58:25 UTC
Answering Amnon's question: from my point of view, the bug is between qxl-wddm-dod and qemu. It is currently fixed in upstream of qxl-wddm-dod and will be in next release of the driver, i.e. 0.17 as soon as we build it.

Comment 8 Frediano Ziglio 2017-05-19 11:42:56 UTC
Fixed by 0.17 version of the driver.
If virtualization team want to allow migration to work even if resources are in Bar1 they should change Qemu code to allow that.
Note that wddm dod is not officially supported by RHEL.

Comment 9 Ademar Reis 2017-05-23 13:29:38 UTC
(In reply to Frediano Ziglio from comment #8)
> Fixed by 0.17 version of the driver.
> If virtualization team want to allow migration to work even if resources are
> in Bar1 they should change Qemu code to allow that.
> Note that wddm dod is not officially supported by RHEL.

Gerd?

Comment 10 Gerd Hoffmann 2017-05-24 10:20:36 UTC
(In reply to Ademar Reis from comment #9)
> (In reply to Frediano Ziglio from comment #8)
> > Fixed by 0.17 version of the driver.
> > If virtualization team want to allow migration to work even if resources are
> > in Bar1 they should change Qemu code to allow that.
> > Note that wddm dod is not officially supported by RHEL.
> 
> Gerd?

Hmm, we can support that in qemu but requires a update to the live migration format, so it would come with quite some overhead for backward compatibility, such as requiring a new machine type.  Which implies backporting limitations (no z-stream).  Also the guest driver can't just expect this to work as there will be old qemu versions not supporting this deployed for quite a while.

Is there a compelling reason to place ressources in bar 1?

Comment 11 Frediano Ziglio 2017-05-24 10:40:27 UTC
There are no reason like there are no reasons Qemu doing this limitation beside implementation details.
We could decide that resources to be released must be allocated only in Bar0 and put some documentation and enforcement (at least on debug versions or Qemu/spice-server) to detect and prevent such issues in the future.

Comment 12 Gerd Hoffmann 2017-05-24 11:11:17 UTC
(In reply to Frediano Ziglio from comment #11)
> There are no reason like there are no reasons Qemu doing this limitation
> beside implementation details.

The reason why this limitation in qemu exists basically is "historical reasons", dating back to the days where bar1 didn't exist.  Yes, from a design point of view it doesn't make sense.  Likewise having the bar0 / bar1 split in the first place doesn't make sense.  When re-designing qxl today with the lessons learned in the last years there a quite a few things I would do in a different way ...

But you have the memory in bar0, it makes sense to use it, so why not simply continue to place the commands there?  You have to do that anyway to support older qemu versions.  Either unconditionally (which would be the simplest and IMHO best way), or by detecting support and running different code paths in the driver depending on the qemu version you are running on.  Which just increases your test matrix for no good reason.

Comment 13 Frediano Ziglio 2017-05-24 11:27:46 UTC
(In reply to Gerd Hoffmann from comment #12)
> (In reply to Frediano Ziglio from comment #11)
> > There are no reason like there are no reasons Qemu doing this limitation
> > beside implementation details.
> 
> The reason why this limitation in qemu exists basically is "historical
> reasons", dating back to the days where bar1 didn't exist.  Yes, from a
> design point of view it doesn't make sense.  Likewise having the bar0 / bar1
> split in the first place doesn't make sense.  When re-designing qxl today
> with the lessons learned in the last years there a quite a few things I
> would do in a different way ...
> 
> But you have the memory in bar0, it makes sense to use it, so why not simply
> continue to place the commands there?  You have to do that anyway to support
> older qemu versions.  Either unconditionally (which would be the simplest
> and IMHO best way), or by detecting support and running different code paths
> in the driver depending on the qemu version you are running on.  Which just
> increases your test matrix for no good reason.

As far as I understand you are just rephrasing what we already said.

The only good thing of Bar1 compared to Bar0 is that can be quite large as it supports 64 bit and save address space for devices that needs 32 bit space.
As already said the switch from Bar0 to Bar1 for the DOD driver did not came with much explanation and reverting this caused not issues.
I consider enforcing/documenting the Bar0 allocation as a good debug practice to avoid the issue for future versions but is more of an enhancement than a fix for this issue and should be deserved for a different bug.

Personally as this was just an issue with a upstream version not supposed to be supported by RHEL I would close the bug.

Comment 14 Gerd Hoffmann 2017-05-24 11:59:13 UTC
> The only good thing of Bar1 compared to Bar0 is that can be quite large as
> it supports 64 bit and save address space for devices that needs 32 bit
> space.

Yes.  You can store all the bulky data in bar1.  Only stuff referenced from the rings directly must be in bar0.  Stuff referenced indirectly such as image data chunks attached to commands can be in bar1.  And surfaces can be in bar1 anyway, that is what bar1 was originally created for.

> I consider enforcing/documenting the Bar0 allocation as a good debug
> practice to avoid the issue for future versions but is more of an
> enhancement than a fix for this issue and should be deserved for a different
> bug.

That is easy.  qemu already detects that (see commit referenced in comment #2) an registers a migration blocker then.  We could also raise an error IRQ and enter guest bug mode.

Comment 16 Gerd Hoffmann 2017-07-12 08:11:08 UTC
> Personally as this was just an issue with a upstream version not supposed to
> be supported by RHEL I would close the bug.

Closing now.  It's fixed in the windows guest driver.

Comment 17 fgieseler 2018-06-24 22:57:00 UTC
It working fine.

I downloaded the latest code from: https://www.spice-space.org/download/windows/qxl-wddm-dod/qxl-wddm-dod-0.18/ and updated the Windows Driver.

After that, the SAVE command is working fine for Windows 10.

Thank you so much for your support.

Kind regards!


Note You need to log in before you can comment on or make changes to this bug.