Red Hat Bugzilla – Bug 994211
Spice: fallback from async io mode to sync mode might result in errors and hanging vm
Last modified: 2016-06-28 10:17:20 EDT
Description of problem:
If an async io query doesn't get a reply after 60 seconds, the driver transfers from async_io mode to sync_io.
However, not all the io operations have a sync support, so we can get errors like
qxl/guest-0: 222140443011: qxldd: set custom display FFFFF900C1E00020
qxl/guest-0: 222147232862: qxldd: DrvAssertMode: 0xc1e00020 revision 4 enable 0
qxl/guest-0: 222147251233: qxldd: ERROR: trying calling sync io on NULL port 6
for ASYNCABLE_FLUSH_SURFACES (and not flushing the surfaces can be followed by other errors).
In addition, probably since the async_io query eventually was handled by spice server, and the IO_CMD interrupt *was* sent for the expired query, we get a "guest bug" in qemu, for 2 simultaneous async ios (after the driver is reloaded and is back to async io mode).
"qxl-0: guest bug: 2 async started before last (16) complete"
Notice that after a guest_bug is identified by qemu there is another bug (in qemu side):
interface_get_command in qxl.c reports FALSE, while
interface_req_cmd_notification returns FALSE as well (it reports that the ring is not empty).
This leads to an endless loop in spice-server red_worker.c (see https://bugzilla.redhat.com/show_bug.cgi?id=964136#c29)
I'm not sure about the severity of this bug because we shouldn't reach a query timeout in the first place.
We reached it in bug 964136, probably because by mistake we had too long timeout in spice-server when waiting for a response from a client, and from a reason we don't know yet, the client was not responsive. I'll send a patch to fix the timeout on the server side.
Related Bug: 964136
has this patch been submitted? I have encountered this bug on qemu-kvm-0.12 and spice-server-0.12
(In reply to bin.liu from comment #1)
> has this patch been submitted? I have encountered this bug on qemu-kvm-0.12
> and spice-server-0.12
A patch for bug 995041 has been submitted. It is part of spice-server-0.12.4-3.el6
this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015.
Please review this bug and if not a blocker, please postpone to a later release.
All bugs not postponed on GA release will be automatically re-targeted to
- 3.6.1 if severity >= high
- 4.0 if severity < high
This bug is not a blocker
oVirt 4.0 Alpha has been released, moving to oVirt 4.0 Beta target.