RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1674324 - With <graphics type='spice'><gl enable='on'/>, qemu either refuses to start completely or spice-server crashes afterwards
Summary: With <graphics type='spice'><gl enable='on'/>, qemu either refuses to start c...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.0
Assignee: Marc-Andre Lureau
QA Contact: Guo, Zhiyi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-11 00:04 UTC by David Jaša
Modified: 2023-05-17 12:12 UTC (History)
14 users (show)

Fixed In Version: qemu-kvm-2.12.0-95.module+el8.2.0+5354+b7ebf7be
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-28 15:32:11 UTC
Type: Bug
Target Upstream Version:
Embargoed:
zhguo: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-26547 0 None None None 2023-05-17 12:12:30 UTC
Red Hat Product Errata RHEA-2020:1587 0 None None None 2020-04-28 15:33:33 UTC

Description David Jaša 2019-02-11 00:04:59 UTC
Description of problem:
qemu either doesnt' start right away or crashes with:
Spice-CRITICAL **: 10:52:50.884: red-qxl.c:822:spice_qxl_gl_scanout: condition `qxl_state->gl_draw_cookie == GL_DRAW_COOKIE_INVALID' failed

Version-Release number of selected component (if applicable):
qemu-kvm-2.12.0-60.module+el8+2725+0ab65287.x86_64

How reproducible:
either way, I couldn't point what made qemu fail in one mode or other

Steps to Reproduce:
1. start VM with <graphics type='spice'><gl enable='yes'> (and <video><model type='virtio'/>)
2. launch any window once DE comes up
3.

Actual results:
either:
1.: VM refuses to start
2.: VM crashes with assert above

Expected results:
VM runs and has accelrated graphics

Additional info:
F29 versions don't suffer any of thes
el7 with RHV qemu behaves the same

Comment 3 Christophe Fergeau 2019-02-12 12:38:54 UTC
(In reply to David Jaša from comment #0)

> Additional info:
> F29 versions don't suffer any of thes
> el7 with RHV qemu behaves the same

"behaves the same"? as f29 or as el8? In my testing, el7 is working, so I assume you meant "behave the same as f29 (working)"?

Comment 4 Christophe Fergeau 2019-02-13 12:19:55 UTC
virgl is disabled in qemu-kvm-2.12.0-60.module+el8+2725+0ab65287.x86_64, without virgl (ie no <accelerate accel3d='yes'/> support), virtio + spice + gl='on' is not really interesting, so in my opinion this is not a 8.0.0 blocker.

Comment 5 Christophe Fergeau 2019-02-13 12:39:08 UTC
This seems to be a bug in qemu. Backtrace of the crash is (don't trust the line numbers too much, this was tested with a local build)

(gdb) bt
#0  0x00007fea1e59a53f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fea1e584895 in __GI_abort () at abort.c:79
#2  0x00007fea1eb56d62 in spice_logv
    (log_domain=0x7fea1ec031d2 "Spice", log_level=G_LOG_LEVEL_CRITICAL, strloc=0x7fea1ebf4884 "../server/red-qxl.c:790", function=0x7fea1ebf4b00 <__func__.48991> "spice_qxl_gl_scanout", format=0x7fea1ebf4822 "condition `%s' failed", args=0x7fff10882628) at ../subprojects/spice-common/common/log.c:187
#3  0x00007fea1eb56e20 in spice_log
    (log_level=G_LOG_LEVEL_CRITICAL, strloc=0x7fea1ebf4884 "../server/red-qxl.c:790", function=0x7fea1ebf4b00 <__func__.48991> "spice_qxl_gl_scanout", format=0x7fea1ebf4822 "condition `%s' failed")
    at ../subprojects/spice-common/common/log.c:200
#4  0x00007fea1eb0ca4f in spice_qxl_gl_scanout (qxl=0x561a439981d8, fd=137, width=1024, height=768, stride=4096, format=875708993, y_0_top=0) at ../server/red-qxl.c:790
#5  0x0000561a3fe19bee in spice_gl_switch (dcl=0x561a43998198, new_surface=<optimized out>) at /home/teuf/redhat/qemu/include/ui/console.h:332
#6  0x0000561a3fe120aa in dpy_gfx_replace_surface (con=0x561a42d2c800, surface=0x561a43dfa2e0) at ui/console.c:1585
#7  0x0000561a3fb80e5f in virtio_gpu_set_scanout (cmd=0x561a4308c260, g=0x561a43c8d690) at /home/teuf/redhat/qemu/hw/display/virtio-gpu.c:677
#8  0x0000561a3fb80e5f in virtio_gpu_simple_process_cmd (cmd=0x561a4308c260, g=0x561a43c8d690) at /home/teuf/redhat/qemu/hw/display/virtio-gpu.c:855
#9  0x0000561a3fb80e5f in virtio_gpu_process_cmdq (g=<optimized out>) at /home/teuf/redhat/qemu/hw/display/virtio-gpu.c:893
#10 0x0000561a3ff498ce in aio_bh_call (bh=0x561a43df90a0) at util/async.c:118
#11 0x0000561a3ff498ce in aio_bh_poll (ctx=ctx@entry=0x561a427b6960) at util/async.c:118
#12 0x0000561a3ff4ce80 in aio_dispatch (ctx=0x561a427b6960) at util/aio-posix.c:460
#13 0x0000561a3ff497ae in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
#14 0x00007fea2130406d in g_main_dispatch (context=0x561a427b6d20) at gmain.c:3182
#15 0x00007fea2130406d in g_main_context_dispatch (context=context@entry=0x561a427b6d20) at gmain.c:3847
#16 0x0000561a3ff4c098 in glib_pollfds_poll () at util/main-loop.c:215
#17 0x0000561a3ff4c098 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238
#18 0x0000561a3ff4c098 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:514
#19 0x0000561a3fc55e29 in main_loop () at vl.c:1923
#20 0x0000561a3fad65c1 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4584

I've reproduced this bug with git master and
/configure '--extra-ldflags=-Wl,--build-id -Wl,-z,relro -Wl,-z,now' '--extra-cflags=-O2 -g -pipe  -fexceptions -fstack-protector-strong -Wno-error'   --enable-spice  --enable-opengl  --target-list=x86_64-softmmu   --enable-kvm   --disable-virglrenderer  --disable-xen --enable-debug --disable-strip
Then I boot a fedora 29 livecd with the config djasa described in the initial comment, connect to it with virt-viewer -a, click on the 'Activities' button in the upper right corner, and click on the firefox icon. More often than not, this triggers the assertion failure and crashes QEMU.

This happens because we receive a spice_qxl_gl_scanout call in the middle of a spice_qxl_gl_draw_async call (before red_qxl_gl_draw_async_complete/QXLInterface::async_complete gets called back), which is not valid.
qemu_spice_gl_block/qemu_spice_gl_unblock seems to be meant to avoid this kind of situations, however the virtio-gpu implementation is:

const GraphicHwOps virtio_gpu_ops = {
#ifdef CONFIG_VIRGL
 .gl_block = virtio_gpu_gl_block,
#endif
and virgl is disabled in this build, so the blocking while the draw command is in flight is not be functional.

Comment 6 Marc-Andre Lureau 2019-02-13 13:22:16 UTC
thanks Christophe for the analysis, are you working on a patch?

I think I have done a related fix in QEMU in the past, I would have to dig in the archives though.

Comment 7 Christophe Fergeau 2019-02-13 13:43:33 UTC
Marc-André, was not planning too, then dug at history, found https://git.qemu.org/?p=qemu.git;a=commit;h=c19f4fbce1c2293b7a9bddadddd7a1b69953f534 which seems related, and now trying to revert that patch, and do something similar to https://git.qemu.org/?p=qemu.git;a=blob;f=hw/display/virtio-gpu-3d.c#l406 in virtio-gpu.c is awfully tempting... In short, I'll experiment with that and see if it helps.

Comment 8 Marc-Andre Lureau 2019-02-13 14:04:14 UTC
I found this pending patch: https://github.com/elmarco/qemu/commit/22c94823d741dca97d912f5d737561da12538f75

Comment 9 Christophe Fergeau 2019-02-13 14:47:35 UTC
Looks very similar to what I came up with, cmd->waiting can be removed after this change. I'll test a bit more but this was fixing the crash for me.

Comment 10 Christophe Fergeau 2019-02-13 15:34:04 UTC
Ah, and you also need

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 74f203c727..c1b46e0686 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1054,9 +1054,7 @@ const GraphicHwOps virtio_gpu_ops = {
     .gfx_update = virtio_gpu_update_display,
     .text_update = virtio_gpu_text_update,
     .ui_info = virtio_gpu_ui_info,
-#ifdef CONFIG_VIRGL
     .gl_block = virtio_gpu_gl_block,
-#endif
 };
 
 static const VMStateDescription vmstate_virtio_gpu_scanout = {

Comment 11 Christophe Fergeau 2019-02-13 16:52:37 UTC
djasa, this scratch build should have Marc-André's patch + the change from comment #10 if you want to give it a try.

Comment 12 Christophe Fergeau 2019-02-15 16:12:09 UTC
Not sure where the link went
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=20194251

Comment 13 Marc-Andre Lureau 2019-02-15 16:13:57 UTC
(In reply to Christophe Fergeau from comment #9)
> Looks very similar to what I came up with, cmd->waiting can be removed after
> this change. I'll test a bit more but this was fixing the crash for me.

yep, the following patch in the series: "virtio-gpu: remove useless 'waiting' field"

I'll resend soon. I need to push Gerd to review my changes :)

Comment 18 David Jaša 2019-02-26 15:11:21 UTC
I tried the scratch build on two el8 machines with these results:
- on one machine, qemu with -spice gl=on fails to start with no error message given (even with G_MESSAGES_DEBUG=all)
- on the other machine, qemu starts but guests don't utilize virgl renderer (although they are run from the very same images that do render using VirGL on F29 host)

I didn't encounter the crash any more.

Comment 19 Christophe Fergeau 2019-03-04 13:58:11 UTC
(In reply to David Jaša from comment #18)
> I tried the scratch build on two el8 machines with these results:
> - on one machine, qemu with -spice gl=on fails to start with no error
> message given (even with G_MESSAGES_DEBUG=all)
> - on the other machine, qemu starts but guests don't utilize virgl renderer
> (although they are run from the very same images that do render using VirGL
> on F29 host)
> 
> I didn't encounter the crash any more.

RHEL qemu builds are compiled without virgl support, so I would not expect the guests to be able to use it.

Comment 20 Marc-Andre Lureau 2019-05-20 15:50:19 UTC
(In reply to Christophe Fergeau from comment #19)
> (In reply to David Jaša from comment #18)
> > I tried the scratch build on two el8 machines with these results:
> > - on one machine, qemu with -spice gl=on fails to start with no error
> > message given (even with G_MESSAGES_DEBUG=all)
> > - on the other machine, qemu starts but guests don't utilize virgl renderer
> > (although they are run from the very same images that do render using VirGL
> > on F29 host)
> > 
> > I didn't encounter the crash any more.
> 
> RHEL qemu builds are compiled without virgl support, so I would not expect
> the guests to be able to use it.

<graphics type='spice'><gl enable='yes'> isn't enabling virgl, however it enables qemu gl rendering.

David, what made the first machine different from the second? The GPU? Could you give access to the first machine?

thanks (sorry for the delay)

Comment 21 David Jaša 2019-07-17 10:23:36 UTC
(In reply to Marc-Andre Lureau from comment #20)
...
> <graphics type='spice'><gl enable='yes'> isn't enabling virgl, however it
> enables qemu gl rendering.
> 
> David, what made the first machine different from the second? The GPU? Could
> you give access to the first machine?
> 
> thanks (sorry for the delay)

I'm delayed even more, also sorry. I managed to reproduce both on the same machine, the behaviour depends on libvirt session:
- in system session, the VM doesn't start
- in user session, the VM starts
As you say, the acceleration isn't available so
- when you specify <video><model type="virtio"><acceleration accel3d='yes'/>, VM is not started
- without specification in xml or when disable, when the VM starts, acceleration is reported as unavailable within VM. The qemu crashes are also pretty frequent:
(process:21091): Spice-CRITICAL **: 16:01:30.123: red-qxl.c:708:spice_qxl_gl_scanout: condition `qxl_state->gl_draw_cookie == GL_DRAW_COOKIE_INVALID' failed
2019-07-12 14:01:30.679+0000: shutting down, reason=crashed

Comment 23 Marc-Andre Lureau 2019-07-17 12:17:34 UTC
Christophe, do you still have the GL_DRAW_COOKIE_INVALID backport handy? Can you submit it to rhvirt?

Comment 25 Danilo de Paula 2019-12-10 13:51:36 UTC
This BZ lost ITR flag.

I'm setting it back, but the patch needs acks.
Also requesting exception+

QE: can you please grant QA_ACK?

Comment 30 Ademar Reis 2020-02-05 22:54:11 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 31 Guo, Zhiyi 2020-02-12 12:27:38 UTC
Hmm, I'm not able to reproduce this bug against qemu-kvm-2.12.0-63.module+el8+2833+c7d6d092.x86_64 and qemu-kvm-2.12.0-88.module+el8.1.0+4233+bc44be3f.x86_64.
My qemu command line used:
/usr/libexec/qemu-kvm -device virtio-vga -spice addr=/tmp/spice.sock,unix,disable-ticketing,gl=on -monitor stdio -cdrom Fedora-Workstation-Live-x86_64-31-1.9.iso -m 4G -machine q35

With this command, I always get a bland screen

After removing gl=on, I can always see fedora welcome page.

Comment 32 Guo, Zhiyi 2020-02-12 12:28:35 UTC
On latest qemu-kvm-2.12.0-98.module+el8.2.0+5698+10a84757.x86_64, I also hit the same behaviors.

Comment 33 Guo, Zhiyi 2020-02-12 12:41:05 UTC
Hi Marc-Andre,


   Could you help to check comment 31 and 32? Thanks!

BR/
Zhiyi

Comment 34 Marc-Andre Lureau 2020-02-12 13:13:58 UTC
Do you get only a black screen when connecting the spice client, and no qemu error such as:

(process:21091): Spice-CRITICAL **: 16:01:30.123: red-qxl.c:708:spice_qxl_gl_scanout: condition `qxl_state->gl_draw_cookie == GL_DRAW_COOKIE_INVALID' failed ?

Black screen may be due to incompatible GPU, what's the host gpu?

I don't know if local GL/spice is supported in RHEL8 tbh, we would have to ask the Spice team.

Comment 35 Guo, Zhiyi 2020-02-12 13:21:24 UTC
(In reply to Marc-Andre Lureau from comment #34)
> Do you get only a black screen when connecting the spice client, and no qemu
> error such as:
> 
> (process:21091): Spice-CRITICAL **: 16:01:30.123:
> red-qxl.c:708:spice_qxl_gl_scanout: condition `qxl_state->gl_draw_cookie ==
> GL_DRAW_COOKIE_INVALID' failed ?

Nothing like this prompts
> 
> Black screen may be due to incompatible GPU, what's the host gpu?
> 
> I don't know if local GL/spice is supported in RHEL8 tbh, we would have to
> ask the Spice team.

I have checked on different GPUs, but results are same, only open source driver used.
Gpu I used:
04:00.0 VGA compatible controller: NVIDIA Corporation GP106GL [Quadro P2000]
00:02.0 VGA compatible controller: Intel Corporation Iris Plus Graphics 650
21:00.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4]

Comment 36 Guo, Zhiyi 2020-02-12 13:24:13 UTC
Needinfo David and Christophe for help

Comment 37 Guo, Zhiyi 2020-02-14 03:29:11 UTC
Reproduce this issue against qemu-kvm-2.12.0-94.module+el8.2.0+5297+222a20af.x86_64

Steps:
# gdb /usr/libexec/qemu-kvm
(gdb) run -device virtio-vga -spice addr=/tmp/spice.sock,unix,disable-ticketing,gl=on  -monitor stdio -cdrom Fedora-Workstation-Live-x86_64-31-1.9.iso -m 4G -machine q35

try to touch some UI

Result:
qemu core dump with:
(process:561): Spice-CRITICAL **: 11:22:11.464: red-qxl.c:708:spice_qxl_gl_scanout: condition `qxl_state->gl_draw_cookie == GL_DRAW_COOKIE_INVALID' failed
[Detaching after fork from child process 779]

Thread 1 "qemu-kvm" received signal SIGABRT, Aborted.
0x00007ffff2c1270f in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff2c1270f in raise () at /lib64/libc.so.6
#1  0x00007ffff2bfcb25 in abort () at /lib64/libc.so.6
#2  0x00007ffff439f948 in  () at /lib64/libspice-server.so.1
#3  0x00007ffff436ff31 in spice_qxl_gl_scanout () at /lib64/libspice-server.so.1
#4  0x0000555555a92b60 in spice_gl_switch (dcl=0x5555573cd7f8, new_surface=<optimized out>)
    at /usr/src/debug/qemu-kvm-2.12.0-94.module+el8.2.0+5297+222a20af.x86_64/include/ui/console.h:342
#5  0x0000555555a8bc9a in dpy_gfx_replace_surface (con=0x555556445200, surface=0x55555785a090) at ui/console.c:1597
#6  0x00005555558b9ee3 in virtio_gpu_set_scanout (cmd=0x5555575355f0, g=0x5555575f5820)
    at /usr/src/debug/qemu-kvm-2.12.0-94.module+el8.2.0+5297+222a20af.x86_64/hw/display/virtio-gpu.c:676
#7  0x00005555558b9ee3 in virtio_gpu_simple_process_cmd (cmd=0x5555575355f0, g=0x5555575f5820)
    at /usr/src/debug/qemu-kvm-2.12.0-94.module+el8.2.0+5297+222a20af.x86_64/hw/display/virtio-gpu.c:854
#8  0x00005555558b9ee3 in virtio_gpu_process_cmdq (g=<optimized out>)
    at /usr/src/debug/qemu-kvm-2.12.0-94.module+el8.2.0+5297+222a20af.x86_64/hw/display/virtio-gpu.c:892
#9  0x0000555555b6dcb6 in aio_bh_call (bh=0x555557731850) at util/async.c:118
#10 0x0000555555b6dcb6 in aio_bh_poll (ctx=ctx@entry=0x5555564ad1c0) at util/async.c:118
#11 0x0000555555b70e34 in aio_dispatch (ctx=0x5555564ad1c0) at util/aio-posix.c:440
#12 0x0000555555b6db92 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>)
    at util/async.c:261
#13 0x00007ffff76ac67d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#14 0x0000555555b700b0 in glib_pollfds_poll () at util/main-loop.c:215
#15 0x0000555555b700b0 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238
#16 0x0000555555b700b0 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:497
#17 0x0000555555837a27 in main_loop () at vl.c:1981
#18 0x0000555555837a27 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4820

Verify this issue against qemu-kvm-2.12.0-98.module+el8.2.0+5698+10a84757.x86_64, doing some interactions with VM desktop, no crash happen

Comment 38 Guo, Zhiyi 2020-02-14 03:29:43 UTC
verified per comment 37

Comment 40 errata-xmlrpc 2020-04-28 15:32:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:1587


Note You need to log in before you can comment on or make changes to this bug.