Description of problem: QEMU crashes when launching VM with 3d acceleration enabled on some GPUs. Version-Release number of selected component (if applicable): qemu-3.1.0-5.fc30.x86_64 How reproducible: Always (on affected hardware) Steps to Reproduce: 1. Create new F29 Live VM in F30 GNOME Boxes/or virt-manager with virtio 3d/virgl 2. Wait and see the VM crash Actual results: QEMU might fail to run the VM with virgl on some GPUs, with following in journal: Mar 25 11:10:36 fanys-laptop systemd-coredump[17145]: Process 17104 (qemu-system-x86) of user 1000 dumped core. Expected results: QEMU shouldn't fail to run the VM with virgl enabled. Additional info: The same can be achieved in virt-manager by switching Video to virtio 3d and enabling OpenGl support on Display tab. GNOME Boxes doesn't show the tip to disable 3D acceleration if it crashed this way right away. This seems to be hardware specific. It works just fine on: 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09) but is broken on: 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02)
Proposed as a Blocker for 30-final by Fedora user frantisekz using the blocker tracking app because: This breaks "The release must be able host virtual guest instances of the same release." criteria as GNOME Boxes enable virgl by default since F30. As it seems to be HW specific, I am proposing this as a Final Blocker instead of Beta Blocker (which it would have been if it wasn't HW specific.)
This will need more info beyond "it crashed", especially since its hardware specific. To start with please acquire a stack trace of all threads in QEMU, with all relevant -debuginfo RPMs present.
Created attachment 1547721 [details] qemu bt
If I change virgl and gl qemu options to off, it doesn't crash with: Thread 3 "qemu-system-x86" received signal SIGSYS, Bad system call. [Switching to Thread 0x7fffeec09700 (LWP 7321)] 0x00007ffff6e141a3 in __pthread_setaffinity_new (th=<optimized out>, cpusetsize=128, cpuset=0x7fffeec086c0) at ../sysdeps/unix/sysv/linux/pthread_setaffinity.c:34 34 res = INTERNAL_SYSCALL (sched_setaffinity, err, 3, pd->tid, cpusetsize
That log file doesn't contain any stack trace, just one single stack frame from one thread. Please capture a full stack trace eg "thread apply all backtrace" in the GDB prompt
Created attachment 1547724 [details] qemu bt - all threads
This looks like it is caused by Mesa trying to set CPU affinity which is blocked by QEMU's seccomp filters, probably fixed upstream by https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg06006.html
So, I tried to do a test build with the patch, but it doesn't compile when applied against either f30 or f31, so I assume it should be used against master. I can try that later, thanks for pointing to the patch Daniel! /usr/bin/ld: ../qemu-seccomp.o: in function `seccomp_start': /builddir/build/BUILD/qemu-4.0.0-rc0/qemu-seccomp.c:181: undefined reference to `qemu_seccomp_get_kill_action' collect2: error: ld returned 1 exit status
Sigh the maintainer broke my patch. Try the original one https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg04413.html
Okay, with that patch applied, the issue is fixed (tried against f30 branch in dist-git). Thanks!
I've fired off scratch build with the fix if anyone else is interested: https://koji.fedoraproject.org/koji/taskinfo?taskID=33766636 Daniel, should I create a PR for the qemu package, or would you rather handle it yourself?
Discussed during the 2019-03-25 blocker review meeting: [1] The decision to classify this bug as an "AcceptedBlocker" (Beta) was made as it violates the following criteria: "The release must be able host virtual guest instances of the same release", given that it affects the default config of the default virt app on the default desktop. [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2019-03-25/f30-blocker-review.2019-03-25-16.01.txt
As this was accepted as a blocker, and none of the qemu maintainers seemed to be around, I'm doing the fix for this. The build is currently running: https://koji.fedoraproject.org/koji/taskinfo?taskID=33767403 and I will edit the existing pending qemu update (which includes a bunch of CVE fixes, so probably good things to pull in) to include it when it's done, and we will fire a new compose.
qemu-3.1.0-6.fc30 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-0664c7724d
Can those testing this confirm it's working ok when they have libseccomp-2.3.3-5.fc30 installed (whats stable now), and not libseccomp-2.4.0-0.fc30 (updates-testing). Our f29 builders were unable to launch f30/f31 guests with the older libseccomp.
Tried on 1.7 beta live iso with boxes on a ryzen 5 1600X + nvidia 1060 and can't reproduce it on a normal usage of gnome-boxes
(In reply to Kevin Fenzi from comment #16) > Can those testing this confirm it's working ok when they have > libseccomp-2.3.3-5.fc30 installed (whats stable now), and not > libseccomp-2.4.0-0.fc30 (updates-testing). > > Our f29 builders were unable to launch f30/f31 guests with the older > libseccomp. It works both with libseccomp-2.3.3-5.fc30.x86_64 and libseccomp-2.4.0-0.fc30 .
qemu-3.1.0-6.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-0664c7724d
Frantisek: can you confirm that this works OK with the qemu update in Beta-1.8 (just to make sure I didn't screw up the patch or anything)? Thanks!
Yeah, it's working just fine!
qemu-3.1.0-6.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.