|Summary:||Include conditional_wakeup fix in glib2|
|Product:||Red Hat Enterprise Linux 7||Reporter:||Richard W.M. Jones <rjones>|
|Component:||glib2||Assignee:||Colin Walters <walters>|
|Status:||ASSIGNED ---||QA Contact:||Desktop QE <desktop-qa-list>|
|Version:||7.5||CC:||amit, berrange, cfergeau, dwmw2, extras-qa, fziglio, itamar, jkoten, kchamart, klember, mboisver, mclasen, mtessun, pbonzini, ptoscano, rjones, toneata, tpelka, virt-maint, walters, xchen, yoguo|
|Fixed In Version:||Doc Type:||If docs needed, set a value|
|Doc Text:||Story Points:||---|
|:||1474405 (view as bug list)||Environment:|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Cloudforms Team:||---||Target Upstream Version:|
|Bug Depends On:||1438539|
Description Richard W.M. Jones 2017-07-21 09:36:27 UTC
Description of problem: Rebasing glib2 in RHEL 7.4 (bug 1386874) has introduced a bug which affects the qemu main loop, originally reported in Fedora in bug 1438539 (glib2) and bug 1435432 (qemu). The bug causes the emulated ISA serial port in qemu to hang because wakeup events get lost. It's difficult to reliably reproduce this bug. The best we have is to run the following command: while libguestfs-test-tool -t 180 >& /tmp/log; do echo -n .; done which should run forever (printing lots of dots), but where the bug is present will sometimes fail, although often not for many iterations. This bug can be fixed by backporting the following upstream commits: main: Create a helper function for "owner wakeup" optimization 20870240492dcca19e0b4243d99f8e0f397cdac7 gmain: Signal wakeups if context has never been acquired as well 0c0469b56d7e6b2533760d5d821076c88b05dfb0 gmain: only signal GWakeup right before or during a blocking poll 9ba95e25b74adf8d62effeaf6567074ac932811c These commits cherry pick cleanly on top of glib2 2.50.3. I have built a scratch build containing the fix here: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13697039
Comment 2 Richard W.M. Jones 2017-07-21 09:41:14 UTC
BTW we really need this fixed in RHEL 7.4-z.
Comment 4 Colin Walters 2017-07-21 13:04:16 UTC
Weren't the qemu patches backported to RHEL? If not, why not? I asked upstream about this and it sounded like that was going to happen quickly. Is the scenario here new base RHEL with old qemu?
Comment 5 Richard W.M. Jones 2017-07-21 16:56:55 UTC
No idea, but this affects qemu-kvm-1.5.3-141.el7_4.1.x86_64 right now.
Comment 6 Daniel Berrangé 2017-07-24 09:11:07 UTC
The upstream commit is this one: commit ecbddbb106114f90008024b4e6c3ba1c38d7ca0e Author: Richard W.M. Jones <email@example.com> Date: Fri Mar 31 21:51:33 2017 +0100 main-loop: Acquire main_context lock around os_host_main_loop_wait. This made it into the QEMU 2.9.0 release, and thus *is* present in qemu-kvm-rhev (that's used by RHEV and OpenStack). It is *not*, however, present in the qemu-kvm 1.5.3 that is shipped in base RHEL, since that version doesn't rebase and no one backported the patch to it.
Comment 7 Colin Walters 2017-07-24 13:55:05 UTC
OK, well can we at least get a backport of that patch scheduled for 7.5 for qemu?
Comment 8 Colin Walters 2017-07-24 13:59:59 UTC
I'll move this bug to 7.5, and clone it for 7.4z if that decision is made.
Comment 9 Paolo Bonzini 2017-07-24 14:15:12 UTC
It's okay to backport the fix to RHEL. All the previous discussions were centered around upstream and Fedora.
Comment 10 Paolo Bonzini 2017-07-24 14:16:03 UTC
Doesn't affect layered products.
Comment 11 Daniel Berrangé 2017-07-24 14:25:50 UTC
I opened this to request the QEMU fix be backported to qemu-kvm in 7.4.z https://bugzilla.redhat.com/show_bug.cgi?id=1474405
Comment 17 Colin Walters 2018-05-16 21:16:57 UTC
The upstream bug is still a bit stalled: https://bugzilla.gnome.org/show_bug.cgi?id=761102 We could sync to what's in glib master today, but...since the qemu-kvm fix went out in https://bugzilla.redhat.com/show_bug.cgi?id=1473536 is it really worth the risk? Let's just tell people to upgrade qemu?