Bug 1894045
| Summary: | Avoid crash due to race in glib event loop code | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Martin Kletzander <mkletzan> | |
| Component: | libvirt | Assignee: | Martin Kletzander <mkletzan> | |
| Status: | CLOSED ERRATA | QA Contact: | Han Han <hhan> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 8.3 | CC: | amashah, andrew, berrange, elima, fjin, guillaume.pavese, gveitmic, hhan, jdenemar, kanderso, mkalinin, sbonazzo, virt-maint, yafu, yalzhang | |
| Target Milestone: | rc | Keywords: | Regression, ZStream | |
| Target Release: | 8.3 | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | libvirt-6.6.0-8.el8 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1915601 (view as bug list) | Environment: | ||
| Last Closed: | 2021-02-22 15:39:41 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1915601 | |||
|
Description
Martin Kletzander
2020-11-03 12:38:23 UTC
Questions from QE: 1. Since the chance to reproduce it is rare, is there any other way to reproduce it instead of the eventtest? 2. This fix is a workaround for the glib before glib-2.63.6(commit https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1353). Will it work on the glib>=2.63.6? Will it be reverted when the minimal required version >= glib-2.63.6? (In reply to Han Han from comment #5) ad 1. I don't know about any. The glib issue itself mentions it only being reported couple of times, so I don't expect it to be very reproducible. ad 2. It looks like it is scheduled for 2.64.0 and will be backported to 2.62.x as well, but since the fix is not a workaround, just a more safe way of handling things, I do not see a reason for that to be reverted. I am fine with this going in without much testing (basically unless anything else breaks it's fine), but I do not know if there is a process for it. Some way to tag this BZ maybe? Can we get this out as async as soon as it is verified? Get a reproducer here by run unittest:
1. Install the src rpm of libvirt-6.6.0-7
# rpm -i LIVBRIT_6_6_SRCRPM_URL
2. Compile the libvirt-6.6
# rpmbuild -bc -v rpmbuild/SPECS/libvirt.spec
(you may need to disable rbd storage in libvirt.spec and recompile if error happens on rbd)
3. Run eventtest
# cd /root/rpmbuild/BUILD/libvirt-6.6.0/x86_64-redhat-linux-gnu
Run eventtest until it hit an error:
# while true;do ./tests/eventtest; if [ $? -ne 0 ];then break;fi;done
(process:2738769): GLib-CRITICAL **: 20:57:16.306: source_remove_from_context: assertion 'source_list != NULL' failed
# coredumpctl -1
TIME PID UID GID SIG COREFILE EXE
Tue 2021-01-12 20:57:16 EST 2738769 0 0 11 present /root/rpmbuild/BUILD/libvirt-6.6.0/x86_64-redhat-
Backtrace:
(gdb) bt fu
#0 0x00007f9ca01348b8 in g_source_unref_internal (source=0x55881d6a5090, context=0x55881d6a7af0, have_lock=1)
at gmain.c:2127
old_cb_data = 0x55881d6a4910
old_cb_funcs = 0x55881d664010
__func__ = "g_source_unref_internal"
#1 0x00007f9ca0134a0e in g_source_iter_next
(iter=iter@entry=0x7f9c9932a930, source=source@entry=0x7f9c9932a928) at gmain.c:980
next_source = <optimized out>
#2 0x00007f9ca01373df in g_main_context_check
(context=context@entry=0x55881d6a7af0, max_priority=200, fds=fds@entry=0x7f9c940024a0, n_fds=n_fds@entry=26)
at gmain.c:3715
source = 0x55881d6a5090
iter =
{context = 0x55881d6a7af0, may_modify = 1, current_list = 0x55881d6a4660 = {0x55881d6a4640}, source = 0x55881d6a5090}
pollrec = <optimized out>
n_ready = 0
i = <optimized out>
#3 0x00007f9ca0137a60 in g_main_context_iterate
(context=context@entry=0x55881d6a7af0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>)
at gmain.c:3899
max_priority = 200
timeout = 0
some_ready = <optimized out>
nfds = 26
allocated_nfds = 32
fds = 0x7f9c940024a0
#4 0x00007f9ca0137be0 in g_main_context_iteration (context=0x55881d6a7af0,
context@entry=0x0, may_block=may_block@entry=1) at gmain.c:3963
retval = <optimized out>
#5 0x00007f9ca370e3a4 in virEventGLibRunOnce () at ../../src/util/vireventglib.c:496
#6 0x000055881c416b65 in eventThreadLoop (data=<optimized out>) at ../../tests/eventtest.c:176
#7 0x00007f9c9fcce14a in start_thread (arg=<optimized out>) at pthread_create.c:479
ret = <optimized out>
pd = <optimized out>
unwind_buf =
{cancel_jmp_buf = {{jmp_buf = {140310561863424, 8770522698823025210, 140727661230654, 140727661230655, 0, 140310561860352, -8751033350172917190, -8751030135613111750}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#8 0x00007f9c9f5e4763 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Test as comment19 for thousands of loops on libvirt-6.6.0-8.module+el8.3.1+8648+130818f2. No reproduced. Covered by unit test Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0639 |