Bug 1894045 - Avoid crash due to race in glib event loop code
Summary: Avoid crash due to race in glib event loop code
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.3
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: 8.3
Assignee: Martin Kletzander
QA Contact: Han Han
URL:
Whiteboard:
Depends On:
Blocks: 1915601
TreeView+ depends on / blocked
 
Reported: 2020-11-03 12:38 UTC by Martin Kletzander
Modified: 2024-06-13 23:19 UTC (History)
15 users (show)

Fixed In Version: libvirt-6.6.0-8.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1915601 (view as bug list)
Environment:
Last Closed: 2021-02-22 15:39:41 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5694061 0 None None None 2021-01-10 23:32:39 UTC

Description Martin Kletzander 2020-11-03 12:38:23 UTC
Description of problem:
There was a possible leak/memory corruption in glib and in order to fix a possible issue like this libvirt added a fix in commit 0db4743645b7a0611a3c0687f834205c9956f7fc which should work around the issue.

How reproducible:
Very very rarely (mostly in eventtest)

Steps to Reproduce:
1. Run tests
2. See eventtest segfault
3. Have mixed feelings

Actual results:
segfault

Expected results:
test pass

Additional info:
This was never seen outside of eventtest, but there are two reasons for backporting commit 0db4743645b7a0611a3c0687f834205c9956f7fc:

 1) sometimes the build fails, so we need to run a build multiple times (which is particularly annoying when the error happens almost at the end)

 2) it is safer to have this patch than not to have it in the long run, even if this does not happen in libvirtd

Comment 5 Han Han 2020-12-15 06:23:53 UTC
Questions from QE:
1. Since the chance to reproduce it is rare, is there any other way to reproduce it instead of the eventtest?
2. This fix is a workaround for the glib before glib-2.63.6(commit https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1353). Will it work on the glib>=2.63.6? Will it be reverted when the minimal required version >= glib-2.63.6?

Comment 6 Martin Kletzander 2020-12-15 22:09:49 UTC
(In reply to Han Han from comment #5)
ad 1. I don't know about any.  The glib issue itself mentions it only being reported couple of times, so I don't expect it to be very reproducible.  

ad 2. It looks like it is scheduled for 2.64.0 and will be backported to 2.62.x as well, but since the fix is not a workaround, just a more safe way of handling things, I do not see a reason for that to be reverted.

I am fine with this going in without much testing (basically unless anything else breaks it's fine), but I do not know if there is a process for it.  Some way to tag this BZ maybe?

Comment 7 Sandro Bonazzola 2021-01-08 07:49:06 UTC
Can we get this out as async as soon as it is verified?

Comment 19 Han Han 2021-01-13 02:38:06 UTC
Get a reproducer here by run unittest:
1. Install the src rpm of libvirt-6.6.0-7
# rpm -i LIVBRIT_6_6_SRCRPM_URL

2. Compile the libvirt-6.6
# rpmbuild -bc -v rpmbuild/SPECS/libvirt.spec
(you may need to disable rbd storage in libvirt.spec and recompile if error happens on rbd)

3. Run eventtest
# cd /root/rpmbuild/BUILD/libvirt-6.6.0/x86_64-redhat-linux-gnu

Run eventtest until it hit an error:
# while true;do ./tests/eventtest; if [ $? -ne 0 ];then break;fi;done


(process:2738769): GLib-CRITICAL **: 20:57:16.306: source_remove_from_context: assertion 'source_list != NULL' failed

# coredumpctl -1                                                     
TIME                            PID   UID   GID SIG COREFILE  EXE
Tue 2021-01-12 20:57:16 EST  2738769     0     0  11 present   /root/rpmbuild/BUILD/libvirt-6.6.0/x86_64-redhat-

Backtrace:
(gdb) bt fu
#0  0x00007f9ca01348b8 in g_source_unref_internal (source=0x55881d6a5090, context=0x55881d6a7af0, have_lock=1)
    at gmain.c:2127
        old_cb_data = 0x55881d6a4910
        old_cb_funcs = 0x55881d664010
        __func__ = "g_source_unref_internal"
#1  0x00007f9ca0134a0e in g_source_iter_next
    (iter=iter@entry=0x7f9c9932a930, source=source@entry=0x7f9c9932a928) at gmain.c:980
        next_source = <optimized out>
#2  0x00007f9ca01373df in g_main_context_check
    (context=context@entry=0x55881d6a7af0, max_priority=200, fds=fds@entry=0x7f9c940024a0, n_fds=n_fds@entry=26)
    at gmain.c:3715
        source = 0x55881d6a5090
        iter = 
          {context = 0x55881d6a7af0, may_modify = 1, current_list = 0x55881d6a4660 = {0x55881d6a4640}, source = 0x55881d6a5090}
        pollrec = <optimized out>
        n_ready = 0
        i = <optimized out>
#3  0x00007f9ca0137a60 in g_main_context_iterate
    (context=context@entry=0x55881d6a7af0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>)
    at gmain.c:3899
        max_priority = 200
        timeout = 0
        some_ready = <optimized out>
        nfds = 26
        allocated_nfds = 32
        fds = 0x7f9c940024a0
#4  0x00007f9ca0137be0 in g_main_context_iteration (context=0x55881d6a7af0, 
    context@entry=0x0, may_block=may_block@entry=1) at gmain.c:3963
        retval = <optimized out>
#5  0x00007f9ca370e3a4 in virEventGLibRunOnce () at ../../src/util/vireventglib.c:496
#6  0x000055881c416b65 in eventThreadLoop (data=<optimized out>) at ../../tests/eventtest.c:176
#7  0x00007f9c9fcce14a in start_thread (arg=<optimized out>) at pthread_create.c:479
        ret = <optimized out>
        pd = <optimized out>
        unwind_buf = 
              {cancel_jmp_buf = {{jmp_buf = {140310561863424, 8770522698823025210, 140727661230654, 140727661230655, 0, 140310561860352, -8751033350172917190, -8751030135613111750}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#8  0x00007f9c9f5e4763 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment 20 Han Han 2021-01-13 04:44:22 UTC
Test as comment19 for thousands of loops on libvirt-6.6.0-8.module+el8.3.1+8648+130818f2. No reproduced.

Comment 22 Han Han 2021-02-04 08:29:46 UTC
Covered by unit test

Comment 24 errata-xmlrpc 2021-02-22 15:39:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0639


Note You need to log in before you can comment on or make changes to this bug.