Bug 1652495 - glibc: Incorrect double-free malloc tcache check disregards tcache size
Summary: glibc: Incorrect double-free malloc tcache check disregards tcache size
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Florian Weimer
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-22 09:08 UTC by Yanko Kaneti
Modified: 2018-11-30 06:56 UTC (History)
12 users (show)

Fixed In Version: glibc-2.28.9000-19.fc30
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-30 06:56:34 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Fedora Pagure releng issue 7928 None None None 2018-11-22 11:28:18 UTC
Red Hat Bugzilla 1647395 None CLOSED glibc: the execution continued with double free in the program 2019-06-24 09:15:40 UTC
Sourceware 23907 None None None 2019-06-24 09:15:40 UTC

Internal Links: 1647395

Description Yanko Kaneti 2018-11-22 09:08:46 UTC
Description of problem:
Random crashes in realloc in different programs starting with glibc-2.28.900-19

Some coredumpctl samples I have here:
One nautilus coredump:

                Stack trace of thread 17746:
                #0  0x00007f5881fb2ac9 _int_free (libc.so.6)
                #1  0x00007f5881fb4eaf _int_realloc (libc.so.6)
                #2  0x00007f5881fb622b __GI___libc_realloc (libc.so.6)
                #3  0x00007f5882fc505e g_realloc (libglib-2.0.so.0)
                #4  0x00007f5882fe21f7 g_string_maybe_expand (libglib-2.0.so.0)
                #5  0x00007f5882fe254a g_string_insert_len (libglib-2.0.so.0)
                #6  0x00007f5882faa2ae g_build_path_va (libglib-2.0.so.0)
                #7  0x00007f5882fab739 g_build_filename_va (libglib-2.0.so.0)
                #8  0x00007f588247cb5a get_thumbnail_attributes (libgio-2.0.so.0)
                #9  0x00007f588247eea4 _g_local_file_info_get (libgio-2.0.so.0)
                #10 0x00007f588247945b g_local_file_query_info (libgio-2.0.so.0)
                #11 0x00007f58823e0e08 query_info_async_thread (libgio-2.0.so.0)
                #12 0x00007f5882425a07 g_task_thread_pool_thread (libgio-2.0.so.0)
                #13 0x00007f5882fe8e93 g_thread_pool_thread_proxy (libglib-2.0.so.0)
                #14 0x00007f5882fe848a g_thread_proxy (libglib-2.0.so.0)
                #15 0x00007f58820fd583 start_thread (libpthread.so.0)
                #16 0x00007f588202c083 __clone (libc.so.6)

One evoluion:
                Stack trace of thread 16382:
                #0  0x00007f40cc9beac9 _int_free (libc.so.6)
                #1  0x00007f40cc9c0eaf _int_realloc (libc.so.6)
                #2  0x00007f40cc9c222b __GI___libc_realloc (libc.so.6)
                #3  0x00007f40d040305e g_realloc (libglib-2.0.so.0)
                #4  0x00007f40d04201f7 g_string_maybe_expand (libglib-2.0.so.0)
                #5  0x00007f40d042054a g_string_insert_len (libglib-2.0.so.0)
                #6  0x00007f40d03e82ae g_build_path_va (libglib-2.0.so.0)
                #7  0x00007f40d03e9739 g_build_filename_va (libglib-2.0.so.0)
                #8  0x00007f40d0539ee3 data_cache_expire (libcamel-1.2.so.62)
                #9  0x00007f40d053a188 data_cache_path (libcamel-1.2.so.62)
                #10 0x00007f40d053aa1c camel_data_cache_get (libcamel-1.2.so.62)
                #11 0x00007f40c42e17c2 imapx_get_message_cached (libcamelimapx.so)
                #12 0x00007f40d055d0cd camel_folder_get_message_sync (libcamel-1.2.so.62)
                #13 0x00007f40d055d784 folder_get_message_thread (libcamel-1.2.so.62)
                #14 0x00007f40cfa08a07 g_task_thread_pool_thread (libgio-2.0.so.0)
                #15 0x00007f40d0426e93 g_thread_pool_thread_proxy (libglib-2.0.so.0)
                #16 0x00007f40d042648a g_thread_proxy (libglib-2.0.so.0)
                #17 0x00007f40d04d6583 start_thread (libpthread.so.0)
                #18 0x00007f40cca38083 __clone (libc.so.6)


One liferea:
                Stack trace of thread 24158:
                #0  0x00007f4ee348bac9 _int_free (libc.so.6)
                #1  0x00007f4ee348deaf _int_realloc (libc.so.6)
                #2  0x00007f4ee348f22b __GI___libc_realloc (libc.so.6)
                #3  0x00007f4e502a43fb n/a (p11-kit-trust.so)
                #4  0x00007f4e502a446a n/a (p11-kit-trust.so)
                #5  0x00007f4e502a4115 n/a (p11-kit-trust.so)
                #6  0x00007f4e502a54a1 n/a (p11-kit-trust.so)
                #7  0x00007f4e502a8be1 n/a (p11-kit-trust.so)
                #8  0x00007f4e5016622c find_cert_cb (libgnutls.so.30)
                #9  0x00007f4e5016b8e3 _pkcs11_traverse_tokens (libgnutls.so.30)
                #10 0x00007f4e5016d95b gnutls_pkcs11_crt_is_known (libgnutls.so.30)
                #11 0x00007f4e501afde6 _gnutls_pkcs11_verify_crt_status (libgnutls.so.30)
                #12 0x00007f4e501bfda9 gnutls_x509_trust_list_verify_crt2 (libgnutls.so.30)
                #13 0x00007f4e501bffa9 gnutls_x509_trust_list_verify_crt (libgnutls.so.30)
                #14 0x00007f4e502e2f73 g_tls_database_gnutls_verify_chain (libgiognutls.so)
                #15 0x00007f4e502dfc93 verify_peer_certificate (libgiognutls.so)
                #16 0x00007f4e502dffda async_handshake_thread (libgiognutls.so)
                #17 0x00007f4ee3974a07 g_task_thread_pool_thread (libgio-2.0.so.0)
                #18 0x00007f4ee37ace93 g_thread_pool_thread_proxy (libglib-2.0.so.0)
                #19 0x00007f4ee37ac48a g_thread_proxy (libglib-2.0.so.0)
                #20 0x00007f4ee35d6583 start_thread (libpthread.so.0)
                #21 0x00007f4ee3505083 __clone (libc.so.6)
                

I dont't have a reproducer.  All of these use threads..

Comment 1 Florian Weimer 2018-11-22 09:36:58 UTC
Ugh, sorry about that.  Do you have a backtrace with debugging information?  Thanks.

Comment 2 Yanko Kaneti 2018-11-22 09:51:04 UTC
Here is an excerpt from the evolution crash. 

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f40cc9beac9 in _int_free (av=av@entry=0x7f4008000020, p=p@entry=0x7f4008001ff0, have_lock=have_lock@entry=1) at malloc.c:4243
4243		  if (tmp == e)
[Current thread is 1 (Thread 0x7f404d3f7700 (LWP 16382))]
Missing separate debuginfos, use: dnf debuginfo-install enchant2-2.2.3-5.fc30.x86_64 libnghttp2-1.34.0-1.fc30.x86_64 libtool-ltdl-2.4.6-27.fc30.x86_64 libxcrypt-4.4.0-1.fc30.x86_64 opensc-0.19.0-3.fc30.x86_64 pcsc-lite-libs-1.8.24-1.fc30.x86_64 sssd-client-2.0.0-5.fc30.x86_64 webkit2gtk3-2.22.3-2.fc30.x86_64 webkit2gtk3-jsc-2.22.3-2.fc30.x86_64 woff2-1.0.2-4.fc29.x86_64 xfconf-4.13.6-2.fc30.x86_64 yajl-2.1.0-11.fc29.x86_64
(gdb) bt full
#0  0x00007f40cc9beac9 in _int_free (av=av@entry=0x7f4008000020, p=p@entry=0x7f4008001ff0, have_lock=have_lock@entry=1) at malloc.c:4243
        tmp = 0x1
        tc_idx = 254
        e = 0x7f4008002000
        size = 4096
        fb = <optimized out>
        nextchunk = <optimized out>
        nextsize = <optimized out>
        nextinuse = <optimized out>
        prevsize = <optimized out>
        bck = <optimized out>
        fwd = <optimized out>
        __PRETTY_FUNCTION__ = "_int_free"
#1  0x00007f40cc9c0eaf in _int_realloc (av=av@entry=0x7f4008000020, oldp=oldp@entry=0x7f4008001f60, oldsize=oldsize@entry=80, nb=nb@entry=144) at malloc.c:4710
        newp = 0x7f4008001f60
        newsize = 4240
        newmem = <optimized out>
        next = 0x7f4008001fb0
        remainder = 0x7f4008001ff0
        remainder_size = 4096
        copysize = <optimized out>
        ncopies = <optimized out>
        s = <optimized out>
        d = <optimized out>
        __PRETTY_FUNCTION__ = "_int_realloc"
        nextsize = <optimized out>
#2  0x00007f40cc9c222b in __GI___libc_realloc (oldmem=0x7f4008001f70, bytes=bytes@entry=128) at malloc.c:3301
        ar_ptr = 0x7f4008000020
        nb = 144
        newp = <optimized out>
        hook = <optimized out>
        oldp = 0x7f4008001f60
        oldsize = 80
        __PRETTY_FUNCTION__ = "__libc_realloc"
#3  0x00007f40d040305e in g_realloc (mem=0x7f4008001f70, n_bytes=128) at gmem.c:164
        newmem = <optimized out>
#4  0x00007f40d04201f7 in g_string_maybe_expand (string=0x7f40ac601440, len=<optimized out>) at gstring.c:102
#5  0x00007f40d042054a in g_string_insert_len (string=0x7f40ac601440, pos=<optimized out>, val=0x7f400800411b "2549505", len=<optimized out>) at gstring.c:476
        pos = <optimized out>
        string = 0x7f40ac601440
        __func__ = "g_string_insert_len"
        len = <optimized out>
        val = 0x7f400800411b "2549505"
        __func__ = "g_string_insert_len"
#6  0x00007f40d03e82ae in g_build_path_va
    (separator=separator@entry=0x7f40d044bedf "/", first_element=first_element@entry=0x7f404d3f66b0 "/home/yaneti/.cache/evolution/mail/0/folders/INBOX/cur/00", args=args@entry=0x7f404d3f64d0, str_array=str_array@entry=0x0) at gfileutils.c:1766
        element = <optimized out>
        start = <optimized out>
        end = 0x7f4008004122 ""
        result = 0x7f40ac601440
        separator_len = <optimized out>
        is_first = 0
        have_leading = 1
        single_element = 0x0
        next_element = 0x0
        last_trailing = 0x7f4008004122 ""
        i = 0
#7  0x00007f40d03e9739 in g_build_filename_va (str_array=0x0, args=0x7f404d3f64d0, first_argument=<optimized out>) at gfileutils.c:2069
        str = <optimized out>
        str = <optimized out>
        args = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 0x7f404d3f65b0, reg_save_area = 0x7f404d3f64f0}}
.....

Comment 3 Florian Weimer 2018-11-22 09:58:04 UTC
(In reply to Yanko Kaneti from comment #2)
> Here is an excerpt from the evolution crash. 
> 
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00007f40cc9beac9 in _int_free (av=av@entry=0x7f4008000020,
> p=p@entry=0x7f4008001ff0, have_lock=have_lock@entry=1) at malloc.c:4243
> 4243		  if (tmp == e)

This is in the new double-free checking code:

   4229     /* Check to see if it's already in the tcache.  */
   4230     tcache_entry *e = (tcache_entry *) chunk2mem (p);
   4231 
   4232     /* This test succeeds on double free.  However, we don't 100%
   4233        trust it (it also matches random payload data at a 1 in
   4234        2^<size_t> chance), so verify it's not an unlikely coincidence
   4235        before aborting.  */
   4236     if (__glibc_unlikely (e->key == tcache && tcache))
   4237       {
   4238         tcache_entry *tmp;
   4239         LIBC_PROBE (memory_tcache_double_free, 2, e, tc_idx);
   4240         for (tmp = tcache->entries[tc_idx];
   4241              tmp;
   4242              tmp = tmp->next)
   4243           if (tmp == e)
   4244             malloc_printerr ("free(): double free detected in tcache 2")   4244 ;
   4245         /* If we get here, it was a coincidence.  We've wasted a few
   4246            cycles, but don't abort.  */
   4247       }

I will try to create a reproducer, using random cross-thread reallocs, and revert the upstream patch in rawhide later today.

Comment 4 Florian Weimer 2018-11-22 11:28:19 UTC
I also see a GCC crash when rebuilding glibc itself, which could be related.  GCC is not multi-threaded.

I filed an untag request with releng: https://pagure.io/releng/issue/7928

Comment 5 Florian Weimer 2018-11-22 12:19:10 UTC
I managed to obtain the core file.  Backtrace looks very similar, and the process was NOT multi-threaded.

#10 0xf7c4faf0 in _int_free (av=av@entry=0xf7d787a0 <main_arena>, 
    p=p@entry=0xabd7958, have_lock=have_lock@entry=1) at malloc.c:4243
#11 0xf7c51cb0 in _int_realloc (av=av@entry=0xf7d787a0 <main_arena>, 
    oldp=oldp@entry=0xabd77e8, oldsize=oldsize@entry=352, nb=368) at malloc.c:4710
#12 0xf7c52c75 in __GI___libc_realloc (oldmem=0xabd77f0, bytes=352)
    at malloc.c:3292

(gdb) print tmp
$2 = (tcache_entry *) 0xd

The process is not multithreaded, which is why TLS does not work in GDB:

(gdb) print tcache
Cannot find thread-local storage for LWP 17812, shared library /lib/libc.so.6:
Cannot find thread-local variables on this target

But assuming that e->key *is* the address of the tcache, we have:

(gdb) print e->key->entries[tc_idx]
$28 = (tcache_entry *) 0xd
(gdb) print tc_idx
$31 = 109

That appears to be issue: The index is larger than TCACHE_MAX_BINS.  We need to move the check for tc_idx < mp_.tcache_bin before the double-free check.

Comment 6 Florian Weimer 2018-11-22 13:27:51 UTC
I posted what I believe is the fix upstream:

https://sourceware.org/ml/libc-alpha/2018-11/msg00577.html

Comment 7 Florian Weimer 2018-11-30 06:56:34 UTC
The upstream fix was incorporated in glibc-2.28.9000-21.fc30.


Note You need to log in before you can comment on or make changes to this bug.