Bug 1674280 - glibc: Invalid LIBC_PROBE in __pthread_timedjoin_ex can cause SIGSEGV
Summary: glibc: Invalid LIBC_PROBE in __pthread_timedjoin_ex can cause SIGSEGV
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: rawhide
Hardware: armv7l
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Florian Weimer
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1196181
TreeView+ depends on / blocked
 
Reported: 2019-02-10 18:20 UTC by Igor Raits
Modified: 2019-04-04 17:31 UTC (History)
14 users (show)

Fixed In Version: glibc-2.29-8.fc30
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-04 17:31:17 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Sourceware 24211 0 None None None 2019-04-04 17:30:58 UTC

Description Igor Raits 2019-02-10 18:20:12 UTC
(gdb) bt full
#0  0xb6f81af0 in __pthread_timedjoin_ex () from /lib/libpthread.so.0
No symbol table info available.
#1  0x005edeac in std::sys::unix::thread::Thread::join ()
No symbol table info available.
#2  0x004af7b4 in <std::thread::JoinInner<T>>::join (self=0xbefff398) at /builddir/build/BUILD/rustc-1.32.0-src/src/libstd/thread/mod.rs:1298
No locals.
#3  <std::thread::JoinHandle<T>>::join (self=...) at /builddir/build/BUILD/rustc-1.32.0-src/src/libstd/thread/mod.rs:1431
No locals.
#4  0x00491e40 in build_script_build::codegen::main () at build.rs:40
        handle = <optimized out>
        output = <optimized out>
        input = <optimized out>
        manifest_dir = <optimized out>
#5  0x0040f790 in std::rt::lang_start::{{closure}} () at /builddir/build/BUILD/rustc-1.32.0-src/src/libstd/rt.rs:74
        main = <optimized out>
#6  0x005fc270 in std::panicking::try::do_call ()
No symbol table info available.
#7  0x006051d0 in __rust_maybe_catch_panic ()
No symbol table info available.
#8  0x005fb808 in std::panic::catch_unwind ()
No symbol table info available.
#9  0x005ee45c in std::rt::lang_start_internal ()
No symbol table info available.
#10 0x00429edc in main ()
No symbol table info available.
#11 0xb6e1587c in __libc_start_main () from /lib/libc.so.6
No symbol table info available.
#12 0x00405034 in _start ()
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

rust-1.32.0-2.fc30.armv7hl

Comment 1 Igor Raits 2019-02-10 19:20:19 UTC
(gdb) bt full
#0  __GI___pthread_timedjoin_ex (threadid=3067290560, thread_return=0x0, abstime=<optimized out>, block=<optimized out>) at pthread_join_c
   ommon.c:104
        pd = 0xb6d323c0
        self = <optimized out>
        result = <optimized out>
#1  0x0069171c in std::sys::unix::thread::Thread::join ()
No symbol table info available.
#2  0x005640f8 in <std::thread::JoinInner<T>>::join (self=0xbef25428) at /builddir/build/BUILD/rustc-1.32.0-src/src/libstd/thread/mod.rs:1298
No locals.
#3  <std::thread::JoinHandle<T>>::join (self=...) at /builddir/build/BUILD/rustc-1.32.0-src/src/libstd/thread/mod.rs:1431
No locals.
#4  0x00533cb0 in build_script_build::codegen::main () at build.rs:40
        handle = <optimized out>
        output = <optimized out>
        input = <optimized out>
        manifest_dir = <optimized out>
#5  0x005337b8 in std::rt::lang_start::{{closure}} () at /builddir/build/BUILD/rustc-1.32.0-src/src/libstd/rt.rs:74
        main = <optimized out>
#6  0x0069fadc in std::panicking::try::do_call ()
No symbol table info available.
#7  0x006a89f0 in __rust_maybe_catch_panic ()
No symbol table info available.
#8  0x0069f074 in std::panic::catch_unwind ()
No symbol table info available.
#9  0x00691cd0 in std::rt::lang_start_internal ()
No symbol table info available.
#10 0x004eaa14 in main ()
No symbol table info available.
#11 0xb6d4a87c in __libc_start_main (main=0xbef25644, argc=-1226317824, argv=0xb6d4a87c <__libc_start_main+268>, init=<optimized out>
    , fini=0x6b1d18 <__libc_csu_fini>, rtld_fini=0xb6f13390 <_dl_fini>, stack_end=0xbef25644) at libc-start.c:308
        self = <optimized out>
        result = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-462516321, -330796193, 7019704, 0, 4874224, 0, 0, 0, 7335336, 0 <repeats 33 times>, -1091414460, -1226317824, -1091414185, -1091414452, -1226317824, 1, -10914
                14460, -1227577600, 0, -1091414568, -1226444948, 61765110, 1, 0, 4, -1225598304, 1, -1091414460, -1225848560, -1225707192, -1, -1225570024}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0
              x0, 0xb6f13298 <_dl_init+124>}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#12 0x004a6034 in _start ()

Comment 2 Josh Stone 2019-02-11 18:00:45 UTC
These backtraces just show the main thread trying to join -- waiting for another thread to complete. It's likely that the actual fault is in one of those other threads -- can you try "thread apply all bt"?

Comment 3 Josh Stone 2019-02-11 18:43:57 UTC
Discussion on IRC clarified that this is cssparser's build script, not rustc itself. And while that does start its own thread, GDB has no knowledge of it after the SIGSEGV.

FWIW, the script is using a pretty large stack for that thread:

        // We have stack overflows on Servo's CI.
        let handle = Builder::new().stack_size(128 * 1024 * 1024).spawn(move || {
            match_byte::expand(&input, &output);
        }).unwrap();

Comment 4 Igor Raits 2019-02-11 19:36:28 UTC
<mock-chroot> sh-4.4# cat > t.c << EOF
> #include <pthread.h>
> 
> void *x(void *data) {}
> 
> int main(void)
> {
>   pthread_t thread;
>   pthread_attr_t thread_attr;
>   pthread_attr_setstacksize (&thread_attr, 128 * 1024 * 1024);
>   pthread_create (&thread, &thread_attr, x, NULL);
>   pthread_join (thread, NULL);
> }
> EOF
<mock-chroot> sh-4.4# gcc t.c -lpthread
<mock-chroot> sh-4.4# ./a.out 
Segmentation fault

Comment 5 Igor Raits 2019-02-11 19:43:58 UTC
<mock-chroot> sh-4.4# cat t.c
#include <errno.h>
#include <pthread.h>
#include <stdio.h>

void *x(void *data)
{
  return NULL;
}

int
main (void)
{
  pthread_t thread;
  pthread_attr_t thread_attr;

  if (pthread_attr_init (&thread_attr))
    perror ("pthread_attr_init");
  if (pthread_attr_setstacksize (&thread_attr, 128 * 1024 * 1024))
    perror ("pthread_attr_setstacksize");
  if (pthread_create (&thread, &thread_attr, x, NULL))
    perror ("pthread_create");
  if (pthread_join (thread, NULL))
    perror ("pthread_join");

  return 0;
}
<mock-chroot> sh-4.4# gcc t.c -lpthread -g -Wall && ./a.out
Segmentation fault

Comment 6 Igor Raits 2019-02-11 19:50:06 UTC
(gdb) r
Starting program: /a.out 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
[New Thread 0xb6e54460 (LWP 6060)]
[Thread 0xb6e54460 (LWP 6060) exited]

Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
__GI___pthread_timedjoin_ex (threadid=3068478560, thread_return=0x0, abstime=<optimized out>, block=<optimized out>) at pthread_join_common.c:
   104
104       LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd->result);
(gdb) t a a bt full

Thread 1 (Thread 0xb6ff8ac0 (LWP 6057)):
#0  __GI___pthread_timedjoin_ex (threadid=3068478560, thread_return=0x0, abstime=<optimized out>, block=<optimized out>) at pthread_join_common.c:104
        pd = 0xb6e54460
        self = <optimized out>
        result = <optimized out>
#1  0x00010640 in main () at t.c:22
        thread = 3068478560
        thread_attr = {__size = '\000' <repeats 13 times>, "\020\000\000\000\000\000\000\000\000\000\b", '\000' <repeats 11 times>, __align = 0}


<mock-chroot> sh-4.4# rpm -q glibc gcc binutils
glibc-2.29-6.fc30.armv7hl
gcc-9.0.1-0.4.fc30.armv7hl
binutils-2.31.1-21.fc30.armv7hl

Comment 7 Igor Raits 2019-02-11 19:52:26 UTC
The same binary (compiled inside chroot) works fine on F29 system (glibc-2.28-26.fc29.armv7hl)

Comment 8 Igor Raits 2019-02-11 20:03:31 UTC
With a pthread_attr_setstacksize (&thread_attr, 41938960) or any number above, it makes program to segfault.

Comment 9 Florian Weimer 2019-02-11 20:34:27 UTC
(In reply to Igor Gnatenko from comment #8)
> With a pthread_attr_setstacksize (&thread_attr, 41938960) or any number
> above, it makes program to segfault.

This is the cut-off point where the stack (and the thread descriptor) is freed to stay within the thread stack cache limit:

      /* Free the TCB.  */
      __free_tcb (pd);
    }
  else
    pd->joinid = NULL;

  LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd->result);

With the fix for bug 1196181, we need to load pd->result into a register on Arm, so we start dereferencing pd->result.  But this probe is buggy on all architectures.  It should use result, not pd->result.

Comment 10 Florian Weimer 2019-02-11 20:53:16 UTC
glibc-2.29-7.fc30 should have a fix for this once the build completes.

Comment 11 Igor Raits 2019-02-11 20:53:53 UTC
(In reply to Florian Weimer from comment #10)
> glibc-2.29-7.fc30 should have a fix for this once the build completes.

Thank you a lot!

Comment 12 Florian Weimer 2019-02-11 21:22:56 UTC
The patch is slightly buggy (the probe value isn't correct), but it will make the crash go away.

Comment 13 Florian Weimer 2019-02-19 07:48:22 UTC
Official upstream fix applied in glibc-2.29-8.fc30.


Note You need to log in before you can comment on or make changes to this bug.