Description of problem: esp and ebx gets swapped around during a syscall causing some confusion for introspection tools like valgrind. Any tool that would intercept syscalls and needs the user stack/backtrace will probably have some problems. Version-Release number of selected component (if applicable): glibc-2.19.90-33.fc21.1.i686 How reproducible: Always. Steps to Reproduce: 1. Run valgrind on any executable, see lots of warnings about client switching stacks. Additional info: The problem the TLS_INIT_TP macro called from init_tls () at rtld.c:631 TLS_INIT_TP is defined in sysdeps/i386/nptl/tls.h and contains: /* Install the TLS. */ \ asm volatile (TLS_LOAD_EBX \ "int $0x80\n\t" \ TLS_LOAD_EBX \ : "=a" (_result), "=m" (_segdescr.desc.entry_number) \ : "0" (__NR_set_thread_area), \ TLS_EBX_ARG (&_segdescr.desc), "m" (_segdescr.desc)); \ Which gets turned into: 0x04000a1c <+274>: mov $0xf3,%eax 0x04000a21 <+279>: movl $0xfffff,0x8(%esp) 0x04000a29 <+287>: movl $0x51,0xc(%esp) => 0x04000a31 <+295>: xchg %esp,%ebx 0x04000a33 <+297>: int $0x80 0x04000a35 <+299>: xchg %esp,%ebx 0x04000a37 <+301>: test %eax,%eax 0x04000a39 <+303>: jne 0x4000a52 <init_tls+328> Note that previous versions (the glibc in f20) use ecx instead of esp which isn't a problem. Using and replacing esp here is really inconvenient and breaks some assumptions of tools that might intercept syscalls. Please use any other register.
Created attachment 930003 [details] glibc.spec add glibc-rh1133134-i386-tlsinit.patch Using INTERNAL_SYSCALL instead of hand written asm to call set_thread_area solves it for me. With the attached patch the code comes out as: a1e: c7 44 24 08 ff ff 0f movl $0xfffff,0x8(%esp) a25: 00 a26: c7 44 24 0c 51 00 00 movl $0x51,0xc(%esp) a2d: 00 a2e: 87 cb xchg %ecx,%ebx a30: b8 f3 00 00 00 mov $0xf3,%eax a35: cd 80 int $0x80 a37: 87 cb xchg %ecx,%ebx a39: 85 c0 test %eax,%eax a3b: 75 17 jne a54 <init_tls+0x14a> And valgrind seems to work again as expected.
Created attachment 930372 [details] rtld.i Preprocessed sources, Compile with “gcc -std=gnu99 -S -O2 -o- rtld.i”. Reproduces with gcc-4.9.1-7.fc22.i686: movl $243, %eax movl $1048575, 8(%esp) movl $81, 12(%esp) #APP # 631 "rtld.c" 1 xchgl %esp, %ebx int $0x80 xchgl %esp, %ebx # 0 "" 2 #NO_APP testl %eax, %eax jne .L215 Relevant source code after preprocessing: const char *lossage = ({ void *_thrdescr = (tcbp); tcbhead_t *_head = _thrdescr; union user_desc_init _segdescr; int _result; _head->tcb = _thrdescr; _head->self = _thrdescr; _head->sysinfo = _rtld_local_ro._dl_sysinfo; tls_fill_user_desc (&_segdescr, -1, _thrdescr); asm volatile ("xchgl %3, %%ebx\n\t" "int $0x80\n\t" "xchgl %3, %%ebx\n\t" : "=a" (_result), "=m" (_segdescr.desc.entry_number) : "0" (243), "r" (&_segdescr.desc), "m" (_segdescr.desc)); if (_result == 0) __asm ("movl %0, %%gs" :: "q" (_segdescr.desc.entry_number * 8 + 3)); _result == 0 ? ((void *)0) : "set_thread_area failed when setting up thread-local storage\n"; }); The constraints look reasonable. I'll try to reproduce this with upstream GCC.
Mark's patch looks good to me to work around this in glibc even if it turns out to be a gcc bug. I'll push it in rawhide soon, but we eventually need to get it upstream too. Posting it upstream now would be useless though since it'll probably just be ignored till 2.20 is released.
Okay, the penny dropped. This is a glibc bug, and using INTERNAL_SYSCALL is the correct fix. _segdescr.desc happens to be at the top of the stack, so its address is in %esp. The asm statement says that %3 is an input, so its value will not change, and GCC can use %esp as the input register for the expression &_segdescr.desc. But the constraints do not fully describe the asm statement because the %3 register is actually modified, albeit only temporarily. I think signals are still blocked when this code runs (otherwise there would be a race condition between calling the signal handler and saving and restoring %esp), so this bug is essentially harmless and only visible with valgrind and perhaps when single-stepping through the function with a debugger.
Posted upstream: https://sourceware.org/ml/libc-alpha/2014-08/msg00417.html
Should be fixed in rawhide and f21.