Bug 1133134 - init_tls switches around esp around set_thread_area syscall
Summary: init_tls switches around esp around set_thread_area syscall
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: 21
Hardware: i686
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Carlos O'Donell
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-22 20:48 UTC by Mark Wielaard
Modified: 2014-08-26 13:19 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-08-26 13:19:55 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
glibc.spec add glibc-rh1133134-i386-tlsinit.patch (2.63 KB, patch)
2014-08-23 20:16 UTC, Mark Wielaard
no flags Details | Diff
rtld.i (393.79 KB, text/plain)
2014-08-25 08:32 UTC, Florian Weimer
no flags Details

Description Mark Wielaard 2014-08-22 20:48:56 UTC
Description of problem:

esp and ebx gets swapped around during a syscall causing some confusion for introspection tools like valgrind. Any tool that would intercept syscalls and needs the user stack/backtrace will probably have some problems.

Version-Release number of selected component (if applicable):

glibc-2.19.90-33.fc21.1.i686

How reproducible:

Always.

Steps to Reproduce:
1. Run valgrind on any executable, see lots of warnings about client switching stacks.

Additional info:

The problem the TLS_INIT_TP macro called from init_tls () at rtld.c:631

TLS_INIT_TP is defined in sysdeps/i386/nptl/tls.h and contains:

     /* Install the TLS.  */                                                  \
     asm volatile (TLS_LOAD_EBX                                               \
                   "int $0x80\n\t"                                            \
                   TLS_LOAD_EBX                                               \
                   : "=a" (_result), "=m" (_segdescr.desc.entry_number)       \
                   : "0" (__NR_set_thread_area),                              \
                     TLS_EBX_ARG (&_segdescr.desc), "m" (_segdescr.desc));    \

Which gets turned into:

   0x04000a1c <+274>: mov    $0xf3,%eax
   0x04000a21 <+279>: movl   $0xfffff,0x8(%esp)
   0x04000a29 <+287>: movl   $0x51,0xc(%esp)
=> 0x04000a31 <+295>: xchg   %esp,%ebx
   0x04000a33 <+297>: int    $0x80
   0x04000a35 <+299>: xchg   %esp,%ebx
   0x04000a37 <+301>: test   %eax,%eax
   0x04000a39 <+303>: jne    0x4000a52 <init_tls+328>

Note that previous versions (the glibc in f20) use ecx instead of esp which isn't a problem.

Using and replacing esp here is really inconvenient and breaks some assumptions of tools that might intercept syscalls. Please use any other register.

Comment 1 Mark Wielaard 2014-08-23 20:16:19 UTC
Created attachment 930003 [details]
glibc.spec add glibc-rh1133134-i386-tlsinit.patch

Using INTERNAL_SYSCALL instead of hand written asm to call set_thread_area solves it for me. With the attached patch the code comes out as:

     a1e:	c7 44 24 08 ff ff 0f 	movl   $0xfffff,0x8(%esp)
     a25:	00 
     a26:	c7 44 24 0c 51 00 00 	movl   $0x51,0xc(%esp)
     a2d:	00 
     a2e:	87 cb                	xchg   %ecx,%ebx
     a30:	b8 f3 00 00 00       	mov    $0xf3,%eax
     a35:	cd 80                	int    $0x80
     a37:	87 cb                	xchg   %ecx,%ebx
     a39:	85 c0                	test   %eax,%eax
     a3b:	75 17                	jne    a54 <init_tls+0x14a>

And valgrind seems to work again as expected.

Comment 2 Florian Weimer 2014-08-25 08:32:10 UTC
Created attachment 930372 [details]
rtld.i

Preprocessed sources,  Compile with “gcc -std=gnu99 -S -O2 -o- rtld.i”.  Reproduces with gcc-4.9.1-7.fc22.i686:

	movl	$243, %eax
	movl	$1048575, 8(%esp)
	movl	$81, 12(%esp)
#APP
# 631 "rtld.c" 1
	xchgl %esp, %ebx
	int $0x80
	xchgl %esp, %ebx
	
# 0 "" 2
#NO_APP
	testl	%eax, %eax
	jne	.L215

Relevant source code after preprocessing:

  const char *lossage = ({
    void *_thrdescr = (tcbp);
    tcbhead_t *_head = _thrdescr;
    union user_desc_init _segdescr;
    int _result;

    _head->tcb = _thrdescr;
    _head->self = _thrdescr;
    _head->sysinfo = _rtld_local_ro._dl_sysinfo;
    tls_fill_user_desc (&_segdescr, -1, _thrdescr);
    asm volatile
     ("xchgl %3, %%ebx\n\t"
      "int $0x80\n\t"
      "xchgl %3, %%ebx\n\t"
      : "=a" (_result), "=m" (_segdescr.desc.entry_number)
      : "0" (243), "r" (&_segdescr.desc), "m" (_segdescr.desc));
    if (_result == 0)
      __asm
       ("movl %0, %%gs"
        :: "q" (_segdescr.desc.entry_number * 8 + 3));
    _result == 0 ? ((void *)0)
      : "set_thread_area failed when setting up thread-local storage\n";
  });

The constraints look reasonable.  I'll try to reproduce this with upstream GCC.

Comment 3 Siddhesh Poyarekar 2014-08-25 09:09:19 UTC
Mark's patch looks good to me to work around this in glibc even if it turns out to be a gcc bug.  I'll push it in rawhide soon, but we eventually need to get it upstream too.  Posting it upstream now would be useless though since it'll probably just be ignored till 2.20 is released.

Comment 4 Florian Weimer 2014-08-25 12:34:27 UTC
Okay, the penny dropped.  This is a glibc bug, and using INTERNAL_SYSCALL is the correct fix.

_segdescr.desc happens to be at the top of the stack, so its address is in %esp.  The asm statement says that %3 is an input, so its value will not change, and GCC can use %esp as the input register for the expression &_segdescr.desc.  But the constraints do not fully describe the asm statement because the %3 register is actually modified, albeit only temporarily.

I think signals are still blocked when this code runs (otherwise there would be a race condition between calling the signal handler and saving and restoring %esp), so this bug is essentially harmless and only visible with valgrind and perhaps when single-stepping through the function with a debugger.

Comment 5 Mark Wielaard 2014-08-26 08:33:22 UTC
Posted upstream: https://sourceware.org/ml/libc-alpha/2014-08/msg00417.html

Comment 6 Siddhesh Poyarekar 2014-08-26 13:19:55 UTC
Should be fixed in rawhide and f21.


Note You need to log in before you can comment on or make changes to this bug.