Bug 654636

Summary:	ltrace hits unexpected breakpoint after execve
Product:	[Fedora] Fedora	Reporter:	Michael Chapman <redhat-bugzilla>
Component:	ltrace	Assignee:	Petr Machata <pmachata>
Status:	CLOSED ERRATA	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	medium	Docs Contact:
Priority:	low
Version:	14	CC:	jlieskov, kdudka, mnewsome, ovasik, pmachata, twaugh
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:	ltrace-0.6.0-1.fc16	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-08-30 20:43:24 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Michael Chapman 2010-11-18 13:46:07 UTC

Description of problem:

After a process being traced by ltrace calls execve, ltrace logs an "unexpected breakpoint" error. Most of the time, the process segfaults at this point.

Version-Release number of selected component (if applicable):

- ltrace-0.5-16.45svn.fc14.i686

How reproducible: Always

Steps to Reproduce:
1. ltrace -i nice true
  
Actual results:

[0x8048d11] __libc_start_main(0x80490b0, 2, 0xbfbcd914, 0x804bf40, 0x804bfa0 <unfinished ...>
[0x80495f9] strrchr("nice", '/')                                     = NULL
[0x80490f5] setlocale(6, "")                                         = "en_AU.UTF-8"
[0x8049107] bindtextdomain("coreutils", "/usr/share/locale")         = "/usr/share/locale"
[0x804910f] textdomain("coreutils")                                  = "coreutils"
[0x804bfe5] __cxa_atexit(0x8049f00, 0, 0, 0x804e980, 0xbfbcd868)     = 0
[0x80492e6] getopt_long(2, 0xbfbcd914, "+n:", 0x804e820, NULL)       = -1
[0x80493a9] __errno_location()                                       = 0xb78a5688
[0x80493c7] getpriority(0, 0, 0x804c471, 0x804e820, 0)               = 0
[0x80493f1] setpriority(0, 0, 10, 0x804e820, 0)                      = 0
[0x8049468] execvp(0xbfbcf2fe, 0xbfbcd918, 10, 0x804e820, 0 <unfinished ...>
[0x950870] unexpected breakpoint at 0x95086f
[0x1] _exit(-1076784386 <unfinished ...>
[0x804b300] strrchr("U\211\345WV1\366S\350\016\222", '\034' <unfinished ...>
[0x8048b6c] --- SIGSEGV (Segmentation fault) ---
[0xffffffff] +++ killed by SIGSEGV +++

Additional info:

An example that doesn't immediately segfault:

$ ( sleep 1 ; exec true ) & ltrace -i -p $!
...
[0x80714e1] execve(0xa030ca0, 0xa032060, 0x9e79df8, 0x8096700, 0 <unfinished ...>
[0x950870] unexpected breakpoint at 0x95086f
[0xffffffff] +++ exited (status 0) +++

On my system it appears that the unexpected breakpoint address is always 0x95086f.

Comment 1 Jan Lieskovsky 2010-11-18 14:44:06 UTC

Hello Michael,

  thank you for your report. Just FYI, the responsible
tool for above crash is not /usr/bin/ltrace, but rather
/bin/true:

$ ltrace -i nice true
[..]
[0x8048a9c] --- SIGSEGV (Segmentation fault) ---
[0xffffffff] +++ killed by SIGSEGV +++

$ gdb -q ltrace core.18109 

warning: core file may not match specified executable file.
Core was generated by `true'.
Program terminated with signal 11, Segmentation fault.
[..]

=> reassigning to proper component.

Thanks && Regards, Jan.
--
Jan iankko Lieskovsky / Red Hat Security Response Team

Comment 2 Ondrej Vasik 2010-11-18 15:41:16 UTC

Core was generated by `true'.
Program terminated with signal 11, Segmentation fault.
#0  0x08048af8 in _start ()
...Dump of assembler code for function _start:
0x08048af0 <_start+0>:	xor    %ebp,%ebp
0x08048af2 <_start+2>:	pop    %esi
0x08048af3 <_start+3>:	mov    %ecx,%esp
0x08048af5 <_start+5>:	and    $0xfffffff0,%esp
0x08048af8 <_start+8>:	push   %eax
0x08048af9 <_start+9>:	push   %esp
0x08048afa <_start+10>:	push   %edx
0x08048afb <_start+11>:	push   $0x804b410
0x08048b00 <_start+16>:	push   $0xcc04b420
0x08048b05 <_start+21>:	push   %ecx
0x08048b06 <_start+22>:	push   %esi
0x08048b07 <_start+23>:	push   $0x8048e00
0x08048b0c <_start+28>:	call   0x80489a0 <__libc_start_main@plt>
0x08048b11 <_start+33>:	hlt    
0x08048b12 <_start+34>:	nop
0x08048b13 <_start+35>:	nop
0x08048b14 <_start+36>:	int3   
0x08048b15 <_start+37>:	nop
0x08048b16 <_start+38>:	nop
...
So true failed to even start ... maybe more glibc issue?

Comment 3 Michael Chapman 2010-11-19 06:31:19 UTC

But note that your disassembly contains three int3 instructions injected by ltrace:

0x08048af3 <_start+3>: mov    %ecx,%esp
and
0x08048b00 <_start+16>: push   $0xcc04b420
and
0x08048b14 <_start+36>: int3   

In the first case, the instruction should actually be "move %esp,%ecx", in the second the address should be 0x8004b420 (0xcc is the int3 opcode). The third is not reachable.

What seems to be happening is that ltrace is not loading the new program's PLT; it's simply reapplying the breakpoints at the locations determined for the old program.

This is also evident from if ltrace is run with debugging turned up ("ltrace -dd nice true"):

...
execvp(0xbfa15889, 0xbfa139e8, 10, 0x804e820, 0 <unfinished ...>
DEBUG: breakpoints.c:35: insert_breakpoint(): insert_breakpoint(symbol=(nil), addr=0x8049468)
DEBUG: breakpoint.c:25: enable_breakpoint(): enable_breakpoint(1759,0x8049468)
DEBUG: breakpoint.c:55: disable_breakpoint(): disable_breakpoint(1759,0x8048c40)
DEBUG: process_event.c:119: process_event(): event: breakpoint
DEBUG: process_event.c:239: process_breakpoint(): event: breakpoint (0xa14aff)
DEBUG: breakpoint.c:25: enable_breakpoint(): enable_breakpoint(1759,0x8048c40)
DEBUG: process_event.c:109: process_event(): event: syscall (SYS_execve [11])
DEBUG: breakpoints.c:128: disable_all_breakpoints(): Disabling breakpoints for pid 1759...
DEBUG: breakpoint.c:55: disable_breakpoint(): disable_breakpoint(1759,0x8048a80)
DEBUG: breakpoint.c:55: disable_breakpoint(): disable_breakpoint(1759,0x8048a90)
DEBUG: breakpoint.c:55: disable_breakpoint(): disable_breakpoint(1759,0x8048aa0)
...
DEBUG: breakpoint.c:55: disable_breakpoint(): disable_breakpoint(1759,0x8048cd0)
DEBUG: breakpoint.c:55: disable_breakpoint(): disable_breakpoint(1759,0x8048ce0)
DEBUG: breakpoint.c:55: disable_breakpoint(): disable_breakpoint(1759,0x8048d11)
DEBUG: process_event.c:115: process_event(): event: sysret (SYS_execve [11])
DEBUG: process_event.c:109: process_event(): event: syscall (SYS_execve [11])
DEBUG: process_event.c:115: process_event(): event: sysret (SYS_execve [11])
DEBUG: process_event.c:109: process_event(): event: syscall (SYS_execve [11])
DEBUG: process_event.c:115: process_event(): event: sysret (SYS_execve [11])
DEBUG: process_event.c:109: process_event(): event: syscall (SYS_execve [11])
DEBUG: process_event.c:115: process_event(): event: sysret (SYS_execve [11])
DEBUG: process_event.c:109: process_event(): event: syscall (SYS_execve [11])
DEBUG: process_event.c:115: process_event(): event: sysret (SYS_execve [11])
DEBUG: process_event.c:119: process_event(): event: breakpoint
DEBUG: process_event.c:239: process_breakpoint(): event: breakpoint (0x95086f)
unexpected breakpoint at 0x95086f
DEBUG: process_event.c:109: process_event(): event: syscall (SYS_brk [45])
DEBUG: breakpoints.c:109: enable_all_breakpoints(): Enabling breakpoints for pid 1759...
DEBUG: breakpoint.c:25: enable_breakpoint(): enable_breakpoint(1759,0x8048a80)
DEBUG: breakpoint.c:25: enable_breakpoint(): enable_breakpoint(1759,0x8048a90)
DEBUG: breakpoint.c:25: enable_breakpoint(): enable_breakpoint(1759,0x8048aa0)
...
DEBUG: breakpoint.c:25: enable_breakpoint(): enable_breakpoint(1759,0x8048cd0)
DEBUG: breakpoint.c:25: enable_breakpoint(): enable_breakpoint(1759,0x8048ce0)
DEBUG: breakpoint.c:25: enable_breakpoint(): enable_breakpoint(1759,0x8048d11)
...
_exit(-1079609207 <unfinished ...>
...

(My addresses are slightly different from the ones in you disassembly...)

What appears to be happening is that the "unexpected breakpoint" is occurring immediately after the return from SYS_execve, and since it's being treated as an error the *old* breakpoints are being reinstated.

As far as I can see, however, the address of the breakpoint is not touched by ltrace since it gets it from the kernel -- get_instruction_pointer in sysdeps/linux-gnu/i386/regs.c uses ptrace to get it from the process' USER area.

Either ltrace is not calling this, so it's just using whatever random data happens to be left in event.proc->instruction_pointer, or it *is* calling this and ptrace (and, I guess, the kernel) is returning the wrong value.

Comment 4 Petr Machata 2010-11-19 12:33:05 UTC

That true segfaults is hard to believe.  Either there's something major going on in toolchain, or ltrace screws it.  Somehow I'd rather believe the second option.  I should update ltrace to something way more decent shortly, new release is in the making.

I'm taking the bug back to ltrace.

Comment 5 Fedora Update System 2011-08-11 13:29:59 UTC

ltrace-0.6.0-1.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/ltrace-0.6.0-1.fc16

Comment 6 Fedora Update System 2011-08-12 04:21:23 UTC

Package ltrace-0.6.0-1.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing ltrace-0.6.0-1.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/ltrace-0.6.0-1.fc16
then log in and leave karma (feedback).

Comment 7 Fedora Update System 2011-08-30 20:43:17 UTC

ltrace-0.6.0-1.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.