Bug 455088

Summary: [ia64] strace -f crashes multhreaded program
Product: Red Hat Enterprise Linux 5 Reporter: Jan Kratochvil <jan.kratochvil>
Component: straceAssignee: Roland McGrath <roland>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: urgent    
Version: 5.2CC: azelinka, francois.quenum, mmatsuya, mnowak, tao
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: ia64   
OS: Linux   
URL: http://sourceforge.net/mailarchive/message.php?msg_name=20080630164049.GA19501%40host0.dyn.jankratochvil.net
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 22:09:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 455874    
Bug Blocks: 457584    
Attachments:
Description Flags
Simple threading testcase.
none
The real `Simple threading testcase.'. none

Description Jan Kratochvil 2008-07-11 21:20:22 UTC
Description of problem:
`strace -f' should trace also all the threads of the program (as also happens on
non-ia64 arches).

Version-Release number of selected component (if applicable):
kernel-2.6.18-92.el5.ia64
strace-4.5.16-1.el5.1.ia64

How reproducible:
Always.

Steps to Reproduce:
gcc -o threadit threadit.c -Wall -ggdb2 -pthread; strace -f ./threadit

Actual results:
...
clone2(Process 4000 attached
child_stack=0x2000000000320000, stack_size=0x9feb80,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x2000000000d1f2d0, tls=0x2000000000d1f910,
child_tidptr=0x2000000000d1f2d0) = 4000
[pid  3999] futex(0x2000000000d1f2d0, FUTEX_WAIT, 4000, NULL <unfinished ...>
[pid  4000] --- SIGSEGV (Segmentation fault) @ 200000000023af40 (3d0f00) ---
Process 4000 detached
upeek: ptrace(PTRACE_PEEKUSER,3999,2096,0): No such process
upeek: ptrace(PTRACE_PEEKUSER,3999,2240,0): No such process

Expected results:
...
clone2(Process 7859 attached
child_stack=0x2000000000320000, stack_size=0x9feb80,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x2000000000d1f2d0, tls=0x2000000000d1f910,
child_tidptr=0x2000000000d1f2d0) = 7859
[pid  7858] futex(0x2000000000d1f2d0, FUTEX_WAIT, 7859, NULL <unfinished ...>
[pid  7859] get_robust_list(0x2000000000d1f2e0, 0x18, 0) = 0
[pid  7859] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid  7859] rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
[pid  7859] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  7859] nanosleep({1, 0}, {1, 0})   = 0
[pid  7859] exit(0)                     = ?
Process 7859 detached
<... futex resumed> )                   = 0
exit_group(0)                           = ?

Additional info:
See the referenced mail:
http://sourceforge.net/mailarchive/message.php?msg_name=20080630164049.GA19501%40host0.dyn.jankratochvil.net

RHEL-4 does not crash this way, IMO the kernel new stack frame created for the
new thread there has a lucky layout where it does not crash.  Still the fix
would be IMO appropriate even for RHEL-4.

Comment 1 Jan Kratochvil 2008-07-11 21:20:22 UTC
Created attachment 311621 [details]
Simple threading testcase.

Comment 2 RHEL Program Management 2008-07-11 21:39:31 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 6 RHEL Program Management 2008-07-17 05:54:30 UTC
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.

Comment 7 Eric Bachalo 2008-07-18 15:14:42 UTC
This problem will be fixed in 
strace RHEL 5.3 rebase to version 4.5.17

http://bugzilla.redhat.com/show_bug.cgi?id=455874

Comment 8 Larry Troan 2008-07-29 23:51:14 UTC
Per Eric Bachalo...  
The 5 BZ's (452501, 455088, 453438, 435444, and 454431) reported against strace
and requested for RHEL 5.3 will all be fixed in the strace version 4.5.17.

Closing as a DUP of bug 455874.

*** This bug has been marked as a duplicate of 455874 ***

Comment 11 Jan Kratochvil 2008-08-01 16:08:06 UTC
Created attachment 313210 [details]
The real `Simple threading testcase.'.

I have never seen the patch in Comment 1.  I do not have that file on my hard
drive.
Here is the testcase I intended to attach in the first place.

Comment 18 errata-xmlrpc 2009-01-20 22:09:46 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0233.html