Bug 455078 - [4.7] strace -f fails to follow vfork() processes on ia64 - hangs instead - possible kernel bug?
[4.7] strace -f fails to follow vfork() processes on ia64 - hangs instead - p...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: strace (Show other bugs)
4.7
ia64 Linux
urgent Severity high
: rc
: ---
Assigned To: Jeff Law
Brian Brock
http://sourceforge.net/mailarchive/me...
PM_RHEL4_8
: ZStream
Depends On: 452501
Blocks: 513180
  Show dependency treegraph
 
Reported: 2008-07-11 16:34 EDT by Jan Kratochvil
Modified: 2012-06-14 16:44 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-06-14 16:44:44 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Simple vfork(2) testcase. (304 bytes, text/plain)
2008-07-11 16:34 EDT, Jan Kratochvil
no flags Details
Proposed backported fix from 4.5.17 (5.63 KB, patch)
2009-06-12 06:33 EDT, Denys Vlasenko
no flags Details | Diff

  None (edit)
Description Jan Kratochvil 2008-07-11 16:34:22 EDT
+++ This bug was initially created as a clone of Bug #452501 +++

`strace -f ./vfork' hangs on ia64.
(`-F' is not needed on GNU/Linux, `-f' is enough to trace vfork(2).)

RHEL4-U7-re20080711.0
kernel-2.6.9-78.EL.ia64
strace-4.5.16-1.el4.2.ia64

-- Additional comment from vmayatsk@redhat.com on 2008-06-26 10:35 EST --
This is because IA64 uses clone() instead of vfork(). I played a bit with strace
sources and found that function setbpt() (file util.c) has flag CLONE_VFORK set
in tcp->inst[0] for the case of SYS_clone/clone2, but hasn't it in the case of
SYS_fork/vfork. When I manually remove CLONE_VFORK from inst[0] (case
SYS_clone), strace -f on IA64 works just like as on x86_64. When I manually add
0x4000 (CLONE_VFORK) to arg0 (case SYS_fork) it hangs on x86_64 just like as on
IA64. I'm not an expert in strace, but seems it's a bug in strace utility.

-- Additional comment from jan.kratochvil@redhat.com on 2008-07-08 14:15 EST --
[...]
The ia64 threads fix is unrelated but posted here:
http://sourceforge.net/mailarchive/message.php?msg_name=20080630164049.GA19501%40host0.dyn.jankratochvil.net

###############################################################################

RHEL4-U7-re20080711.0 ia64
kernel-2.6.9-78.EL.ia64
strace-4.5.16-1.el4.2.ia64
$ strace -f ./vfork
...
clone(Process 16684 attached (waiting for parent)
[ hang ]

RHEL4-U7-re20080711.0 ia64
kernel-2.6.9-78.EL.ia64
strace-4.5.16-1.el4.2.ia64 + the patch above
[ This trace was copied from RHEL-5.2 Bug 452501 but it looks the same. ]
$ ./strace -f ./vfork
...
clone(Process 20396 attached
child_stack=0, flags=CLONE_VM|CLONE_VFORK|SIGCHLD) = 20396
[pid 20395] getpid()                    = 20395
...
[pid 20395] write(1, "20396 pid=20395\n", 1620396 pid=20395
) = 16
...
[pid 20395] nanosleep({1, 0},  <unfinished ...>
[pid 20396] write(1, "0 pid=20395\n", 120 pid=20395
) = 12
...
[pid 20396] nanosleep({1, 0},  <unfinished ...>
[pid 20395] <... nanosleep resumed> {1, 0}) = 0
[pid 20395] execve("/bin/true", ["/bin/true"...], [/* 47 vars */] <unfinished ...>
[pid 20396] <... nanosleep resumed> {1, 0}) = 0
[pid 20396] execve("/bin/true", ["/bin/true"...], [/* 47 vars */]) = 1
[pid 20395] <... execve resumed> )      = 1
...
[pid 20395] fstat(3,  <unfinished ...>
[pid 20396] exit_group(0)               = ?
Process 20396 detached
<... fstat resumed> {st_mode=S_IFREG|0644, st_size=58727440, ...}) = 0
--- SIGCHLD (Child exited) @ a000000000010621 (4fac) ---
mmap(NULL, 58727440, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2000000000308000
close(3)                                = 0
close(1)                                = 0
exit_group(0)                           = ?
Comment 1 Jan Kratochvil 2008-07-11 16:34:22 EDT
Created attachment 311618 [details]
Simple vfork(2) testcase.
Comment 3 Denys Vlasenko 2008-09-11 10:58:12 EDT
Have no RHEL4 ia64 machine at the moment to experiment with, but on RHEL5:

strace-4.5.16-1.el5.1 - exhibits this bug,

strace-4.5.16-1.el5_2.2 - does not.

Just FYI.
Comment 4 Denys Vlasenko 2008-10-07 10:48:00 EDT
A few more data points:

On RHEL4-U7, installed version of strace is strace-4.5.16-1.el4.2 and it exhibits the bug.

I just tested that last upstream release - strace-4.5.18 - can be successfully built on RHEL4-U7 and it does not exhibit this bug.
Comment 5 RHEL Product and Program Management 2008-10-31 12:48:09 EDT
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Comment 11 Denys Vlasenko 2009-06-12 06:33:49 EDT
Created attachment 347532 [details]
Proposed backported fix from 4.5.17

Tested to work with the testcase attached by Jan
on ia64 Red Hat Enterprise Linux AS release 4 (Nahant Update 8)
Comment 12 Roland McGrath 2009-06-16 18:42:26 EDT
That backport looks fine to me.

Note You need to log in before you can comment on or make changes to this bug.