From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020529 Description of problem: Running "strace -f nptl/tst-fork3" hangs in idle. The strace output mentions non-existent flag CLONE_IDLETASK. The report for fork() implemented using clone() has entirely the wrong flags. Version-Release number of selected component (if applicable): strace-4.4.95-2 How reproducible: Always Steps to Reproduce: 1. Build the glibc internal testcase nptl/tst-fork3 . 2. Run "strace -f nptl/tst-fork3 2>strace.out &" in background, and notice that it never exits. The non-straced tst-fork3 exits very quickly. 3. Inspect strace.out for the second call to clone(), which is implementing fork(). This instance can be identified by child_stack==0. 4. Run "grep CLONE_IDLETASK /usr/include/*/*.h". Actual Results: strace hangs in idle. ----- clone(child_stack=0x40835950, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [15942], {entry_number:6, base_addr:0x40835d30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 15942 [pid 15942] clone(child_stack=0, flags=CLONE_VM|CLONE_FILES|CLONE_SIGHAND|CLONE_IDLETASK|CLONE_VFORK|CLONE_THREAD|CLONE_NEWNS|CLONE_UNTRACED|0x40000078) = 15943 ### flags are totally wrong; actual value is 0x1200011 . [pid 15943] clone(child_stack=0x41036950, flags=CLONE_VM|CLONE_FILES|CLONE_SIGHAND|CLONE_IDLETASK|CLONE_VFORK|CLONE_THREAD|CLONE_NEWNS|CLONE_UNTRACED|0x40000078) = 15944 ----- No output from the grep for CLONE_IDLETASK. Expected Results: strace exits normally. Second clone() has only two flags CLONE_CHILD_CLEARTID | CLONE_CHILD_SETTID , plus SIGCHLD; the value in %ebx is 0x1200011. Additional info:
I did not see the hanging failure mode, but the bug corrupts syscall arguments and so a variety of failure modes are possible. I've found and fixed the bug in the upstream strace sources. There will be an errata version as soon as possible.
Perhaps this is why 'strace -fp <pid>' is failing for me: # ps aux | grep httpd # strace -fp 9627 trace: ptrace(PTRACE_SYSCALL, ...): Operation not permitted detach: ptrace(PTRACE_DETACH, ...): Operation not permitted redhat 9, strace-4.4.95-2, default 2.4.20-8 kernel, updated httpd-2.0.40-21.1
That's very possible. Using strace on strace would show you the PIDs it uses in the ptrace calls that fail. If it is using bogus PID values, then that is probably the same bug.
Something I discovered was that if I do strace -fp <pid>, and if I get that error, the process in question is put into 'T' state in a ps listing. Traced or stopped, according to the man page. Something *really* neat occurred when I tried 'strace -fo/tmp/a strace -fp <pid>' (or somesuch)... that httpd process is not only in 'T' state, but it's unkillable, even with a kill -9. Did you say there was a new rpm for strace available somewhere?
There is a new strace version now available in rawhide. There will be an errata release for RHL9 at some point as well.