Created attachment 316450 [details] Source for pipetest Description of problem: # strace -f /home/orion/src/pipetest/pipetest lsb_release execve("/home/orion/src/pipetest/pipetest", ["/home/orion/src/pipetest/pipetes"..., "lsb_release"], [/* 38 vars */]) = 0 [ Process PID=14902 runs in 32 bit mode. ] brk(0) = 0x902d000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=115237, ...}) = 0 mmap2(NULL, 115237, PROT_READ, MAP_PRIVATE, 3, 0) = 0xfffffffff7fbc000 close(3) = 0 open("/lib/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@\370%\0004\0\0\0L"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1519244, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7fbb000 mmap2(0x249000, 1521232, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x249000 mmap2(0x3b7000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16e)= 0x3b7000 mmap2(0x3ba000, 9808, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x3ba000 close(3) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7fba000 set_thread_area(0xffcd7740) = 0 mprotect(0x3b7000, 8192, PROT_READ) = 0 mprotect(0x245000, 4096, PROT_READ) = 0 munmap(0xf7fbc000, 115237) = 0 brk(0) = 0x902d000 brk(0x904e000) = 0x904e000 SYS_331(0xffcd6908, 0x80000, 0x3b8ff4, 0x902d008, 0x1) = 0 clone(Process 14903 attached (waiting for parent) resume: ptrace(PTRACE_SYSCALL, ...): No such process child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0) = 14903 [pid 14902] close(4) = 0 [pid 14902] fcntl64(3, F_SETFD, 0) = 0 [pid 14902] fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 14902] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7fd8000 [pid 14902] read(3, and here it hangs. Need to kill -9 the process to end. Version-Release number of selected component (if applicable): strace-4.5.18-1.fc10.x86_64 How reproducible: Everytime
It seems to be a kernel-side problem. bug 461552 probably has the same cause. I tested a few kernels and both this bug and bug 461552 appeared at the same kernel version: 2.6.27-0.166.rc0.git8 is ok 2.6.27-0.173.rc0.git11 is bad I also tested relatively recent upstream kernel + utrace patch and it still has this bug. The bug is that clone(... CLONE_PTRACE ...) doesn't stop with SIGSTOP anymore. This affects strace because it patches clone() syscalls with this flag in order to get a trap just after a clone() in the child, not sometime after it. Now it does not get this trap.
Was further debugging it using this testcase from systemtap testsuite (built with "gcc -Os -D_GNU_SOURCE -o clone-ptrace clone-ptrace.c"): http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/clone-ptrace.c?cvsroot=systemtap On recent vanilla kernel+utrace patch, testcase fails here: assert (WIFSTOPPED (status)); assert (WSTOPSIG (status) == SIGSTOP); /* Test we can trace the new child. */ errno = 0; ptrace (PTRACE_SYSCALL, grandchild, (void *) 1, (void *) 0); if (errno == ESRCH) { /* Expected failure - we are not the ptrace parent of the new child despite CLONE_PTRACE was used to create it. Still it got at least its SIGSTOP due to CLONE_PTRACE. Detected on: kernel-2.6.27-0.329.rc6.git2.fc10.x86_64 */ return 1; IOW: we got SIGSTOP, but when we try to step one syscall, it doesn't work. ptrace (PTRACE_SYSCALL...) fails in kernel here - utrace_attach_task() returns -2 (NB: -2 is -ENOENT), and this makes ptrace_check_attach() return ESRCH: int ptrace_check_attach(struct task_struct *child, int kill) { struct utrace_attached_engine *engine; struct utrace_examiner exam; int ret; engine = utrace_attach_task(child, UTRACE_ATTACH_MATCH_OPS, &ptrace_utrace_ops, NULL); if (IS_ERR(engine)) return -ESRCH; ... This in turn makes sys_ptrace() to fail: asmlinkage long sys_ptrace(long request, long pid, long addr, long data) { ... ret = ptrace_check_attach(child, request == PTRACE_KILL); if (ret < 0) goto out_put_task_struct; Didn't look deeper yet why utrace_attach_task() fails.
Tested kernel-2.6.27-0.372.rc8 - seems to be fixed there.