Description of problem: This bug is present the same way on upstream (vanilla) kernels. After `strace -p PID' session the next attach gets wrong (excessive SIGTRAP). Version-Release number of selected component (if applicable): kernel-2.6.9-55.0.2.EL.x86_64 kernel-2.6.9-57.EL.x86_64 2.6.22-rc4-git7 (upstream) [ Fixed in F-7 / utrace kernels! ] How reproducible: Always. Steps to Reproduce: 1. cat >t.c #include <sys/stat.h> int main(int argc, char *argv[]) { struct stat b1; while(1) { stat("/", &b1); } } 2. gcc -o t t.c -Wall -ggdb2 3. ./t& pid=$! 4. strace -p $pid # break it after several syscalls 5. strace -o x -q gdb -p $pid Actual results: upstream gdb-6.6: GNU gdb 6.6 ... Attaching to process 2895 linux-nat.c:1026: internal-error: linux_nat_attach: Assertion `pid == GET_PID (inferior_ptid) && WIFSTOPPED (status) && WSTOPSIG (status) == SIGSTOP' failed. RHEL-4.5 GDB gdb-6.3.0.0-1.143.el4 attaches but later leaves the inferior Stopped. The problem is: ptrace(PTRACE_ATTACH, 20073, 0, 0) = 0 wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], 0, NULL) = 20073 Despite the previous strace session did (artifically assembled dump): ptrace(PTRACE_ATTACH, 5592, 0x1, 0) = 0 wait4(4294967295, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], __WALL, NULL) = 5592 ptrace(PTRACE_SYSCALL, 5592, 0x1, SIG_0) = 0 wait4(4294967295, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], __WALL, NULL) = 5592 ptrace(PTRACE_SYSCALL, 5592, 0x1, SIG_0) = 0 wait4(4294967295, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], __WALL, NULL) = 5592 ptrace(PTRACE_DETACH, 5592, 0x1, SIG_0) = 0 Expected results: The first wait4() after the second PTRACE_ATTACH should return SIGSTOP. Additional info: The problem is due to the GDB fix of Bug 233746 now GDB no longer just leaves the process stopped but it will abort its run now: GNU gdb Red Hat Linux (6.3.0.0-1.153.el4rh) Attaching to process 20883 Redelivering pending Trace/breakpoint trap. Redelivering pending Trace/breakpoint trap. Program process 0 exited: Unknown signal 0 (terminated) /root/jkratoch/redhat/20883: No such file or directory. (gdb) q [1]+ Trace/breakpoint trap ./a.out $ _ Attached testcase: 98 bad in 200 iterations - may occur in 1 of 2 runs: 49.00% * 2 = 98.00% FAIL While it is an upstream problem it may hit customers more severely for GDB than before. Is it viable to fix it in RHEL-4.6 kernel ptrace? GDB exception to ignore SIGTRAP during PTRACE_ATTACH is possible but definitely not right.
Created attachment 186401 [details] Kernel testcase.
(sorry for the ping but the NEW Bug notification mail from Bugzilla got lost for me)
I think I know from the kernel source what is going on here. Can you verify that this never happens on i386 in the upstream kernel? The bug is actually arch-specific, and of code I've checked only the i386 has the necessary bits in its function.
Created attachment 187211 [details] rhel4 backport of patch posted upstream This applies and I expect works right for all RHEL4 architectures, but I have not tried it. It's a subset of the upstream patch I've posted.
committed in stream U6 build 59. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0791.html