276861 – ptrace: SIGTRAP on second PTRACE_ATTACH

Bug 276861 - ptrace: SIGTRAP on second PTRACE_ATTACH

Summary: ptrace: SIGTRAP on second PTRACE_ATTACH

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	4.6
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	Roland McGrath
QA Contact:	Martin Jenner
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	276091
TreeView+	depends on / blocked

Reported:	2007-09-04 18:14 UTC by Jan Kratochvil
Modified:	2007-11-30 22:07 UTC (History)
CC List:	2 users (show)
Fixed In Version:	RHBA-2007-0791
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-11-15 16:32:14 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Kernel testcase. (3.35 KB, text/plain) 2007-09-04 18:14 UTC, Jan Kratochvil	no flags	Details
rhel4 backport of patch posted upstream (400 bytes, patch) 2007-09-05 10:07 UTC, Roland McGrath	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2007:0791	0	normal	SHIPPED_LIVE	Updated kernel packages available for Red Hat Enterprise Linux 4 Update 6	2007-11-14 18:25:55 UTC

Description Jan Kratochvil 2007-09-04 18:14:10 UTC

Description of problem:
This bug is present the same way on upstream (vanilla) kernels.
After `strace -p PID' session the next attach gets wrong (excessive SIGTRAP).

Version-Release number of selected component (if applicable):
kernel-2.6.9-55.0.2.EL.x86_64
kernel-2.6.9-57.EL.x86_64
2.6.22-rc4-git7 (upstream)
[ Fixed in F-7 / utrace kernels! ]

How reproducible:
Always.

Steps to Reproduce:
1. cat >t.c
#include <sys/stat.h>
int main(int argc, char *argv[])
{
  struct stat b1;
  while(1) { stat("/", &b1); }
}
2. gcc -o t t.c -Wall -ggdb2
3. ./t& pid=$!
4. strace -p $pid  # break it after several syscalls
5. strace -o x -q gdb -p $pid

Actual results:
upstream gdb-6.6:
GNU gdb 6.6
...
Attaching to process 2895
linux-nat.c:1026: internal-error: linux_nat_attach: Assertion `pid == GET_PID
(inferior_ptid) && WIFSTOPPED (status) && WSTOPSIG (status) == SIGSTOP' failed.

RHEL-4.5 GDB gdb-6.3.0.0-1.143.el4
attaches but later leaves the inferior Stopped.

The problem is:
ptrace(PTRACE_ATTACH, 20073, 0, 0)      = 0
wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], 0, NULL) = 20073

Despite the previous strace session did (artifically assembled dump):
ptrace(PTRACE_ATTACH, 5592, 0x1, 0)     = 0
wait4(4294967295, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], __WALL, NULL) = 5592
ptrace(PTRACE_SYSCALL, 5592, 0x1, SIG_0) = 0
wait4(4294967295, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], __WALL, NULL) = 5592
ptrace(PTRACE_SYSCALL, 5592, 0x1, SIG_0) = 0
wait4(4294967295, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], __WALL, NULL) = 5592
ptrace(PTRACE_DETACH, 5592, 0x1, SIG_0) = 0

Expected results:
The first wait4() after the second PTRACE_ATTACH should return SIGSTOP.

Additional info:
The problem is due to the GDB fix of Bug 233746 now GDB no longer just leaves
the process stopped but it will abort its run now:

GNU gdb Red Hat Linux (6.3.0.0-1.153.el4rh)
Attaching to process 20883
Redelivering pending Trace/breakpoint trap.
Redelivering pending Trace/breakpoint trap.
Program process 0 exited: Unknown signal 0 (terminated)
/root/jkratoch/redhat/20883: No such file or directory.
(gdb) q
[1]+  Trace/breakpoint trap   ./a.out
$ _

Attached testcase:
98 bad in 200 iterations - may occur in 1 of 2 runs: 49.00% * 2 = 98.00%
FAIL

While it is an upstream problem it may hit customers more severely for GDB than
before.
Is it viable to fix it in RHEL-4.6 kernel ptrace?
GDB exception to ignore SIGTRAP during PTRACE_ATTACH is possible but definitely
not right.

Comment 1 Jan Kratochvil 2007-09-04 18:14:10 UTC

Created attachment 186401 [details]
Kernel testcase.

Comment 2 Jan Kratochvil 2007-09-04 20:52:14 UTC

(sorry for the ping but the NEW Bug notification mail from Bugzilla got lost for me)

Comment 3 Roland McGrath 2007-09-04 21:18:55 UTC

I think I know from the kernel source what is going on here.
Can you verify that this never happens on i386 in the upstream kernel?  The bug
is actually arch-specific, and of code I've checked only the i386 has the
necessary bits in its function.

Comment 4 Roland McGrath 2007-09-05 10:07:09 UTC

Created attachment 187211 [details]
rhel4 backport of patch posted upstream

This applies and I expect works right for all RHEL4 architectures, but I have
not tried it.  It's a subset of the upstream patch I've posted.

Comment 9 Jason Baron 2007-09-12 19:14:18 UTC

committed in stream U6 build 59. A test kernel with this patch is available from
http://people.redhat.com/~jbaron/rhel4/

Comment 12 errata-xmlrpc 2007-11-15 16:32:14 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0791.html

Note You need to log in before you can comment on or make changes to this bug.