Bug 130995

Summary: On i386 PTRACE_SINGLESTEP to deliver signal runs handler without single-step
Product: Red Hat Enterprise Linux 3 Reporter: Andrew Cagney <cagney>
Component: kernelAssignee: Ernie Petrides <petrides>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: mingo, petrides, riel
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-12-20 20:56:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 116894, 127692    
Attachments:
Description Flags
C program used in example
none
kernel patch
none
patch for 2.4.21-20.EL to fix single-step behavior for signal handlers none

Description Andrew Cagney 2004-08-26 15:11:59 UTC
Description of problem:

The ptrace(PT_STEP,SIGALARM) system call instead implements
ptrace(PT_CONTINUE,SIGALARM).

Version-Release number of selected component (if applicable):

Roland says this is present in all i386 kernels.

How reproducible:

Always.

Steps to Reproduce:

In the below, target_resume(...) corresponds directly to a ptrace call.

cagney@tomago$ gdb ./a.out
[...]
(gdb) b handler
Breakpoint 1 at 0x80483bb: file sigstep.c, line 31.
(gdb) list main
39	  itimer_real = ITIMER_REAL,
40	  itimer_virtual = ITIMER_VIRTUAL
41	} itimer = ITIMER_REAL; /* ITIMER_VIRTUAL; */
42	
43	main ()
44	{
45	
46	  /* Set up the signal handler.  */
47	  memset (&action, 0, sizeof (action));
48	  action.sa_handler = handler;
(gdb) 
49	  sigaction (SIGVTALRM, &action, NULL);
50	  sigaction (SIGALRM, &action, NULL);
51	
52	  /* The values needed for the itimer.  This needs to be at least long
53	     enough for the setitimer() call to return.  */
54	  memset (&itime, 0, sizeof (itime));
55	  itime.it_value.tv_usec = 250 * 1000;
56	
57	  /* Loop for ever, constantly taking an interrupt.  */
58	  while (1)
(gdb) 
59	    {
60	      /* Set up a one-off timer.  A timer, rather than SIGSEGV, is
61		 used as after a timer handler finishes the interrupted code
62		 can safely resume.  */
63	      setitimer (itimer, &itime, NULL);
64	      /* Wait.  */
65	      while (!done);
66	      done = 0;
67	    }
68	}
(gdb) break 65
Breakpoint 2 at 0x8048456: file sigstep.c, line 65.
(gdb) set debug target 1
(gdb) run
Starting program: /home/cagney/tmp/sigstep/a.out 
[...]
Breakpoint 2, main () at sigstep.c:65
65	      while (!done);
(gdb) step
target_terminal_inferior ()
target_xfer_memory (0x8048456, xxx, 2, read, xxx) = 2, bytes = a1 64

Try to step off breakpoint, get back SIGALRM (ok), EIP doesn't change
(ok).

target_resume (27256, step, 0)
target_wait (-1, status) = 27256,   status->kind = stopped, signal =
SIGALRM
target_fetch_registers (eip) = 56840408 0x8048456 134513750
target_terminal_inferior ()
target_xfer_memory (0x8048456, xxx, 2, read, xxx) = 2, bytes = a1 64

Try to deliver SIGALRM, get back SIGTRAP (ok), EIP didn't change (not
ok, should have been handler or signal trampoline).

target_resume (27256, step, SIGALRM)
target_wait (-1, status) = 27256,   status->kind = stopped, signal =
SIGTRAP
target_fetch_registers (eip) = 56840408 0x8048456 134513750
target_xfer_memory (0x80483bb, xxx, 1, read, xxx) = 1, bytes =
 c7
target_xfer_memory (0x80483bb, xxx, 1, write, xxx) = 1, bytes =
 cc
target_insert_breakpoint (0x80483bb, xxx) = 0
target_xfer_memory (0x8048456, xxx, 1, read, xxx) = 1, bytes = a1
target_xfer_memory (0x8048456, xxx, 1, write, xxx) = 1, bytes =
 cc
target_insert_breakpoint (0x8048456, xxx) = 0
target_xfer_memory (0x4e02b0, xxx, 1, read, xxx) = 1, bytes =
 55
target_xfer_memory (0x4e02b0, xxx, 1, write, xxx) = 1, bytes =
 cc
target_insert_breakpoint (0x4e02b0, xxx) = 0
target_xfer_memory (0x4e3310, xxx, 1, read, xxx) = 1, bytes =
 55
target_xfer_memory (0x4e3310, xxx, 1, write, xxx) = 1, bytes =
 cc
target_insert_breakpoint (0x4e3310, xxx) = 0
target_xfer_memory (0x320820, xxx, 1, read, xxx) = 1, bytes = 55
target_xfer_memory (0x320820, xxx, 1, write, xxx) = 1, bytes =
 cc
target_insert_breakpoint (0x320820, xxx) = 0
target_terminal_inferior ()
target_xfer_memory (0x8048457, xxx, 1, read, xxx) = 1, bytes = 64

GDB thinks it's still single stepping in main() so tries to do another
single-step but with breakpoints inserted (it assumed the above
managed to step off the breakpoint at 65).

Since the PC didn't change, it re-hits the breakpoint at 65 causing a
SIGTRAP and an off-by-one PC - decremented so that it matches the
breakpoint.

target_resume (-1, step, 0)
target_wait (-1, status) = 27256,   status->kind = stopped, signal =
SIGTRAP
target_fetch_registers (eip) = 57840408 0x8048457 134513751
target_prepare_to_store ()
target_store_registers (eip) = 56840408 0x8048456 134513750

target_xfer_memory (0x80483bb, xxx, 1, write, xxx) = 1, bytes =
 c7
target_remove_breakpoint (0x80483bb, xxx) = 0
target_xfer_memory (0x8048456, xxx, 1, write, xxx) = 1, bytes = a1
target_remove_breakpoint (0x8048456, xxx) = 0
target_xfer_memory (0x4e02b0, xxx, 1, write, xxx) = 1, bytes =
 55
target_remove_breakpoint (0x4e02b0, xxx) = 0
target_xfer_memory (0x4e3310, xxx, 1, write, xxx) = 1, bytes =
 55
target_remove_breakpoint (0x4e3310, xxx) = 0
target_xfer_memory (0x320820, xxx, 1, write, xxx) = 1, bytes = 55
target_remove_breakpoint (0x320820, xxx) = 0
target_terminal_ours ()

Breakpoint 2, main () at sigstep.c:65
65	      while (!done);
(gdb) 

The ptrace(PT_STEP,SIGNAL) should setup the signal and then (arguably)
 execute no instructions.

Comment 1 Andrew Cagney 2004-08-26 15:13:31 UTC
Created attachment 103122 [details]
C program used in example

Comment 3 Roland McGrath 2004-08-26 19:54:11 UTC
What's happening here is that signal handler setup (in all extant
Linux kernels) clears the TF (single-step) bit in the processor flags.
When it was set by PTRACE_SINGLESTEP, this means that the handler runs
unchecked (unless it hits another bkpt or other signal) until it
returns, restores TF and takes a single-step trap.  This looks in gdb
like the single-step didn't do anything, but in fact it ran the handler.

Comment 4 Roland McGrath 2004-08-31 06:32:02 UTC
Created attachment 103280 [details]
kernel patch 

I've just submitted this patch upstream for 2.6, and have not yet gotten
feedback.  The same fixes apply to 2.4/RHEL3 as well, but we can consider that
only once upstream has agreed to change 2.6 behavior.

Comment 5 Roland McGrath 2004-09-11 20:19:38 UTC
My fix (amended from what's attached here) has gone into 2.6 upstream.
I will produce a RHEL3 backport of the fix.

Comment 6 Roland McGrath 2004-09-11 20:24:41 UTC
Created attachment 103740 [details]
patch for 2.4.21-20.EL to fix single-step behavior for signal handlers

This is the backport of the changes that have already gone into 2.6 upstream.

Comment 8 Ernie Petrides 2004-09-20 06:47:31 UTC
A fix for this problem has just been committed to the RHEL3 U4
patch pool this evening (in kernel version 2.4.21-20.8.EL).


Comment 9 Andrew Cagney 2004-09-24 17:46:14 UTC
Does this fix cover all architectures, or do I need to go through each
in turn?  I noticed that a recent ppc64 kernel was exibiting the same bug.


Comment 10 Roland McGrath 2004-09-24 20:03:18 UTC
This issue is architecture-specific.  You must determine whether the
problem exists on each architecture, and file a bug for each.

Comment 11 Andrew Cagney 2004-09-27 13:18:34 UTC
Did this fix cover amd64?


Comment 13 Roland McGrath 2004-09-27 19:24:43 UTC
The patch here covers i386 native, x86-64 native, and x86-64's i386
emulation. 

Comment 14 Roland McGrath 2004-12-04 23:17:11 UTC
I believe this fix is already in Taroon, though not sure in which updates.
Ernie, please update here with the merge status.

Comment 15 Ernie Petrides 2004-12-06 23:34:27 UTC
The information is in comment #8.  The fix has already been committed to U4.


Comment 16 John Flanagan 2004-12-20 20:56:01 UTC
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-550.html


Comment 17 Peter Martuccelli 2005-10-18 21:47:24 UTC
Removing the blocking of bug 117972 on this bugzilla entry to keep this bugzilla
entry, bug 130995, from appearing on the RHEL3 U7 proposed list.