Bug 126908 - single-stepping system call executes two instructions on powerpc
single-stepping system call executes two instructions on powerpc
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
powerpc Linux
medium Severity medium
: ---
: ---
Assigned To: David Woodhouse
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-06-28 18:40 EDT by Andrew Cagney
Modified: 2007-11-30 17:07 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-12-20 15:55:23 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch using same method as i386/x86_64 (1.73 KB, text/plain)
2004-09-20 16:57 EDT, David Woodhouse
no flags Details
Updated patch (2.12 KB, patch)
2004-09-20 18:41 EDT, David Woodhouse
no flags Details | Diff
Correct patch. (2.75 KB, patch)
2004-09-20 20:54 EDT, David Woodhouse
no flags Details | Diff

  None (edit)
Description Andrew Cagney 2004-06-28 18:40:39 EDT
Description of problem:

See PR 117972.

Version-Release number of selected component (if applicable):


How reproducible:

Always.

Steps to Reproduce:

$ cag > nothing.c
#include <signal.h>

main ()
{
  while (1) 
    {
      kill (getpid (), 0);
    }
}

$ cc -g -o to-rhaps7 nothing.c
$ gdb ./to-rhaps7
(gdb) run
Starting program: /home/cagney/tmp/to-rhaps 

Program received signal SIGINT, Interrupt.
0x0ff49144 in getpid () from /lib/tls/libc.so.6
(gdb) disassemble 
Dump of assembler code for function getpid:
0x0ff4913c <getpid+0>:  li      r0,20
0x0ff49140 <getpid+4>:  sc
0x0ff49144 <getpid+8>:  blr
End of assembler dump.
(gdb) break 0x0ff49140
Function "0x0ff49140" not defined.
Make breakpoint pending on future shared library load? (y or [n]) n
(gdb) break *0x0ff49140
Breakpoint 1 at 0xff49140
(gdb) c
Continuing.

Breakpoint 1, 0x0ff49140 in getpid () from /lib/tls/libc.so.6
(gdb) display/i $pc
1: x/i $pc  0xff49140 <getpid+4>:       sc
(gdb) del 1
(gdb) disassemble 
Dump of assembler code for function getpid:
0x0ff4913c <getpid+0>:  li      r0,20
0x0ff49140 <getpid+4>:  sc
0x0ff49144 <getpid+8>:  blr
End of assembler dump.
(gdb) stepi
0x1000046c in main () at nothing.c:7
7             kill (getpid (), 0);
1: x/i $pc  0x1000046c <main+28>:       mr      r0,r3

Notice how the STEPI executed both:
0x0ff49140 <getpid+4>:  sc
0x0ff49144 <getpid+8>:  blr
Comment 9 Roland McGrath 2004-09-20 14:40:58 EDT
I don't know PPC in detail, so we may need to consult on whether this
approach makes sense there.  The issue on x86/x86-64 is that the
hardware single-step flag set in the processor flags when returning
from a system call means to execute one user instruction before stopping,
so the instruction immediately after the system call entry instruction
doesn't get traced by single-step.  The approach to fix that is a
software bit PT_SINGLESTEP that's set by PTRACE_SINGLESTEP and that
system call return notices to mean it should simulate a single-step
trap with the PC of the first user instruction to be run after the
syscall.
If the meaning of the PPC's MSR_SE bit is the same as x86's TF, then
copying that method should be fine.

The patch looks incomplete, because it doesn't change
syscall_enter_leave to actually do the tracing in the PT_SINGLESTEP case.
Comment 11 David Woodhouse 2004-09-20 16:57:55 EDT
Created attachment 104031 [details]
Patch using same method as i386/x86_64

This attempts to fix the problem in the same way we fix it for i386 and x86_64,
in bug #126699.
Comment 12 Roland McGrath 2004-09-20 17:07:54 EDT
That patch looks to me like it will work, not knowing PPC myself.
For the x86 changes, I felt it appropriate to get the change of
behavior incorporated in 2.6 upstream before we committed to changing
the RHEL3 behavior.
Comment 14 David Woodhouse 2004-09-20 18:41:20 EDT
Created attachment 104042 [details]
Updated patch

This patch has more chance of working -- the previous one had the set/clear of
PT_SINGLESTEP in set_single_step() and clear_single_step() the wrong way round.


But it doesn't actually seem to work. I'm not entirely sure why. More
investigation required.
Comment 15 Ernie Petrides 2004-09-20 19:23:18 EDT
Reassigning to DavidW and reverting to ASSIGNED state.  (David,
I change bugs to MODIFIED state when the associated patches are
actually committed to CVS.)
Comment 16 David Woodhouse 2004-09-20 20:54:01 EDT
Created attachment 104050 [details]
Correct patch.

The previous patch works only in 64-bit mode. It'll work a little better for
32-bit gdb if I put the same changes into ptrace32.c as I have in ptrace.c.

This doesn't look like it should be an issue for x86_64 ptrace32.c, because
that one just calls through to the 64-bit functions.
Comment 17 Andrew Cagney 2004-09-23 16:24:08 EDT
Using p630.lab.boston.redhat.com and the sources in
/tmp/cagney/gdb+dejagnu-20040607/ configured in /tmp/cagney/native using:

$ cd /tmp/cagney/native/
$ CC='gcc -m64' /tmp/cagney/gdb+dejagnu-20040607/configure
$ make
$ file gdb/gdb
<something about 64-bit elf>

(which gives a 64-bit GDB), and tested using:

$ cd /tmp/cagney/native/gdb/testsuite
$ make check RUNTESTFLAGS='--target_board=unix/-m32\ unix/-m64
sigstep.exp'
$ less gdb.log

(assuming no typos) I'm seeing that:

- 32-bit stepping of sigreturn works
For GDB, this is the critical system call that must not double-step. 
This can be seen with the sigstep.exp test where it stepi's the "sc"
instruction.

- 64-bit stepping of sigreturn works when tested by hand
GDB is scrambling its backtrace when single-stepping through an
epolog, but that is a separate GDB problem.

- 32-bit and 64-bit stepi when a pending signal doesn't work
It would appear to free run.  This is a related and known problem, see
130995.

So i think this bug is fixed.




Comment 18 Ernie Petrides 2004-09-24 05:36:16 EDT
A fix for this problem has just been committed to the RHEL3 U4
patch pool this evening (in kernel version 2.4.21-20.11.EL).
Comment 19 John Flanagan 2004-12-20 15:55:23 EST
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-550.html

Note You need to log in before you can comment on or make changes to this bug.