Bug 679129 - Change in behaviour between RH5.5 and RH6 with ptrace and syscalls.
Summary: Change in behaviour between RH5.5 and RH6 with ptrace and syscalls.
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Oleg Nesterov
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-21 17:07 UTC by kevin.fletcher
Modified: 2018-11-14 12:32 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-07-14 18:41:49 UTC
Target Upstream Version:


Attachments (Terms of Use)
Demo for issue. (30.00 KB, application/x-tar)
2011-02-21 17:07 UTC, kevin.fletcher
no flags Details

Description kevin.fletcher 2011-02-21 17:07:18 UTC
Created attachment 479960 [details]
Demo for issue.

Description of problem:
We have a process, the debugger, using ptrace() to control a target process, the debuggee. If the debuggee process is within a system call, and that system call is interrupted by a signal, then the process stops and the signal is reported to the debugger. When the debugger resumes the debuggee process, the debuggee process sets the PC value back by 2 bytes. This seems to allow the process to call back into the system call that was interrupted. However, that is not the case with the poll() call, on RH4/5/5.5 however it is on RH6.

I’ve attached a demo to this email, with instructions in README.txt.

We are in a position where we sometimes need to reset the debuggee process PC value. It is apparent that if the debuggee process was interrupted by a signal while processing a system call, we MAY need to allow for the system decrementing the PC value. The question is, how do we know which system calls require which action, and which kernel version did the behaviour change, so we can cope with any version of the kernel.

Version-Release number of selected component (if applicable):
Difference between RH5.5 and RH6

How reproducible:
Demo attached including instructions and makefile ...etc.

Steps to Reproduce:
1. Download demo.
2. Follow instructions.
3.
  
Actual results:

In RH6 you get :-
interrupting 'read' resets PC back by 2
interrupting 'poll' resets PC back by 2
In RH5.5 you get :-
interrupting 'read' resets PC back by 2
interrupting 'poll' does NOT reset PC back by 2

Additional info:
This may well not be a bug, just a change in behaviour, in which case, how do we deal with it ?

Comment 2 RHEL Program Management 2011-04-04 02:48:47 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 6 Oleg Nesterov 2011-07-14 18:41:02 UTC
(In reply to comment #0)
> Description of problem:
> We have a process, the debugger, using ptrace() to control a target process,
> the debuggee. If the debuggee process is within a system call, and that system
> call is interrupted by a signal, then the process stops and the signal is
> reported to the debugger. When the debugger resumes the debuggee process, the
> debuggee process sets the PC value back by 2 bytes. This seems to allow the
> process to call back into the system call that was interrupted.

No, this is not enough. You should also setup the registers correctly.
I'd not recommend to restart the syscalls this way unless you know
what you are doing ;)

The test-case nearly killed me, it is very much overcomplicated imho
and doesn't match the description above.

So. what it does is:

        - the tracer send the signal to the tracee

        - the tracee reports the signal

        - debugger sets $rip = dummy_routine and then does PTRACE_SINGLESTEP
          (hmm. this dummy_routine is in fact dummy_label, but this
           doesn't matter. why do you need this asm at all?)

        - then

                if ($rip == dummy_routine - 1)
                        i_like_this();
                else
                        complain();

The first question is, why i_like_this() thinks that PC should
be _decremented_ after SINGLESTEP? But yes, this is what actually
happens.

Lets discuss the "poll" case only, sys_read() is almost the same.

This is because $rax = -ERESTART_RESTARTBLOCK, and the kernel
does $rip -= 2 before return to the user-mode, so it becomes
dummy_routine - 2. Remember, the debugger cancelled the signal,
so the kernel correctly does the restart logic.

Then the process steps over the "nop" insn in a64.s and reports
another trap with $rip incremented by one.

Why rhel5 differs? Because unlike in rhel6, sys_poll() returns
-EINTR if interrupted, that is all. This old implementation does
not support the restart-if-eintr-is-spurious.

Not a bug, the test-case is wrong.

Comment 7 Oleg Nesterov 2011-07-14 18:52:49 UTC
just in case...

(In reply to comment #6)
>
> So. what it does is:
> 
>         - the tracer send the signal to the tracee

I meant, to the tracee sleeping in read/poll syscall

>         - the tracee reports the signal

and the trace is going to return to user-mode

Comment 8 kevin.fletcher 2011-07-19 13:49:47 UTC
I think that I haven't explained very well, what it is we are trying to do.

So, first off, the demo program is simply there to prove whether the PC gets reset back by a value of 2 when it is resumed.

Leaving the demo aside for a moment, the actual situation we have is :-
1. We have a tracer, tracing a target process.
2. The tracee process is waiting in a system call.
3. The tracee process receives a signal.
4. The tracee process stops.
5. The tracer sees the signal.
6. The tracer sets the PC of the tracee to a known routine in the tracee process (which will cleanup and exit that process).
7. The tracer resumes execution of the tracee.
8. The tracee exits...or...crashes because the PC has been reset back by 2.

We simply want to know if the PC will be modified back by a value of 2 in the tracee when we resume the process. Is checking the $rax register for -ERESTART_RESTARTBLOCK sufficient to know whether the PC will be modified by a value of 2 or not ?

Comment 9 Oleg Nesterov 2011-07-19 15:01:25 UTC
> We simply want to know if the PC will be modified back by a value of 2 in the
> tracee when we resume the process. Is checking the $rax register for
> -ERESTART_RESTARTBLOCK sufficient to know whether the PC will be modified by a
> value of 2 or not ?

-ERESTARTNOHAND, -ERESTARTSYS, -ERESTARTNOINTR, and -ERESTART_RESTARTBLOCK
mean regs->ip will be decremented to restart the syscall. Assuming that
you didn't change the "regs->orig_ax >= 0" condition.

And. Assuming that the tracer cancels the signal. Otherwise you should
check -ERESTARTNOINTR || (-ERESTARTSYS && (sa_flags & SA_RESTART)).

Please note that the tracee can receive another signal after
PTRACE_CONT/PTRACE_SINGLESTEP/whatever and before it returns to the
user mode. This can happen before _or_ after regs->ip was already
changed, you should check regs->ax every time.

In case my explanation is not clear. You can simply look at
arch/x86/kernel/signal.c:do_signal(), the code under
"if (syscall_get_nr(current, regs) >= 0)" handles the case when syscall
was interrupted but there is no signal to deliver. (like with your
test-case).

arch/x86/kernel/signal.c:handle_signal() handles the case when the
signal was delivered.


Note You need to log in before you can comment on or make changes to this bug.