Hide Forgot
Created attachment 479960 [details] Demo for issue. Description of problem: We have a process, the debugger, using ptrace() to control a target process, the debuggee. If the debuggee process is within a system call, and that system call is interrupted by a signal, then the process stops and the signal is reported to the debugger. When the debugger resumes the debuggee process, the debuggee process sets the PC value back by 2 bytes. This seems to allow the process to call back into the system call that was interrupted. However, that is not the case with the poll() call, on RH4/5/5.5 however it is on RH6. I’ve attached a demo to this email, with instructions in README.txt. We are in a position where we sometimes need to reset the debuggee process PC value. It is apparent that if the debuggee process was interrupted by a signal while processing a system call, we MAY need to allow for the system decrementing the PC value. The question is, how do we know which system calls require which action, and which kernel version did the behaviour change, so we can cope with any version of the kernel. Version-Release number of selected component (if applicable): Difference between RH5.5 and RH6 How reproducible: Demo attached including instructions and makefile ...etc. Steps to Reproduce: 1. Download demo. 2. Follow instructions. 3. Actual results: In RH6 you get :- interrupting 'read' resets PC back by 2 interrupting 'poll' resets PC back by 2 In RH5.5 you get :- interrupting 'read' resets PC back by 2 interrupting 'poll' does NOT reset PC back by 2 Additional info: This may well not be a bug, just a change in behaviour, in which case, how do we deal with it ?
Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
(In reply to comment #0) > Description of problem: > We have a process, the debugger, using ptrace() to control a target process, > the debuggee. If the debuggee process is within a system call, and that system > call is interrupted by a signal, then the process stops and the signal is > reported to the debugger. When the debugger resumes the debuggee process, the > debuggee process sets the PC value back by 2 bytes. This seems to allow the > process to call back into the system call that was interrupted. No, this is not enough. You should also setup the registers correctly. I'd not recommend to restart the syscalls this way unless you know what you are doing ;) The test-case nearly killed me, it is very much overcomplicated imho and doesn't match the description above. So. what it does is: - the tracer send the signal to the tracee - the tracee reports the signal - debugger sets $rip = dummy_routine and then does PTRACE_SINGLESTEP (hmm. this dummy_routine is in fact dummy_label, but this doesn't matter. why do you need this asm at all?) - then if ($rip == dummy_routine - 1) i_like_this(); else complain(); The first question is, why i_like_this() thinks that PC should be _decremented_ after SINGLESTEP? But yes, this is what actually happens. Lets discuss the "poll" case only, sys_read() is almost the same. This is because $rax = -ERESTART_RESTARTBLOCK, and the kernel does $rip -= 2 before return to the user-mode, so it becomes dummy_routine - 2. Remember, the debugger cancelled the signal, so the kernel correctly does the restart logic. Then the process steps over the "nop" insn in a64.s and reports another trap with $rip incremented by one. Why rhel5 differs? Because unlike in rhel6, sys_poll() returns -EINTR if interrupted, that is all. This old implementation does not support the restart-if-eintr-is-spurious. Not a bug, the test-case is wrong.
just in case... (In reply to comment #6) > > So. what it does is: > > - the tracer send the signal to the tracee I meant, to the tracee sleeping in read/poll syscall > - the tracee reports the signal and the trace is going to return to user-mode
I think that I haven't explained very well, what it is we are trying to do. So, first off, the demo program is simply there to prove whether the PC gets reset back by a value of 2 when it is resumed. Leaving the demo aside for a moment, the actual situation we have is :- 1. We have a tracer, tracing a target process. 2. The tracee process is waiting in a system call. 3. The tracee process receives a signal. 4. The tracee process stops. 5. The tracer sees the signal. 6. The tracer sets the PC of the tracee to a known routine in the tracee process (which will cleanup and exit that process). 7. The tracer resumes execution of the tracee. 8. The tracee exits...or...crashes because the PC has been reset back by 2. We simply want to know if the PC will be modified back by a value of 2 in the tracee when we resume the process. Is checking the $rax register for -ERESTART_RESTARTBLOCK sufficient to know whether the PC will be modified by a value of 2 or not ?
> We simply want to know if the PC will be modified back by a value of 2 in the > tracee when we resume the process. Is checking the $rax register for > -ERESTART_RESTARTBLOCK sufficient to know whether the PC will be modified by a > value of 2 or not ? -ERESTARTNOHAND, -ERESTARTSYS, -ERESTARTNOINTR, and -ERESTART_RESTARTBLOCK mean regs->ip will be decremented to restart the syscall. Assuming that you didn't change the "regs->orig_ax >= 0" condition. And. Assuming that the tracer cancels the signal. Otherwise you should check -ERESTARTNOINTR || (-ERESTARTSYS && (sa_flags & SA_RESTART)). Please note that the tracee can receive another signal after PTRACE_CONT/PTRACE_SINGLESTEP/whatever and before it returns to the user mode. This can happen before _or_ after regs->ip was already changed, you should check regs->ax every time. In case my explanation is not clear. You can simply look at arch/x86/kernel/signal.c:do_signal(), the code under "if (syscall_get_nr(current, regs) >= 0)" handles the case when syscall was interrupted but there is no signal to deliver. (like with your test-case). arch/x86/kernel/signal.c:handle_signal() handles the case when the signal was delivered.