Bug 232837

Summary: utrace: PTRACE_ATTACH of SIGSTOPped process hangs
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: kernelAssignee: Roland McGrath <roland>
Status: CLOSED RAWHIDE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: rawhideCC: cagney, cmoller, mjw
Target Milestone: ---Keywords: Regression, Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.23-0.204.rc8.fc8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-04 20:23:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 233540, 233852    
Attachments:
Description Flags
Testcase (returns OK or abort()s).
none
Testcase for kernel-2.6.20-1.2935.rm1.fc6: PTRACE_PEEKUSER + PTRACE_GETREGS
none
Testcase for this bug
none
Testcase for kernel-2.6.20-1.2935.rm2.fc6: second PTRACE_ATTACH
none
Roland's fix. none

Description Jan Kratochvil 2007-03-18 17:33:10 UTC
Description of problem:
utrace implementation of ptrace(2) is incompatible:
PTRACE_ATTACH on a process being stopped (by SIGSTOP) never returns.
On non-utrace kernels it returns, tested:
  kernel-2.6.20-1.2300.fc5.x86_64
  linux-2.6.17.7.x86_64 (from kernel.org)
  linux-2.6.16-xen.i686 (from kernel.org)

Version-Release number of selected component (if applicable):
kernel-xen-2.6.19-1.2898.2.3.fc7.i686
kernel-2.6.20-1.2925.fc6.i586

How reproducible:
Always.

Steps to Reproduce:
1. Process A should be: kill -STOP process_A_PID
2. Process B should: ptrace (PTRACE_ATTACH, process_A_PID, NULL, NULL);
3. Process B should: waitpid (process_A_PID, &status, 0);

Actual results:
3. Process B hangs.

Expected results:
3. Process B syscall returns with: WSTOPSIG (status) == SIGSTOP

Additional info:
Testcase attached.
It was causing 12 FAILs on GDB the testcase `gdb.base/attachstop.exp'.

Comment 1 Jan Kratochvil 2007-03-18 17:33:10 UTC
Created attachment 150336 [details]
Testcase (returns OK or abort()s).

Comment 2 Roland McGrath 2007-03-19 20:06:20 UTC
Looking into it.  Someone please add this as a regression test in the frysk suite.

Comment 4 Jan Kratochvil 2007-03-20 00:50:28 UTC
Created attachment 150448 [details]
Testcase for kernel-2.6.20-1.2935.rm1.fc6: PTRACE_PEEKUSER + PTRACE_GETREGS

Tested kernel-2.6.20-1.2935.rm1.fc6 passes PTRACE_ATTACH / waitpid() but it
fails on the registers reading:
ptrace(PTRACE_ATTACH, 16984, 0, 0)	= 0
wait4(16984, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGSTOP}], 0, NULL) = 16984
ptrace(PTRACE_PEEKUSER, 16984, 8*R15, [0]) = -1 ESRCH (No such process)
ptrace(PTRACE_GETREGS, 16984, 0, 0x7fffde5e2a50) = -1 ESRCH (No such process)

Updated testcase tests also this kernel feature.
On kernel-2.6.20-1.2935.rm1.fc6.x86_64 it now fails with:
attachstop2: attachstop2.c:77: main: Assertion `(*__errno_location ()) == 0'
failed.
Aborted

Comment 5 Chris Moller 2007-03-20 00:54:04 UTC
Created attachment 150449 [details]
Testcase for this bug

Having the testcase simply abort doesn't work in the frysk test suite.	This
version has been tweaked to exit(0) on pass and exit(1) on fail and it's what
I'm going to stick into the suite.

Comment 6 Jan Kratochvil 2007-03-20 00:56:39 UTC
No other regressions were found during compare of GDB-6.6-5 testsuite results:
kernel-2.6.20-1.2300.fc5.x86_64 -> kernel-2.6.20-1.2935.rm1.fc6.x86_64


Comment 7 Roland McGrath 2007-03-20 02:25:46 UTC
You also need to clean up all the asserts to be e.g. error (2, errno, ...) calls.
Do that on attachstop2.c for the suite.

Comment 8 Chris Moller 2007-03-20 03:11:18 UTC
Okay, original frysk testsuite test replaced with one based on attachstop2.c.

Comment 10 Jan Kratochvil 2007-03-20 12:26:47 UTC
Created attachment 150476 [details]
Testcase for kernel-2.6.20-1.2935.rm2.fc6: second PTRACE_ATTACH

It is sad but kernel-2.6.20-1.2935.rm2.fc6 still hangs on the sequence:
PTRACE_ATTACH, PTRACE_DETACH, PTRACE_ATTACH.

Chris, based on your variant, also the frysk testsuite got committed this
update.

Comment 11 Jan Kratochvil 2007-03-22 15:31:35 UTC
No GDB testsuite regressions found for kernel-2.6.20-1.2300.fc5 -> 
kernel-2.6.20-1.2936.rm2.fc6 ( /mnt/brew/scratch/roland/task_684350/ ) when ran
on i686 + x86_64.


Comment 12 Roland McGrath 2007-08-02 05:39:01 UTC
This is long fixed and should be closed, right?

Comment 13 Jan Kratochvil 2007-08-02 14:41:35 UTC
Yes, thanks, all of its 3 sub-bugs were verified as fixed on:
kernel-2.6.21-1.3228.fc7.x86_64


Comment 14 Jan Kratochvil 2007-08-30 10:53:25 UTC
There is a regression for the testcase of Comment 10:
kernel-2.6.21-1.3228.fc7.x86_64: PASS (as in Comment 13 above)
but:
kernel-2.6.22.4-65.fc7.x86_64: FAIL
kernel-2.6.23-0.149.rc4.fc8.x86_64: FAIL

Testcase is now provided in Frysk as `frysk4217/attachstop.c'.


Comment 15 Jan Kratochvil 2007-08-30 12:30:39 UTC
Created attachment 180921 [details]
Roland's fix.

Still testing possible regressions but basic tests look OK.

Comment 16 Jan Kratochvil 2007-08-30 14:29:22 UTC
I see no regressions on the fix in Comment 15.


Comment 17 Roland McGrath 2007-09-04 20:56:58 UTC
Current fixes are committed for the next rawhide kernel build.

Comment 18 Jan Kratochvil 2007-10-04 20:23:05 UTC
Problem is no longer reproducible on: kernel-2.6.23-0.204.rc8.fc8.x86_64