Bug 660003

Summary: ptrace: watchpoint-fork (DR fork() inheritance)
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: kernelAssignee: Oleg Nesterov <onestero>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 13CC: dougsland, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, roland
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
URL: http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/watchpoint-fork.c?cvsroot=systemtap
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-12-06 00:06:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Kratochvil 2010-12-04 23:04:32 UTC
Description of problem:
Formerly the child after fork() inherited the content of DRs (debug registers).  This no longer happens with F14 utrace-ptrace.

In fact it workarounds a bug in FSF GDB (already fixed in Fedora GDB) but I guess utrace should be behavior-compatible in this respect to legacy ptrace.

Version-Release number of selected component (if applicable):
PASS kernel-smp-2.6.9-89.EL.x86_64
PASS kernel-2.6.18-233.el5.x86_64
PASS kernel-2.6.32-71.7.1.el6.x86_64
FAIL kernel-2.6.35.6-48.fc14.x86_64
(tested everything on iron)

How reproducible:
Always.

Steps to Reproduce:
wget -O watchpoint-fork.c http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/watchpoint-fork.c?cvsroot=systemtap; gcc -o watchpoint-fork watchpoint-fork.c -Wall -g; ./watchpoint-fork; echo $?

Actual results:
1

Expected results:
0

Additional info:

Comment 1 Jan Kratochvil 2010-12-04 23:21:39 UTC
FAIL kernel-2.6.34.7-61.fc13.x86_64 (KVM)

Comment 2 Oleg Nesterov 2010-12-05 14:49:11 UTC
(In reply to comment #0)
>
> Formerly the child after fork() inherited the content of DRs

Yes. thread_struct.debugregX array was copied by dup_task_struct().
I am not sure this was by design though. Perhaps this was oversight,
I do not know.

> This no longer happens with F14 utrace-ptrace.
> 
> In fact it workarounds a bug in FSF GDB (already fixed in Fedora GDB)

To me, the new behaviour is better, but:

> but I
> guess utrace should be behavior-compatible in this respect to legacy ptrace.

Agreed. However, this has nothing to do with utrace/ptrace, see below.

> FAIL kernel-2.6.35.6-48.fc14.x86_64
> FAIL kernel-2.6.34.7-61.fc13.x86_64 (KVM)

This is because the recent kernels use the new kernel-breakpoint
abstraction layer. copy_thread() explicitly clears ->ptrace_bps[].
And it is not trivial to change this behaviour, we need to "deep
copy" these perf_event's.

Really, I do not know what we can do. In any case, if we want to
fix this, this should be changed upstream. (Oh, and we have the
real bugs in this area, but nobody cares).

But, once again, I am not sure we actually want to "fix" this.

Thoughts?

Comment 3 Jan Kratochvil 2010-12-06 00:06:17 UTC
I have built
  kernel-2.6.35.6-48.noutrace.fc14
  https://koji.fedoraproject.org/scratch/jkratoch/task_2644776/
with
#ApplyPatch git-utrace.patch
#ApplyPatch utrace-ptrace-fix-build.patch
#ApplyPatch utrace-remove-use-of-kref_set.patch

and I can confirm watchpoint-fork really still "FAIL"s there.

(In reply to comment #2)
> But, once again, I am not sure we actually want to "fix" this.

OK, FSF GDB amd64_linux_new_thread already copies DRs into new threads so I do not see a regression for GDB (I will yet verify it).

Removing the testcase, thanks.  Still TBH I find this as a too radical ptrace change, if ptrace should be kept compatible.

Comment 4 Oleg Nesterov 2010-12-06 09:09:52 UTC
(In reply to comment #3)
>
> Still TBH I find this as a too radical ptrace change,

Oh, I agree.

Comment 5 Roland McGrath 2010-12-06 23:37:30 UTC
I also concur about the wrongness of the unannounced ABI change.  But the people upstream don't care about the details and they didn't check with us.  I'm not sure it's worth the trouble to try to convince them they were wrong.  All else being equal, the "new" behavior probably makes more sense, though with PTRACE_O_TRACECLONE it is somewhat debatable.