Bug 660204 - ptrace: watchpoint-zeroaddr: EINVAL on PTRACE_POKEUSER of DR7
ptrace: watchpoint-zeroaddr: EINVAL on PTRACE_POKEUSER of DR7
Status: CLOSED UPSTREAM
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
19
x86_64 Linux
high Severity low
: ---
: ---
Assigned To: Oleg Nesterov
Fedora Extras Quality Assurance
: Regression
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-12-05 22:43 EST by Jan Kratochvil
Modified: 2013-05-22 12:54 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-05-21 15:00:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Jan Kratochvil 2010-12-05 22:43:18 EST
Description of problem:
When you set first DR7 and then DR0 - the setting of DR7 fails, because DR0 has still address 0 that time.  The process is still stopped so there is no reason for a sanity check that time.
This is a regression.

Version-Release number of selected component (if applicable):
PASS kernel-2.6.32-71.7.1.el6.x86_64
FAIL kernel-2.6.35.6-48.fc14.x86_64
FAIL kernel-2.6.35.6-48.noutrace.fc14.x86_64

How reproducible:
Always.

Steps to Reproduce:
wget -O watchpoint-zeroaddr.c http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/watchpoint-zeroaddr.c?cvsroot=systemtap; gcc -o watchpoint-zeroaddr watchpoint-zeroaddr.c -Wall -g; ./watchpoint-zeroaddr; echo $?

Actual results:
1

Expected results:
0

Additional info:
Comment 1 Oleg Nesterov 2010-12-06 05:37:38 EST
(In reply to comment #0)
>
> When you set first DR7 and then DR0 - the setting of DR7 fails,
> because DR0 has still address 0 that time.

More precisely, it fails because ptrace_write_dr7() never tries
to allocate the perf_event connected to this bpnum, it blindly
calls ptrace_modify_breakpoint(bp => NULL)  which obviously fails
with EINVAL.

And it is not trivial to change this, despite its name
ptrace_write_dr7() doesn't write dr7, we do not have  debugreg7
any longer. We really need bp != NULL to store the necessary
info.

Another problem is that this code has other bugs, and those
who wrote this code do not reply at all.

So. This is upstream problem. Unlike the previous regression
(watchpoints && fork), this one looks like a bug to me.

But. I do not know how long will it take to fix.


Just in case, to clarify once again... "zeroaddr" is fine.
The problem is, with the recent changes there is no "adrress"
at all before the tracer writes to u_debugreg[0]. But if you
try to read u_debugreg[0], get_watchpoint() will happily return
zero.

Isn't this nice...
Comment 2 Josh Boyer 2011-08-31 13:34:34 EDT
Did this ever get fixed?
Comment 3 Oleg Nesterov 2011-08-31 13:48:52 EDT
(In reply to comment #2)
>
> Did this ever get fixed?

argh.... thanks for reminding. Should be fixed upstream iirc.

Damn. I'll _try_ to look further this week. I forgot the details,
but afaics we already fixed other (more serious) problems which
delayed this problem...
Comment 4 Jan Kratochvil 2011-09-16 17:52:08 EDT
Still FAILs:
kernel-3.1.0-0.rc6.git0.0.fc17.x86_64
Comment 5 Fedora End Of Life 2013-04-03 14:09:53 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19
Comment 6 Justin M. Forbes 2013-04-05 12:58:00 EDT
Is this still a problem with 3.9 based F19 kernels?
Comment 7 Jan Kratochvil 2013-04-05 13:04:18 EDT
Yes; BTW the reproducer in Comment 0 is really simple to run.

kernel-3.8.4-202.fc18.x86_64
Comment 8 Oleg Nesterov 2013-04-13 14:40:44 EDT
(In reply to comment #7)
> Yes; BTW the reproducer in Comment 0 is really simple to run.

All I can say is: sorry for delay.

I constantly forgot about this problem. Fortunataly, it seems that
today the fix should be simple. Still it is complicated by the fact
that this code needs cleanups, in particular we should kill
ptrace_get_breakpoints() and ->ptrace_bp_refcnt, and this needs
per-arch changes.

I am working on the patches.
Comment 9 Oleg Nesterov 2013-04-14 15:22:43 EDT
Cough. Jan, suddenly I started to doubt this should be fixed.

Lets discuss this upstream, you were cc'ed. I already sent the
preparation patches, the actual fix should be trivial.

But. Could you please tell me if gdb really wants this fix?
Comment 10 Jan Kratochvil 2013-04-14 15:39:06 EDT
Recent GDB versions are fully compatible with recent Linux kernels, there was a special care about it:
  http://sourceware.org/git/?p=gdb.git;a=commitdiff;h=2bfeecf636f5790a7e850e11c7936c4d9814d8ef

Old GDBs were also compatible, although just accidentally:
  /* Finally, actually pass the info to the inferior.  */
  i386_dr_low.set_addr (i, addr);
  i386_dr_low.set_control (dr_control_mirror);

Unfortunately I really do not remember which specific use case had a problem with it.  I was heavily patching this GDB code before so some Fedora/RHEL GDB could be dependent on it, or maybe just some my work-in-progress patches that time.

I do not see any Red Hat business reason for it.  I still believe it is a needless ABI backward compatibility breakage but I do not mind if the kernel developers do not agree and WONTFIX it.
Comment 11 Oleg Nesterov 2013-04-14 15:48:02 EDT
(In reply to comment #10)
> I still believe it is a
> needless ABI backward compatibility breakage

Yes, yes, I agree. But given that this was broken a long ago...

> but I do not mind if the kernel
> developers do not agree and WONTFIX it.

OK. Thanks Jan. Lets wait for other upstream comments, then
we will see.
Comment 12 Jan Kratochvil 2013-04-14 15:56:01 EDT
(In reply to comment #11)
> Yes, yes, I agree. But given that this was broken a long ago...

The kernel ABI got broken ~3 years ago. Most of the "recent" users run Ubuntu 10.04 which I believe was still before the breakage.  Other users run even more old GNU/Linux releases.  The users still may not have noticed the change.
Comment 13 Oleg Nesterov 2013-05-17 14:06:22 EDT
(In reply to comment #12)
> (In reply to comment #11)
> > Yes, yes, I agree. But given that this was broken a long ago...
> 
> The kernel ABI got broken ~3 years ago. Most of the "recent" users run
> Ubuntu 10.04 which I believe was still before the breakage.  Other users run
> even more old GNU/Linux releases.  The users still may not have noticed the
> change.

OK.

The patches are already in -mm, and they were acked. Can we close
this bug?
Comment 14 Jan Kratochvil 2013-05-21 15:00:48 EDT
If the testcase really PASSes there then OK, CLOSED-UPSTREAM.
Comment 15 Oleg Nesterov 2013-05-22 12:54:34 EDT
(In reply to Jan Kratochvil from comment #14)
>
> If the testcase really PASSes there then OK, CLOSED-UPSTREAM.

Yes, yes, I even wrote the new test-case (the old one passes too).

Thanks!

Note You need to log in before you can comment on or make changes to this bug.