Bug 810668

Summary: ptrace: GDB inferior calls with floats fail
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: kernelAssignee: Oleg Nesterov <onestero>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 16CC: cebbert.lkml, gansalmon, itamar, jan.kratochvil, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-3.3.5-2.fc16.x86_64 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-05-14 09:00:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Attempt for a reproducer; nothing is reproduced.
none
Reproducer
none
may be fix (incomplete!)
none
Better reproducer none

Description Jan Kratochvil 2012-04-07 22:28:57 UTC
Created attachment 575976 [details]
Attempt for a reproducer; nothing is reproduced.

Description of problem:
There is kernel-3.2.x -> kernel-3.3.x regression for GDB inferior calls with float parameters.

Version-Release number of selected component (if applicable):
PASS: kernel-3.2.5-3.fc16.x86_64
PASS: kernel-3.2.10-3.fc16.x86_64
FAIL: kernel-3.3.0-4.fc16.x86_64
FAIL: kernel-3.3.1-3.fc16.x86_64
FAIL: kernel-3.4.0-0.rc1.git3.1.fc18.x86_64
and with maxcpus:
PASS: kernel-3.3.1-3.fc16.x86_64 maxcpus=0
PASS: kernel-3.3.1-3.fc16.x86_64 maxcpus=2
FAIL: kernel-3.3.1-3.fc16.x86_64 maxcpus=4
It is on i7-920 which has 4 cores with multithreading = 8 cpuinfo entries.
It is also reproducible inside KVM with host kernel-3.2.10-3.fc16.x86_64 and guest kernel-3.3.1-3.fc16.x86_64 with qemu-kvm -smp 4.

How reproducible:
It is racy, sometimes it is 100% reproducible, sometimes it is reproducible on 12th run of runtest etc. but it is easily reproducible in general.

Steps to Reproduce:
git clone git://sourceware.org/git/gdb.git gdb-git; cd gdb-git; ./configure; make; cd gdb/testsuite; make site.exp
i=0;while runtest gdb.base/callfuncs.exp;do i=$[$i+1];echo $i;done

Actual results:
FAIL: gdb.base/callfuncs.exp: p t_float_values2(3.14159,float_val2)
FAIL: gdb.base/callfuncs.exp: Call function with many float arguments.
sometimes even 6 FAILs.

Expected results:
# of expected passes		150
# of known failures		7
(no FAIL, there are 7 KFAILs, that is OK)

Additional info:
For: FAIL: gdb.base/callfuncs.exp: p t_float_values2(3.14159,float_val2)
gets called: t_float_values2 (0.000000, 3.141590)
instead of:  t_float_values2 (3.141590, -2.376500)
Some weird shift of fpregs contents in kernel?

GDB is using in this case PTRACE_GETFPREGS + PTRACE_SETFPREGS.
I did not test how behaves PTRACE_GETREGSET + PTRACE_SETREGSET (as I do not have handy CPU with 'xsave' flag around).

I have tried a reproducer for ptrace-testsuite but I do not have the problem reproducible outside of GDB.  Could you advice?

It affects more GDB testcases than just gdb.base/callfuncs.exp, at least also gdb.base/dfp-test.exp.

Comment 1 Oleg Nesterov 2012-04-08 22:19:12 UTC
(In reply to comment #0)
> 
> Description of problem:
> There is kernel-3.2.x -> kernel-3.3.x regression for GDB inferior calls with
> float parameters.

Obviously, I'd suspect this was cause by i387 fixes from Linus.
The only problem is, right now I do not understand how this is
possible ;) Perhaps we can try the revert-them-all patch to ensure.

> Steps to Reproduce:
> git clone git://sourceware.org/git/gdb.git gdb-git; cd gdb-git; ./configure;
> make; cd gdb/testsuite; make site.exp
> i=0;while runtest gdb.base/callfuncs.exp;do i=$[$i+1];echo $i;done
> 
> Actual results:
> FAIL: gdb.base/callfuncs.exp: p t_float_values2(3.14159,float_val2)
> FAIL: gdb.base/callfuncs.exp: Call function with many float arguments.
> sometimes even 6 FAILs.

Ooh. You know, I didn't even try to look into these .exp test-cases.
I did before, and I simply can't understand this black magic.

Any chance you can provide the simplified PSEUDO-CODE in C (without
error-checking/etc) which fails? So that, fingers crossed, I could
more or less understand what the failing test-case does and what is
actually wrong.

Comment 2 Jan Kratochvil 2012-04-09 07:01:01 UTC
(In reply to comment #1)
> Obviously, I'd suspect this was cause by i387 fixes from Linus.

The commit hash?  I would have to fully bisect it otherwise which I did not want to spend time on.

> Ooh. You know, I didn't even try to look into these .exp test-cases.
> I did before, and I simply can't understand this black magic.

I do not think it makes sense to analyse more the GDB code.  GDB inferior calls code is big.

> Any chance you can provide the simplified PSEUDO-CODE in C (without
> error-checking/etc) which fails?

I have tried to produce a simplified reproducer as you can see in the attachment but it does not reproduce the problem.


My idea is to reduce the kernel changes to a single commit and then you can review that change to possibly find the bug. Hints where the problem is that:
 * it is somehow related to SMP-CPU context switches (probably)

Comment 3 Oleg Nesterov 2012-04-10 21:15:39 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > Obviously, I'd suspect this was cause by i387 fixes from Linus.
> 
> The commit hash?  I would have to fully bisect it otherwise which I did not
> want to spend time on.


1361b83a13d4d92e53fbb6c877528713e118b821 i387: Split up <asm/i387.h> into exported and internal interfaces
8546c008924d5fd1724fa698eaa92b414bafd50d i387: Uninline the generic FP helpers that we expose to kernel modules
27e74da9800289e69ba907777df1e2085231eff7 i387: export 'fpu_owner_task' per-cpu variable
7e16838d94b566a17b65231073d179bc04d590c8 i387: support lazy restore of FPU state
80ab6f1e8c981b1b6604b2f22e36c917526235cd i387: use 'restore_fpu_checking()' directly in task switching code
cea20ca3f3181fc36788a15bc65d1062b96a0a6c i387: fix up some fpu_counter confusion
34ddc81a230b15c0e345b6b253049db731499f7e i387: re-introduce FPU state preloading at context switch time
f94edacf998516ac9d849f7bc6949a703977a7f3 i387: move TS_USEDFPU flag from thread_info to task_struct
4903062b5485f0e2c286a23b44c9b59d9b017d53 i387: move AMD K7/K8 fpu fxsave/fxrstor workaround from save to restore
b3b0870ef3ffed72b92415423da864f440f57ad6 i387: do not preload FPU state at task switch time
6d59d7a9f5b723a7ac1925c136e93ec83c0c3043 i387: don't ever touch TS_USEDFPU directly, use helper functions
b6c66418dcad0fcf83cd1d0a39482db37bf4fc41 i387: move TS_USEDFPU clearing out of __save_init_fpu and into callers
15d8791cae75dca27bfda8ecfe87dca9379d6bb0 i387: fix x86-64 preemption-unsafe user stack save/restore
c38e23456278e967f094b08247ffc3711b1029b2 i387: fix sense of sanity check
5b1cbac37798805c1fee18c8cebe5c0a13975b17 i387: make irq_fpu_usable() tests more robust
be98c2cdb15ba26148cd2bd58a857d4f7759ed38 i387: math_state_restore() isn't called from asm

(in reverse order, git log | grep i387)

It would be really great if you can test the kernel before
be98c2cdb15ba26148cd2bd58a857d4f7759ed38 at least.

Comment 4 Jan Kratochvil 2012-04-11 20:52:31 UTC
7e16838d94b566a17b65231073d179bc04d590c8 is the first bad commit
commit 7e16838d94b566a17b65231073d179bc04d590c8
Author: Linus Torvalds <torvalds>
Date:   Sun Feb 19 13:27:00 2012 -0800

    i387: support lazy restore of FPU state
    
    This makes us recognize when we try to restore FPU state that matches
    what we already have in the FPU on this CPU, and avoids the restore
    entirely if so.
    
    To do this, we add two new data fields:
    
     - a percpu 'fpu_owner_task' variable that gets written any time we
       update the "has_fpu" field, and thus acts as a kind of back-pointer
       to the task that owns the CPU.  The exception is when we save the FPU
       state as part of a context switch - if the save can keep the FPU
       state around, we leave the 'fpu_owner_task' variable pointing at the
       task whose FP state still remains on the CPU.
    
     - a per-thread 'last_cpu' field, that indicates which CPU that thread
       used its FPU on last.  We update this on every context switch
       (writing an invalid CPU number if the last context switch didn't
       leave the FPU in a lazily usable state), so we know that *that*
       thread has done nothing else with the FPU since.
    
    These two fields together can be used when next switching back to the
    task to see if the CPU still matches: if 'fpu_owner_task' matches the
    task we are switching to, we know that no other task (or kernel FPU
    usage) touched the FPU on this CPU in the meantime, and if the current
    CPU number matches the 'last_cpu' field, we know that this thread did no
    other FP work on any other CPU, so the FPU state on the CPU must match
    what was saved on last context switch.
    
    In that case, we can avoid the 'f[x]rstor' entirely, and just clear the
    CR0.TS bit.
    
    Signed-off-by: Linus Torvalds <torvalds>

Comment 5 Oleg Nesterov 2012-04-13 00:49:03 UTC
(In reply to comment #4)
> 7e16838d94b566a17b65231073d179bc04d590c8 is the first bad commit

Oh, great, thanks a lot Jan!

Please give me some time. As usual, I'll pretend I am busy and can't
react right now, but I'll try to investigate further asap. If nothing
else, we can always re-assign this bug to Linus ;)

Comment 6 Chuck Ebbert 2012-04-13 08:34:09 UTC
Created attachment 577270 [details]
Reproducer

Comment 7 Jan Kratochvil 2012-04-13 16:19:51 UTC
The reproducer works but only in ~50% of cases, isn't there a reliable way with sched_setaffinity?  I will see from Oleg's fix what is really happening there for making the testcases results stable, if possible.

Comment 8 Oleg Nesterov 2012-04-14 23:23:07 UTC
Created attachment 577491 [details]
may be fix (incomplete!)

I seem to understand the problem...

could you try this patch? Just in case, it is not complete,
only for this test-case.

And, btw, the reproducer doesn't work for me unless I spawn
a lot of cpu-intensive tasks. But now this is more or less
clear why.

Comment 9 Oleg Nesterov 2012-04-15 22:50:35 UTC
(In reply to comment #8)
>
> could you try this patch?

Better yet, please try the patch from Linus. I am sure it is
correct, but it would be nice if you can test and confirm.

Comment 10 Chuck Ebbert 2012-04-18 10:09:49 UTC
Created attachment 578300 [details]
Better reproducer

This test program still leaves orphan children stuck in signal handling when the tracer dies, but it's a lot more reliable.

Linus's patch works for me with this test.

Comment 11 Chuck Ebbert 2012-04-27 03:05:13 UTC
commit 089f9fba5
i387: ptrace breaks the lazy-fpu-restore logic

Will be in 3.4-rc5 and 3.3.5

Comment 12 Jan Kratochvil 2012-05-01 02:15:33 UTC
Included the comment 10 testcase as:
  http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/fpregs-smp.c?cvsroot=systemtap

/* Value 8 has false PASS while value 10 already makes it reproducible.  */
#define LOOPS 12

/* If it does not FAIL in the first cycle it will never FAIL.  Reproducibility
   is about 50%.  With running background processes the reproducibility is 0%.
   */

I do not understand the behavior but I left it as is.

Comment 13 Josh Boyer 2012-05-07 19:11:13 UTC
As Chuck noted, this should be fixed with 3.3.5.  That has been committed to the various branches now.

Comment 14 Jan Kratochvil 2012-05-14 07:04:38 UTC
Thanks:
PASS: kernel-3.2.10-3.fc16.x86_64 - before changes
FAIL: kernel-3.3.2-6.fc16.x86_64 - bugreport
PASS: kernel-3.3.4-3.fc16.x86_64 - curiously problem is not reproducible
PASS: kernel-3.3.5-2.fc16.x86_64 - verified