Bug 144805 - New kernel causes unexpected SIGTRAPs when making inferior calls in gdb
Summary: New kernel causes unexpected SIGTRAPs when making inferior calls in gdb
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 5
Hardware: i686
OS: Linux
high
high
Target Milestone: ---
Assignee: Dave Jones
QA Contact:
URL:
Whiteboard:
: 149432 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-01-11 17:23 UTC by Diego Novillo
Modified: 2015-01-04 22:15 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-10-17 23:04:16 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Diego Novillo 2005-01-11 17:23:51 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Gecko/20041111 Firefox/1.0

Description of problem:

After upgrading to the latest 2.6.10 kernel, gdb has started behaving
erratically when making inferior calls:

$ gdb --args gcc/cc1 -O2 -ftree-vrp a.c
GNU gdb Red Hat Linux (6.1post-1.20040607.43rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are welcome to change it and/or distribute copies of it under
certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux-gnu"...
Using host libthread_db library "/lib/tls/libthread_db.so.1".

(gdb) b tree-vrp.c:114
Breakpoint 1 at 0x8497d84: file /home/dnovillo/tcb/src/gcc/tree-vrp.c,
line 114.
(gdb) run
Starting program: /home/dnovillo/notnfs/BLD-tcb-native.topo/gcc/cc1
-O2 -ftree-vrp a.c

Breakpoint 1, get_range_from_assert (vr_p=0xbfffe970, expr=0xb7ee8510)
    at /home/dnovillo/tcb/src/gcc/tree-vrp.c:114
114           vr_p->type = VR_HALF_RANGE;
(gdb) n
115           vr_p->min = fold (build (PLUS_EXPR, type, limit, one));
(gdb) n
116           vr_p->max = NULL_TREE;
(gdb) n
120     }
(gdb) p *vr_p
$1 = {
  type = VR_HALF_RANGE,
  min = 0xb7ee15d0,
  max = 0x0
}
(gdb) call print_generic_expr(stderr,$1.min,0)
0
(gdb) call print_generic_expr(stderr,$1.min,0)

Program received signal SIGTRAP, Trace/breakpoint trap.
0x081024d4 in print_generic_expr (file=0xb7ee15d0, t=0x0, flags=158135312)
    at /home/dnovillo/tcb/src/gcc/tree-pretty-print.c:145
145     {
The program being debugged was signaled while in a function called
from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function
(print_generic_expr) will be abandoned.
(gdb)

Version-Release number of selected component (if applicable):
kernel-smp-2.6.10-1.737_FC3

How reproducible:
Always

Steps to Reproduce:
1. Compile this with -g:
#include <stdio.h>

foo (int *p)
{
  fprintf (stderr, "%d\n", *p);
}

main()
{
  int i;
  int *p = &i;
  i = 3;
  foo (p);
}

$ gcc -g b.c -o b
$ gdb --args ./b


Actual Results:  GNU gdb Red Hat Linux (6.1post-1.20040607.43rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux-gnu"...
Using host libthread_db library "/lib/tls/libthread_db.so.1".

(gdb) b main
Breakpoint 1 at 0x80483cf: file b.c, line 11.
(gdb) run
Starting program: /home/dnovillo/tests/00wrk/vrp/b

Breakpoint 1, main () at b.c:11
11        int *p = &i;
(gdb) n
12        i = 3;
(gdb) n
13        foo (p);
(gdb) call foo (p)
3
$1 = 2
(gdb) call foo (p)

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048391 in foo (p=0x0) at b.c:4
4       {
The program being debugged was signaled while in a function called
from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (foo) will be
abandoned.

Expected Results:  
The second call to foo() should not have trapped.

Additional info:


I think that the previous kernel I had gotten from updates-testing did
not exhibit this problem (kernel-smp-2.6.10-1.727_FC3).  I will try it
again.

I cannot try with the latest 2.6.9 (kernel-smp-2.6.9-1.724_FC3)
because that one doesn't even work with gdb
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=144411)

Comment 1 Roland McGrath 2005-01-11 23:26:23 UTC
Seems to be fine with the rawhide kernel, and my own upstream build.


Comment 2 Dave Jones 2005-01-12 00:12:26 UTC
which means its something that was fixed post 2.6.10. Hmm, any ideas which
cset(s) may be involved ?  ISTR that 2.6.10-ac8 on which the FC3 update is based
had some backports of some of the signal work done post 2.6.10. Is it possible
theres some bits of that missing ?

Comment 3 Ian Lance Taylor 2005-01-13 01:50:52 UTC
I am also seeing this problem.

Comment 4 Ian Lance Taylor 2005-01-14 18:09:33 UTC
This seems to be working better in 2.6.10-1.741_FC3.


Comment 5 Diego Novillo 2005-01-14 22:53:18 UTC
It takes a bit longer, but I can still reproduce this problem with
2.6.10-1.741_FC3smp after several inferior calls in one session.

Comment 6 Diego Novillo 2005-02-14 21:36:45 UTC
Still running into this problem after making several inferior calls in
one debugging session.

2.6.10-1.760_FC3smp #1 SMP Wed Feb 2 00:29:03 EST2005 i686 i686 i386
GNU/Linux

Comment 7 Roland McGrath 2005-02-14 22:28:32 UTC
Can you try the rawhide kernel and see if the bug still shows up there?
I suspect that we only have a problem with a botched backport, but we should be
positive that the upstream code is good before trying to sort that out.

Comment 8 Diego Novillo 2005-02-14 22:36:59 UTC
(In reply to comment #7)
> Can you try the rawhide kernel and see if the bug still shows up there?

Sure.  URL?


Thanks.  Diego.

Comment 10 Jeff Johnston 2005-02-23 00:01:59 UTC
*** Bug 149432 has been marked as a duplicate of this bug. ***

Comment 11 Andrew Cagney 2005-02-24 15:46:56 UTC
From bug 149432; Still occures in:

Version-Release number of selected component (if applicable):
kernel-2.6.10-1.760_FC3smp


Comment 12 Roland McGrath 2005-02-25 23:40:37 UTC
That kernel is already out of date.  Have you tried the current fc3-updates
kernel?  We also haven't gotten an answer from Diego about whether the rawhide
kernel has any problems, which will help us understand if we have a real bug or
just a patch merging botch.

Comment 13 Diego Novillo 2005-02-26 06:40:47 UTC
This is not a machine that I can reboot at will.  Several weeks may go by before
I  get a chance to kill everything I'm doing and start from scratch.  I'll try
to test the rawhide kernel in the next few days.

Comment 14 Ian Lance Taylor 2005-02-27 00:10:14 UTC
For the record, the problem does still happen with kernel-2.6.10-1.766_FC3.

Comment 15 Diego Novillo 2005-02-27 00:18:24 UTC
I tried installing the rawhide kernel.  It gives me a slew of warnings.  Are
these ignorable?

$ sudo rpm -ivh kernel-smp-2.6.10-1.1153_FC4.i686.rpm
Preparing...                ########################################### [100%]
   1:kernel-smp             ########################################### [100%]
WARNING:
/lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/pcmcia/pdaudiocf/snd-pdaudiocf.ko
needs unknown symbol print_tainted
WARNING: /lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/drivers/vx/snd-vx-lib.ko
needs unknown symbol print_tainted
WARNING: /lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/isa/snd-azt2320.ko needs
unknown symbol print_tainted
WARNING:
/lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/isa/cs423x/snd-cs4231-lib.ko
needs unknown symbol print_tainted
WARNING: /lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/pci/cs46xx/snd-cs46xx.ko
needs unknown symbol print_tainted
WARNING: /lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/pci/snd-azt3328.ko needs
unknown symbol print_tainted
WARNING:
/lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/pci/ice1712/snd-ice1724.ko needs
unknown symbol print_tainted
WARNING:
/lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/pci/ice1712/snd-ice1712.ko needs
unknown symbol print_tainted
WARNING:
/lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/pci/korg1212/snd-korg1212.ko
needs unknown symbol print_tainted
WARNING:
/lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/usb/usx2y/snd-usb-usx2y.ko needs
unknown symbol print_tainted
WARNING: /lib/modules/2.6.10-1.1153_FC4smp/kernel/sound/usb/snd-usb-lib.ko needs
unknown symbol print_tainted
WARNING:
/lib/modules/2.6.10-1.1153_FC4smp/kernel/arch/i386/kernel/cpu/cpufreq/speedstep-smi.ko
needs unknown symbol print_tainted
[  ..... ]

Comment 16 Dave Jones 2005-02-27 00:27:32 UTC
that particular problem should be fixedin 1.1154_FC4

Comment 17 Benjamin Kosnik 2005-04-13 20:24:29 UTC
Still at issue with kernel-2.6.11-1.14_FC3.

Comment 18 Benjamin Kosnik 2005-04-19 07:15:32 UTC
Diego, in between sniffs into kleenex, says 

kernel-2.6.9-1.681_FC3

works. Accept no substitute!

I'm bumping up priority on this sucker.

-benjamin

Comment 19 Benjamin Kosnik 2005-04-19 07:16:53 UTC
Diego, in between sniffs into kleenex, says 

kernel-2.6.9-1.681_FC3

works. Accept no substitute!

I'm bumping up priority on this sucker.

-benjamin

Comment 20 Dave Jones 2005-07-15 17:44:22 UTC
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem.   Please update to this new kernel, and
report whether or not it fixes your problem.

If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.

Thank you.

Comment 21 Ian Lance Taylor 2005-07-15 17:47:41 UTC
I upgrade to Fedora Core 4 a couple of weeks ago, and I have not seen the
problem since.  I'll report back if it happens again.

Comment 22 Diego Novillo 2005-07-18 13:21:57 UTC
(In reply to comment #21)
> I upgrade to Fedora Core 4 a couple of weeks ago, and I have not seen the
> problem since.  I'll report back if it happens again.

Likewise.  I do have other machines running FC3.  I'll see if I can reproduce
there with the new kernel.

Comment 23 John Reiser 2005-08-17 03:12:23 UTC
I can reproduce this on x86_64 running kernel-2.6.12-1.1398_FC4.  The following
8-instruction program just execs itself over and over:
-----execve.S
#include <asm/unistd.h>

/*
gcc -o execve -nostartfiles -nostdlib execve.S
gdb ./execve
run
p/x $ps
   # 0x202
c
p/x $ps
   # 0x302  TF (0x100) set, but should not be
*/

_start: .globl _start
        nop; int3
        popq %rcx  # argc
        movq (%rsp),%rdi  # same filename from argv[0]
        movq %rsp,%rsi    # same argv
        lea 8(%rsp,%rcx,8),%rdx  # same envp
        movl $__NR_execve,%eax   # here we go 'round the mulberry bush, ...
        syscall
-----end of execve.S
When run under gdb, the TF trace flag gets set on the 2nd verse.  Where did that
come from?

Also, if the 'int3' is replaced with 'nop', so that there is no reason at all to
trap, then there is a trap under gdb anyway.  Of course when run from bash
without gdb, then the program just spins merrily.  Also "strace -f gdb ./execve"
then "run\r" spins while spewing one line per execve.  So using strace has
altered functional behavior in an unexpected way: it "fixed" the bug.

$ gdb ./execve  # after replacing 'int3'==>'nop', then re-compiling
GNU gdb Red Hat Linux (6.3.0.0-1.21rh)
 ...
This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib64/libthread_db.so.1".

(gdb) run
Starting program: /home/jreiser/execve

Program received signal SIGTRAP, Trace/breakpoint trap.
0x00000000004000b2 in _start ()
(gdb) p/x $ps
$1 = 0x302  # TF bit (0x100) set
(gdb) x/4i _start
0x4000b0 <_start>:      nop
0x4000b1 <_start+1>:    nop  # there is no 'int3' here!
0x4000b2 <_start+2>:    pop    %rcx
0x4000b3 <_start+3>:    mov    (%rsp),%rdi
(gdb)
-----

-----/proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 47
model name      : AMD Athlon(tm) 64 Processor 3200+
stepping        : 0
-----



Comment 24 John Reiser 2005-08-17 03:29:59 UTC
Same problems on i686, kernel-2.6.12-1.1398_FC4.

-----execve.S
#include <asm/unistd.h>

/*
gcc -o execve -nostartfiles -nostdlib execve.S
*/

_start: .globl _start
        nop; nop
        popl %ebp  # argc
        movl (%esp),%ebx  # same filename from argv[0]
        movl %esp,%ecx    # same argv
        lea 4(%esp,%ebp,4),%edx  # same envp
        movl $__NR_execve,%eax   # here we go 'round the mulberry bush, ...
        int $0x80
-----

-----/proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 1.60GHz
stepping        : 4
-----


Comment 25 Dave Jones 2005-09-29 07:15:03 UTC
hmm, this has been around for a while, and seems to still be present.
(though rawhide x86-64 now immediately gets a SIGSEGV)

This is probably going to get more traction if you bring it up upstream on
linux-kernel.org

I just double-checked a vanilla kernel, and its present in plain 2.6.13.2 too.

Comment 26 John Reiser 2005-09-29 16:03:56 UTC
Transcribed to Linux kernel mailing list:

Message-ID: <433C0F21.8070104>
Date:   Thu, 29 Sep 2005 08:58:25 -0700
From:   John Reiser <jreiser>
Subject: ptrace unexpected SIGTRAP (trace bit) on x86, x86_64  kernel 2.6.13.2


Comment 27 Dave Jones 2006-01-16 22:13:41 UTC
This is a mass-update to all currently open Fedora Core 3 kernel bugs.

Fedora Core 3 support has transitioned to the Fedora Legacy project.
Due to the limited resources of this project, typically only
updates for new security issues are released.

As this bug isn't security related, it has been migrated to a
Fedora Core 4 bug.  Please upgrade to this newer release, and
test if this bug is still present there.

This bug has been placed in NEEDINFO_REPORTER state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

Thank you.


Comment 28 John Reiser 2006-01-20 21:07:41 UTC
There was essentially no response on the LKML (Comment #26.)

The bad behavior still exists on i686 Fedora Core 5 test 2,
kernel-2.6.15-1.1863_FC5.  [x86_64 not yet tested.]
So: somebody with enough authority could change the Version of this bugzilla
report to "fc5test2".

(gdb) run
Starting program: execve-spin
Program received signal SIGTRAP, Trace/breakpoint trap.
0x08048076 in _start ()
(gdb) p/x $ps
$1 = 0x200302   ## Trace bit (0x100) set
(gdb) x/5i $pc
0x8048076 <_start+2>:   pop    %ebp
0x8048077 <_start+3>:   mov    (%esp),%ebx
0x804807a <_start+6>:   mov    %esp,%ecx
0x804807c <_start+8>:   lea    0x4(%esp,%ebp,4),%edx
0x8048080 <_start+12>:  mov    $0xb,%eax
(gdb) x/5i _start   ## shows no 'int3'; SIGTRAP was entirely the kernel's idea.
0x8048074 <_start>:     nop
0x8048075 <_start+1>:   nop
0x8048076 <_start+2>:   pop    %ebp


Comment 29 Dave Jones 2006-02-03 05:14:24 UTC
This is a mass-update to all currently open kernel bugs.

A new kernel update has been released (Version: 2.6.15-1.1830_FC4)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO_REPORTER state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

Thank you.


Comment 30 John Reiser 2006-02-03 20:42:46 UTC
The extraneous Trace bit (bit value 0x100) is still seen under
kernel-2.6.15-1.1826.2.10_FC5 on i686; program in Comment #24.
The extraneous Trace bit (bit value 0x100) is still seen under
kernel-2.6.15-1.1884_FC5 on amd64 (x86_64); program in Comment #23.

So, someone with enough privileges: please change the Version to fc5test2.




Comment 31 Dave Jones 2006-10-16 17:38:27 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 32 John Reiser 2006-10-17 20:28:35 UTC
The testcase of Comment #24 works for me on i686 under kernel-2.6.18-1.2200.fc5;
namely, when run under gdb the testcase spins (execve-ing itself over and over)
with no SIGTRAP reported by gdb.  [gdb address space grows by 8MB/s, but that's
a different problem.]  I suspect that general work in utrace, and/or the fix for
bug #205659 "SIGTRAP cannot be caught" may be related.

Comment 33 Dave Jones 2006-10-17 23:04:16 UTC
great. thanks for retesting.



Note You need to log in before you can comment on or make changes to this bug.