Bug 205659 - SIGTRAP cannot be caught
Summary: SIGTRAP cannot be caught
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 6
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Roland McGrath
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 227693
TreeView+ depends on / blocked
 
Reported: 2006-09-07 21:26 UTC by John Reiser
Modified: 2007-11-30 22:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-11-12 05:47:46 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
signal.c C-language main program which traces itself (995 bytes, text/plain)
2006-09-07 21:27 UTC, John Reiser
no flags Details
go_asm.S assembly-language subroutine which sets Trace bit (164 bytes, text/plain)
2006-09-07 21:28 UTC, John Reiser
no flags Details
BISECT_LOG from git bisect of kernel.org 2.6.16 through 2.6.18-rc6 (2.51 KB, text/plain)
2006-09-08 19:04 UTC, John Reiser
no flags Details
test case source, 32 or 64 (1.29 KB, text/x-csrc)
2006-10-12 04:39 UTC, Roland McGrath
no flags Details
fix patch (2.71 KB, patch)
2006-10-12 04:45 UTC, Roland McGrath
no flags Details | Diff

Description John Reiser 2006-09-07 21:26:11 UTC
Description of problem: Attempts to catch SIGTRAP fail.  Even with a sigaction
for SIGTRAP installed, the SIGTRAP causes process termination.  In particular, a
process cannot trace itself.


Version-Release number of selected component (if applicable):
kernel-2.6.17-1.2630.fc6

How reproducible:
Always.

Steps to Reproduce:
1. compile and run the attached program which traces itself for 10 instructions.
2.
3.
  
Actual results:
SIGTRAP - process killed

Expected results:
A trace of ten instructions, then normal exit.  On Fedora Core 5 under
kernel-2.6.17-1.2145_FC5, the correct output is:
5  0x40079b  316  48 83 c0 01 48 83 e9 02
5  0x40079f  302  48 83 e9 02 c3 90 90 90
5  0x4007a3  306  c3 90 90 90 90 90 90 90
5  0x40077c  306  bf ba 08 40 00 e8 d2 fd
5  0x400781  306  e8 d2 fd ff ff b8 00 00
5  0x400558  306  ff 25 a2 06 10 00 68 05
5  0x40055e  306  68 05 00 00 00 e9 90 ff
5  0x400563  306  e9 90 ff ff ff ff 25 9a
5  0x4004f8  306  ff 35 ca 06 10 00 ff 25
5  0x4004fe  306  ff 25 cc 06 10 00 90 90


Additional info:
Runs fine under latest FC5, kernel-2.6.17-1.2145_FC5.

Comment 1 John Reiser 2006-09-07 21:27:44 UTC
Created attachment 135815 [details]
signal.c  C-language main program which traces itself

Comment 2 John Reiser 2006-09-07 21:28:39 UTC
Created attachment 135816 [details]
go_asm.S  assembly-language subroutine which sets Trace bit

Comment 3 John Reiser 2006-09-07 21:48:31 UTC
The analogous program for i386 cannot catch SIGTRAP under
kernel-2.6.17-1.2611.fc6 running on i686.
-----go_asm.S  i386 version
go_asm:
        pushf
        orb $1,1(%esp)
        popf
        nop
        addl $1,%eax
        subl $2,%ecx
        ret
-----
and use REG_EIP instead of REG_RIP in signal.c.


Comment 4 John Reiser 2006-09-07 21:56:26 UTC
On Fedora Core 5 for i686, kernel-2.6.17-1.2145_FC5, the correct output is:
5  0x804849f  382  83 c0 01 83 e9 02 c3 90
5  0x80484a2  302  83 e9 02 c3 90 90 55 89
5  0x80484a5  393  c3 90 90 55 89 e5 83 ec
5  0x80485e6  393  c7 04 24 c6 86 04 08 e8
5  0x80485ed  393  e8 92 fd ff ff b8 00 00
5  0x8048384  393  ff 25 cc 97 04 08 68 10
5  0x804838a  393  68 10 00 00 00 e9 c0 ff
5  0x804838f  393  e9 c0 ff ff ff ff 25 d0
5  0x8048354  393  ff 35 bc 97 04 08 ff 25
5  0x804835a  393  ff 25 c0 97 04 08 00 00

Comment 5 John Reiser 2006-09-08 19:01:13 UTC
This seems to be a Fedora-only problem; could it be related to xen?  The
attached BISECT_LOG gives evidence that no 2.6.17-* kernel fromm kernel.org has
the problem.  Yesterday's (2006-09-07) latest, changeset
10387e5eb45c6e48d67102b88229f5bc6037461c , is good.  But for testing purposes, I
told git bisect that it was bad, and that the last known good version was
2.6.16.  The sixteen bisections in between were all good.

Comment 6 John Reiser 2006-09-08 19:04:12 UTC
Created attachment 135868 [details]
BISECT_LOG from git bisect of kernel.org 2.6.16 through 2.6.18-rc6

The first kernel 10387e5eb45c6e48d67102b88229f5bc6037461c is in fact good, but
was labeled as bad to force a probe of many changesets in 2.6.17-*

Comment 7 Dave Jones 2006-09-14 05:31:32 UTC
Roland, any ideas ?

Comment 8 John Reiser 2006-09-26 15:06:08 UTC
This problem persists in  kernel-2.6.18-1.2693.fc6  for i386  [x86_64 not yet
tested.]


Comment 9 John Reiser 2006-09-27 15:16:50 UTC
Also still fails on x86_64 with kernel-2.6.18-1.2693.fc6.

Comment 10 John Reiser 2006-09-28 02:16:01 UTC
This 20-month-old article (originally from LKML) looks to be related.  It
suggests that kprobes interfere: 
http://www.gatago.com/linux/kernel/15462875.html (2001-01-18) "x86-64: int3 no
longer causes SIGTRAP in 2.6.10"

Comment 11 Dave Jones 2006-09-28 22:13:05 UTC
So this definitly isn't reproducable with a vanilla 2.6.18 ?


Comment 12 Roland McGrath 2006-09-28 22:39:43 UTC
sorry, cannot help much w/broken wrist

if reproduces, pls try disable CONFIG_KPROBES, CONFIG_UTRACE in otherwise
rawhide kernel src rpm, see which of 4 permutations differ

Comment 13 John Reiser 2006-09-29 01:24:54 UTC
On i686 [AMD Duron], linux-2.6.18.tar.bz2, built after "make oldconfig" starting
from configs/kernel-2.6.18-i686.config of 2693, but with no patches, produces a
kernel under which the testcase runs correctly: self-traces 10 instructions,
then quits;  (does not kill the process with SIGTRAP.)  The resulting .config
has CONFIG_UTRACE=y and CONFIG_KPROBES=y; but again, no patches from SOURCES
were applied to the kernel.org source.  So I believe that this shows a vanilla
2.6.18 does not have the bug.
 

Comment 14 John Reiser 2006-09-29 02:31:17 UTC
Setting CONFIG_KPROBES=y and CONFIG_UTRACE=n in a build of
kernel-2.6.18-1.2693.fc6 gets:
  CC [M]  drivers/net/pcmcia/ibmtr_cs.o
In file included from drivers/net/pcmcia/ibmtr_cs.c:50:
include/linux/ptrace.h: In function ‘ptrace_do_wait’:
include/linux/ptrace.h:227: error: ‘ECHILD’ undeclared (first use in this function)

[Build sequence was:
  rpmbuild -bp --target i686 kernel-2.6.spec
  cd BUILD/kernel-2.6.18/linux-2.6.18.i686
  cp configs/kernel-2.6.18-i686.config .config
  <<edit .config>>
  make oldconfig; make; make modules_install; mkinitrd
]

Comment 15 John Reiser 2006-09-29 02:37:50 UTC
Setting CONFIG_KPROBES=n and CONFIG_UTRACE=y in a build of
kernel-2.6.18-1.2699.fc6 [note 2699; otherwise same build sequence and machine
as 2693 in Comment #14] gives a kernel that reproduces the bug.  The testcase is
killed with SIGTRAP.

So the combination of Comment #13, Comment #14, and this comment says to me that
some aspect of UTRACE is the culprit.


Comment 16 John Reiser 2006-09-29 05:13:33 UTC
After fixing the build of Comment #14 by setting CONFIG_PCMCIA=n and
CONFIG_PCCARD=n, then the resulting kernel reproduces the bug: the testcase gets
killed with SIGTRAP. 

So CONFIG_KPROBES=y and CONFIG_UTRACE=n also tickles the bug.  Combined with
Comment #15, this means that either one of CONFIG_KPROBES or CONFIG_UTRACE is
enough to tickle the bug, as long as the patches from SOURCES have been applied
to linux-2.6.18 by the %prep step under the influence of "%define includexen 1".
 Does xen support a process tracing itself via catching SIGTRAP from Trace bit?
 (Who has tried it recently?)

Comment 17 Roland McGrath 2006-10-12 04:37:45 UTC
The i386 failure is a bug in the utrace patch, a simple inverted test.
I reproduced the problem on my vanilla 2.6.18+utrace kernel.
The x86_64 code from the utrace patch does not have the same problem, and the
bug does not reproduce on the x86_64 vanilla 2.6.18+utrace kernel, for either
i386 or x86_64 binaries.  In the otherwise identical-looking code on i366 and
x86_64, the flag being tested has the inverted sense, which is how the i386 bug
came about by sloppy copying on my part.

However, linux-2.6-x86_64-tif-restore-sigmask.patch changes the code to invert
the sense of that same flag, so it now matches the i386 code's sense but not
that expected by the x86_64 utrace patch code.

I am updating the utrace patch's changes in this code so that both the i386 bug
will be fixed and the patch will mesh happily with tif-restore-sigmask.patch.


Comment 18 Roland McGrath 2006-10-12 04:39:33 UTC
Created attachment 138301 [details]
test case source, 32 or 64

This is the self-contained single source file I have actually tested with, for
either -m32 or -m64 compilation.

Comment 19 Roland McGrath 2006-10-12 04:45:49 UTC
Created attachment 138302 [details]
fix patch

Comment 20 Dave Jones 2006-10-12 18:10:55 UTC
added to the end of linux-2.6-utrace.patch, should be in tomorrows rawhide.


Comment 21 Dave Jones 2006-11-12 05:47:46 UTC
should be fixed in 2.6.18-1.2849.fc6 now in updates


Note You need to log in before you can comment on or make changes to this bug.