Description of problem: Attempts to catch SIGTRAP fail. Even with a sigaction for SIGTRAP installed, the SIGTRAP causes process termination. In particular, a process cannot trace itself. Version-Release number of selected component (if applicable): kernel-2.6.17-1.2630.fc6 How reproducible: Always. Steps to Reproduce: 1. compile and run the attached program which traces itself for 10 instructions. 2. 3. Actual results: SIGTRAP - process killed Expected results: A trace of ten instructions, then normal exit. On Fedora Core 5 under kernel-2.6.17-1.2145_FC5, the correct output is: 5 0x40079b 316 48 83 c0 01 48 83 e9 02 5 0x40079f 302 48 83 e9 02 c3 90 90 90 5 0x4007a3 306 c3 90 90 90 90 90 90 90 5 0x40077c 306 bf ba 08 40 00 e8 d2 fd 5 0x400781 306 e8 d2 fd ff ff b8 00 00 5 0x400558 306 ff 25 a2 06 10 00 68 05 5 0x40055e 306 68 05 00 00 00 e9 90 ff 5 0x400563 306 e9 90 ff ff ff ff 25 9a 5 0x4004f8 306 ff 35 ca 06 10 00 ff 25 5 0x4004fe 306 ff 25 cc 06 10 00 90 90 Additional info: Runs fine under latest FC5, kernel-2.6.17-1.2145_FC5.
Created attachment 135815 [details] signal.c C-language main program which traces itself
Created attachment 135816 [details] go_asm.S assembly-language subroutine which sets Trace bit
The analogous program for i386 cannot catch SIGTRAP under kernel-2.6.17-1.2611.fc6 running on i686. -----go_asm.S i386 version go_asm: pushf orb $1,1(%esp) popf nop addl $1,%eax subl $2,%ecx ret ----- and use REG_EIP instead of REG_RIP in signal.c.
On Fedora Core 5 for i686, kernel-2.6.17-1.2145_FC5, the correct output is: 5 0x804849f 382 83 c0 01 83 e9 02 c3 90 5 0x80484a2 302 83 e9 02 c3 90 90 55 89 5 0x80484a5 393 c3 90 90 55 89 e5 83 ec 5 0x80485e6 393 c7 04 24 c6 86 04 08 e8 5 0x80485ed 393 e8 92 fd ff ff b8 00 00 5 0x8048384 393 ff 25 cc 97 04 08 68 10 5 0x804838a 393 68 10 00 00 00 e9 c0 ff 5 0x804838f 393 e9 c0 ff ff ff ff 25 d0 5 0x8048354 393 ff 35 bc 97 04 08 ff 25 5 0x804835a 393 ff 25 c0 97 04 08 00 00
This seems to be a Fedora-only problem; could it be related to xen? The attached BISECT_LOG gives evidence that no 2.6.17-* kernel fromm kernel.org has the problem. Yesterday's (2006-09-07) latest, changeset 10387e5eb45c6e48d67102b88229f5bc6037461c , is good. But for testing purposes, I told git bisect that it was bad, and that the last known good version was 2.6.16. The sixteen bisections in between were all good.
Created attachment 135868 [details] BISECT_LOG from git bisect of kernel.org 2.6.16 through 2.6.18-rc6 The first kernel 10387e5eb45c6e48d67102b88229f5bc6037461c is in fact good, but was labeled as bad to force a probe of many changesets in 2.6.17-*
Roland, any ideas ?
This problem persists in kernel-2.6.18-1.2693.fc6 for i386 [x86_64 not yet tested.]
Also still fails on x86_64 with kernel-2.6.18-1.2693.fc6.
This 20-month-old article (originally from LKML) looks to be related. It suggests that kprobes interfere: http://www.gatago.com/linux/kernel/15462875.html (2001-01-18) "x86-64: int3 no longer causes SIGTRAP in 2.6.10"
So this definitly isn't reproducable with a vanilla 2.6.18 ?
sorry, cannot help much w/broken wrist if reproduces, pls try disable CONFIG_KPROBES, CONFIG_UTRACE in otherwise rawhide kernel src rpm, see which of 4 permutations differ
On i686 [AMD Duron], linux-2.6.18.tar.bz2, built after "make oldconfig" starting from configs/kernel-2.6.18-i686.config of 2693, but with no patches, produces a kernel under which the testcase runs correctly: self-traces 10 instructions, then quits; (does not kill the process with SIGTRAP.) The resulting .config has CONFIG_UTRACE=y and CONFIG_KPROBES=y; but again, no patches from SOURCES were applied to the kernel.org source. So I believe that this shows a vanilla 2.6.18 does not have the bug.
Setting CONFIG_KPROBES=y and CONFIG_UTRACE=n in a build of kernel-2.6.18-1.2693.fc6 gets: CC [M] drivers/net/pcmcia/ibmtr_cs.o In file included from drivers/net/pcmcia/ibmtr_cs.c:50: include/linux/ptrace.h: In function ‘ptrace_do_wait’: include/linux/ptrace.h:227: error: ‘ECHILD’ undeclared (first use in this function) [Build sequence was: rpmbuild -bp --target i686 kernel-2.6.spec cd BUILD/kernel-2.6.18/linux-2.6.18.i686 cp configs/kernel-2.6.18-i686.config .config <<edit .config>> make oldconfig; make; make modules_install; mkinitrd ]
Setting CONFIG_KPROBES=n and CONFIG_UTRACE=y in a build of kernel-2.6.18-1.2699.fc6 [note 2699; otherwise same build sequence and machine as 2693 in Comment #14] gives a kernel that reproduces the bug. The testcase is killed with SIGTRAP. So the combination of Comment #13, Comment #14, and this comment says to me that some aspect of UTRACE is the culprit.
After fixing the build of Comment #14 by setting CONFIG_PCMCIA=n and CONFIG_PCCARD=n, then the resulting kernel reproduces the bug: the testcase gets killed with SIGTRAP. So CONFIG_KPROBES=y and CONFIG_UTRACE=n also tickles the bug. Combined with Comment #15, this means that either one of CONFIG_KPROBES or CONFIG_UTRACE is enough to tickle the bug, as long as the patches from SOURCES have been applied to linux-2.6.18 by the %prep step under the influence of "%define includexen 1". Does xen support a process tracing itself via catching SIGTRAP from Trace bit? (Who has tried it recently?)
The i386 failure is a bug in the utrace patch, a simple inverted test. I reproduced the problem on my vanilla 2.6.18+utrace kernel. The x86_64 code from the utrace patch does not have the same problem, and the bug does not reproduce on the x86_64 vanilla 2.6.18+utrace kernel, for either i386 or x86_64 binaries. In the otherwise identical-looking code on i366 and x86_64, the flag being tested has the inverted sense, which is how the i386 bug came about by sloppy copying on my part. However, linux-2.6-x86_64-tif-restore-sigmask.patch changes the code to invert the sense of that same flag, so it now matches the i386 code's sense but not that expected by the x86_64 utrace patch code. I am updating the utrace patch's changes in this code so that both the i386 bug will be fixed and the patch will mesh happily with tif-restore-sigmask.patch.
Created attachment 138301 [details] test case source, 32 or 64 This is the self-contained single source file I have actually tested with, for either -m32 or -m64 compilation.
Created attachment 138302 [details] fix patch
added to the end of linux-2.6-utrace.patch, should be in tomorrows rawhide.
should be fixed in 2.6.18-1.2849.fc6 now in updates