Note, this is hard to diagnose, need bug 126095 fixed to be able to examine the code. However, as the example illustrates, there's a problem: cagney@tonic$ gcc -o tonic nothing.c -g cagney@tonic$ gdb tonic GNU gdb Red Hat Linux (6.0post-0.20040223.21rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "ia64-redhat-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) run Starting program: /home/cagney/tmp/tonic Program received signal SIGINT, Interrupt. 0x200000000019b282 in getpid () from /lib/tls/libc.so.6.1 (gdb) display/i $pc 1: x/i $pc 0x200000000019b282 <getpid+2>: nop.i 0x0 (gdb) disassemble Dump of assembler code for function getpid: 0x200000000019b280 <getpid+0>: [MII] mov r15=1041 0x200000000019b281 <getpid+1>: break.i 0x100000;; 0x200000000019b282 <getpid+2>: nop.i 0x0 0x200000000019b290 <getpid+16>: [MFB] nop.m 0x0 0x200000000019b291 <getpid+17>: nop.f 0x0 0x200000000019b292 <getpid+18>: br.ret.sptk.few b0;; End of assembler dump. (gdb) The program is running. Exit anyway? (y or n) y cagney@tonic$ gcc -g -o sigbpt sigbpt.c cagney@tonic$ gdb sigbpt GNU gdb Red Hat Linux (6.0post-0.20040223.21rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "ia64-redhat-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) break keeper Breakpoint 1 at 0x40000000000006a2: file sigbpt.c, line 27. (gdb) handle SIGSEGV print pass nostop Signal Stop Print Pass to program Description SIGSEGV No Yes Yes Segmentation fault (gdb) run Starting program: /home/cagney/tmp/sigbpt Program received signal SIGSEGV, Segmentation fault. Breakpoint 1, keeper (sig=11) at sigbpt.c:27 27 } (gdb) disassemble Dump of assembler code for function keeper: 0x40000000000006a0 <keeper+0>: [MMI] mov r2=r12;; 0x40000000000006a1 <keeper+1>: st4 [r2]=r32 0x40000000000006a2 <keeper+2>: mov r12=r2 0x40000000000006b0 <keeper+16>: [MFB] nop.m 0x0 0x40000000000006b1 <keeper+17>: nop.f 0x0 0x40000000000006b2 <keeper+18>: br.ret.sptk.many b0;; End of assembler dump. (gdb) stepi 0x40000000000006b0 27 } (gdb) 0x40000000000006b1 27 } (gdb) 27 } (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) stepi <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) <signal handler called> (gdb) stepi <signal handler called> (gdb) Program received signal SIGSEGV, Segmentation fault. <signal handler called> (gdb) The code should have eventually stepi'ed back to the faulted instruction. It appears to have instead also executed that faulting instruction. What's extra weird is that it didn't end up back in keeper.
Created attachment 101494 [details] Program that throws/catches a sigsegv
Jason, this looks related to bug #126699, but since this one is being reported on a different architecture, I won't close it as a dup (in case the fixes need to be kept separate).
Larry Wodman has asked HP for help with this Bug. Therefore opening this bugzilla to HP-Confidential group as well as RH-Development. -----Original Message----- From: Helgaas, Bjorn Sent: Tuesday, September 14, 2004 9:17 AM To: Pherigo, Suzanne S Subject: Fwd: HP RHEL3 update features I don't know anything about the defects Larry mentions, and I don't have permissions to read the bugzilla entries. Do you have any information? ---------- Forwarded Message ---------- Subject: HP RHEL3 update features Date: Tuesday 14 September 2004 9:01 am From: Larry Woodman <lwoodman> To: Bjorn Helgaas <bjorn.helgaas> Cc: Peter Martuccelli <peterm>, Tim Burke <tburke> Hi Bjorn, we are looking at a couple IA64 features from HP for RHEL3 that, quite frankly,wont get done in the RHEL3-U4 timeframe unless you guys can help us out. Can you takea look at these two bugs and 1.) determine if they are important enough to be implemented by the end of September and 2.) implement them if they really are important. 1.) Bug 126095: GDB can't examine / write to ia64 signal trampoline (aka gate page) 2.) Bug 126913: single-stepping sigreturn appears to execute two in sns on ia64 Thanks, Larry
Bjorn Helgass of HP replies: Did HP put these on some must-fix list or something? I haven't found anyone here who knows about them, so I don't have any information that says they're urgent. Seems like the easiest way to proceed is to try to reproduce it on a current upstream kernel. If you can, more people will be interested in helping to fix it. If you can't, then you've narrowed down the places to look.
An (untested) fix for this problem has just been committed to the RHEL3 U4 patch pool this evening (in kernel version 2.4.21-21.EL).
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-550.html
According to test results shown to me by jeff (now that it is possible to examine memory) this bug is occuring in rhel3. Jeff, can you attach the testresults and confirm this.
are there any additional test results, as mentioned in comment #10?
This bug is filed against RHEL 3, which is in maintenance phase. During the maintenance phase, only security errata and select mission critical bug fixes will be released for enterprise products. Since this bug does not meet that criteria, it is now being closed. For more information of the RHEL errata support policy, please visit: http://www.redhat.com/security/updates/errata/ If you feel this bug is indeed mission critical, please contact your support representative. You may be asked to provide detailed information on how this bug is affecting you.