From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) Description of problem: The test case aborts due to an exception not being caught in main(). This is the bt from core: macaroni:/nfs/devco/edin/eh > gdb ./cond_test core GNU gdb 5.0rh-5 Red Hat Linux 7.1 Copyright 2001 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... Core was generated by `./cond_test'. Program terminated with signal 6, Aborted. Reading symbols from ./libad.so...done. Loaded symbols for ./libad.so Reading symbols from /package/1/compilers/gcc-2.96-81/lib/libstdc++- libc6.2-2.so.3...done. Loaded symbols for /package/1/compilers/gcc-2.96-81/lib/libstdc++-libc6.2- 2.so.3 Reading symbols from /lib/i686/libm.so.6...done. Loaded symbols for /lib/i686/libm.so.6 Reading symbols from /lib/i686/libc.so.6...done. Loaded symbols for /lib/i686/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 #0 0x400c3801 in __kill () from /lib/i686/libc.so.6 (gdb) bt #0 0x400c3801 in __kill () from /lib/i686/libc.so.6 #1 0x400c35da in raise (sig=6) at ../sysdeps/posix/raise.c:27 #2 0x400c4d82 in abort () at ../sysdeps/generic/abort.c:88 #3 0x4003ef2b in __default_terminate () at ./libgcc2.c:3034 #4 0x4004218e in terminate () at ./cp/exception.cc:47 #5 0x40019490 in so::testCancellation (this=0x8049ef8) at so.h:40 #6 0x4003f841 in find_exception_handler (pc=0x4001943a, table=0x4001aafc, eh_info=0x8049ef8, rethrow=1, cleanup=0xbffff4dc) at ./libgcc2.c:3168 #7 0x4003fb12 in throw_helper (eh=0x4005cbc0, pc=0x4001947a, my_udata=0xbffff6c0, offset_p=0xbffff6bc) at ./libgcc2.c:3168 #8 0x4003ff4f in __rethrow (index=0x4001aaa8) at ./libgcc2.c:3168 #9 0x4001947b in so::testCancellation (this=0xbffff7c0) at so.h:40 #10 0x4001914c in cond::wait (this=0xbffff7c0) at cond.cpp:12 #11 0x0804893f in main () #12 0x400b2177 in __libc_start_main (main=0x8048900 <main>, argc=1, ubp_av=0xbffff85c, init=0x80486b4 <_init>, fini=0x8048bf0 <_fini>, rtld_fini=0x4000e184 <_dl_fini>, stack_end=0xbffff84c) at ../sysdeps/generic/libc-start.c:129 libad.so is part of the test case, which is not that big at all, but had to be split up into several files in order for the problem to persist. If the same code is in one file, the problem goes away. This test case is one of several scenarios in which the problem occurs. In other cases the library is static, or the code structure is different. However, in all the cases the problem is the same: an exception that is thrown from the library code is not caught in main(). The same code works fine with gcc 2.95.2 and 3.0.1 on the same machine. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. gunzip eh_bug.tar.gz 2. tar -xvf eh_bug.tar 3. run make 4. run ./cond_test Actual Results: Abort (core dumped) Expected Results: The correct output: Cancellation caught Additional info: Platform: $> uname -a Linux macaroni 2.4.2-2smp #1 SMP Sun Apr 8 20:21:34 EDT 2001 i686 unknown $> g++ -v Reading specs from /package/1/compilers/gcc-2.96-81/bin/../lib/gcc- lib/i686-pc-linux/2.96/specs gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-81) $> rpm -qa | grep glibc glibc-2.2.2-10 glibc-devel-2.2.2-10 glibc-profile-2.2.2-10 glibc-common-2.2.2-10 compat-glibc-6.2-2.1.3.2 $> cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 10 model name : Pentium III (Cascades) stepping : 1 cpu MHz : 699.331 cache size : 1024 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1395.91 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 10 model name : Pentium III (Cascades) stepping : 1 cpu MHz : 699.331 cache size : 1024 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1395.91
Created attachment 34242 [details] tar ball of the test case
Cannot reproduce (tried gcc-c++ 2.96-99, 2.96-81 and 2.96-79), the results are the same as for gcc3 3.0.1-3 - it prints Cancellation caught and exits with status 0. Have tried even -O2 instead of what is in makefile, no effect.
That's perplexing. I need to find out why it fails for me, and not for you. I'd appreciate if you could email me your test executable, cond_test, and the library libad.so, so I can try it. I'd also appreciate if you could compile the test case with -v, and send me the output. Thanks!
Created attachment 34441 [details] cond_test executable, libad.so, core
I tried the test case on a box with a "fresh", out-of-the-box installation of Red Hat Linux 7.1. It still fails. Could you please try it (untar with "tar - xvfz eh_exec.tar")?
Here's another (slightly simplified) testcase. The program segfaults in a call to find_exception_handler(). Changes to the testcase such as removing the empty X::~X(), removing A::~A(), inlining B::~B(), or removing the dummy local from A::foobar() all allow the program to behave es expected. Optimizing either of the translation units linked to the shared library (but not the executable), or simply coalescing them into one also eliminates the core dump. Removing the dummy local variable from A::~A() reproduces the abort described by the original reporter above. Regards Martin $ cat a.cpp extern "C" int printf (const char*, ...); struct X { ~X () { /* matters */ } }; struct A { void (*f)(); ~A () { X x; /* matters */ } void foobar (); void bar (); }; inline void A::foobar () { printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__); X x; // matters f (); } inline void A::bar() { printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__); foobar (); // matters } struct B: A { ~B (); // outlining matters void foo (); // outlining matters }; #ifdef BAR // unused dummy function, call to A::bar() matters void bar (A *a) { a->bar (); } #elif defined (MAIN) void foo () { printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__); throw 0; } int main () { printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__); try { B b; b.f = foo; b.foo (); } catch (int) { printf ("exception caught\n"); } catch (...) { printf ("unexpected exception\n"); } printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__); } #else B::~B () { } void B::foo () { printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__); this->bar (); } #endif $ g++ -v && g++ -DBAR -g -fPIC -c a.cpp && g++ -g -fPIC -shared -o libfoo.so a.cpp a.o && g++ -DMAIN -L. -g -o a.out a.cpp -lfoo && ./a.out || gdb -q a.out core Reading specs from /package/1/compilers/gcc-2.96-81/bin/../lib/gcc-lib/i686-pc-linux/2.96/specs gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-81) a.cpp:56: int main () a.cpp:79: void B::foo () a.cpp:27: void A::bar () a.cpp:18: void A::foobar () a.cpp:49: void foo () Segmentation fault (core dumped) Core was generated by `./a.out'. Program terminated with signal 11, Segmentation fault. Error while mapping shared library sections: libfoo.so: Success. Reading symbols from libfoo.so...done. Loaded symbols for libfoo.so Reading symbols from /package/1/compilers/gcc-2.96-81/lib/libstdc++-libc6.2-2.so.3...done. Loaded symbols for /package/1/compilers/gcc-2.96-81/lib/libstdc++-libc6.2-2.so.3 Reading symbols from /lib/i686/libm.so.6...done. Loaded symbols for /lib/i686/libm.so.6 Reading symbols from /lib/i686/libc.so.6...done. Loaded symbols for /lib/i686/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for/lib/ld-linux.so.2 #0 0x00010004 in ?? () (gdb) bt #0 0x00010004 in ?? () #1 0x4003f841 in find_exception_handler (pc=0x40018c58, table=0x40019ec4, eh_info=0x8049c40, rethrow=1, cleanup=0xbffff58c) at ./libgcc2.c:3168 #2 0x4003fb12 in throw_helper (eh=0x4005cbc0, pc=0x40018c9a, my_udata=0xbffff770, offset_p=0xbffff76c) at ./libgcc2.c:3168 #3 0x4003ff4f in __rethrow (index=0x40019eb0) at ./libgcc2.c:3168 #4 0x40018c9b in ?? () #5 0x40018bcb in ?? () #6 0x40018aab in ?? () #7 0x08048872 in main () at a.cpp:61 #8 0x400b2177 in __libc_start_main (main=0x8048840 <main>, argc=1, ubp_av=0xbffff90c, init=0x80485c8 <_init>, fini=0x80489a0 <_fini>, rtld_fini=0x4000e184 <_dl_fini>, stack_end=0xbffff8fc) at ../sysdeps/generic/libc-start.c:129
It turns out that binutils 2.11 makes all the difference. We installed it along with gcc 2.96-99, and found out that it fixes the problem. Then I tried it with gcc 2.96-81 and the test case did not fail. It printed :Cancellation caught". I am concluding that in your testing you did not use "plain vanilla" installation of Red Hat 7.1, which I believe contains binutils-2.10.91.0.2-3. That would explain why you could not reproduce the problem. Could you please confirm that: 1. binutils 2.11 is supported with gcc 2.96-81 2. You used binutils 2.11 in testing 3. using binutils-2.10.91.0.2-3 causes the test case to abort Thank you!
I have to ammend my last entry: binutils version that fixes the problem is binutils-2.11.92.0.5. This is important to note, since the version that ships with Red Hat 7.2, binutils- 2.11.90.0.8-9, does not fix the problem. In light of this new detail, could you please confirm that: 1. binutils-2.11.92.0.5 can be safely used with gcc 2.96-81 2. using binutils-2.10.91.0.2-3 causes the test case to abort 3. using binutils-2.11.90.0.8-9 causes the test case to abort What was the version of binutils that you used to test this problem? Thank you!
We are examining this issue.
After some debugging this seems to be the famous section relative relocs against excluded .gnu.linkonce.* sections problem, see e.g. http://sources.redhat.com/ml/binutils/2001-06/msg00413.html and following thread for details. binutils-2.11.92.0.5 zeros those relocs while binutils-2.11.90.0.8-9 resolved them relative to the winner .gnu.linkonce section. The right solution is to have .gnu.linkonce.eh.* sections etc., but this requires coordination between gcc and ld and cannot happen for 7.2. Unfortunately, this exact change in binutils-2.11.92.0.5 causes lots of instability in that release, which is hopefully better in current binutils. I'll see if something could be done in gcc 2.96-RH unwinder so that it would cope with this to fix it for 7.x and will code a solution for future gcc/ld.
The following patch should do the trick: 2001-10-31 Jakub Jelinek <jakub> * frame.c (fde_merge): Choose just one from FDEs for the same function in erratic array. --- gcc/frame.c.jj Thu Jun 8 15:45:46 2000 +++ gcc/frame.c Wed Oct 31 17:38:09 2001 @@ -197,7 +197,7 @@ static void fde_merge (fde_vector *v1, const fde_vector *v2) { size_t i1, i2; - fde * fde2; + fde * fde2 = NULL; i2 = v2->count; if (i2 > 0) @@ -205,6 +205,17 @@ fde_merge (fde_vector *v1, const fde_vec i1 = v1->count; do { i2--; + if (fde2 != NULL && fde_compare (v2->array[i2], fde2) == 0) + { + /* Some linkers (e.g. 2.10.91.0.2 or 2.11.92.0.8) resolve + section relative relocations against removed linkonce + section to corresponding location in the output linkonce + section. Always use the earliest fde in that case. */ + fde2 = v2->array[i2]; + v1->array[i1+i2+1] = fde2; + v1->array[i1+i2] = fde2; + continue; + } fde2 = v2->array[i2]; while (i1 > 0 && fde_compare (v1->array[i1-1], fde2) > 0) { I think it is better to handle this in gcc, which means only libstdc++ can be upgraded, as opposed to require program/library relinking with fixed binutils. This patch will make it into gcc-2.96-100.
I'm having a similar problem, and unfortunately, this patch didn't fix it. I posted a report to the gcc list (gcc.gnu.org), under ticket C++/4678. Summary: a large dll, loaded by java, does not catch thrown exceptions, leading to a core dump instead. The application is large, so I'll refrain from sending the code unless asked. Small test cases did *not* reproduce the problem. Compiling the same application using mingw under windows or an older egcs redhat (egcs-2.91.66 ) works. My system is redhat 7.1 with all up2date patches as of today (11/04/2001) (although I have now rebuilt gcc using the src rpm to include the above patch). Here is the original report, with stacktrace: This bug is only reproducable when building a large (~60k lines of code) dll for a Java/JNI application. The application (and exception handling) works fine under windows using mingw. An exception of the following form: Exception e("Instantiated outside of try block"); try { throw e; } catch(Exception &e) { cerr<< "Caught Exception: " << e.getMessage() << "\n"; } catch(...) { cerr<< "Caught Unknown Exception\n" ; } works. The following snippet: try { throw Exception("Instantiated within try block"); } catch(Exception &e) { cerr<< "Caught Exception: " << e.getMessage() << "\n"; } catch(...) { cerr<< "Caught Unknown Exception\n" ; } causes a core dump. Here is the stack trace. Note that __rethrow is being called: the exception was never caught, even though there was a catch(...) block. #0 0x40374ae1 in __kill () from /lib/i686/libc.so.6 #1 0x4003276b in raise (sig=6) at signals.c:65 #2 0x40376062 in abort () at ../sysdeps/generic/abort.c:88 #3 0x404eae55 in __default_terminate () from /usr/lib/libstdc++-libc6.1-1.so.2 #4 0x404eae72 in __terminate () from /usr/lib/libstdc++-libc6.1-1.so.2 #5 0x4a9c7f01 in __rethrow (index=0x4a4c15dc) at ../../gcc/libgcc2.c:3168 #6 0x4a4b8f94 in Java_com_leastsquares_decision_gambit_GambitWrapper_setDimsNative (env=0x804e094, obj=0xbfffd46c, hashCode=7474923, nPlayerOneDims=15, nPlayerTwoDims=10) at GambitWrapperInterface.cc:55 #7 0x08062055 in ?? () at eval.c:41 #8 0x0805f685 in ?? () at eval.c:41 #9 0x0805f685 in ?? () at eval.c:41 #10 0x0805f685 in ?? () at eval.c:41 #11 0x40338d7e in StubRoutines::_code1 () at eval.c:41 #12 0x40130604 in JavaCalls::call_helper () at eval.c:41 #13 0x4018e48d in os::os_exception_wrapper () at eval.c:41 #14 0x40130840 in JavaCalls::call () at eval.c:41 #15 0x40135c1a in jni_invoke () at eval.c:41 #16 0x40140bb7 in jni_CallStaticVoidMethod () at eval.c:41 #17 0x08049344 in main () at eval.c:41 #18 0x40362627 in __libc_start_main (main=0x8048c90 <main>, argc=2, ubp_av=0xbffff7b4, init=0x8048974 <_init>, fini=0x804aaec <_fini>, rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffff7ac) at ../sysdeps/generic/libc-start.c:129
The original bug should be fixed in gcc-2.96-100 and above. As for grendel's report, it is a different thing and without testcase there is nothing that can be done about it, but my guess is that you're throwing exception from code compiled by one G++ version and catching it in code compiled by a different (major) G++ version (here I mean egcs and gcc-2.96-RH). If the binary-only java libs have been compiled say with egcs (highly likely), then this really will never work.
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2002-055.html
still buggy??? The following code ***does*** catch the thrown exception but then seg faults... Am I misunderstanding something?? #include <iostream> using namespace std; main(){ try { throw 1; } catch(int e){ cerr << "caught: " << e << endl; } cerr << "done!!!" << endl; } Output: caught: 1 Segmentation fault (gdb) where #0 0x00000000 in ?? () #1 0x40269348 in __user_type_info::dyncast (this=0x40277df0, boff=0, target=@0x40277dd8, objptr=0x40277aa0, subtype=@0x40277d68, subptr=0x40277aa0) from /usr/lib/libstdc++-libc6.2-2.so.3 #2 0x4026b4e3 in __dynamic_cast_2 (from=0x4026bac0 <__builtin_type_info type_info function>, to=0x4026b980 <__pointer_type_info type_info function>, boff=0, address=0x40277aa0, sub=0x402ccaac <type_info type_info function>, subptr=0x40277aa0) from /usr/lib/libstdc++-libc6.2-2.so.3 #3 0x4026b2a3 in __is_pointer (p=0x40277aa0) from /usr/lib/libstdc++-libc6.2-2.so.3 #4 0x4026a796 in __cp_pop_exception (p=0x804a070) from /usr/lib/libstdc++-libc6.2-2.so.3 #5 0x0804894c in __eh_alloc () #6 0x42017589 in __libc_start_main () from /lib/i686/libc.so.6 I applied the latest patches to 2.96 today.
Sorry... I take back the above "bug" report. The bug only arises when linking with the mysql libsqlplus library.
util-linux-ng-2.16.2-5.fc12 has been submitted as an update for Fedora 12. http://admin.fedoraproject.org/updates/util-linux-ng-2.16.2-5.fc12
util-linux-ng-2.16.2-5.fc12 has been pushed to the Fedora 12 stable repository. If problems still persist, please make note of it in this bug report.