Bug 54724 - gcc 2.96 generates bad exception handling code
Summary: gcc 2.96 generates bad exception handling code
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: gcc
Version: 7.1
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-10-17 00:04 UTC by Need Real Name
Modified: 2009-12-16 01:10 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2001-11-05 00:49:28 UTC
Embargoed:


Attachments (Terms of Use)
tar ball of the test case (1.35 KB, application/octet-stream)
2001-10-17 00:06 UTC, Need Real Name
no flags Details
cond_test executable, libad.so, core (39.91 KB, application/octet-stream)
2001-10-19 20:36 UTC, Need Real Name
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2002:055 0 contract SHIPPED_LIVE Updated version of GCC 2.96-RH now available 2002-04-02 05:00:00 UTC
Red Hat Product Errata RHBA-2002:200 0 high SHIPPED_LIVE Updated version of GCC 2.96-RH now available 2002-09-12 04:00:00 UTC

Description Need Real Name 2001-10-17 00:04:44 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)

Description of problem:
The test case aborts due to an exception not being caught in main(). This 
is the bt from core:

macaroni:/nfs/devco/edin/eh > gdb ./cond_test core 
GNU gdb 5.0rh-5 Red Hat Linux 7.1
Copyright 2001 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you 
are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
Core was generated by `./cond_test'.
Program terminated with signal 6, Aborted.
Reading symbols from ./libad.so...done.
Loaded symbols for ./libad.so
Reading symbols from /package/1/compilers/gcc-2.96-81/lib/libstdc++-
libc6.2-2.so.3...done.
Loaded symbols for /package/1/compilers/gcc-2.96-81/lib/libstdc++-libc6.2-
2.so.3
Reading symbols from /lib/i686/libm.so.6...done.
Loaded symbols for /lib/i686/libm.so.6
Reading symbols from /lib/i686/libc.so.6...done.
Loaded symbols for /lib/i686/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x400c3801 in __kill () from /lib/i686/libc.so.6
(gdb) bt
#0  0x400c3801 in __kill () from /lib/i686/libc.so.6
#1  0x400c35da in raise (sig=6) at ../sysdeps/posix/raise.c:27
#2  0x400c4d82 in abort () at ../sysdeps/generic/abort.c:88
#3  0x4003ef2b in __default_terminate () at ./libgcc2.c:3034
#4  0x4004218e in terminate () at ./cp/exception.cc:47
#5  0x40019490 in so::testCancellation (this=0x8049ef8) at so.h:40
#6  0x4003f841 in find_exception_handler (pc=0x4001943a, table=0x4001aafc, 
eh_info=0x8049ef8, rethrow=1, 
    cleanup=0xbffff4dc) at ./libgcc2.c:3168
#7  0x4003fb12 in throw_helper (eh=0x4005cbc0, pc=0x4001947a, 
my_udata=0xbffff6c0, offset_p=0xbffff6bc)
    at ./libgcc2.c:3168
#8  0x4003ff4f in __rethrow (index=0x4001aaa8) at ./libgcc2.c:3168
#9  0x4001947b in so::testCancellation (this=0xbffff7c0) at so.h:40
#10 0x4001914c in cond::wait (this=0xbffff7c0) at cond.cpp:12
#11 0x0804893f in main ()
#12 0x400b2177 in __libc_start_main (main=0x8048900 <main>, argc=1, 
ubp_av=0xbffff85c, init=0x80486b4 <_init>, 
    fini=0x8048bf0 <_fini>, rtld_fini=0x4000e184 <_dl_fini>, 
stack_end=0xbffff84c)
    at ../sysdeps/generic/libc-start.c:129

libad.so is part of the test case, which is not that big at all, but had 
to be split up into several files in order for the problem to persist. If 
the same code is in one file, the problem goes away. This test case is one 
of several scenarios in which the problem occurs. In other cases the 
library is static, or the code structure is different. However, in all the 
cases the problem is the same: an exception that is thrown from the 
library code is not caught in main(). 

The same code works fine with gcc 2.95.2 and 3.0.1 on the same machine.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. gunzip eh_bug.tar.gz
2. tar -xvf eh_bug.tar
3. run make
4. run ./cond_test
	

Actual Results:  Abort (core dumped)

Expected Results:  The correct output:
Cancellation caught

Additional info:

Platform:
$> uname -a
Linux macaroni 2.4.2-2smp #1 SMP Sun Apr 8 20:21:34 EDT 2001 i686 unknown

$> g++ -v
Reading specs from /package/1/compilers/gcc-2.96-81/bin/../lib/gcc-
lib/i686-pc-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-81)

$> rpm -qa | grep glibc
glibc-2.2.2-10
glibc-devel-2.2.2-10
glibc-profile-2.2.2-10
glibc-common-2.2.2-10
compat-glibc-6.2-2.1.3.2

$> cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 10
model name	: Pentium III (Cascades)
stepping	: 1
cpu MHz	: 699.331
cache size	: 1024 KB
fdiv_bug	: no
hlt_bug	        : no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 mmx fxsr sse
bogomips	: 1395.91

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 10
model name	: Pentium III (Cascades)
stepping	: 1
cpu MHz         : 699.331
cache size	: 1024 KB
fdiv_bug	: no
hlt_bug	        : no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 mmx fxsr sse
bogomips	: 1395.91

Comment 1 Need Real Name 2001-10-17 00:06:20 UTC
Created attachment 34242 [details]
tar ball of the test case

Comment 2 Jakub Jelinek 2001-10-17 10:19:10 UTC
Cannot reproduce (tried gcc-c++ 2.96-99, 2.96-81 and 2.96-79), the results
are the same as for gcc3 3.0.1-3 - it prints Cancellation caught and exits
with status 0. Have tried even -O2 instead of what is in makefile, no effect.

Comment 3 Need Real Name 2001-10-17 18:46:28 UTC
That's perplexing. I need to find out why it fails for me, and not for you.

I'd appreciate if you could email me your test executable, cond_test, and the 
library libad.so, so I can try it. I'd also appreciate if you could compile the 
test case with -v, and send me the output.

Thanks!

Comment 4 Need Real Name 2001-10-19 20:36:24 UTC
Created attachment 34441 [details]
cond_test executable, libad.so, core

Comment 5 Need Real Name 2001-10-19 20:46:11 UTC
I tried the test case on a box with a "fresh", out-of-the-box installation of 
Red Hat Linux 7.1. It still fails. Could you please try it (untar with "tar -
xvfz eh_exec.tar")?




Comment 6 Martin Sebor 2001-10-20 00:04:18 UTC
Here's another (slightly simplified) testcase. The program segfaults in a call to find_exception_handler().

Changes to the testcase such as removing the empty X::~X(), removing A::~A(), inlining B::~B(), or removing the dummy local from A::foobar() all  allow the program to 
behave es expected. Optimizing either of the translation units linked to the shared library (but not the executable), or simply coalescing them into one also eliminates 
the core dump.

Removing the dummy local variable from A::~A() reproduces the abort described by the original reporter above.

Regards
Martin


$ cat a.cpp
extern "C" int printf (const char*, ...);

struct X { ~X () { /* matters */ } };

struct A
{
    void (*f)();

    ~A () { X x; /* matters */ }

    void foobar ();

    void bar ();
};

inline void A::foobar ()
{
    printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__);

    X x;   // matters

    f ();
}

inline void A::bar()
{
    printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__);

    foobar ();   // matters
}

struct B: A
{
    ~B ();   // outlining matters

    void foo ();   // outlining matters
};


#ifdef BAR

// unused dummy function, call to A::bar() matters
void bar (A *a) { a->bar (); }

#elif defined (MAIN)

void foo ()
{
    printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__);

    throw 0;
}

int main ()
{
    printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__);

    try {
        B b;
        b.f = foo;
        b.foo ();
    }
    catch (int) {
        printf ("exception caught\n");
    }
    catch (...) {
        printf ("unexpected exception\n");
    }

    printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__);
}

#else

B::~B () { }

void B::foo ()
{
    printf ("%s:%d: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__);

    this->bar ();
}

#endif

$ g++ -v && g++ -DBAR -g -fPIC -c a.cpp && g++ -g -fPIC -shared -o libfoo.so a.cpp a.o && g++ -DMAIN -L. -g -o a.out a.cpp -lfoo && ./a.out || gdb -q a.out core
Reading specs from /package/1/compilers/gcc-2.96-81/bin/../lib/gcc-lib/i686-pc-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-81)
a.cpp:56: int main ()
a.cpp:79: void B::foo ()
a.cpp:27: void A::bar ()
a.cpp:18: void A::foobar ()
a.cpp:49: void foo ()
Segmentation fault (core dumped)
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
Error while mapping shared library sections:
libfoo.so: Success.
Reading symbols from libfoo.so...done.
Loaded symbols for libfoo.so
Reading symbols from /package/1/compilers/gcc-2.96-81/lib/libstdc++-libc6.2-2.so.3...done.
Loaded symbols for /package/1/compilers/gcc-2.96-81/lib/libstdc++-libc6.2-2.so.3
Reading symbols from /lib/i686/libm.so.6...done.
Loaded symbols for /lib/i686/libm.so.6
Reading symbols from /lib/i686/libc.so.6...done.
Loaded symbols for /lib/i686/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for/lib/ld-linux.so.2
#0  0x00010004 in ?? ()
(gdb) bt
#0  0x00010004 in ?? ()
#1  0x4003f841 in find_exception_handler (pc=0x40018c58, table=0x40019ec4, 
    eh_info=0x8049c40, rethrow=1, cleanup=0xbffff58c) at ./libgcc2.c:3168
#2  0x4003fb12 in throw_helper (eh=0x4005cbc0, pc=0x40018c9a, 
    my_udata=0xbffff770, offset_p=0xbffff76c) at ./libgcc2.c:3168
#3  0x4003ff4f in __rethrow (index=0x40019eb0) at ./libgcc2.c:3168
#4  0x40018c9b in ?? ()
#5  0x40018bcb in ?? ()
#6  0x40018aab in ?? ()
#7  0x08048872 in main () at a.cpp:61
#8  0x400b2177 in __libc_start_main (main=0x8048840 <main>, argc=1, 
    ubp_av=0xbffff90c, init=0x80485c8 <_init>, fini=0x80489a0 <_fini>, 
    rtld_fini=0x4000e184 <_dl_fini>, stack_end=0xbffff8fc)
    at ../sysdeps/generic/libc-start.c:129


Comment 7 Need Real Name 2001-10-23 00:34:42 UTC
It turns out that binutils 2.11 makes all the difference. We installed it along 
with gcc 2.96-99, and found out that it fixes the problem. Then I tried it with 
gcc 2.96-81 and the test case did not fail. It printed :Cancellation caught". I 
am concluding that in your testing you did not use "plain vanilla" installation 
of Red Hat 7.1, which I believe contains binutils-2.10.91.0.2-3. That would 
explain why you could not reproduce the problem. Could you please confirm that:

1. binutils 2.11 is supported with gcc 2.96-81
2. You used binutils 2.11 in testing
3. using binutils-2.10.91.0.2-3 causes the test case to abort

Thank you!



Comment 8 Need Real Name 2001-10-26 02:06:25 UTC
I have to ammend my last entry:

binutils version that fixes the problem is binutils-2.11.92.0.5. This is 
important to note, since the version that ships with Red Hat 7.2, binutils-
2.11.90.0.8-9, does not fix the problem. In light of this new detail, could you 
please confirm that:

1. binutils-2.11.92.0.5 can be safely used with gcc 2.96-81
2. using binutils-2.10.91.0.2-3 causes the test case to abort
3. using binutils-2.11.90.0.8-9 causes the test case to abort

What was the version of binutils that you used to test this problem?

Thank you!


Comment 9 Preston Brown 2001-10-30 20:37:20 UTC
We are examining this issue.

Comment 10 Jakub Jelinek 2001-10-31 10:11:32 UTC
After some debugging this seems to be the famous section relative relocs
against excluded .gnu.linkonce.* sections problem, see e.g.
http://sources.redhat.com/ml/binutils/2001-06/msg00413.html
and following thread for details. binutils-2.11.92.0.5 zeros those relocs
while binutils-2.11.90.0.8-9 resolved them relative to the winner .gnu.linkonce
section. The right solution is to have .gnu.linkonce.eh.* sections etc.,
but this requires coordination between gcc and ld and cannot happen for 7.2.
Unfortunately, this exact change in binutils-2.11.92.0.5 causes lots of
instability in that release, which is hopefully better in current binutils.

I'll see if something could be done in gcc 2.96-RH unwinder so that it would
cope with this to fix it for 7.x and will code a solution for future gcc/ld.

Comment 11 Jakub Jelinek 2001-10-31 16:28:23 UTC
The following patch should do the trick:
2001-10-31  Jakub Jelinek  <jakub>

        * frame.c (fde_merge): Choose just one from FDEs for the
        same function in erratic array.

--- gcc/frame.c.jj      Thu Jun  8 15:45:46 2000
+++ gcc/frame.c Wed Oct 31 17:38:09 2001
@@ -197,7 +197,7 @@ static void
 fde_merge (fde_vector *v1, const fde_vector *v2)
 {
   size_t i1, i2;
-  fde * fde2;
+  fde * fde2 = NULL;

   i2 = v2->count;
   if (i2 > 0)
@@ -205,6 +205,17 @@ fde_merge (fde_vector *v1, const fde_vec
       i1 = v1->count;
       do {
         i2--;
+        if (fde2 != NULL && fde_compare (v2->array[i2], fde2) == 0)
+         {
+           /* Some linkers (e.g. 2.10.91.0.2 or 2.11.92.0.8) resolve
+              section relative relocations against removed linkonce
+              section to corresponding location in the output linkonce
+              section. Always use the earliest fde in that case.  */
+           fde2 = v2->array[i2];
+           v1->array[i1+i2+1] = fde2;
+           v1->array[i1+i2] = fde2;
+           continue;
+         }
         fde2 = v2->array[i2];
         while (i1 > 0 && fde_compare (v1->array[i1-1], fde2) > 0)
           {

I think it is better to handle this in gcc, which means only libstdc++
can be upgraded, as opposed to require program/library relinking with
fixed binutils.
This patch will make it into gcc-2.96-100.

Comment 12 Need Real Name 2001-11-05 00:49:24 UTC
I'm having a similar problem, and unfortunately, this patch didn't fix it. I
posted a report to the gcc list (gcc.gnu.org), under ticket  C++/4678. Summary:
a large dll, loaded by java, does not catch thrown exceptions, leading to a core
dump instead. The application is large, so I'll refrain from sending the code
unless asked. Small test cases did *not* reproduce the problem. Compiling the
same application using mingw under windows or an older egcs redhat (egcs-2.91.66
) works. My system is redhat 7.1 with all up2date patches as of today
(11/04/2001) (although I have now rebuilt gcc using the src rpm to include the
above patch). Here is the original report, with  stacktrace:


This bug is only reproducable when building a large (~60k lines of code) dll for
a Java/JNI application.
                   The application (and exception handling) works fine under
windows using mingw.

                   An exception of the following form:

                   Exception e("Instantiated outside of try block");

                   try
                   {
                   throw e;
                   }
                   catch(Exception &e)
                   {
                   cerr<< "Caught Exception: " << e.getMessage() << "\n";
                   }
                   catch(...)
                   {
                   cerr<< "Caught Unknown Exception\n" ;
                   }


                   works. The following snippet:


                   try
                   {
                   throw Exception("Instantiated within try block");
                   }
                   catch(Exception &e)
                   {
                   cerr<< "Caught Exception: " << e.getMessage() << "\n";
                   }
                   catch(...)
                   {
                   cerr<< "Caught Unknown Exception\n" ;
                   }


                   causes a core dump.

                   Here is the stack trace. Note that __rethrow is being called:
the exception was never caught, even
                   though there was a catch(...) block.


                   #0 0x40374ae1 in __kill () from /lib/i686/libc.so.6
                   #1 0x4003276b in raise (sig=6) at signals.c:65
                   #2 0x40376062 in abort () at ../sysdeps/generic/abort.c:88
                   #3 0x404eae55 in __default_terminate () from
/usr/lib/libstdc++-libc6.1-1.so.2
                   #4 0x404eae72 in __terminate () from
/usr/lib/libstdc++-libc6.1-1.so.2
                   #5 0x4a9c7f01 in __rethrow (index=0x4a4c15dc) at
../../gcc/libgcc2.c:3168
                   #6 0x4a4b8f94 in
Java_com_leastsquares_decision_gambit_GambitWrapper_setDimsNative
(env=0x804e094,
                   obj=0xbfffd46c, hashCode=7474923, nPlayerOneDims=15, 
                   nPlayerTwoDims=10) at GambitWrapperInterface.cc:55
                   #7 0x08062055 in ?? () at eval.c:41
                   #8 0x0805f685 in ?? () at eval.c:41
                   #9 0x0805f685 in ?? () at eval.c:41
                   #10 0x0805f685 in ?? () at eval.c:41
                   #11 0x40338d7e in StubRoutines::_code1 () at eval.c:41
                   #12 0x40130604 in JavaCalls::call_helper () at eval.c:41
                   #13 0x4018e48d in os::os_exception_wrapper () at eval.c:41
                   #14 0x40130840 in JavaCalls::call () at eval.c:41
                   #15 0x40135c1a in jni_invoke () at eval.c:41
                   #16 0x40140bb7 in jni_CallStaticVoidMethod () at eval.c:41
                   #17 0x08049344 in main () at eval.c:41
                   #18 0x40362627 in __libc_start_main (main=0x8048c90 <main>,
argc=2, 
                   ubp_av=0xbffff7b4, init=0x8048974 <_init>, fini=0x804aaec
<_fini>, 
                   rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffff7ac)
                   at ../sysdeps/generic/libc-start.c:129


Comment 13 Jakub Jelinek 2001-11-28 13:02:22 UTC
The original bug should be fixed in gcc-2.96-100 and above.
As for grendel's report, it is a different thing and without testcase
there is nothing that can be done about it, but my guess is that you're
throwing exception from code compiled by one G++ version and catching
it in code compiled by a different (major) G++ version (here I mean egcs and
gcc-2.96-RH). If the binary-only java libs have been compiled say with egcs
(highly likely), then this really will never work.

Comment 14 Bill Nottingham 2002-07-26 21:47:31 UTC
An errata has been issued which should help the problem described in this bug report. 
This report is therefore being closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, please follow the link below. You may reopen 
this bug report if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2002-055.html


Comment 15 karsten 2002-09-26 18:28:24 UTC
still buggy???

The following code ***does*** catch the thrown exception but then seg faults...
Am I misunderstanding something??

#include <iostream>
using namespace std;

main(){
  try {
    throw 1;
  } catch(int e){
    cerr << "caught: " << e << endl;
  }
  cerr << "done!!!" << endl;
}
Output:

caught: 1
Segmentation fault
(gdb) where
#0  0x00000000 in ?? ()
#1  0x40269348 in __user_type_info::dyncast (this=0x40277df0, boff=0,
target=@0x40277dd8, objptr=0x40277aa0, subtype=@0x40277d68, subptr=0x40277aa0)
   from /usr/lib/libstdc++-libc6.2-2.so.3
#2  0x4026b4e3 in __dynamic_cast_2 (from=0x4026bac0 <__builtin_type_info
type_info function>, to=0x4026b980 <__pointer_type_info type_info function>, boff=0,
    address=0x40277aa0, sub=0x402ccaac <type_info type_info function>,
subptr=0x40277aa0) from /usr/lib/libstdc++-libc6.2-2.so.3
#3  0x4026b2a3 in __is_pointer (p=0x40277aa0) from /usr/lib/libstdc++-libc6.2-2.so.3
#4  0x4026a796 in __cp_pop_exception (p=0x804a070) from
/usr/lib/libstdc++-libc6.2-2.so.3
#5  0x0804894c in __eh_alloc ()
#6  0x42017589 in __libc_start_main () from /lib/i686/libc.so.6

I applied the latest patches to 2.96 today.

Comment 16 karsten 2002-09-26 19:20:13 UTC
Sorry... I take back the above "bug" report.  The bug only arises when linking
with the mysql libsqlplus library.

Comment 17 Fedora Update System 2009-12-15 13:53:14 UTC
util-linux-ng-2.16.2-5.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/util-linux-ng-2.16.2-5.fc12

Comment 18 Fedora Update System 2009-12-16 01:10:59 UTC
util-linux-ng-2.16.2-5.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.