Description of problem: openbabel fails to build with gcc-4.3 on ppc64. Version-Release number of selected component (if applicable): gcc-4.3.0-0.4 How reproducible: Always Steps to Reproduce: 1. koji build --scratch dist-f9-gcc43 cvs://cvs.fedoraproject.org/cvs/pkgs?rpms/openbabel/devel#HEAD Actual results: [...] openbabel_perl.cpp:113918: warning: unused variable 'items' {standard input}: Assembler messages: {standard input}:1216558: Error: operand out of range (0x0000000000008000 is not between 0xffffffffffff8000 and 0x0000000000007ffc) (many repeated similar lines)
Created attachment 290922 [details] ppc64 build log from koji (gzipped)
For the record: builds fine on i386, x86_64 and ppc.
This means .toc1 section overflow - on ppc64 .toc1 for one CU can have at most 64KB in size. The generated source is really huge, even with g++ 4.1 .toc1 size is over 55KB, so close to the limit. To that adds an issue in inlining heuristic function size estimation, filed http://gcc.gnu.org/PR34708 for that and the newly added -finline-small-functions which is default at -O2. The inlining estimation causes the size of SWIG_Perl_ErrorType to be incorrectly estimated, much smaller than it really is, and then it is inlined thousands of times, in each case needing a jump table which eats one .toc1 entry. In the mean time, the best change to make this build with current 4.3 is IMHO add __attribute__((noinline)) to SWIG_Perl_ErrorType to prevent inlining it.
Thanks for the quick response. I did as you suggested, but it still fails, just a little bit further. http://koji.fedoraproject.org/koji/getfile?taskID=332336&name=build.log I'd rather wait until this is fixed upstream. I don't like working around compiler bugs.
In this case it is not really a compiler bug, perhaps not very good inlining decision. With -O3 you will overflow .toc1 even with 4.1. And when reaching target limitations packages that want to build just need to do some steps to help it building, which can be spliting the huge source into several smaller ones, or aggregating some string literal addresses into arrays, etc. BTW, another alternative which would make inlining SWIG_Perl_ErrorType actually a win would be to rewrite SWIG_Perl_ErrorType to reference a static array, indexed by code+13, with all the error codes and only handle the cases where code+13 is out of that static array bounds. Then no jump table is needed (even 4.1 uses switch here, just doesn't inline the function containing it). An optimization which would do this automatically has been submitted for GCC some time ago, but has some nits to be still worked on and as such probably won't make it into 4.3.
(In reply to comment #5) > In this case it is not really a compiler bug, perhaps not very good inlining > decision. That's a bug in my book. > With -O3 you will overflow .toc1 even with 4.1. Maybe, but we don't use -O3 in Fedora. Anyway, thanks for the suggestions. I've submitted this to openbabel developers.
I'm looking into this because it's blocking updating kdeedu to the 4.1 snapshots. I tried -fno-inline-small-functions and also -fno-inline-functions, neither seems to help. :-(
BTW it now fails in the Python bindings, not the Perl ones. Jakub, if you think this is a different issue, we can open a different bug.
Build log with -fno-inline-functions -fno-inline-small-functions: http://koji.fedoraproject.org/koji/getfile?taskID=600751&name=build.log
I tried adding -fno-inline too, but that just makes the section overflow even more: http://koji.fedoraproject.org/koji/getfile?taskID=600792&name=build.log
I finally got this to build for F10 (not tried F9 yet) after lots of trial and error, using these 2 hacks: http://cvs.fedoraproject.org/viewcvs/rpms/openbabel/devel/openbabel.spec?r1=1.18&r2=1.32 (note the huge number of revisions between the original and the attempt which finally worked). The SWIG switch "-fastdispatch" makes the code faster and smaller (and with less TOC1 entries) at the expense of error message quality when you pass a bad parameter to an overloaded function from Python (not a real issue IMHO and better than not having the binding available at all!). The GCC switch "-mno-sum-in-toc" saves TOC1 entries at the expense of speed. It would probably be better to use the GCC switch only for that one file, but setting it globally at least gets this thing to build and doesn't need makefile hackery. Unfortunately, this isn't a permanent solution though: I've seen they've added even more stuff to their Python binding in their SVN repository after beta 4, and I only brought this barely below the TOC1 limit, so I fully expect this to blow up again in the near future. :-(
As predicted, the problem is back with beta 5 which has even an larger Python binding. :-( 353 toc1 entries too many, despite the above tricks. I have no idea how to make those fit.
Looks like this was finally fixed upstream: http://openbabel.svn.sourceforge.net/viewvc/openbabel?view=rev&revision=2535 I'm backporting that fix to the openbabel package. (I guess I should really become an official comaintainer of that package.)
I never did understand why we're limited to a single TOC in each relocatable object file. Each function has its own function descriptor with its own TOC pointer, after all -- I don't see why they all have to point to the same place. Why can't each have their _own_ TOC, if necessary? In fact, if you use -mminimal-toc you get something _similar_ to that, but GCC uses an extra register as a pointer to its '.toc1' and similar sections, and loads that pointer from the 'real' TOC instead of just putting it in the function descriptor. Which seems a bit strange and wastes a register. When I hit TOC size problems with the ppc64 ocaml back end and nobody could answer the above, I just stopped using the 'real' TOC altogether, and pointed each function's descriptor at its own local TOC instead. It seems to work fine, although strictly speaking it breaks the ABI because the TOC pointer in the function descriptor is supposed to point to the TOC section. If we can't do that, can we not at least have the compiler keep track of how big it's making the TOC, and start enabling -mno-fp-in-toc, -mno-sum-in-toc, or -mminimal-toc automatically?
-mminimal-toc is already used, the problem is that this just means that instead of global .toc entries, we have per-compilation-unit .toc1 entries, and it's that .toc1 section which overflowed because the compilation unit was too big. (OpenBabel upstream "fixed" it by splitting the bindings so there are separate compilation units. This was nontrivial because, due to the way SWIG works, this means the parts also have to be separate Python extensions, in a separate Python namespace - you can't split a single generated binding into multiple compilation units. What they did is use the Python "import" statement to bring these back into one namespace.)