Bug 223181 - major (7x) linking performance regression caused by gcc update 4.1.1-51.fc6
Summary: major (7x) linking performance regression caused by gcc update 4.1.1-51.fc6
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: binutils
Version: 6
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-01-18 07:01 UTC by David Baron
Modified: 2007-11-30 22:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-09-22 16:37:56 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description David Baron 2007-01-18 07:01:06 UTC
Description of problem:  The gcc update pushed to FC6 updates on January 10
caused a major performance regression in linking the files compiled by this gcc.
 This make the write-compile-test cycle for Mozilla development quite painful.

Version-Release number of selected component (if applicable):
Jan 10 14:27:20 Updated: libgcc.i386 4.1.1-51.fc6
Jan 10 14:28:21 Updated: libstdc++.i386 4.1.1-51.fc6
Jan 10 14:28:28 Updated: libgcj.i386 4.1.1-51.fc6
Jan 10 14:28:29 Updated: libgomp.i386 4.1.1-51.fc6
Jan 10 14:28:30 Updated: libgfortran.i386 4.1.1-51.fc6
Jan 10 14:28:51 Updated: libgcj-devel.i386 4.1.1-51.fc6
Jan 10 14:29:01 Updated: libstdc++-devel.i386 4.1.1-51.fc6
Jan 10 14:29:02 Updated: libmudflap.i386 4.1.1-51.fc6
Jan 10 14:29:04 Updated: cpp.i386 4.1.1-51.fc6
Jan 10 14:29:08 Updated: gcc.i386 4.1.1-51.fc6
Jan 10 14:29:10 Updated: libobjc.i386 4.1.1-51.fc6
Jan 10 14:29:33 Updated: gcc-gfortran.i386 4.1.1-51.fc6
Jan 10 14:31:03 Updated: gcc-debuginfo.i386 4.1.1-51.fc6
Jan 10 14:31:05 Updated: gcc-c++.i386 4.1.1-51.fc6
Jan 10 14:31:15 Updated: gcc-java.i386 4.1.1-51.fc6
Jan 10 14:32:24 Updated: libgnat.i386 4.1.1-51.fc6

How reproducible:  Always

Steps to Reproduce:  My steps to reproduce were building Mozilla trunk,
--enable-optimize="-O2 -fno-omit-frame-pointer" and  --enable-debug.  I'm
looking at the time it takes to link libgklayout.so, which is built in
mozilla/layout/build/ and consists of the code compiled in the mozilla/dom/,
mozilla/content/, and mozilla/layout/ subdirectories of the Mozilla tree.

Expected results:

Before the gcc update above, i.e., with all of the above packages at version
4.1.1-30, compiling dom, content, and layout produced .o files and .a files that
led to libgklayout linking on my laptop in:
real    0m22.756s
user    0m10.965s
sys     0m1.543s

Actual results:  After upgrading the packages to 4.1.1-51.fc6, linking
libgklayout.so takes:
real    2m49.734s
user    2m25.087s
sys     0m4.782s

(Both of these timings are the second link in a row, to get "warm" times.)

Additional info:  I did test that the performance regression is related to the
compiler used to compile the .o/.a files, not the compiler used for the final
link command.

Comment 1 David Baron 2007-01-18 07:08:00 UTC
The most noticeable difference between an objdump -Cd of the before and after .o
files is that inline functions used to (fast) be output into sections like
".gnu.linkonce.t._Z12VERIFY_COORDi" and now (slow) they are output into sections
like ".text._Z12VERIFY_COORDi".

Comment 2 Jakub Jelinek 2007-01-18 08:48:35 UTC
The change on the GCC side was an intentional bugfix.  GCC configury wasn't
able to parse the enhanced FC6+ binutils version numbers (containing -%{release}
at the end) and therefore assumed the linker doesn't support COMDAT groups.
See e.g. #215317 for an example of a bug that was fixed by this.

Now, unlike FC5 ld shouldn't be horribly slow with this, see
binutils-2.17.50.0.6-kept-section.patch
Some slowdown is certainly to be expected, guess if you prepare a tarball
with all the objects you are linking together and the exact ld command line,
some oprofiling of ld could reveal one or two spots that can be still speeded up.

Comment 3 David Baron 2007-01-19 05:47:23 UTC
I have a 67MB tar.bz2 file with an ld command that links (i686 arch).  I've
posted it at http://dbaron.org/tmp/rh-bug-223181.tar.bz2 .  Please let me know
once you've downloaded it so I can remove it from my Web space.

(It's not quite pure, since I've downgraded gcc, and I forgot about one of the
directories involved, mozilla/view/, so I didn't save it and thus a single one
of the .a files with only three .o files is actually built using the old
compiler.  However, it nevertheless shows the performance problems.)

Comment 4 Jakub Jelinek 2007-01-19 12:51:33 UTC
Please try rawhide binutils.  It seems
binutils-2.17.50.0.6-kept-section.patch
doesn't measurably help on this testcase (but the time is spent mostly in
_bfd_elf_check_kept_section and functions it calls).
rawhide binutils contains a different fix for this:
http://sources.redhat.com/ml/binutils/2006-11/msg00190.html
but even backporting that patch alone didn't make any visible difference.
With 2.17.50.0.6 binutils ld takes around 120 sec on my box, while 2.17.50.0.9
binutils takes just 7 sec.

Comment 5 H.J. Lu 2007-01-19 17:56:04 UTC
It has been fixed in binutils in CVS:

[hjl@gnu rh-bug-223181]$ time make LD=/usr/bin/ld
/usr/bin/ld --eh-frame-hdr ...
...

real    2m20.714s
user    2m14.343s
sys     0m5.264s

[hjl@gnu rh-bug-223181]$ time make
./ld --eh-frame-hdr ...
...

real    0m16.837s
user    0m13.281s
sys     0m1.749s


Comment 6 H.J. Lu 2007-01-19 18:32:24 UTC
In fact, this bug is fixed in Linux binutils 2.17.50.0.6 from kernel.org:

[hjl@gnu rh-bug-223181]$ LD_LIBRARY_PATH=~/usr/lib time make LD=~/usr/bin/ld
/export/home/hjl/usr/bin/ld --eh-frame-hdr ...
...
13.75user 1.74system 0:17.06elapsed 90%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+77781minor)pagefaults 0swaps

But you have to run patches/README to apply patches to correct this before
building binutils.

Comment 7 David Baron 2007-01-19 18:42:46 UTC
Yeah, it's also fixed for me in binutils from development.  It would be good to
get the necessary patches into updates; I've heard complaints on IRC from some
other people as well.

Comment 8 H.J. Lu 2007-01-19 18:47:49 UTC
There are 2 choices going forward:

1. I run patches/README before creating Linux binutils tar ball.
2. Red Hat adds "patches/README" at the end of %setup in binutils.spec, which
will fix this bug with a rebuild.

Comment 9 Jakub Jelinek 2007-01-19 20:03:42 UTC
binutils-2.17.50.0.6-3.fc6 in FC6 testing updates should fix this, please check
it out.  It contains my version of the fix rather than hjl's which I wasn't
aware of until I wrote the patch.

Comment 10 Jakub Jelinek 2007-01-22 22:28:26 UTC
It took longer than I hoped, but binutils-2.17.50.0.6-3.fc6 has been finally
pushed today:
https://www.redhat.com/archives/fedora-test-list/2007-January/msg00259.html

Comment 11 David Baron 2007-01-24 07:56:49 UTC
The binutils in updates-testing are a big improvement over the released
binutils, but they're not as good as the ones in devel.  I'm seeing:

real    0m46.389s
user    0m27.146s
sys     0m4.655s

with binutils-2.17.50.0.6-3.fc6 (updates-testing)

whereas with binutils-2.17.50.0.9-1 (development) I see:

real    0m31.537s
user    0m14.255s
sys     0m2.377s

(In both cases, those are the second time, so that the files are in memory.)

Comment 12 Jakub Jelinek 2007-09-22 16:37:56 UTC
The code in F7 binutils is more invasive, certainly not appropriate for FC6.


Note You need to log in before you can comment on or make changes to this bug.