Bug 183178 - huge slowdown in ld speed over fc4
huge slowdown in ld speed over fc4
Product: Fedora
Classification: Fedora
Component: binutils (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jakub Jelinek
Depends On:
  Show dependency treegraph
Reported: 2006-02-27 00:21 EST by Vladimir Vukicevic
Modified: 2007-11-30 17:11 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-07-03 03:15:12 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Vladimir Vukicevic 2006-02-27 00:21:02 EST
binutils as present in FC5t3 takes over 3 minutes to link mozilla's
gklayout.so (roughly ~80MB with debug info) -- the ld in fc4's binutils took
just around 10-15 seconds.  Pulling binutils CVS from today (20060226) and
building a new rpm with the same patches as included in the rpm
brings linking speed down to 10-15 seconds again.  (Same problem is seen by
someone else using under SuSE, also fixed by binutils from CVS.)
Comment 1 Alexandre Oliva 2006-02-28 23:41:42 EST
How did you compare the link timings?  Are you sure this difference is not just
the result of caching?  These are the link times I got, after one very long link
pass that presumably brought all of the working set into main memory:

GNU ld version 20041220
7.77user 1.44system 0:09.70elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+61258minor)pagefaults 0swaps

GNU ld version 20060212
6.50user 1.31system 0:08.08elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+61305minor)pagefaults 0swaps

It could be that the newer linker uses more memory, although this is not
reflected in the virtual memory use that top shows for these processes, that
both top just under 230MB.  This presumably does not take the sizes of the input
object files and libraries into account, though, so perhaps there was some
change in the new linker that would get it into disk thrashing or something. 
That's quite unlikely, looking at the upstream changes since the snapshot for
our release was taken.  Can you please confirm that what you're seeing is not
just a side effect of not having files on memory, and that the numbers are
repeatable after an initial link to bring everything into memory?

FWIW, the numbers above are for an AMD64 box; I took whatever I had handiest
since the bug report didn't specify an architecture.
Comment 2 Vladimir Vukicevic 2006-03-01 00:36:37 EST
The timing was done with a script that re-ran the link step 10 times in a row,
printing the timing result for each run; it was on an x86(32) box (Pentium M
2.13 ghz, CPU speed locked at 2.13ghz).  I'll repeat tomorrow and post the
actual raw numbers.
Comment 3 Vladimir Vukicevic 2006-03-01 01:08:44 EST
Well, or now, instead of tomorrow... changed the script to just 5 passes.  It
does one link (with just the direct link command), and originally started
oprofile after the first pass and stopped it after the second.  oprofile was
disabled for these times (and this script was run after the newer linker was
run, so the pass 0 times already have everything loaded into memory):

GNU ld version 20060212
Pass 0...
real    2m23.917s
user    2m17.093s
sys     0m6.116s
Starting profiling
Pass 1...
real    2m24.045s
user    2m17.625s
sys     0m6.416s
Pass 2...
real    2m23.823s
user    2m17.486s
sys     0m6.560s
Pass 3...
real    2m24.080s
user    2m16.957s
sys     0m6.372s
Pass 4...
real    2m24.285s
user    2m17.053s
sys     0m6.412s
Pass 5...
real    2m24.706s
user    2m17.249s
sys     0m6.396s

oprofile says:

samples  %        image name               app name                 symbol name
12332    33.0847  libc-2.3.90.so           libc-2.3.90.so           mempcpy
7777     20.8644  libc-2.3.90.so           libc-2.3.90.so           msort_with_tmp
4580     12.2874  libc-2.3.90.so           libc-2.3.90.so           memcpy
4045     10.8521  libbfd-    libbfd-   
2435      6.5327  no-vmlinux               no-vmlinux               (no symbols)
1321      3.5440  libbfd-    libbfd-   
1005      2.6962  libbfd-    libbfd-    bfd_getl32
545       1.4621  libbfd-    libbfd-   
348       0.9336  ld                       ld                      

With CVS version (as I reinstalled the RPM I noticed that I had rebuilt binutils
with --target i686; not sure if that would make a difference in this case) --
again, oprofile start/stops were commented out, and in this case Pass 0 is a
cold start:

GNU ld version 2.16.91 20060227
Pass 0...
real    0m34.330s
user    0m9.701s
sys     0m1.456s
Starting profiling
Pass 1...
real    0m14.063s
user    0m10.241s
sys     0m1.612s
Pass 2...
real    0m9.987s
user    0m8.873s
sys     0m1.084s
Pass 3...
real    0m10.037s
user    0m8.773s
sys     0m1.232s
Pass 4...
real    0m10.011s
user    0m8.697s
sys     0m1.284s
Pass 5...
real    0m10.008s
user    0m8.861s
sys     0m1.124s

Memory usage of both topped out at around 270MB; the machine has 2GB ram, so
swap was never being touched.
Comment 4 Alexandre Oliva 2006-03-01 04:08:45 EST
I'm unable to duplicate this on my old PIII notebook either.  Here are the
timings I get:

GNU ld version 20041220
17.07user 2.80system 0:24.39elapsed 81%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+30942minor)pagefaults 0swaps

GNU ld version 20060212
15.07user 2.90system 0:21.25elapsed 84%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+31108minor)pagefaults 0swaps

It's running yesterday's rawhide.  Could it possibly be any other component of
the distro that's causing the slow down you're observing, that would be fixed by
updating to today's rawhide?  Or is it all running on the same box, which makes
it less likely (although not impossible) to be the fault of another component?

--target=i686 *might* make a difference, although I wouldn't expect such a huge one.
Comment 5 Vladimir Vukicevic 2006-03-02 14:23:17 EST
I rebuilt the rpm with i386 as the target; the numbers are basically the same as
the i686 build.  It's the same machine, and all I'm doing is swapping binutils;
latest rawhide (updated yesterday, then re-tested), except I'm still running
2.6.15-1.1977 instead of 1996 since I haven't wanted to do the nvidia dance
again yet.

Just to make sure.. your test is doing the same mozilla layout/build/ link step,
Comment 6 Alexandre Oliva 2006-03-03 14:04:32 EST
What I'm doing is to run different copies of ld, using the command line gcc
uses, to create (re)create gklayout.so within the same build tree.  Is this what
you are doing too?
Comment 7 Vladimir Vukicevic 2006-03-03 16:09:08 EST
Yep, same thing.. I've got someone else doing a fresh FC5t3 install here (mine
was an upgrade over FC4); I'll see if he has the same problem, since I'm not
sure what else it could be.
Comment 8 Vladimir Vukicevic 2006-03-04 04:40:05 EST
Just did a fresh FC5t3 install on an Athlon 64 X2 4400 machine with 2GB ram,
then did a yum update to all the latest bits.  gklayout link took almost 2 minutes..

This is a link from mozilla CVS, with --enable-default-toolkit=cairo-gtk2 . 
That may be relevant, because I know we do some things that we shouldn't (that
will be fixed shortly); in particular, a number of the cairo symbols are
multiply defined in both our own .a files that are part of the link and in the
system libcairo.so.
Comment 9 Jakub Jelinek 2006-03-13 08:29:15 EST
Can you post the exact ld command line used and pack up all the libraries
linker brings in (you can use e.g. ld -M to see what libraries and objects have
been used, create a file list from it and then tar -h them all).
The result, even bzip2ed, will be probably larger than what bugzilla allows
for attachments, so you'd need to put it somewhere on the web or ftp
and we would grab it from there.
Comment 10 Fabrice Bellet 2006-04-18 09:21:36 EDT
I have an example, where ld from binutils- on FC5 is _really_
slower than the CVS version (approx 45 minutes / 8 seconds) :

The ld command is in the runme script. I hope I provided all the needed libs. I
rebuilt a binutils package (based on CVS snapshot on Apr 8) that works fine for
me on FC5, and I obtain a decent execution time with this one :

Comment 11 David Baron 2006-04-27 18:01:35 EDT
FWIW, I'm not sure what version of binutils has the performance-improving
patches in http://sourceware.org/ml/binutils/2005-02/msg00375.html ; these may
be relevant.
Comment 12 Christian Nolte 2006-09-06 11:55:15 EDT
I have the same problems with binutils- on FC5. Issuing a request
on binutils mailinglist
(http://sources.redhat.com/ml/binutils/2006-05/msg00504.html) I've got a
response today pointing me to this bug report:


The problem seems to be related to the dwarf-2 debugging symbols and some
inefficient handling by ld. A patch has been proposed here:

Comment 13 David Baron 2006-10-28 19:18:09 EDT
This is better for me on FC6.
Comment 14 Matthew Miller 2007-04-06 13:39:43 EDT
Fedora Core 5 and Fedora Core 6 are, as we're sure you've noticed, no longer
test releases. We're cleaning up the bug database and making sure important bug
reports filed against these test releases don't get lost. It would be helpful if
you could test this issue with a released version of Fedora or with the latest
development / test release. Thanks for your help and for your patience.

[This is a bulk message for all open FC5/FC6 test release bugs. I'm adding
myself to the CC list for each bug, so I'll see any comments you make after this
and do my best to make sure every issue gets proper attention.]
Comment 15 Jakub Jelinek 2007-07-03 03:15:12 EDT
FC5 is no longer supported and this problem is fixed in FC6 updates and F7.

Note You need to log in before you can comment on or make changes to this bug.