Bug 505347

Summary: eu-readelf super slow compared to readelf for large debuginfo dump
Product: [Fedora] Fedora Reporter: Frank Ch. Eigler <fche>
Component: elfutilsAssignee: Roland McGrath <roland>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: mjw, pmuller, roland
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 0.142-1.fc10 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-08-20 20:54:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 516995    

Description Frank Ch. Eigler 2009-06-11 15:07:04 UTC
Take a vmlinux image (with debuginfo in it, make it a few hundred
megabytes), run "eu-readelf -w" vs. "readelf -w" on it.  In my
tests, the former runs about a hundred times slower (!).

Version-Release number of selected component (if applicable):
0.140

Comment 1 Roland McGrath 2009-06-18 00:12:26 UTC
Profiling welcome.

Comment 2 Mark Wielaard 2009-06-28 14:19:31 UTC
This is not just with vmlinuz. I have seen it also with user space executables/libraries. For example eu-readelf is almost 30 times slower against libjvm.so from java-1.6.0-openjdk-debuginfo-1.6.0.0-22.b16.fc11.i586

Against binutils-2.19.51.0.2-17.fc11.i586:
$ readelf -w /usr/lib/debug/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client/libjvm.so.debug > /dev/null

real	1m11.262s
user	1m5.913s
sys	0m0.832s

Against elfutils-0.141-1.fc11.i586:
$ eu-readelf -w /usr/lib/debug/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre/lib/i386/client/libjvm.so.debug > /dev/null
real	29m54.918s
user	29m35.359s
sys	0m3.720s

Top samples from oprofile report for that last run:

CPU: Core Solo / Duo, speed 1000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Unhalted clock cycles) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        image name               app name                 symbol name
259359   14.0077  libdw-0.141.so           eu-readelf               dwfl_module_getsym
190665   10.2976  vmlinux                  vmlinux                  read_hpet
151954    8.2069  libelf-0.141.so          eu-readelf               gelf_getsymshndx
151840    8.2007  libdw-0.141.so           eu-readelf               dwfl_module_addrsym
116833    6.3100  libelf-0.141.so          eu-readelf               gelf_getshdr
106727    5.7642  vmlinux                  vmlinux                  acpi_os_read_port
96369     5.2048  oprofiled                oprofiled                /usr/bin/oprofiled
79784     4.3091  vmlinux                  vmlinux                  acpi_idle_do_entry
77526     4.1871  vmlinux                  vmlinux                  acpi_idle_enter_bm
59341     3.2050  vmlinux                  vmlinux                  hpet_next_event
55339     2.9888  libelf-0.141.so          eu-readelf               elf_getscn
37600     2.0307  vmlinux                  vmlinux                  __ticket_spin_lock
33805     1.8258  vmlinux                  vmlinux                  __ticket_spin_unlock
31339     1.6926  vmlinux                  vmlinux                  rb_get_reader_page
28327     1.5299  oprofile                 oprofile                 /oprofile
27269     1.4728  libdw-0.141.so           eu-readelf               .plt
26971     1.4567  libelf-0.141.so          eu-readelf               __i686.get_pc_thunk.bx
19175     1.0356  vmlinux                  vmlinux                  acpi_os_write_port

Comment 3 Mark Wielaard 2009-06-28 14:53:26 UTC
oprofile callgraph output for top 5 eu-readelf symbols in above report:
$ opreport --callgraph --symbols --sort sample --include-symbols dwfl_module_getsym,gelf_getsymshndx,dwfl_module_addrsym,gelf_getshdr,elf_getscn

CPU: Core Solo / Duo, speed 1000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Unhalted clock cycles) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples  %        image name         app name     symbol name
-------------------------------------------------------------
  9620      1.6423  eu-readelf         eu-readelf   format_dwarf_addr
  576150   98.3577  libdw-0.141.so     eu-readelf   dwfl_module_addrsym
259359   35.2713  libdw-0.141.so     eu-readelf   dwfl_module_getsym
  259359   44.1981  libdw-0.141.so     eu-readelf   dwfl_module_getsym [self]
  145994   24.8793  libelf-0.141.so    eu-readelf   gelf_getsymshndx
  106954   18.2263  libelf-0.141.so    eu-readelf   gelf_getshdr
  47584     8.1089  libelf-0.141.so    eu-readelf   elf_getscn
  26919     4.5873  libelf-0.141.so    eu-readelf   __i686.get_pc_thunk.bx
-------------------------------------------------------------
  5707      3.7620  libdw-0.141.so     eu-readelf   dwfl_module_addrsym
  145994   96.2380  libdw-0.141.so     eu-readelf   dwfl_module_getsym
151954   20.6649  libelf-0.141.so    eu-readelf   gelf_getsymshndx
  151954   100.000  libelf-0.141.so    eu-readelf   gelf_getsymshndx [self]
-------------------------------------------------------------
  3        3.8e-04  eu-readelf         eu-readelf   print_ops
  3        3.8e-04  eu-readelf         eu-readelf   print_debug_loc_section
  15        0.0019  eu-readelf         eu-readelf   attr_callback
  32        0.0041  eu-readelf         eu-readelf   print_debug_line_section
  786601   99.9933  eu-readelf         eu-readelf   format_dwarf_addr
151840   20.6494  libdw-0.141.so     eu-readelf   dwfl_module_addrsym
  576150   73.1213  libdw-0.141.so     eu-readelf   dwfl_module_getsym
  151840   19.2706  libdw-0.141.so     eu-readelf   dwfl_module_addrsym [self]
  27193     3.4512  libdw-0.141.so     eu-readelf   .plt
  9663      1.2264  libelf-0.141.so    eu-readelf   gelf_getshdr
  9604      1.2189  libdw-0.141.so     eu-readelf   __i686.get_pc_thunk.bx
  7663      0.9725  libelf-0.141.so    eu-readelf   elf_getscn
  5707      0.7243  libelf-0.141.so    eu-readelf   gelf_getsymshndx
  98        0.0124  libelf-0.141.so    eu-readelf   elf_nextscn
  10        0.0013  libelf-0.141.so    eu-readelf   __i686.get_pc_thunk.bx
  7        8.9e-04  libdw-0.141.so     eu-readelf   dwfl_module_getsymtab
  2        2.5e-04  libdw-0.141.so     eu-readelf   find_symtab
-------------------------------------------------------------
  3         0.0026  eu-readelf         eu-readelf   format_dwarf_addr
  9663      8.2859  libdw-0.141.so     eu-readelf   dwfl_module_addrsym
  106954   91.7115  libdw-0.141.so     eu-readelf   dwfl_module_getsym
116833   15.8886  libelf-0.141.so    eu-readelf   gelf_getshdr
  116833   100.000  libelf-0.141.so    eu-readelf   gelf_getshdr [self]
-------------------------------------------------------------
  7663     13.8704  libdw-0.141.so     eu-readelf   dwfl_module_addrsym
  47584    86.1296  libdw-0.141.so     eu-readelf   dwfl_module_getsym
55339     7.5258  libelf-0.141.so    eu-readelf   elf_getscn
  55339    100.000  libelf-0.141.so    eu-readelf   elf_getscn [self]
-------------------------------------------------------------

Comment 4 Roland McGrath 2009-06-28 21:53:48 UTC
Try git elfutils with readelf -N.  You were paying for an extra feature that binutils readelf does not have, printing addresses from the DWARF in symbolic form.  With the new -N option, elfutils readelf is faster than binutils readelf in my test (--debug-dump=info on a random big ol' vmlinux binary).

At some point we can look into optimizing dwfl_module_addrsym better.  It basically does an exhaustive search every time as it is.

Comment 5 Mark Wielaard 2009-06-29 05:59:22 UTC
(In reply to comment #4)
> Try git elfutils with readelf -N.  You were paying for an extra feature that
> binutils readelf does not have, printing addresses from the DWARF in symbolic
> form.  With the new -N option, elfutils readelf is faster than binutils readelf
> in my test (--debug-dump=info on a random big ol' vmlinux binary).

In the test above eu-readelf -N -w is also faster than binutils readelf -w on the libjvm.so now:

real	0m55.888s
user	0m46.044s
sys	0m0.651s

> At some point we can look into optimizing dwfl_module_addrsym better.  It
> basically does an exhaustive search every time as it is.  

Having a symbol info cache for this case (and I assume addr2line) would be nice.

Comment 6 Roland McGrath 2009-07-23 19:44:15 UTC
I am going to call this bug covered by the addition of -N.

Comment 7 Fedora Update System 2009-07-29 03:01:13 UTC
elfutils-0.142-1.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/elfutils-0.142-1.fc11

Comment 8 Fedora Update System 2009-07-29 03:01:27 UTC
elfutils-0.142-1.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/elfutils-0.142-1.fc10

Comment 9 Fedora Update System 2009-07-29 21:28:50 UTC
elfutils-0.142-1.fc10 has been pushed to the Fedora 10 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update elfutils'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F10/FEDORA-2009-8078

Comment 10 Fedora Update System 2009-07-29 21:29:41 UTC
elfutils-0.142-1.fc11 has been pushed to the Fedora 11 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update elfutils'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-8083

Comment 11 Fedora Update System 2009-08-20 20:53:49 UTC
elfutils-0.142-1.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 12 Fedora Update System 2009-08-20 20:59:06 UTC
elfutils-0.142-1.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.