Bug 2120752

Summary: Backport patch to f36 binutils bfd to address perf report performance issue
Product: [Fedora] Fedora Reporter: William Cohen <wcohen>
Component: binutilsAssignee: Nick Clifton <nickc>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 36CC: aoliva, dvlasenk, fweimer, jakub, nickc, sipoyare, yahmad
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: binutils-2.37-36.fc36 and binutils-2.38-24.fc37 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-12 17:15:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description William Cohen 2022-08-23 16:37:36 UTC
Description of problem:

When examining the performance of linux perf doing a report I found that excessive time (88% of total samples) being spent in the bfd function lookup_func_by_offset on Fedora 36. Nick Clifton suggested trying rawhide as there were some efficiency patches in rawhide's version of binutils.  Nick identified "Reduce-O-n2-performance-overhead-when-parsing-DWARF" as the patch that will improve performance.


Version-Release number of selected component (if applicable):
binutils-2.37-27.fc36.x86_64

How reproducible:

Every time.


Steps to Reproduce:

dnf download --source kernel-tools
dnf download --source systemtap
sudo dnf builddep ./kernel-tools*.src.rpm -y
sudo dnf builddep ./systemtap*.src.rpm -y
rpm -Uvh ./kernel-tools*.src.rpm
rpm -Uvh ./systemtap*.src.rpm
cd ~/rpmbuild/SPECS
rpmbuild -bp kernel-tools.spec
rpmbuild -bc systemtap.spec
cd ~/rpmbuild/BUILD/kernel*/linux-*/tools/perf
make
mkdir ~/bin
make install
cd ~/rpmbuild/BUILD/systemtap*
make
sudo make install
cd

# warm things up with and avoid measuring some systemtap setup done on the first run:
stap --example -p2 sleeptime.stp

#actual data collection
~/bin/perf record -e cpu-clock --output=perf_record.data --call-graph=dwarf -- ~/bin/perf record -e cpu-clock --call-graph=dwarf --output=perf_stap.data stap --example -p2 sleeptime.stp
~/bin/perf record -e cpu-clock --output=perf_report.data --call-graph=dwarf -- ~/bin/perf report --input=perf_stap.data --stdio > junk

# See where time is in perf processing the 

~/bin/perf report --input=perf_report.data --dso=perf --stdio > where_perf_report_spends_time.log

# measure how long it takes to generate the report
 /usr/bin/time ~/bin/perf report --input=perf_report.data --dso=perf --stdio  > /dev/null

Actual results:

Fedora 36 is about 4 times slower than the rawhide version.

f36:
$ /usr/bin/time ~/bin/perf report --input=perf_report.data --dso=perf --stdio  > /dev/null
3.02user 1.23system 0:04.29elapsed 99%CPU (0avgtext+0avgdata 532680maxresident)k
0inputs+0outputs (0major+373420minor)pagefaults 0swaps

rawhide:
$ /usr/bin/time ~/bin/perf report --input=perf_report.data --dso=perf --stdio  > /dev/null
0.66user 0.55system 0:01.25elapsed 97%CPU (0avgtext+0avgdata 351120maxresident)k
0inputs+0outputs (0major+124043minor)pagefaults 0swaps



Expected results:

On Fedora 36 the "perf report" shouldn't be spending 85%+ of the time in a single function, lookup_func_by_offset.  It should have a time comparable to rawhide.

Additional info:

May need to rebuild kernel-tools as lookup_func_by_offset is being inlined.

Comment 4 Nick Clifton 2022-09-29 10:29:15 UTC
Now fixed in binutils-2.38-24.fc37 and binutils-2.37-36.fc36