Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be unavailable on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1886599 - "perf report" dies in 1GB RAM
Summary: "perf report" dies in 1GB RAM
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel-tools
Version: 33
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Justin M. Forbes
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-08 21:05 UTC by John Reiser
Modified: 2021-05-20 00:04 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug


Attachments (Terms of Use)
oom-killer info from syslog (15.13 KB, text/plain)
2020-10-08 21:15 UTC, John Reiser
no flags Details

Description John Reiser 2020-10-08 21:05:09 UTC
Description of problem: "perf report" suffers oom-killer while running under Fedora-Minimal-33_Beta-1.3.aarch64.raw.xz on RaspberryPi 3 model B+ in 64-bit mode (aarch64) with 1GB RAM. /usr/bin/top reports 935MiB Mem, 467MiB Swap.
The input perf.data is 160MB from "perf record -g" lasting 160 seconds.

Inspection using
   strace --e trace=brk,open,openat,close,mmap,munmap,mprotect perf report
shows several related problems:

1) No feature to limit memory allocation, and adapt processing accordingly.  There should be a command-line option such as "--max-memory=N" where N is a number in bytes with suffix K, M, G indicating (N<<10), (N<<20), or (N<<30).  The default value should be (0.75 * /proc/meminfo["MemTotal"]).  Most of the brk() allocation seems to be decoded tracebacks that are awaiting a final sort(); but external sorting using temporary files has been well-researched for decades, see vol.3 of Knuth.

2. File descriptor leak: The various modules (main programs and shared libraries) that are reported by "perf buildid-list" are openat() and mmap() in order, up to a file descriptor bound of 33 or 34, but "perf report" closes only 1 or 2 of them despite not needing the fd after mmap().

3. Broken mmap() cache.  All slots in the cache are occupied by only one mmap(), ever, except for the last slot.  The last slot is completely cycled on every access, from empty to full then back to empty, and not even peep-hole optimized (retain the fd+mmap for consecutive accesses to the same module; apply munmap() and close() only upon a request for a different module).  This is wasteful by a factor of 2.  For example, from the strace output:
=====
openat(AT_FDCWD, "/usr/lib64/libmodulemd.so.2.9.4", O_RDONLY) = 33
mmap(NULL, 530400, PROT_READ, MAP_PRIVATE, 33, 0) = 0xffffa40c5000
munmap(0xffffa40c5000, 530400)          = 0
close(33)                               = 0
openat(AT_FDCWD, "/usr/lib64/libmodulemd.so.2.9.4", O_RDONLY) = 33
mmap(NULL, 530400, PROT_READ, MAP_PRIVATE, 33, 0) = 0xffffa40c5000
munmap(0xffffa40c5000, 530400)          = 0
close(33)                               = 0
openat(AT_FDCWD, "/usr/lib64/libyaml-0.so.2.0.9", O_RDONLY) = 33
mmap(NULL, 135808, PROT_READ, MAP_PRIVATE, 33, 0) = 0xffffa4125000
munmap(0xffffa4125000, 135808)          = 0
close(33)                               = 0
openat(AT_FDCWD, "/usr/lib64/libyaml-0.so.2.0.9", O_RDONLY) = 33
mmap(NULL, 135808, PROT_READ, MAP_PRIVATE, 33, 0) = 0xffffa4125000
munmap(0xffffa4125000, 135808)          = 0
close(33)                               = 0
=====

4. In many cases the data cannot be collected and transported for analysis to a system with more RAM, because "perf archive" ignores the second column of "perf buildid-list".  If the that is file named with a given .buildid is not found in the .debug directories, then "perf archive" complains and omits that file entirely.  Instead "perf archive" should check that the path given in the second column does have the buildid that is listed im the first column, then use the path from the second column as the source for that module in the archive.  That path might not have as many symbols as a .debug file, but usually there are some symbols; and the motivated user can disassemble to gain some knowledge even from just a numeric address.


Version-Release number of selected component (if applicable):
perf-5.8.11-300.fc33.aarch64


How reproducible: every time


Steps to Reproduce:
1. Run a system-wide "perf record -g" until 160MB of data has been collected in file perf.data
2. Attempt to generated output using "perf report" on a system with only 1GB RAM.
3.

Actual results: Death by oom-killer, an no useful output.


Expected results: Useful and complete output from "perf report".


Additional info:
$ perf buildid-list | wc
     81     162    5972

$ perf buildid-list | sed 10q
5ab8bd57a0dfe5043b533fa0f14bb2858f8f7104 [kernel.kallsyms]
63653a0ac1d1cd057523b982bfc379a0b761fa98 /usr/lib/systemd/systemd
addd1d0fa6a711e0dde947e3eb48a9713dc336c7 /usr/lib64/libnl-3.so.200.26.0
0582ebd57538a155cfca5c1dd78412461000b886 /usr/lib64/libnl-route-3.so.200.26.0
01b3afe3607d7578f0843b97f7a75393bb797d52 /usr/lib64/libudev.so.1.6.18
d1db9e285cb7c68f98d2becbcf4a74cf4dc3505a /usr/lib64/libffi.so.6.0.2
8891095747b5ef5946c6bbc0d48d3e69b4d5701f /usr/lib64/libz.so.1.2.11
a2fe285121b1202b34262e7e16c9c164fd7f9637 /usr/lib64/libpthread-2.32.so
45cfdce4c600e7547868615334f49443c5d77fa8 /usr/lib64/libzstd.so.1.4.5
311449c5479e707d4346c1a03b70d67d00e0c9ed /usr/lib64/libcrypto.so.1.1.1g

$ perf report --header-only
# ========
# captured on    : Sat Oct  3 16:30:46 2020
# header version : 1
# data offset    : 440
# data size      : 165914112
# feat offset    : 165914552
# hostname : localhost.localdomain
# os release : 5.8.12-300.fc33.aarch64
# perf version : 5.8.11
# arch : aarch64
# nrcpus online : 4
# nrcpus avail : 4
# cpuid : 0x00000000410fd030
# total memory : 958016 kB
# cmdline : /usr/bin/perf record -g 
# event : name = cycles, , id = { 4, 5, 6, 7 }, size = 120, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CALLCHAIN|ID|CPU|PERIOD, read_format = ID, disa>
# event : name = dummy:HG, , id = { 8, 9, 10, 11 }, type = 1, size = 120, config = 0x9, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CALLCHAIN|ID|CPU|PE>
# CPU_TOPOLOGY info available, use -I to display
# NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: software = 1, uprobe = 7, breakpoint = 5, tracepoint = 2, kprobe = 6, armv8_cortex_a53 = 8
# CACHE info available, use -I to display
# time of first sample : 739.524839
# time of last sample : 899.941060
# sample duration : 160416.221 ms
# MEM_TOPOLOGY info available, use -I to display
# bpf_prog_info 61: bpf_prog_a42d275341448247 addr 0xffff800000008018 size 472
# bpf_prog_info 62: bpf_prog_6deef7357e7b4530 addr 0xffff80000000ae7c size 104
# bpf_prog_info 63: bpf_prog_6deef7357e7b4530 addr 0xffff80000002ca74 size 104
# bpf_prog_info 64: bpf_prog_438c5618576e5b0c addr 0xffff80000002e32c size 524
# bpf_prog_info 65: bpf_prog_6deef7357e7b4530 addr 0xffff800000030c88 size 104
# bpf_prog_info 66: bpf_prog_6deef7357e7b4530 addr 0xffff800000054c44 size 104
# bpf_prog_info 67: bpf_prog_6ceaf4d6095cca93 addr 0xffff80000000c6f0 size 652
# bpf_prog_info 68: bpf_prog_6deef7357e7b4530 addr 0xffff8000000622b0 size 104
# bpf_prog_info 69: bpf_prog_6deef7357e7b4530 addr 0xffff8000000649e8 size 104
# bpf_prog_info 70: bpf_prog_2705a24f44b96941 addr 0xffff800000066824 size 444
# bpf_prog_info 73: bpf_prog_ee0e253c78993a24 addr 0xffff8000000bc4e8 size 408
# bpf_prog_info 74: bpf_prog_1ab2530658b0b2d1 addr 0xffff8000000beda8 size 468
# bpf_prog_info 75: bpf_prog_6deef7357e7b4530 addr 0xffff8000000c0334 size 104
# bpf_prog_info 76: bpf_prog_6deef7357e7b4530 addr 0xffff8000000fad50 size 104
# bpf_prog_info 80: bpf_prog_ee0e253c78993a24 addr 0xffff8000000b0290 size 408
# bpf_prog_info 81: bpf_prog_6deef7357e7b4530 addr 0xffff80000006cdd8 size 104
# bpf_prog_info 82: bpf_prog_6deef7357e7b4530 addr 0xffff8000000fc234 size 104
# missing features: TRACING_DATA CPUDESC BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CPU_PMU_CAPS 
# ========

Comment 1 John Reiser 2020-10-08 21:15:09 UTC
Created attachment 1720068 [details]
oom-killer info from syslog

Comment 2 Jarod Wilson 2020-10-09 13:41:54 UTC
kernel-utils isn't a package that has existed since circa RHEL 4.

Comment 3 Fedora Admin user for bugzilla script actions 2021-05-19 00:03:50 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.

Comment 4 Fedora Admin user for bugzilla script actions 2021-05-20 00:04:11 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.


Note You need to log in before you can comment on or make changes to this bug.