Bug 2086870 - `perf record true` crashes with libbpf >= 0.5.0
Summary: `perf record true` crashes with libbpf >= 0.5.0
Keywords:
Status: CLOSED DUPLICATE of bug 2076978
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel-tools
Version: 36
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Justin M. Forbes
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-16 17:37 UTC by Ondrej Mosnacek
Modified: 2022-05-25 20:42 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-25 20:42:08 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Ondrej Mosnacek 2022-05-16 17:37:39 UTC
Description of problem:
Running any `perf record ...` command on F36 leads to a segfault. Downgrading to F35 libbpf (libbpf-2:0.4.0-2.fc35.x86_64) makes the problem go away. With the rawhide version (libbpf-2:0.7.0-3.fc37.x86_64) the problem persists. Thus, this seems to be a regression introduced in version 0.5.0.

Version-Release number of selected component (if applicable):
libbpf-2:0.5.0-2.fc36.x86_64
libbpf-2:0.7.0-3.fc37.x86_64
perf-5.17.6-300.fc36.x86_64

How reproducible:
Always.

Steps to Reproduce:
1. Run 'perf record true'.

Actual results:
'Segmentation fault (core dumped)'

Expected results:
No segfault.

Additional info:
This seems to be caused by a stack overflow - see the backtrace from gdb:

#0  btf__get_from_id (id=20, btf=btf@entry=0x7fffff7ff030) at btf.c:1411
#1  0x0000555555964004 in btf__load_from_kernel_by_id (id=<optimized out>) at util/bpf-event.c:30
#2  0x00007ffff7debd8d in btf__get_from_id (id=<optimized out>, btf=btf@entry=0x7fffff7ff080) at btf.c:1411
#3  0x0000555555964004 in btf__load_from_kernel_by_id (id=<optimized out>) at util/bpf-event.c:30
#4  0x00007ffff7debd8d in btf__get_from_id (id=<optimized out>, btf=btf@entry=0x7fffff7ff0d0) at btf.c:1411
[...repeats many times...]
#209081 0x0000555555964004 in btf__load_from_kernel_by_id (id=<optimized out>) at util/bpf-event.c:30
#209082 0x00007ffff7debd8d in btf__get_from_id (id=<optimized out>, btf=btf@entry=0x7fffffff8d40) at btf.c:1411
#209083 0x0000555555964004 in btf__load_from_kernel_by_id (id=<optimized out>) at util/bpf-event.c:30
#209084 0x00007ffff7debd8d in btf__get_from_id (id=<optimized out>, btf=btf@entry=0x7fffffff8d90) at btf.c:1411
#209085 0x0000555555964004 in btf__load_from_kernel_by_id (id=<optimized out>) at util/bpf-event.c:30
#209086 0x00005555559645d7 in perf_event__synthesize_one_bpf_prog (opts=0x555555f77298 <record+312>, event=0x555556093740, fd=22, machine=0x555556013c48, process=0x5555557cec20 <process_synthesized_event>, session=0x555556013a40) at util/bpf-event.c:267
#209087 perf_event__synthesize_bpf_events (session=session@entry=0x555556013a40, process=process@entry=0x5555557cec20 <process_synthesized_event>, machine=machine@entry=0x555556013c48, opts=opts@entry=0x555555f77298 <record+312>) at util/bpf-event.c:451
#209088 0x00005555557cd92b in record__synthesize (tail=tail@entry=false, rec=0x555555f77160 <record>) at builtin-record.c:1471
#209089 0x00005555557d0d0a in __cmd_record (rec=0x555555f77160 <record>, argv=<optimized out>, argc=1) at builtin-record.c:1786
#209090 cmd_record (argc=1, argv=<optimized out>) at builtin-record.c:2962
#209091 0x0000555555856021 in run_builtin (p=0x555555f81718 <commands+216>, argc=2, argv=0x7fffffffe380) at perf.c:313
#209092 0x00005555557b73c4 in handle_internal_command (argv=0x7fffffffe380, argc=2) at perf.c:365
#209093 run_argv (argv=<synthetic pointer>, argcp=<synthetic pointer>) at perf.c:409
#209094 main (argc=2, argv=0x7fffffffe380) at perf.c:539

Comment 1 Ondrej Mosnacek 2022-05-19 06:26:55 UTC
Interestingly, with perf from rawhide (perf-5.18.0-0.rc7.git0.1.fc37.x86_64) installed on F36 (`dnf update perf --releasever rawhide -y`) the issue also doesn't reproduce.

Comment 2 Ondrej Mosnacek 2022-05-25 17:53:29 UTC
It looks like this upstream commit fixed it:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ae065a5d265bc5ada13e350015458e0c5e5c351

The fix has been backported also to the 5.17 stable branch, so updating kernel-tools to 5.17.10+ should fix it:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.17.y&id=e8c7bfd8ff6c6385184bee694c0d741b40b81f37

Comment 3 Justin M. Forbes 2022-05-25 20:42:08 UTC

*** This bug has been marked as a duplicate of bug 2076978 ***


Note You need to log in before you can comment on or make changes to this bug.