RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1656432 - [cgroup bpf devices] BPF program is not properly freed
Summary: [cgroup bpf devices] BPF program is not properly freed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: 8.0
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: 8.0
Assignee: Jiri Olsa
QA Contact: Ziqian SUN (Zamir)
URL:
Whiteboard:
Depends On:
Blocks: 1513930 1689297 1696304 1717394 1717396
TreeView+ depends on / blocked
 
Reported: 2018-12-05 13:58 UTC by Pavel Hrdina
Modified: 2019-11-05 21:36 UTC (History)
11 users (show)

Fixed In Version: kernel-4.18.0-111.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-05 21:35:43 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
reproduce test program (8.95 KB, text/x-csrc)
2018-12-05 13:58 UTC, Pavel Hrdina
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:3517 0 None None None 2019-11-05 21:36:23 UTC

Description Pavel Hrdina 2018-12-05 13:58:02 UTC
Created attachment 1511682 [details]
reproduce test program

Description of problem:

In libvirt we are using cgroups to limit resources available to QEMU processes and on of the resources is access to devices.  With the new cgroupv2 the access to devices is controller by BPF programs.  In libvirt we need to create new program for every VM and attach it to appropriate cgroup.


Version-Release number of selected component (if applicable):
kernel-4.18.0-47.el8.src.rpm


How reproducible:
Not always which indicates that it's most likely some race-condition.


Steps to Reproduce:
1. boot OS with 'systemd.unified_cgroup_hierarchy=1' on the kernel command line
2. install qemu-kvm package (for that you need to enable AppStream repository)
3. download attached test program
4. compile using gcc/clang
5. set 'ulimit -l unlimited' in order to successfully run test program
5. run attached program multiple times
6. use 'bpftool prog list' to list all BPF programs, you will see that there are existing "cgroup_device" programs that are no longer assigned to any existing cgroup
7. in the root cgroup you can check 

Actual results:
there are BPF programs left in the system that are not freed/removed

Expected results:
all programs should be freed

Comment 3 Jiri Olsa 2019-04-01 16:00:28 UTC
(In reply to Pavel Hrdina from comment #0)
> Created attachment 1511682 [details]
> reproduce test program
> 
> Description of problem:
> 
> In libvirt we are using cgroups to limit resources available to QEMU
> processes and on of the resources is access to devices.  With the new
> cgroupv2 the access to devices is controller by BPF programs.  In libvirt we
> need to create new program for every VM and attach it to appropriate cgroup.
> 
> 
> Version-Release number of selected component (if applicable):
> kernel-4.18.0-47.el8.src.rpm
> 
> 
> How reproducible:
> Not always which indicates that it's most likely some race-condition.
> 
> 
> Steps to Reproduce:
> 1. boot OS with 'systemd.unified_cgroup_hierarchy=1' on the kernel command
> line
> 2. install qemu-kvm package (for that you need to enable AppStream
> repository)
> 3. download attached test program
> 4. compile using gcc/clang
> 5. set 'ulimit -l unlimited' in order to successfully run test program
> 5. run attached program multiple times
> 6. use 'bpftool prog list' to list all BPF programs, you will see that there
> are existing "cgroup_device" programs that are no longer assigned to any
> existing cgroup
> 7. in the root cgroup you can check 
> 
> Actual results:
> there are BPF programs left in the system that are not freed/removed
> 
> Expected results:
> all programs should be freed

correct, looks like rhel8 does not release the program once the
cgroup is removed.. upstream seems to work, checking on the fix

jirka

Comment 4 Jiri Olsa 2019-04-01 16:03:49 UTC
(In reply to Jiri Olsa from comment #3)
> (In reply to Pavel Hrdina from comment #0)
> > Created attachment 1511682 [details]
> > reproduce test program
> > 
> > Description of problem:
> > 
> > In libvirt we are using cgroups to limit resources available to QEMU
> > processes and on of the resources is access to devices.  With the new
> > cgroupv2 the access to devices is controller by BPF programs.  In libvirt we
> > need to create new program for every VM and attach it to appropriate cgroup.
> > 
> > 
> > Version-Release number of selected component (if applicable):
> > kernel-4.18.0-47.el8.src.rpm
> > 
> > 
> > How reproducible:
> > Not always which indicates that it's most likely some race-condition.
> > 
> > 
> > Steps to Reproduce:
> > 1. boot OS with 'systemd.unified_cgroup_hierarchy=1' on the kernel command
> > line
> > 2. install qemu-kvm package (for that you need to enable AppStream
> > repository)
> > 3. download attached test program
> > 4. compile using gcc/clang
> > 5. set 'ulimit -l unlimited' in order to successfully run test program
> > 5. run attached program multiple times
> > 6. use 'bpftool prog list' to list all BPF programs, you will see that there
> > are existing "cgroup_device" programs that are no longer assigned to any
> > existing cgroup
> > 7. in the root cgroup you can check 
> > 
> > Actual results:
> > there are BPF programs left in the system that are not freed/removed
> > 
> > Expected results:
> > all programs should be freed
> 
> correct, looks like rhel8 does not release the program once the
> cgroup is removed.. upstream seems to work, checking on the fix
> 

looks like we're missing this one:
  d7bf2c10af05 bpf: allocate cgroup storage entries on attaching bpf programs

will provide build for testing

jirka

Comment 5 Jiri Olsa 2019-04-01 22:01:25 UTC
build with the fix:
  https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=20826762

works for my test, could you please try?

thanks,
jirka

Comment 6 Pavel Hrdina 2019-04-02 11:55:52 UTC
I've installed that kernel and after running this command:

    for i in {1..100}; do ./test-bpf ; done

where test-bpf is the compiled attachment

# date && bpftool prog list
Tue Apr  2 10:01:27 CEST 2019
4: cgroup_device  tag aeb9784193c239a2  gpl
	loaded_at 2019-04-02T09:45:17+0200  uid 0
	xlated 608B  jited 375B  memlock 4096B  map_ids 4
12: cgroup_device  tag aeb9784193c239a2  gpl
	loaded_at 2019-04-02T09:46:55+0200  uid 0
	xlated 608B  jited 375B  memlock 4096B  map_ids 12


And there are still some programs left in the kernel after some time
passed since the test-bpf was executed.

# uname -a
Linux rhel8 4.18.0-80.6.el8bpf_cgroup.x86_64 #1 SMP Mon Apr 1 20:10:51 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux


After running and stopping VM 100 times the result is worse,
all of the 100 programs are still there.

Unfortunately that commit did not fix the issue.

I've tried newer kernel that is shipped in Fedora 29, it's 5.0.5 version,
if I try the test-bpf all BPF programs are correctly freed, but if
I start VM 100 times there is always only one BPF program left, which is
weird.  In addition some of the BPF programs are no freed immediately
but it takes N seconds for them to be freed.

Comment 7 Jiri Olsa 2019-04-03 09:03:19 UTC
right.. wrong direction, I can now reproduce in upstream as well, checking on the fix

jirka

Comment 8 Jiri Olsa 2019-06-20 21:06:32 UTC
fixed by upstream:
4bfc0bb2c60e bpf: decouple the lifetime of cgroup_bpf from cgroup itself

will post backport shortly
jirka

Comment 9 Jiri Olsa 2019-06-21 11:14:37 UTC
I have the backported build in here, could you please test?
  https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=22302147

thanks,
jirka

Comment 15 Pavel Hrdina 2019-06-24 14:46:27 UTC
(In reply to Jiri Olsa from comment #9)
> I have the backported build in here, could you please test?
>   https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=22302147
> 
> thanks,
> jirka

Hi, thanks for the backport.  Tested with libvirt and everything looks good,
there were no programs leaked after starting and destroying 100 VMs.

Comment 17 Herton R. Krzesinski 2019-07-04 14:17:30 UTC
Patch(es) available on kernel-4.18.0-111.el8

Comment 22 errata-xmlrpc 2019-11-05 21:35:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3517


Note You need to log in before you can comment on or make changes to this bug.