Bug 1656432
Summary: | [cgroup bpf devices] BPF program is not properly freed | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Pavel Hrdina <phrdina> | ||||
Component: | kernel | Assignee: | Jiri Olsa <jolsa> | ||||
kernel sub component: | BPF | QA Contact: | Ziqian SUN (Zamir) <zsun> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | unspecified | ||||||
Priority: | high | CC: | bhu, ctrautma, jbenc, jbrouer, jhsiao, jolsa, knoel, kzhang, rvr, skozina, zsun | ||||
Version: | 8.0 | ||||||
Target Milestone: | rc | ||||||
Target Release: | 8.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | kernel-4.18.0-111.el8 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-11-05 21:35:43 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1513930, 1689297, 1696304, 1717394, 1717396 | ||||||
Attachments: |
|
(In reply to Pavel Hrdina from comment #0) > Created attachment 1511682 [details] > reproduce test program > > Description of problem: > > In libvirt we are using cgroups to limit resources available to QEMU > processes and on of the resources is access to devices. With the new > cgroupv2 the access to devices is controller by BPF programs. In libvirt we > need to create new program for every VM and attach it to appropriate cgroup. > > > Version-Release number of selected component (if applicable): > kernel-4.18.0-47.el8.src.rpm > > > How reproducible: > Not always which indicates that it's most likely some race-condition. > > > Steps to Reproduce: > 1. boot OS with 'systemd.unified_cgroup_hierarchy=1' on the kernel command > line > 2. install qemu-kvm package (for that you need to enable AppStream > repository) > 3. download attached test program > 4. compile using gcc/clang > 5. set 'ulimit -l unlimited' in order to successfully run test program > 5. run attached program multiple times > 6. use 'bpftool prog list' to list all BPF programs, you will see that there > are existing "cgroup_device" programs that are no longer assigned to any > existing cgroup > 7. in the root cgroup you can check > > Actual results: > there are BPF programs left in the system that are not freed/removed > > Expected results: > all programs should be freed correct, looks like rhel8 does not release the program once the cgroup is removed.. upstream seems to work, checking on the fix jirka (In reply to Jiri Olsa from comment #3) > (In reply to Pavel Hrdina from comment #0) > > Created attachment 1511682 [details] > > reproduce test program > > > > Description of problem: > > > > In libvirt we are using cgroups to limit resources available to QEMU > > processes and on of the resources is access to devices. With the new > > cgroupv2 the access to devices is controller by BPF programs. In libvirt we > > need to create new program for every VM and attach it to appropriate cgroup. > > > > > > Version-Release number of selected component (if applicable): > > kernel-4.18.0-47.el8.src.rpm > > > > > > How reproducible: > > Not always which indicates that it's most likely some race-condition. > > > > > > Steps to Reproduce: > > 1. boot OS with 'systemd.unified_cgroup_hierarchy=1' on the kernel command > > line > > 2. install qemu-kvm package (for that you need to enable AppStream > > repository) > > 3. download attached test program > > 4. compile using gcc/clang > > 5. set 'ulimit -l unlimited' in order to successfully run test program > > 5. run attached program multiple times > > 6. use 'bpftool prog list' to list all BPF programs, you will see that there > > are existing "cgroup_device" programs that are no longer assigned to any > > existing cgroup > > 7. in the root cgroup you can check > > > > Actual results: > > there are BPF programs left in the system that are not freed/removed > > > > Expected results: > > all programs should be freed > > correct, looks like rhel8 does not release the program once the > cgroup is removed.. upstream seems to work, checking on the fix > looks like we're missing this one: d7bf2c10af05 bpf: allocate cgroup storage entries on attaching bpf programs will provide build for testing jirka build with the fix: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=20826762 works for my test, could you please try? thanks, jirka I've installed that kernel and after running this command: for i in {1..100}; do ./test-bpf ; done where test-bpf is the compiled attachment # date && bpftool prog list Tue Apr 2 10:01:27 CEST 2019 4: cgroup_device tag aeb9784193c239a2 gpl loaded_at 2019-04-02T09:45:17+0200 uid 0 xlated 608B jited 375B memlock 4096B map_ids 4 12: cgroup_device tag aeb9784193c239a2 gpl loaded_at 2019-04-02T09:46:55+0200 uid 0 xlated 608B jited 375B memlock 4096B map_ids 12 And there are still some programs left in the kernel after some time passed since the test-bpf was executed. # uname -a Linux rhel8 4.18.0-80.6.el8bpf_cgroup.x86_64 #1 SMP Mon Apr 1 20:10:51 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux After running and stopping VM 100 times the result is worse, all of the 100 programs are still there. Unfortunately that commit did not fix the issue. I've tried newer kernel that is shipped in Fedora 29, it's 5.0.5 version, if I try the test-bpf all BPF programs are correctly freed, but if I start VM 100 times there is always only one BPF program left, which is weird. In addition some of the BPF programs are no freed immediately but it takes N seconds for them to be freed. right.. wrong direction, I can now reproduce in upstream as well, checking on the fix jirka fixed by upstream: 4bfc0bb2c60e bpf: decouple the lifetime of cgroup_bpf from cgroup itself will post backport shortly jirka I have the backported build in here, could you please test? https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=22302147 thanks, jirka (In reply to Jiri Olsa from comment #9) > I have the backported build in here, could you please test? > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=22302147 > > thanks, > jirka Hi, thanks for the backport. Tested with libvirt and everything looks good, there were no programs leaked after starting and destroying 100 VMs. Patch(es) available on kernel-4.18.0-111.el8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:3517 |
Created attachment 1511682 [details] reproduce test program Description of problem: In libvirt we are using cgroups to limit resources available to QEMU processes and on of the resources is access to devices. With the new cgroupv2 the access to devices is controller by BPF programs. In libvirt we need to create new program for every VM and attach it to appropriate cgroup. Version-Release number of selected component (if applicable): kernel-4.18.0-47.el8.src.rpm How reproducible: Not always which indicates that it's most likely some race-condition. Steps to Reproduce: 1. boot OS with 'systemd.unified_cgroup_hierarchy=1' on the kernel command line 2. install qemu-kvm package (for that you need to enable AppStream repository) 3. download attached test program 4. compile using gcc/clang 5. set 'ulimit -l unlimited' in order to successfully run test program 5. run attached program multiple times 6. use 'bpftool prog list' to list all BPF programs, you will see that there are existing "cgroup_device" programs that are no longer assigned to any existing cgroup 7. in the root cgroup you can check Actual results: there are BPF programs left in the system that are not freed/removed Expected results: all programs should be freed