Description of problem: The memory usage of the virt-launcher process on virt-launcher pod is growing and may lead to OOMKilled pods: > # VM started: > qemu 10 0.1 0.2 2002312 60872 ? Sl 22:15 0:04 /usr/bin/virt-launcher --qemu-timeout 253s --name vm-fedora --uid 6b181371-7c41-4783-8f8f-a > # 12hrs > qemu 10 0.1 0.2 2002312 67952 ? Sl May16 0:58 /usr/bin/virt-launcher --qemu-timeout 253s --name vm-fedora --uid 6b181371-7c41-4783-8f8f-a > # 48hrs > qemu 10 0.1 0.2 2002312 70528 ? Sl May16 3:06 /usr/bin/virt-launcher --qemu-timeout 253s --name vm-fedora --uid 6b181371-7c41-4783-8f8f-a > # And after 5 days it has exceeded the 75MB limit: > qemu 10 0.1 0.3 2002312 77364 ? Sl May16 9:28 /usr/bin/virt-launcher --qemu-timeout 253s --name vm-fedora --uid 6b181371-7c41-4783-8f8f-a Version-Release number of selected component (if applicable): 4.11 How reproducible: Steps to Reproduce: 1. run VM with stress for a long time 2. check memory 3. Actual results: Memory usage of virt-launcher is growing and may exceed the limit Expected results: Memory usage is under the limit Additional info:
We've left a cluster running for a week, and have not seen evidence of a memory leak. We've used a profiler to double check memory consumption of virt-launcher as well. Because we don't have a clear reproducer, we'd like to enlist QE's help. Would it be possible to set up a long-running test to provide evidence of the problem? Deferring to the next release. There's no clear evidence that this is an urgent problem, and diagnosing whether or not there is a problem will take time.
I've reproduced it again and can share access to the cluster. VM is running for 6 days and currently virt-launcher uses ~81Mb: $ oc get vm NAME AGE STATUS READY vm-fedora 6d20h Running True # Initial USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND qemu 10 0.1 0.2 2002316 65300 ? Sl 18:56 0:01 /usr/bin/virt-launcher --qemu-timeout 313s --name vm-fedora --uid 7535d491-5ac3-46fa-9173-0 # 72Hrs qemu 10 0.1 0.3 2002316 75840 ? Sl May27 5:56 /usr/bin/virt-launcher --qemu-timeout 313s --name vm-fedora --uid 7535d491-5ac3-46fa-9173-0 # 6 days qemu 10 0.1 0.3 2002316 81084 ? Sl May27 11:04 /usr/bin/virt-launcher --qemu-timeout 313s --name vm-fedora --uid 7535d491-5ac3-46fa-9173-0 There is Fedora with stress-ng tool running: nohup sudo stress-ng --vm 1 --vm-bytes 100% --vm-method all --verify -t 0 &
Could you share the VM definition that was used? We'd like to clarify what limit was set.
During analysis of this BZ, we did not find direct evidence of a memory leak. This was validated by profiling the virt-launcher container over the course of several days. However emprical data collected during profiling showed that the overhead calculations for virt-launcher we too low. Prior to fixing, the overhead was set to 75M. Testing showed that virt-launcher's overhead could reach as high as 90M. So the overhead calculation has subequently been adjusted. The fix has been merged in the CNV 4.11 stable branch, but builds are still pending.
I've 7 VMs running for multiple days (5-10) and don't see the memory usage more than 100Mb (usually it is between 70 and 90Mb). Moving this BZ to Verified I will continue testing for extra multiple days just in case
This bug re-appeared in CI, therefore I'm reopening it for further investigation
Because the symptoms are different I've actually filed a new bug: bug #2111794 and moving this one back to verified.
I've decided to reopen this bug. According to longevity test (VMs run for 20-30 days) - we already over the limit on some PODs and pretty close on other: > virt-launcher-rhel8-nonroot-meadowlark-fw7qr 1/1 Running 0 21d > POD: virt-launcher-rhel8-nonroot-meadowlark-fw7qr, RSS: 98640 > virt-launcher-rhel8-nonroot-quelea-gkcz5 1/1 Running 0 21d > POD: virt-launcher-rhel8-nonroot-quelea-gkcz5, RSS: 97928 > virt-launcher-rhel8-root-barracuda-8mqd4 1/1 Running 0 28d > POD: virt-launcher-rhel8-root-barracuda-8mqd4, RSS: 105512 > virt-launcher-vm-fedora-mem-d7pnt 2/2 Running 0 21d > POD: virt-launcher-vm-fedora-mem-d7pnt, RSS: 69576 > virt-launcher-win10-nonroot-sailfish-fvpl2 2/2 Running 0 21d > POD: virt-launcher-win10-nonroot-sailfish-fvpl2, RSS: 94884 > virt-launcher-win10-nonroot-stoat-pnmx7 2/2 Running 0 21d > POD: virt-launcher-win10-nonroot-stoat-pnmx7, RSS: 89344 > virt-launcher-win10-root-rodent-jk7l4 2/2 Running 0 31d > POD: virt-launcher-win10-root-rodent-jk7l4, RSS: 109876 I'm collecting memory usage every 6 hours, and I can see that it goes up and down periodically, but the general trend - it grows from day to day on most of pods - you can see the example in attached file rhel8-root-baracuda.mem.log
Federico, can you please share a command of how Denys can get an output similar to what the test is providing: (example) tests/vmi_configuration_test.go:2950 the virtlogd process is taking too much RAM! (17468Ki > 17Mi). All processes: &map[libvirtd:{{28876800 0} {<nil>} BinarySI} qemu-kvm:{{454934528 0} {<nil>} BinarySI} virt-launcher:{{61800448 0} {<nil>} BinarySI} virt-launcher-monitor:{{19357696 0} {<nil>} BinarySI} virtlogd:{{17887232 0} {<nil>} 17468Ki BinarySI}] IOW Can we not only look at the container level, but try to identify which process is growning to get a better lead.
@fdeutsch The test checks the RSS of the involved processes by running `ps` command with different arguments. The full command is `ps --no-header axo rss,command`, but I don't think the arguments are important here. We check the following processes: "virt-launcher-monitor" "virt-launcher" "virtlogd" "libvirtd" btw, also if they seems to be related to the entire pod, the RSS results Denys showed are referring to the only `virt-launcher` process.
The libvirt-go fix has been merged in the release branch and should shortly land downstream too.
The following is a procedure to verify that the leak is actually gone. 1. Create a VMI (can be any kind of VMI, it's not important) 2. SSH as root into the node running the VMI 3. Install the bcc-tools to get the memleak tool by running # dnf install bcc-tools kernel-devel-$(uname -r) 4. Look for the PID of the virt-launcher running the VMI previously started 5. Run the memleak tool against that PID by doing # /usr/share/bcc/tools/memleak -p $PID -o 8000 --wa-missing-free The memleak tool will display all outstanding allocations (meaning they weren't freed yet), what you want to verify is that you don't see any memory allocation related to the virt-handler prometheus endpoint scraping. 6. Exec into the virt-handler running the VMI $ kubectl -n openshift-cnv exec virt-handler-* -- bash 7. Scrape the prom endpoint multiple times and check that you don't see any related allocation reported by memleak $ for i in {1..10000}; do curl -L -k https://0.0.0.0:8443/metrics; done If the memleak is still there it will be pretty eloquent as you will see the tool reporting ~500K of leaked memory from the same point right after scraping the endpoint 10K times. Hope this is clear enough.
Here results after running for i in {1..10000}; do curl -L -k https://0.0.0.0:8443/metrics; done multiple times on old and new CNV Old cluster: [01:48:35] Top 10 stacks with outstanding allocations: 22448595 bytes in 307515 allocations from stack g_malloc+0x16 [libglib-2.0.so.0.5600.4] New cluster (CNV-v4.11.0-587): [01:48:59] Top 10 stacks with outstanding allocations: 7024 bytes in 9 allocations from stack [unknown] [libglib-2.0.so.0.5600.4] It looks like the issue fixed! Thank you!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.11.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:6526