Hide Forgot
Description of problem: after running libvirt with around 62 domains for a couple of days memory comes up to 2.5G (2.4 in this case but on other servers reached to 2.9) - looks like libvirt has some memory leak. no actions has been done on the domains beside possible migration initiated by rhevm. top - 11:28:17 up 4 days, 19:43, 1 user, load average: 0.20, 0.37, 0.36 Tasks: 1879 total, 4 running, 1875 sleeping, 0 stopped, 0 zombie Cpu(s): 1.1%us, 1.9%sy, 0.0%ni, 97.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 66092068k total, 33201216k used, 32890852k free, 166648k buffers Swap: 68108280k total, 0k used, 68108280k free, 9813308k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9195 root 20 0 2929m 2.4g 4720 S 19.3 3.8 847:20.03 libvirtd pgrep qemu | wc -l 62 ps -o etime `pgrep libvirt` ELAPSED 4-19:53:37 after restart memory is reduced to 23m: 40916 root 20 0 692m 23m 5364 S 35.8 0.0 6:40.15 libvirtd Version-Release number of selected component (if applicable): libvirt-0.8.7-18.el6.x86_64 How reproducible: happened on many servers Steps to Reproduce: 1.run 90 vms on a host 2.leave it running for 4-5 days 3. Actual results: Expected results: Additional info: logs are extensive for this long period - anyhow host are available for debug
Moran, I need to get picture of libvirt as whole for that 5 days. Especially, which APIs were called. So could you please set this in /etc/libvirt/libvirtd.conf: log_level = 3 log_filters="1:libvirt" log_outputs="1:file:/var/log/libvirtd_debug.log" run libvirt for desired time and then attach /var/log/libvirtd_debug.log? Thanks.
Michal proposed a fix upstream today which received some feedback requiring v2, but no major objections. https://www.redhat.com/archives/libvir-list/2011-July/msg00752.html
sent v2: https://www.redhat.com/archives/libvir-list/2011-July/msg00812.html
Pushed upstream: commit 85aa40e26d00a64453653c32dc08d25b65e851d5 Author: Michal Privoznik <mprivozn> Date: Thu Jul 14 12:53:45 2011 +0200 storage: Avoid memory leak on metadata fetching Getting metadata on storage allocates a memory (path) which need to be freed after use otherwise it gets leaked. This means after use of virStorageFileGetMetadataFromFD or virStorageFileGetMetadata one must call virStorageFileFreeMetadata to free it. This function frees structure internals and structure itself. v0.9.3-138-g85aa40e
Consumption: 1. with the libvirt-client-0.9.3-5.el6.x86_64.rpm , keep 62 domains running 4 days , didn't encounter this bug PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 17500 root 20 0 829m 35m 4092 S 0.0 0.0 739:12.03 libvirtd 2. with the libvirt-0.8.7-18.el6.x86_64.rpm , keep 62 domains running 3 days , didn't reproduce this bug as well . PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3897 root 20 0 672m 12m 4092 S 0.0 0.0 0:04.22 libvirtd Question: 1. Is 5 days the threshold to reproduce this bug ? 2. Is 62 domains the standard for this bug ? Can we keep more domains to accelerate the reproduction for this bug ? 3. As we'd not got this bug reproduced with new build during 4 days, might we set this bug to VERIFIED ? 4. Can you help try with the vdsm test environment ?
Vivian, just running domains do not reproduce this bug. To do so, you need to create backing store volume and query for vol-info. I think vdsm would be helpful here as you don't need to create any reproducer. To answer yout questions: 1.) No. This bug shows from the very first start of vdsm, and consumes memory a lot. The more domains are running, the more vol-info commands are issued by vdsm, the more memory is leaked. 2. You don't need to run so much domains. I'd suggest to run ~10 domains for an hour (with vdsm) and you should see the bug immediatelly: 'top -p $(pgrep libvirtd)' and notice the mem usage changes. Since I think a little help from vdsm is needed here (at least to setup vdsm) I am not clearing needinfo flag.
Hi, mprivozn I checked the mem leak with the following steps, there is no leak using the libvirt-0.9.3-5/6/8.el6 and there is mem leak for libvirt-0.9.3-2.el6. Is it the expected result for your fix in this bug ? or more leak checking should be conducted ? # qemu-img create -f qcow2 test.qcow2 -o backing_file=foo.qcow2 +10M # qemu-img info /var/lib/libvirt/images/test.qcow2 image: /var/lib/libvirt/images/test.qcow2 file format: qcow2 virtual size: 10M (10485760 bytes) disk size: 140K cluster_size: 65536 backing file: foo.qcow2 (actual path: /var/lib/libvirt/images/foo.qcow2) # valgrind -v --leak-check=full virsh vol-info /var/lib/libvirt/images/test.qcow2 Thanks dyuan
The mem leak should be in daemon, not virsh. Moran, can you please help dyuan to set up vdsm so he can verify this bug?
just tested it with libvirt-0.9.3-8.el6.x86_64 seems to be a 144KB avg leak every 30 sec running 40 vms - 4KB a second 102B/vm/second- see attached ods
Created attachment 515513 [details] leak ods
(In reply to comment #18) > just tested it with libvirt-0.9.3-8.el6.x86_64 seems to be a 144KB avg leak > every 30 sec running 40 vms - 4KB a second 102B/vm/second- see attached ods Didn't this fix pass testing previously? Are we looking at a new leak?
Michal and myself took a look at the patch which seemed to work, however on the build we are looking at a leak which it's rate is suitable to the original one.
Moving to POST: commit 09d7eba99d95b887bd284b3418ea21438f6af277 Author: Michal Privoznik <mprivozn> Date: Thu Jul 28 15:42:57 2011 +0200 qemu: Fix memory leak on metadata fetching As written in virStorageFileGetMetadataFromFD decription, caller must free metadata after use. Qemu driver miss this and therefore leak metadata which can grow to huge mem leak if somebody query for blockInfo a lot. v0.9.4-rc1-36-g09d7eba
Tested this issue with: libvirt-0.9.4-0rc2.el6.x86_64 vdsm-4.9-86.el6.x86_64 qemu-kvm-0.12.1.2-2.172.el6.x86_64 Running 75 domains nearly 3 days: #top -p $(pgrep libvirt) top - 18:47:24 up 2 days, 21:23, 1 user, load average: 10.78, 10.71, 10.71 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie Cpu(s): 13.6%us, 7.6%sy, 0.0%ni, 78.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 1041251268k total, 46389468k used, 994861800k free, 220356k buffers Swap: 1048568k total, 0k used, 1048568k free, 1418964k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 86707 root 20 0 593m 15m 6676 S 15.5 0.0 918:17.33 libvirtd The memory stays at 15m.So change status to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1513.html