Hide Forgot
Description of problem: Libvirtd was killed after several cycles creating/deleting external disk snapshot with glusterfs backend Version-Release number of selected component (if applicable): libvirt-1.2.8-2.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Prepare a guest xml with source file based on gluster server # cat rh7-g.xml 2. Prepare four snapshot file # cat s1.xml <domainsnapshot> <name>s1</name> <disks> <disk name='vda' type='network'> <driver type='qcow2'/> <source protocol='gluster' name='gluster-vol1/rhel7-qcow2.s1'> <host name='10.66.x.xxx'/> </source> </disk> </disks> </domainsnapshot> #for i in s2 s3 s4;do sed -e s/s1/$i/ s1.xml > $i.xml create backing-chains: rhel7-qcow2.s4->rhel7-qcow2.s3>-rhel7-qcow2.s2->rhel7-qcow2.s1->rhel7-qcow2.img 3. Start libvirtd in one terminal # free -m total used free shared buffers cached Mem: 7461 283 7177 4 0 80 -/+ buffers/cache: 203 7258 Swap: 1023 194 829 # valgrind --leak-check=full libvirtd 2>&1 &>memoryleak Killed 4. Do "define guest -> start guest -> sleep 30 -> create four external disk snapshot ->undefine guest ->destroy guest" cycle Libvirtd was killed when try the fourth or fifth # red='\e[0;31m';NC='\e[0m' ;for num in $(seq 1 100);do echo -e "${red}Try the $num time:${NC}";virsh define rh7-g.xml ;virsh start rh7-g;sleep 30;for i in s1 s2 s3 s4;do virsh snapshot-create rh7-g $i.xml --reuse-external --disk-only;done;virsh undefine rh7-g;virsh destroy rh7-g;done Try the 1 time: Domain rh7-g defined from rh7-g.xml Domain rh7-g started Domain snapshot s1 created from 's1.xml' Domain snapshot s2 created from 's2.xml' Domain snapshot s3 created from 's3.xml' Domain snapshot s4 created from 's4.xml' Domain rh7-g has been undefined Domain rh7-g destroyed Try the 2 time: Domain rh7-g defined from rh7-g.xml ... Try the 4 time: Domain rh7-g defined from rh7-g.xml Domain rh7-g started Domain snapshot s1 created from 's1.xml' Domain snapshot s2 created from 's2.xml' Domain snapshot s3 created from 's3.xml' 2014-09-18 03:33:59.419+0000: 3838: info : libvirt version: 1.2.8, package: 2.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2014-09-10-09:58:42, x86-019.build.eng.bos.redhat.com) 2014-09-18 03:33:59.419+0000: 3838: warning : virKeepAliveTimerInternal:143 : No response from client 0x7ff428a4bc60 after 6 keepalive messages in 36 seconds 2014-09-18 03:34:01.185+0000: 3837: warning : virKeepAliveTimerInternal:143 : No response from client 0x7ff428a4bc60 after 6 keepalive messages in 37 seconds error: internal error: received hangup / error event on socket error: Failed to reconnect to the hypervisor I will attach memoryleak Actual results: Expected results: Additional info: A brief valgrind trace: ==28646== Memcheck, a memory error detector ==28646== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==28646== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info ==28646== Command: libvirtd ==28646== ==28646== Warning: noted but unhandled ioctl 0x89a2 with no size/direction hints ==28646== This could cause spurious value errors to appear. ==28646== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==28711== ==28711== HEAP SUMMARY: ==28711== in use at exit: 92,591,877 bytes in 21,517 blocks ==28711== total heap usage: 131,390 allocs, 109,873 frees, 251,304,470 bytes allocated ==28711== ==28711== LEAK SUMMARY: ==28711== definitely lost: 9,169 bytes in 76 blocks ==28711== indirectly lost: 8,924 bytes in 40 blocks ==28711== possibly lost: 44,796,701 bytes in 420 blocks ==28711== still reachable: 47,777,083 bytes in 20,981 blocks ==28711== suppressed: 0 bytes in 0 blocks ==28711== Rerun with --leak-check=full to see details of leaked memory ==28711== ==28711== For counts of detected and suppressed errors, rerun with: -v ==28711== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 3) ==28646== Warning: noted but unhandled ioctl 0x89a2 with no size/direction hints ==28646== This could cause spurious value errors to appear. ==28646== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==28899== ==28899== HEAP SUMMARY: ==28899== in use at exit: 1,391,222,243 bytes in 31,333 blocks ==28899== total heap usage: 173,904 allocs, 142,571 frees, 1,560,240,340 bytes allocated ==28899== ==28899== LEAK SUMMARY: ==28899== definitely lost: 179,496 bytes in 1,013 blocks ==28899== indirectly lost: 92,414 bytes in 369 blocks ==28899== possibly lost: 940,368,330 bytes in 6,365 blocks ==28899== still reachable: 450,582,003 bytes in 23,586 blocks ==28899== suppressed: 0 bytes in 0 blocks ==28899== Rerun with --leak-check=full to see details of leaked memory ==28899== ==28899== For counts of detected and suppressed errors, rerun with: -v ==28899== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 3) ==28900== could not unlink /tmp/vgdb-pipe-from-vgdb-to-28900-by-root-on-shyu_test_pc ==28900== could not unlink /tmp/vgdb-pipe-to-vgdb-from-28900-by-root-on-shyu_test_pc ==28900== could not unlink /tmp/vgdb-pipe-shared-mem-vgdb-28900-by-root-on-shyu_test_pc ==28646== Warning: noted but unhandled ioctl 0x89a2 with no size/direction hints ==28646== This could cause spurious value errors to appear. ==28646== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==29140== ==29140== HEAP SUMMARY: ==29140== in use at exit: 2,734,695,264 bytes in 41,767 blocks ==29140== total heap usage: 222,782 allocs, 181,015 frees, 2,922,844,432 bytes allocated ==29140== ==29140== LEAK SUMMARY: ==29140== definitely lost: 458,454 bytes in 2,125 blocks ==29140== indirectly lost: 358,294,985 bytes in 3,078 blocks ==29140== possibly lost: 1,701,563,207 bytes in 11,492 blocks ==29140== still reachable: 674,378,618 bytes in 25,072 blocks ==29140== suppressed: 0 bytes in 0 blocks ==29140== Rerun with --leak-check=full to see details of leaked memory ==29140== ==29140== For counts of detected and suppressed errors, rerun with: -v ==29140== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 3) ==29332== ==29332== HEAP SUMMARY: ==29332== in use at exit: 4,033,802,861 bytes in 58,023 blocks ==29332== total heap usage: 818,469 allocs, 760,446 frees, 4,414,285,420 bytes allocated ==29332== ==29332== LEAK SUMMARY: ==29332== definitely lost: 608,119 bytes in 3,234 blocks ==29332== indirectly lost: 761,091,420 bytes in 6,014 blocks ==29332== possibly lost: 1,970,413,033 bytes in 13,256 blocks ==29332== still reachable: 1,301,690,289 bytes in 35,519 blocks ==29332== suppressed: 0 bytes in 0 blocks ==29332== Rerun with --leak-check=full to see details of leaked memory ==29332== ==29332== For counts of detected and suppressed errors, rerun with: -v ==29332== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 3)
Created attachment 938739 [details] memoryleak
The issue is that glfs_fini() leaks the memory allocated by glfs_new(). This is a known issue in gluster: https://bugzilla.redhat.com/show_bug.cgi?id=1093594
Looks like the issue ( https://bugzilla.redhat.com/show_bug.cgi?id=1093594) will be fixed in libgfapi soon. Moving to ON_QA. Once the libgfapi package is fixed the memory leak should disappear.
Verified on libvirt-1.2.17-6.el7.x86_64 and glusterfs-server-3.7.3-1.el7.x86_64 Steps 1.prepare a running guest with following xml vim virt-tests-vm1.xml <disk type='network' device='disk'> <driver name='qemu' type='qcow2'/> <source protocol='gluster' name='gluster-vol1/rhel7.qcow2'> <host name='10.66.xx.xx'/> </source> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </disk> 2.Prepare four snapshot file # cat s1.xml <domainsnapshot> <name>s1</name> <disks> <disk name='vda' type='network'> <driver type='qcow2'/> <source protocol='gluster' name='gluster-vol1/rhel7-qcow2.s1'> <host name='10.66.x.xxx'/> </source> </disk> </disks> </domainsnapshot> #for i in s2 s3 s4;do sed -e s/s1/$i/ s1.xml > $i.xml ; done #for i in s1 s2 s3 s4; do virsh snapshot-create virt-tests-vm1 $i --disk-only --no-metadata; done create backing-chains: rhel7-qcow2.s4->rhel7-qcow2.s3>-rhel7-qcow2.s2->rhel7-qcow2.s1->rhel7-qcow2.img 3. Try to create/delete snapshots for 30 times # red='\e[0;31m';NC='\e[0m' ;for num in $(seq 1 100);do echo -e "${red}Try the $num time:${NC}";virsh define virt-tests-vm1.xml ;virsh start virt-tests-vm1;sleep 30;for i in s1 s2 s3 s4;do virsh snapshot-create virt-tests-vm1 $i.xml --reuse-external --disk-only --no-metadata;done;virsh undefine virt-tests-vm1;virsh destroy virt-tests-vm1;done Try the 1 time: Domain virt-tests-vm1 defined from virt-tests-vm1.xml Domain virt-tests-vm1 started Domain snapshot s1 created from 's1.xml' Domain snapshot s2 created from 's2.xml' Domain snapshot s3 created from 's3.xml' Domain snapshot s4 created from 's4.xml' Domain virt-tests-vm1 has been undefined Domain virt-tests-vm1 destroyed ............. Try the 30 time: Domain virt-tests-vm1 defined from virt-tests-vm1.xml Domain virt-tests-vm1 started Domain snapshot s1 created from 's1.xml' Domain snapshot s2 created from 's2.xml' Domain snapshot s3 created from 's3.xml' Domain snapshot s4 created from 's4.xml' Domain virt-tests-vm1 has been undefined Domain virt-tests-vm1 destroyed Libvirtd was not dead
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2202.html