RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1143800 - Libvirtd was killed after several cycles creating/deleting external disk snapshot with glusterfs backend
Summary: Libvirtd was killed after several cycles creating/deleting external disk snap...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Peter Krempa
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1093594
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-09-18 03:41 UTC by Shanzhi Yu
Modified: 2016-01-21 09:32 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-19 05:52:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
memoryleak (2.44 MB, text/plain)
2014-09-18 03:44 UTC, Shanzhi Yu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2202 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2015-11-19 08:17:58 UTC

Description Shanzhi Yu 2014-09-18 03:41:33 UTC
Description of problem:

Libvirtd was killed after several cycles creating/deleting external disk snapshot with glusterfs backend


Version-Release number of selected component (if applicable):

libvirt-1.2.8-2.el7.x86_64

How reproducible:

100%

Steps to Reproduce:

1. Prepare a guest xml with source file based on gluster server
# cat rh7-g.xml


2. Prepare four snapshot file

# cat s1.xml
<domainsnapshot>
<name>s1</name>
<disks>
<disk name='vda' type='network'>
<driver type='qcow2'/>
<source protocol='gluster' name='gluster-vol1/rhel7-qcow2.s1'>
<host name='10.66.x.xxx'/>
</source>
</disk>
</disks>
</domainsnapshot>

#for i in s2 s3 s4;do sed -e s/s1/$i/ s1.xml > $i.xml

create backing-chains:
rhel7-qcow2.s4->rhel7-qcow2.s3>-rhel7-qcow2.s2->rhel7-qcow2.s1->rhel7-qcow2.img

3. Start libvirtd in one terminal 

# free -m 
             total       used       free     shared    buffers     cached
Mem:          7461        283       7177          4          0         80
-/+ buffers/cache:        203       7258
Swap:         1023        194        829


# valgrind --leak-check=full libvirtd   2>&1 &>memoryleak

 
Killed

4. Do "define guest -> start guest -> sleep 30 -> create four external disk snapshot ->undefine guest ->destroy guest" cycle

Libvirtd was killed when try the fourth or fifth 

# red='\e[0;31m';NC='\e[0m' ;for num in $(seq 1 100);do echo -e "${red}Try the $num time:${NC}";virsh define rh7-g.xml ;virsh start  rh7-g;sleep 30;for i in s1 s2 s3 s4;do virsh snapshot-create rh7-g  $i.xml --reuse-external --disk-only;done;virsh undefine rh7-g;virsh destroy rh7-g;done
Try the 1 time:
Domain rh7-g defined from rh7-g.xml

Domain rh7-g started

Domain snapshot s1 created from 's1.xml'
Domain snapshot s2 created from 's2.xml'
Domain snapshot s3 created from 's3.xml'
Domain snapshot s4 created from 's4.xml'
Domain rh7-g has been undefined

Domain rh7-g destroyed

Try the 2 time:
Domain rh7-g defined from rh7-g.xml
...


Try the 4 time:
Domain rh7-g defined from rh7-g.xml

Domain rh7-g started

Domain snapshot s1 created from 's1.xml'
Domain snapshot s2 created from 's2.xml'
Domain snapshot s3 created from 's3.xml'
2014-09-18 03:33:59.419+0000: 3838: info : libvirt version: 1.2.8, package: 2.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2014-09-10-09:58:42, x86-019.build.eng.bos.redhat.com)
2014-09-18 03:33:59.419+0000: 3838: warning : virKeepAliveTimerInternal:143 : No response from client 0x7ff428a4bc60 after 6 keepalive messages in 36 seconds
2014-09-18 03:34:01.185+0000: 3837: warning : virKeepAliveTimerInternal:143 : No response from client 0x7ff428a4bc60 after 6 keepalive messages in 37 seconds
error: internal error: received hangup / error event on socket
error: Failed to reconnect to the hypervisor

I will attach memoryleak 

Actual results:


Expected results:


Additional info:

A brief valgrind trace:

==28646== Memcheck, a memory error detector
==28646== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==28646== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==28646== Command: libvirtd
==28646== 
==28646== Warning: noted but unhandled ioctl 0x89a2 with no size/direction hints
==28646==    This could cause spurious value errors to appear.
==28646==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==28711== 
==28711== HEAP SUMMARY:
==28711==     in use at exit: 92,591,877 bytes in 21,517 blocks
==28711==   total heap usage: 131,390 allocs, 109,873 frees, 251,304,470 bytes allocated
==28711== 
==28711== LEAK SUMMARY:
==28711==    definitely lost: 9,169 bytes in 76 blocks
==28711==    indirectly lost: 8,924 bytes in 40 blocks
==28711==      possibly lost: 44,796,701 bytes in 420 blocks
==28711==    still reachable: 47,777,083 bytes in 20,981 blocks
==28711==         suppressed: 0 bytes in 0 blocks
==28711== Rerun with --leak-check=full to see details of leaked memory
==28711== 
==28711== For counts of detected and suppressed errors, rerun with: -v
==28711== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 3)
==28646== Warning: noted but unhandled ioctl 0x89a2 with no size/direction hints
==28646==    This could cause spurious value errors to appear.
==28646==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==28899== 
==28899== HEAP SUMMARY:
==28899==     in use at exit: 1,391,222,243 bytes in 31,333 blocks
==28899==   total heap usage: 173,904 allocs, 142,571 frees, 1,560,240,340 bytes allocated
==28899== 
==28899== LEAK SUMMARY:
==28899==    definitely lost: 179,496 bytes in 1,013 blocks
==28899==    indirectly lost: 92,414 bytes in 369 blocks
==28899==      possibly lost: 940,368,330 bytes in 6,365 blocks
==28899==    still reachable: 450,582,003 bytes in 23,586 blocks
==28899==         suppressed: 0 bytes in 0 blocks
==28899== Rerun with --leak-check=full to see details of leaked memory
==28899== 
==28899== For counts of detected and suppressed errors, rerun with: -v
==28899== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 3)
==28900== could not unlink /tmp/vgdb-pipe-from-vgdb-to-28900-by-root-on-shyu_test_pc
==28900== could not unlink /tmp/vgdb-pipe-to-vgdb-from-28900-by-root-on-shyu_test_pc
==28900== could not unlink /tmp/vgdb-pipe-shared-mem-vgdb-28900-by-root-on-shyu_test_pc
==28646== Warning: noted but unhandled ioctl 0x89a2 with no size/direction hints
==28646==    This could cause spurious value errors to appear.
==28646==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==29140== 
==29140== HEAP SUMMARY:
==29140==     in use at exit: 2,734,695,264 bytes in 41,767 blocks
==29140==   total heap usage: 222,782 allocs, 181,015 frees, 2,922,844,432 bytes allocated
==29140== 
==29140== LEAK SUMMARY:
==29140==    definitely lost: 458,454 bytes in 2,125 blocks
==29140==    indirectly lost: 358,294,985 bytes in 3,078 blocks
==29140==      possibly lost: 1,701,563,207 bytes in 11,492 blocks
==29140==    still reachable: 674,378,618 bytes in 25,072 blocks
==29140==         suppressed: 0 bytes in 0 blocks
==29140== Rerun with --leak-check=full to see details of leaked memory
==29140== 
==29140== For counts of detected and suppressed errors, rerun with: -v
==29140== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 3)
==29332== 
==29332== HEAP SUMMARY:
==29332==     in use at exit: 4,033,802,861 bytes in 58,023 blocks
==29332==   total heap usage: 818,469 allocs, 760,446 frees, 4,414,285,420 bytes allocated
==29332== 
==29332== LEAK SUMMARY:
==29332==    definitely lost: 608,119 bytes in 3,234 blocks
==29332==    indirectly lost: 761,091,420 bytes in 6,014 blocks
==29332==      possibly lost: 1,970,413,033 bytes in 13,256 blocks
==29332==    still reachable: 1,301,690,289 bytes in 35,519 blocks
==29332==         suppressed: 0 bytes in 0 blocks
==29332== Rerun with --leak-check=full to see details of leaked memory
==29332== 
==29332== For counts of detected and suppressed errors, rerun with: -v
==29332== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 3)

Comment 1 Shanzhi Yu 2014-09-18 03:44:23 UTC
Created attachment 938739 [details]
memoryleak

Comment 2 Peter Krempa 2014-09-26 08:02:15 UTC
The issue is that glfs_fini() leaks the memory allocated by glfs_new(). This is a known issue in gluster: https://bugzilla.redhat.com/show_bug.cgi?id=1093594

Comment 4 Peter Krempa 2015-03-02 14:03:04 UTC
Looks like the issue ( https://bugzilla.redhat.com/show_bug.cgi?id=1093594) will be fixed in libgfapi soon. Moving to ON_QA. Once the libgfapi package is fixed the memory leak should disappear.

Comment 5 Yang Yang 2015-09-06 09:58:51 UTC
Verified on libvirt-1.2.17-6.el7.x86_64 and glusterfs-server-3.7.3-1.el7.x86_64

Steps
1.prepare a running guest with following xml 
vim virt-tests-vm1.xml
 <disk type='network' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source protocol='gluster' name='gluster-vol1/rhel7.qcow2'>
        <host name='10.66.xx.xx'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>

2.Prepare four snapshot file

# cat s1.xml
<domainsnapshot>
<name>s1</name>
<disks>
<disk name='vda' type='network'>
<driver type='qcow2'/>
<source protocol='gluster' name='gluster-vol1/rhel7-qcow2.s1'>
<host name='10.66.x.xxx'/>
</source>
</disk>
</disks>
</domainsnapshot>

#for i in s2 s3 s4;do sed -e s/s1/$i/ s1.xml > $i.xml ; done
#for i in s1 s2 s3 s4; do virsh snapshot-create virt-tests-vm1 $i --disk-only --no-metadata; done
create backing-chains:
rhel7-qcow2.s4->rhel7-qcow2.s3>-rhel7-qcow2.s2->rhel7-qcow2.s1->rhel7-qcow2.img

3. Try to create/delete snapshots for 30 times
# red='\e[0;31m';NC='\e[0m' ;for num in $(seq 1 100);do echo -e "${red}Try the $num time:${NC}";virsh define virt-tests-vm1.xml ;virsh start  virt-tests-vm1;sleep 30;for i in s1 s2 s3 s4;do virsh snapshot-create virt-tests-vm1  $i.xml --reuse-external --disk-only --no-metadata;done;virsh undefine virt-tests-vm1;virsh destroy virt-tests-vm1;done

Try the 1 time:
Domain virt-tests-vm1 defined from virt-tests-vm1.xml

Domain virt-tests-vm1 started

Domain snapshot s1 created from 's1.xml'
Domain snapshot s2 created from 's2.xml'
Domain snapshot s3 created from 's3.xml'
Domain snapshot s4 created from 's4.xml'
Domain virt-tests-vm1 has been undefined

Domain virt-tests-vm1 destroyed
.............
Try the 30 time:
Domain virt-tests-vm1 defined from virt-tests-vm1.xml

Domain virt-tests-vm1 started

Domain snapshot s1 created from 's1.xml'
Domain snapshot s2 created from 's2.xml'
Domain snapshot s3 created from 's3.xml'
Domain snapshot s4 created from 's4.xml'
Domain virt-tests-vm1 has been undefined

Domain virt-tests-vm1 destroyed

Libvirtd was not dead

Comment 7 errata-xmlrpc 2015-11-19 05:52:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html


Note You need to log in before you can comment on or make changes to this bug.