Bug 1603025

Summary: Start/Destroy/Virsh edit guest with 'iscsi block disk'/ 'bridge interface' cause libvirtd memleak
Product: Red Hat Enterprise Linux 7 Reporter: chhu
Component: libvirtAssignee: John Ferlan <jferlan>
Status: CLOSED ERRATA QA Contact: chhu
Severity: high Docs Contact:
Priority: medium    
Version: 7.6CC: dyuan, fjin, lmen, tburke, xuzhang, yafu, yalzhang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-4.5.0-4.el7 Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 09:58:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirtd.memleak none

Description chhu 2018-07-19 02:50:32 UTC
Created attachment 1459845 [details]
libvirtd.memleak

Description of problem:
Start/Destroy/Virsh edit guest with 'iscsi block disk'/ 'bridge interface' cause libvirtd memleak

Version-Release number of selected component (if applicable):
libvirt-4.5.0-3.el7.x86_64
qemu-kvm-rhev-2.12.0-7.el7.x86_64
kernel: 3.10.0-920.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Prepare a guest with an iscsi disk with xml below:
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/sdb'/>
      <target dev='vda' bus='virtio'/>
      <serial>9ed3a403-6e0f-49d2-b856-f4fc11e772d5</serial>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
    ...
    <interface type='network'>
      <mac address='52:54:00:72:9a:74'/>
      <source network='default'/>
      <model type='rtl8139'/>
    </interface>

# iscsiadm -m discovery -t st -p ipaddress
...
ipaddress:3260,1 iqn.2017-06.com.redhat:remotedisk2

# iscsiadm -m node -T iqn.2017-06.com.redhat:remotedisk2 -l
Check there are iscsi disk /dev/sdb

2. Define the guest

3. Disable namespace in /etc/libvirt/qemu.conf
   namespaces = [  ]

4. Start valgrind on terminal for libvirtd:
# service libvirtd stop
# valgrind --leak-check=full libvirtd

5. Start valgrind on other terminal for virtlogd:
# service virtlogd stop
# valgrind --leak-check=full virtlogd

6. Start,destroy,vish edit, start the guest on another terminal

# virsh edit r7
Domain r7 XML configuration edited

# virsh dumpxml r7|grep interface -A 6
    <interface type='bridge'>
      <mac address='00:1a:4a:16:01:55'/>
      <source bridge='virbr0'/>
      <model type='virtio'/>
      <filterref filter='no-mac-spoofing'/>
      <link state='up'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>

# virsh start r7
Domain r7 started

Check the valgrind output: libvirtd.memleak
...
==22270== 240 bytes in 3 blocks are definitely lost in loss record 1,996 of 2,473
==22270==    at 0x4C2B955: calloc (vg_replace_malloc.c:711)
==22270==    by 0x52C2773: virAlloc (viralloc.c:144)
==22270==    by 0x52E02D3: virLastErrorObject (virerror.c:245)
==22270==    by 0x52E1E28: virResetLastError (virerror.c:497)
==22270==    by 0x54C028C: virConnectOpen (libvirt.c:1117)
==22270==    by 0x159779: remoteDispatchConnectOpen (remote_daemon_dispatch.c:1829)
==22270==    by 0x159779: remoteDispatchConnectOpenHelper (remote_daemon_dispatch_stubs.h:3325)
==22270==    by 0x5405D34: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==22270==    by 0x5405D34: virNetServerProgramDispatch (virnetserverprogram.c:304)
==22270==    by 0x540C50C: virNetServerProcessMsg (virnetserver.c:143)
==22270==    by 0x540C50C: virNetServerHandleJob (virnetserver.c:164)
==22270==    by 0x53441A0: virThreadPoolWorker (virthreadpool.c:167)
==22270==    by 0x5343527: virThreadHelper (virthread.c:206)
==22270==    by 0x8111DD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==22270==    by 0x8423EAC: clone (in /usr/lib64/libc-2.17.so)
==22270==
==22270== 688 bytes in 1 blocks are possibly lost in loss record 2,208 of 2,473
==22270==    at 0x4C2B955: calloc (vg_replace_malloc.c:711)
==22270==    by 0x40126C4: _dl_allocate_tls (in /usr/lib64/ld-2.17.so)
==22270==    by 0x81127AB: pthread_create@@GLIBC_2.2.5 (in /usr/lib64/libpthread-2.17.so)
==22270==    by 0x53438FF: virThreadCreateFull (virthread.c:241)
==22270==    by 0x5343F9E: virThreadPoolExpand (virthreadpool.c:201)
==22270==    by 0x53445AF: virThreadPoolNewFull (virthreadpool.c:253)
==22270==    by 0x540C980: virNetServerNew (virnetserver.c:362)
==22270==    by 0x129E7F: main (remote_daemon.c:1276)
==22270==
==22270== 968 bytes in 1 blocks are definitely lost in loss record 2,234 of 2,473
==22270==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==22270==    by 0x6E0FEE1: xmlGetGlobalState (in /usr/lib64/libxml2.so.2.9.1)
==22270==    by 0x6E0F7B4: __xmlKeepBlanksDefaultValue (in /usr/lib64/libxml2.so.2.9.1)
==22270==    by 0x6D9D1BC: xmlKeepBlanksDefault (in /usr/lib64/libxml2.so.2.9.1)
==22270==    by 0x53983A8: virDomainDefParse (domain_conf.c:20990)
==22270==    by 0x53984B1: virDomainDefCopy (domain_conf.c:28722)
==22270==    by 0x5398523: virDomainObjSetDefTransient (domain_conf.c:3372)
==22270==    by 0x31DE672F: qemuProcessInit (qemu_process.c:5255)
==22270==    by 0x31DE77E9: qemuProcessStart (qemu_process.c:6669)
==22270==    by 0x31E4A775: qemuDomainObjStart.constprop.50 (qemu_driver.c:7304)
==22270==    by 0x31E4AEC5: qemuDomainCreateWithFlags (qemu_driver.c:7357)
==22270==    by 0x54D4D9B: virDomainCreate (libvirt-domain.c:6531)
==22270==
==22270== 968 bytes in 1 blocks are definitely lost in loss record 2,235 of 2,473
==22270==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==22270==    by 0x6E0FEE1: xmlGetGlobalState (in /usr/lib64/libxml2.so.2.9.1)
==22270==    by 0x6E0F7B4: __xmlKeepBlanksDefaultValue (in /usr/lib64/libxml2.so.2.9.1)
==22270==    by 0x6D9D1BC: xmlKeepBlanksDefault (in /usr/lib64/libxml2.so.2.9.1)
==22270==    by 0x53983A8: virDomainDefParse (domain_conf.c:20990)
==22270==    by 0x31E322BE: qemuDomainDefineXMLFlags (qemu_driver.c:7405)
==22270==    by 0x54D42B9: virDomainDefineXMLFlags (libvirt-domain.c:6208)
==22270==    by 0x1361FC: remoteDispatchDomainDefineXMLFlags (remote_daemon_dispatch_stubs.h:4714)
==22270==    by 0x1361FC: remoteDispatchDomainDefineXMLFlagsHelper (remote_daemon_dispatch_stubs.h:4692)
==22270==    by 0x5405D34: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==22270==    by 0x5405D34: virNetServerProgramDispatch (virnetserverprogram.c:304)
==22270==    by 0x540C50C: virNetServerProcessMsg (virnetserver.c:143)
==22270==    by 0x540C50C: virNetServerHandleJob (virnetserver.c:164)
==22270==    by 0x53441A0: virThreadPoolWorker (virthreadpool.c:167)
==22270==    by 0x5343527: virThreadHelper (virthread.c:206)
...

Actal results:
In step6: Get libvirtd memleak

Expected results:
In step6: No libvirtd memleak


Additional info:
- libvirtd.memleak

Comment 2 chhu 2018-07-20 01:45:49 UTC
Do more test on the same machine with the same guest, the test is still running, will update more when the test is finished. 

Update current information here:

- Destroy and start the guest for more than 2100 times

# virsh start r7
Domain r7 started
# while :; do virsh destroy  r7; sleep 5; virsh start r7; sleep 5; done

Got libvirtd memleak:

==6423== 968 bytes in 1 blocks are definitely lost in loss record 2,065 of 2,297
==6423==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==6423==    by 0x6E0FEE1: xmlGetGlobalState (in /usr/lib64/libxml2.so.2.9.1)
==6423==    by 0x6E0F7B4: __xmlKeepBlanksDefaultValue (in /usr/lib64/libxml2.so.2.9.1)
==6423==    by 0x6D9D1BC: xmlKeepBlanksDefault (in /usr/lib64/libxml2.so.2.9.1)
==6423==    by 0x53983A8: virDomainDefParse (domain_conf.c:20990)
==6423==    by 0x31E322BE: qemuDomainDefineXMLFlags (qemu_driver.c:7405)
==6423==    by 0x54D42B9: virDomainDefineXMLFlags (libvirt-domain.c:6208)
==6423==    by 0x1361FC: remoteDispatchDomainDefineXMLFlags (remote_daemon_dispatch_stubs.h:4714)
==6423==    by 0x1361FC: remoteDispatchDomainDefineXMLFlagsHelper (remote_daemon_dispatch_stubs.h:4692)
==6423==    by 0x5405D34: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==6423==    by 0x5405D34: virNetServerProgramDispatch (virnetserverprogram.c:304)
==6423==    by 0x540C50C: virNetServerProcessMsg (virnetserver.c:143)
==6423==    by 0x540C50C: virNetServerHandleJob (virnetserver.c:164)
==6423==    by 0x53441A0: virThreadPoolWorker (virthreadpool.c:167)
==6423==    by 0x5343527: virThreadHelper (virthread.c:206)
==6423== 
==6423== 3,872 bytes in 4 blocks are definitely lost in loss record 2,206 of 2,297
==6423==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==6423==    by 0x6E0FEE1: xmlGetGlobalState (in /usr/lib64/libxml2.so.2.9.1)
==6423==    by 0x6E0F7B4: __xmlKeepBlanksDefaultValue (in /usr/lib64/libxml2.so.2.9.1)
==6423==    by 0x6D9D1BC: xmlKeepBlanksDefault (in /usr/lib64/libxml2.so.2.9.1)
==6423==    by 0x53983A8: virDomainDefParse (domain_conf.c:20990)
==6423==    by 0x53984B1: virDomainDefCopy (domain_conf.c:28722)
==6423==    by 0x5398523: virDomainObjSetDefTransient (domain_conf.c:3372)
==6423==    by 0x31DE672F: qemuProcessInit (qemu_process.c:5255)
==6423==    by 0x31DE77E9: qemuProcessStart (qemu_process.c:6669)
==6423==    by 0x31E4A775: qemuDomainObjStart.constprop.50 (qemu_driver.c:7304)
==6423==    by 0x31E4AEC5: qemuDomainCreateWithFlags (qemu_driver.c:7357)
==6423==    by 0x54D4D9B: virDomainCreate (libvirt-domain.c:6531)
==6423== 
==6423== 17,328 bytes in 1,083 blocks are definitely lost in loss record 2,275 of 2,297
==6423==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==6423==    by 0x83B20C9: strdup (in /usr/lib64/libc-2.17.so)
==6423==    by 0x533C144: virStrdup (virstring.c:977)
==6423==    by 0x54BDD53: virGetNWFilterBinding (datatypes.c:865)
==6423==    by 0x318D633C: nwfilterBindingCreateXML (nwfilter_driver.c:767)
==6423==    by 0x54F3FC5: virNWFilterBindingCreateXML (libvirt-nwfilter.c:701)
==6423==    by 0x539CE29: virDomainConfNWFilterInstantiate (domain_nwfilter.c:116)
==6423==    by 0x31E516C2: qemuInterfaceBridgeConnect (qemu_interface.c:589)
==6423==    by 0x31D98B56: qemuBuildInterfaceCommandLine (qemu_command.c:8418)
==6423==    by 0x31D9F783: qemuBuildNetCommandLine (qemu_command.c:8673)
==6423==    by 0x31D9F783: qemuBuildCommandLine (qemu_command.c:10354)
==6423==    by 0x31DE355F: qemuProcessLaunch (qemu_process.c:6292)
==6423==    by 0x31DE7881: qemuProcessStart (qemu_process.c:6686)
==6423== 
==6423== 17,328 bytes in 1,083 blocks are definitely lost in loss record 2,276 of 2,297
==6423==    at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==6423==    by 0x83B20C9: strdup (in /usr/lib64/libc-2.17.so)
==6423==    by 0x533C144: virStrdup (virstring.c:977)
==6423==    by 0x54BDD53: virGetNWFilterBinding (datatypes.c:865)
==6423==    by 0x318D641F: nwfilterBindingLookupByPortDev (nwfilter_driver.c:678)
==6423==    by 0x54F3B63: virNWFilterBindingLookupByPortDev (libvirt-nwfilter.c:593)
==6423==    by 0x539CBC5: virDomainConfNWFilterTeardownImpl.isra.0 (domain_nwfilter.c:136)
==6423==    by 0x539CFA5: virDomainConfVMNWFilterTeardown (domain_nwfilter.c:170)
==6423==    by 0x31DE5651: qemuProcessStop (qemu_process.c:6912)
==6423==    by 0x31E37974: qemuDomainDestroyFlags (qemu_driver.c:2229)
==6423==    by 0x54C24BB: virDomainDestroy (libvirt-domain.c:475)
==6423==    by 0x1589A2: remoteDispatchDomainDestroy (remote_daemon_dispatch_stubs.h:4827)
==6423==    by 0x1589A2: remoteDispatchDomainDestroyHelper (remote_daemon_dispatch_stubs.h:4803)

Comment 3 Fangge Jin 2018-07-20 14:14:54 UTC
@chhu please don't record all the memory leak in one single bug, it is not possible to fix all of them at once.

Comment 4 Fangge Jin 2018-07-20 14:17:21 UTC
@lmen please reassign the QA contact to the other feature owner, it doesn't belong to virsh, the leak is in libvirtd

Comment 5 John Ferlan 2018-07-20 15:07:24 UTC
There's one set of leaks appearing to come from virDomainDefParse(domain_conf.c:20990) as a result of     

    int keepBlanksDefault = xmlKeepBlanksDefault(0);

may be a red herring since I see libxml2 and xmlGetGlobalState in the trace indicating a global/library allocation (and no libvirt changes in the Define and Create XML code around there).

The larger leak from virGetNWFilterBinding is because virNWFilterBindingDispose neglected to VIR_FREE(binding->filtername);

Patch is posted:

https://www.redhat.com/archives/libvir-list/2018-July/msg01353.html

Comment 6 John Ferlan 2018-07-21 13:33:32 UTC
Do not believe the iscsi reference had any bearing, this seems to be a filter/binding issue purely. If there's a way to reproduce with iscsi, just generate another bz w/ just iscsi data.

Above patch is now pushed upstream:

$ git show
commit 329f2347d2215edb3925ec21c73bb52b7b5aa310 (HEAD -> master, origin/master, origin/HEAD, bz1603025)
Author: John Ferlan <jferlan>
Date:   Fri Jul 20 10:58:45 2018 -0400

    src: Fix memory leak in virNWFilterBindingDispose
    
...

    Commit b57a9aec neglected to VIR_FREE(binding->filtername) as seen
    in the following valgrind report
...


$ git describe 329f2347d2215edb3925ec21c73bb52b7b5aa310
v4.5.0-229-g329f2347d2
$

Comment 9 chhu 2018-08-01 09:44:28 UTC
Verified on packages:
libvirt-4.5.0-5.el7.x86_64
qemu-kvm-rhev-2.12.0-8.el7.x86_64

Test steps:
1. Prepare a guest with an iscsi disk with xml below:
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/sdb'/>
      <target dev='vda' bus='virtio'/>
      <serial>9ed3a403-6e0f-49d2-b856-f4fc11e772d5</serial>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>
    ...
    <interface type='network'>
      <mac address='52:54:00:72:9a:74'/>
      <source network='default'/>
      <model type='rtl8139'/>
    </interface>

# iscsiadm -m discovery -t st -p ipaddress
...
ipaddress:3260,1 iqn.2017-06.com.redhat:remotedisk2

# iscsiadm -m node -T iqn.2017-06.com.redhat:remotedisk2 -l
Check there are iscsi disk /dev/sdb

2. Define the guest

3. Disable namespace in /etc/libvirt/qemu.conf
   namespaces = [  ]

4. Start valgrind on terminal for libvirtd:
# service libvirtd stop
# valgrind --leak-check=full libvirtd

5. Start valgrind on other terminal for virtlogd:
# service virtlogd stop
# valgrind --leak-check=full virtlogd

6. Start,destroy,vish edit, start the guest on another terminal

# virsh edit r7
Domain r7 XML configuration edited

# virsh dumpxml r7|grep interface -A 6
    <interface type='bridge'>
      <mac address='00:1a:4a:16:01:55'/>
      <source bridge='virbr0'/>
      <model type='virtio'/>
      <filterref filter='no-mac-spoofing'/>
      <link state='up'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>

# virsh start r7
Domain r7 started

7. Check the valgrind output, there is no memleak from "virGetNWFilterBinding".
   The memleak from "virDomainDefParse" is still there, according to the comment5, and above test steps, set the bug status to "VERIFIED".

Comment 11 errata-xmlrpc 2018-10-30 09:58:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3113