Bug 1731329

Summary: Memory leak when external snapshot create and shallow blockcommit cycle
Product: Red Hat Enterprise Linux 7 Reporter: Han Han <hhan>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED ERRATA QA Contact: yisun
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.7CC: dyuan, fjin, jdenemar, lmen, pkrempa, xuzhang, yisun
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-4.5.0-24.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1731332 (view as bug list) Environment:
Last Closed: 2020-03-31 19:58:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1731332    
Attachments:
Description Flags
the script, domain xml&cmdline, origin data, leak graph
none
valgrind.log none

Description Han Han 2019-07-19 06:45:37 UTC
Created attachment 1591907 [details]
the script, domain xml&cmdline, origin data, leak graph

Description of problem:
As subject

Version-Release number of selected component (if applicable):
libvirt-4.5.0-23.virtcov.el7.x86_64
qemu-kvm-rhev-2.12.0-33.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Prepare a running vm
2. Run the python script(after install psutils and matplotlib), to create snapshot and do blockcommit for the vm:

# ./main.py -e "./snapshot-commit.sh PC" -c 2000 -p "`pidof qemu-kvm libvirtd`" -i 0.2
3. Check the libvirtd ram usage graph, which shows a rising sawteeth line:

Han Han <hhan>
	
Attachments10:29 AM (4 hours ago)
	
to s3-bug-review, yisun
Description of problem:
As subject

Version-Release number of selected component (if applicable):
libvirt-4.5.0-23.virtcov.el7.x86_64
qemu-kvm-rhev-2.12.0-33.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Prepare a running vm
2. Run the python script(after install psutils and matplotlib), to create snapshot and do blockcommit for the vm:

# ./main.py -e "./snapshot-commit.sh PC" -c 2000 -p "`pidof qemu-kvm libvirtd`" -i 0.2
3. Check the libvirtd ram usage graph, which shows a rising sawteeth line:
See the libvirt.png


Actual results:
As above. About 1540 Byte leak every cycle.

Expected results:
No memory leak

Comment 2 Peter Krempa 2019-07-19 12:49:22 UTC
Upstream fixed the leak of certain parts of metadata when redetecting the backing chain which is most probable cause for the leak:
commit f0430d069af991475de6fa83ed62a45f8669c645
Author: Peter Krempa <pkrempa>
Date:   Thu Jul 18 16:32:44 2019 +0200

    util: storage: Don't leak metadata on repeated calls of virStorageFileGetMetadata
    
    When querying storage metadata after a block job we re-run
    virStorageFileGetMetadata on the top level storage file. This means that
    the workers (virStorageFileGetMetadataInternal) must not overwrite any
    pointers without freeing them.
    
    This was not considered for src->compat and src->features. Fix it and
    add a comment mentioning that.
    
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Ján Tomko <jtomko>
    Reviewed-by: Michal Privoznik <mprivozn>

commit 5b8e64f0bcbbab826cff5be1b0adb000923abfb4
Author: Peter Krempa <pkrempa>
Date:   Thu Jul 18 16:30:18 2019 +0200

    util: storage: Clean up label use in virStorageFileGetMetadataInternal
    
    The function does not do any cleanup, so replace the 'cleanup' label
    with return of -1 and the 'done' label with return of 0.
    
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Ján Tomko <jtomko>
    Reviewed-by: Michal Privoznik <mprivozn>

Comment 5 yisun 2019-10-11 13:00:22 UTC
Created attachment 1624733 [details]
valgrind.log

Comment 6 yisun 2019-10-12 07:04:03 UTC
Hi Peter, 
When I do shallow blockcommit, the valgrind shows following info, is this a problem? The whole valgrind log is attached as 'valgrind.log'
==4846== 32 bytes in 1 blocks are definitely lost in loss record 994 of 2,442
==4846==    at 0x4C2C089: calloc (vg_replace_malloc.c:762)
==4846==    by 0x820F60F: _dlerror_run (in /usr/lib64/libdl-2.17.so)
==4846==    by 0x820F040: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so)
==4846==    by 0x53CCB42: virModuleLoadFile (virmodule.c:51)
==4846==    by 0x53CCB42: virModuleLoad (virmodule.c:117)
==4846==    by 0x54136AB: virStorageFileLoadBackendModule (virstoragefilebackend.c:68)
==4846==    by 0x5413716: UnknownInlinedFun (virstoragefilebackend.c:79)
==4846==    by 0x5413716: virStorageFileBackendOnce (virstoragefilebackend.c:89)
==4846==    by 0x841820A: __pthread_once_slow (in /usr/lib64/libpthread-2.17.so)
==4846==    by 0x541EDFF: virOnce (virthread.c:48)
==4846==    by 0x54138E5: UnknownInlinedFun (virstoragefilebackend.c:89)
==4846==    by 0x54138E5: virStorageFileBackendForType (virstoragefilebackend.c:121)
==4846==    by 0x540DDD8: virStorageFileGetBackendForSupportCheck (virstoragefile.c:4374)
==4846==    by 0x5412359: virStorageFileSupportsSecurityDriver (virstoragefile.c:4413)
==4846==    by 0x32280E2E: qemuSecurityChownCallback (qemu_driver.c:290)

Comment 7 yisun 2019-10-12 07:05:37 UTC
the libvirt i used is 'libvirt-4.5.0-27.el7.x86_64'

Comment 8 Peter Krempa 2019-10-23 06:00:14 UTC
That isn't a problem. The code which leaks the pointer is executed in virOnce, thus it's guaranteed to be executed only once per process lifetime. Thus the memory won't leak any more than that.

Additionally the leak comes from virModuleLoad which calls dlopen. We specifically must leak that pointer since it is associated with the loaded dynamic module. Freeing it would mean that the module would be unloaded which is not desired.

Comment 9 yisun 2019-10-23 06:07:10 UTC
(In reply to Peter Krempa from comment #8)
> That isn't a problem. The code which leaks the pointer is executed in
> virOnce, thus it's guaranteed to be executed only once per process lifetime.
> Thus the memory won't leak any more than that.
> 
> Additionally the leak comes from virModuleLoad which calls dlopen. We
> specifically must leak that pointer since it is associated with the loaded
> dynamic module. Freeing it would mean that the module would be unloaded
> which is not desired.

Thx for the confirmation. So set this bz as verified now.

Comment 11 errata-xmlrpc 2020-03-31 19:58:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1094