Bug 2016311

Summary: Few memory leak when fio on disk
Product: Red Hat Enterprise Linux 8 Reporter: qing.wang <qinwang>
Component: qemu-kvmAssignee: Stefano Garzarella <sgarzare>
qemu-kvm sub component: virtio-blk,scsi QA Contact: qing.wang <qinwang>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: medium    
Priority: medium CC: coli, jinzhao, juzhang, kkiwi, lijin, qzhang, virt-maint, xuwei, yanghliu
Version: 8.4Keywords: Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-01 07:51:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2007036    

Description qing.wang 2021-10-21 09:25:44 UTC
Description of problem:

There is few memory leak when do io on disk,the leak size is not fix.
It is usually hundreds or thousands bytes , and it have no leak growth in long running time.

Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux release 8.4 (Ootpa)
4.18.0-305.el8.x86_64
qemu-kvm-5.2.0-16.module+el8.4.0+12596+209e4022.10.x86_64

or 

4.18.0-305.el8.x86_64
qemu-kvm-4.2.0-48.module+el8.4.0+10368+630e803b.x86_64
seabios-bin-1.13.0-2.module+el8.3.0+7353+9de0a3cc.noarch


How reproducible:
80%

Steps to Reproduce:
1.create vm cmd script:
cat mem-local.sh


/usr/libexec/qemu-kvm \
	-name 'avocado-vt-vm1'  \
	-machine pc  \
	-nodefaults  \
	-vga std  \
	-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 \
	-drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/rhel79-64-virtio-scsi.qcow2 \
	-device scsi-hd,id=image1,drive=drive_image1 \
	\
	-drive id=drive_image2,if=none,snapshot=off,aio=threads,cache=none,format=raw,file=/home/kvm_autotest_root/images/data1.raw \
	-device scsi-hd,id=image2,drive=drive_image2 \
	\
	-device virtio-net-pci,mac=9a:e0:e1:e2:e3:e3,id=idIRlhSc,vectors=4,netdev=ids4KA3w,bus=pci.0,addr=0x5  \
	-netdev tap,id=ids4KA3w,vhost=on \
	-m 4G  \
	\
	-vnc :5  \
	-rtc base=localtime,clock=host,driftfix=slew  \
	-boot menu=off,strict=off,order=cdn,once=c \
	-enable-kvm \
	-monitor stdio \

2.run valgrind to monitor the memory leak
valgrind --trace-children=yes --track-origins=yes --leak-check=full --show-leak-kinds=definite --log-file=/tmp/valgrind.log sh mem-local.sh

3.login guest and run fio
fio --randrepeat=0 --iodepth=8 --size=5g --direct=0 --ioengine=libaio --filename=/home/x/test.dat --name=iometer --stonewall --bs=1M --rw=randrw --name=iometer_just_write --stonewall --bs=1M --rw=write --name=iometer_just_read --stonewall --bs=1M --rw=read 

4. quit the qemu after fio finished

5.check the valgrind log to check the exist leak on io relevant operation 

Actual results:
==84619== 24 bytes in 1 blocks are definitely lost in loss record 2,078 of 5,487
==84619==    at 0x4C34F0B: malloc (vg_replace_malloc.c:307)
==84619==    by 0x5D732A5: g_malloc (in /usr/lib64/libglib-2.0.so.0.5600.4)
==84619==    by 0x5D8AEB6: g_slice_alloc (in /usr/lib64/libglib-2.0.so.0.5600.4)
==84619==    by 0x5D8FB56: g_string_sized_new (in /usr/lib64/libglib-2.0.so.0.5600.4)
==84619==    by 0x5D900C9: g_string_new (in /usr/lib64/libglib-2.0.so.0.5600.4)
==84619==    by 0x7D7F64: get_relocated_path (cutils.c:944)
==84619==    by 0x63740C: qemu_init (vl.c:3971)
==84619==    by 0x42DBBC: main (main.c:49)
==84619== 
==84619== 24 bytes in 1 blocks are definitely lost in loss record 2,079 of 5,487
==84619==    at 0x4C34F0B: malloc (vg_replace_malloc.c:307)
==84619==    by 0x5D732A5: g_malloc (in /usr/lib64/libglib-2.0.so.0.5600.4)
==84619==    by 0x5D8AEB6: g_slice_alloc (in /usr/lib64/libglib-2.0.so.0.5600.4)
==84619==    by 0x5D8FB56: g_string_sized_new (in /usr/lib64/libglib-2.0.so.0.5600.4)
==84619==    by 0x5D900C9: g_string_new (in /usr/lib64/libglib-2.0.so.0.5600.4)
==84619==    by 0x7D7F64: get_relocated_path (cutils.c:944)
==84619==    by 0x63772E: find_datadir (vl.c:2873)
==84619==    by 0x63772E: qemu_init (vl.c:3976)
==84619==    by 0x42DBBC: main (main.c:49)
==84619== 
==84619== 304 bytes in 1 blocks are definitely lost in loss record 4,821 of 5,487
==84619==    at 0x4C3721A: calloc (vg_replace_malloc.c:760)
==84619==    by 0x5D732FD: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.5600.4)
==84619==    by 0x7D9F82: qemu_coroutine_new (coroutine-ucontext.c:198)
==84619==    by 0x7D1D90: qemu_coroutine_create (qemu-coroutine.c:75)
==84619==    by 0x703D31: aio_task_pool_start_task (aio_task.c:94)
==84619==    by 0x6F6965: qcow2_add_task (qcow2.c:2222)
==84619==    by 0x6F8028: qcow2_co_preadv_part (qcow2.c:2320)
==84619==    by 0x727867: bdrv_driver_preadv.constprop.18 (io.c:1054)
==84619==    by 0x72B145: bdrv_aligned_preadv (io.c:1440)
==84619==    by 0x72B6E0: bdrv_co_preadv_part (io.c:1682)
==84619==    by 0x755866: blk_do_preadv (block-backend.c:1211)
==84619==    by 0x755919: blk_aio_read_entry (block-backend.c:1464)
==84619== 
==84619== LEAK SUMMARY:
==84619==    definitely lost: 352 bytes in 3 blocks
==84619==    indirectly lost: 0 bytes in 0 blocks
==84619==      possibly lost: 4,622 bytes in 47 blocks
==84619==    still reachable: 9,830,848 bytes in 25,360 blocks
==84619==                       of which reachable via heuristic:
==84619==                         newarray           : 1,536 bytes in 16 blocks
==84619==         suppressed: 0 bytes in 0 blocks


Expected results:
no leak on io operation

Additional info:
No leak  on 8.5 
Red Hat Enterprise Linux release 8.5 Beta (Ootpa)
4.18.0-339.el8.x86_64
qemu-kvm-6.0.0-30.module+el8.5.0+12586+476da3e1.x86_64
seabios-bin-1.14.0-1.module+el8.4.0+8855+a9e237a9.noarch

Comment 3 Stefano Garzarella 2021-10-25 13:34:31 UTC
The fix for the first 2 blocks (24 bytes) is already merged upstream: https://gitlab.com/qemu-project/qemu/-/commit/b6d003dbee81f1bf419c7cceec0c4c358184a601

It's weird that we don't have these blocks reported on qemu-kvm-6.0.0-30.module+el8.5.0 because I don't see the patch being backported. It's included in 6.1, so qemu-kvm on RHEL 8.6.0 and RHEL 9.0 has the fix.

I'm investigating about the third block (304 bytes)