| Summary: | glusterfs memory leak when do fio test with native gluster backend | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Xiaomei Gao <xigao> |
| Component: | glusterfs | Assignee: | Bug Updates Notification Mailing List <rhs-bugs> |
| Status: | CLOSED WONTFIX | QA Contact: | storage-qa-internal <storage-qa-internal> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 6.5 | CC: | areis, chayang, gpo+redhat, juzhang, michen, mkenneth, mzhan, qzhang, rbalakri, rpacheco, virt-maint, wquan, yama |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-12-06 11:58:02 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
(In reply to Asias He from comment #2) > Xiaomei,I can not reproduce this with gluster 3.4.0.34 on my test machine. > Could you test against the latest gluster package. I could still reproduce the issue on latest version. - Host version kernel-2.6.32-431.el6.x86_64 qemu-kvm-0.12.1.2-2.415.el6_5.3.x86_64 glusterfs-libs-3.4.0.36rhs-1.el6.x86_64 glusterfs-api-3.4.0.36rhs-1.el6.x86_64 glusterfs-3.4.0.36rhs-1.el6.x86_64 - Guest version kernel-2.6.32-431.el6.x86_64 - Before running fio test [root@dell-op780-06 ~]# free -m total used free shared buffers cached Mem: 7615 424 7190 0 6 44 -/+ buffers/cache: 372 7242 Swap: 2047 0 2047 - After running fio test [root@dell-op780-06 ~]# free -m total used free shared buffers cached Mem: 7615 7514 100 0 0 17 -/+ buffers/cache: 7496 119 Swap: 2047 528 1519 Qemu-kvm sometimes will core dump. Please check the following info.
(gdb) bt full
#0 0x00007fda2b3d278a in _int_free (av=0x7fda2b6e9e80, p=0x7fda41b2a460, have_lock=0) at malloc.c:5005
size = 16777344
fb = <value optimized out>
nextchunk = 0x7fda42b2a4e0
nextsize = 2097168
nextinuse = <value optimized out>
prevsize = <value optimized out>
bck = 0x7fda2fc48000
fwd = 0x7fda296e9ed8
errstr = 0x0
locked = 1
#1 0x00007fda2af22402 in synctask_destroy (task=0x7fda2fc5c010) at syncop.c:148
No locals.
#2 0x00007fda2af227d0 in syncenv_processor (thdata=0x7fda2f9cbb40) at syncop.c:389
env = 0x7fda2f9ca4c0
proc = 0x7fda2f9cbb40
task = <value optimized out>
#3 0x00007fda2ddf19d1 in start_thread (arg=0x7fda0cb3c700) at pthread_create.c:301
__res = <value optimized out>
pd = 0x7fda0cb3c700
now = <value optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140574492706560, 4723731760081897745, 140575051375456, 140574492707264, 0, 3,
-4739466426402109167, -4739393486156363503}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0,
cleanup = 0x0, canceltype = 0}}}
not_first_call = <value optimized out>
pagesize_m1 = <value optimized out>
sp = <value optimized out>
freesize = <value optimized out>
#4 0x00007fda2b442b6d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
This is almost certainly an issue in libglusterfs, rather than qemu itself (both the leak, and comment #5). For the issue described in comment #5, that sounds like bug #1010638. Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available. The official life cycle policy can be reviewed here: http://redhat.com/rhel/lifecycle This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL: https://access.redhat.com/ |
Description of problem: During testing glusterfs performance with fuse bypass, there is memory leak. Btw, it works well for fuse mount. The issue happened on both virtio-blk driver and virtio-scsi driver. Version-Release number of selected component (if applicable): qemu-kvm-0.12.1.2-2.401.el6.x86_64 kernel-2.6.32-418.el6.x86_64 glusterfs-3.4.0.24rhs-1.el6rhs.x86_64 How reproducible: 1/4 Steps to Reproduce: 1. Testbed: - Hardware: 1 client (4CPU * 8GB); 2 server (8CPU * 16GB ); private network is 1-Gbit - Setup: 1 Gluster volume made up of 1 brick (on SSD) from each server; single replication enabled - Client KVM image: 2VCPUs * 4GB RAM; cache=one; aio=threads 2. Create image with fuse bypass on gluster client. #/usr/bin/qemu-img create -f raw gluster://192.168.0.17:24007/gv1/storage2.raw 40G 3. Boot guest with data disk. # /usr/libexec/qemu-kvm \ -drive file='/home/RHEL-Server-6.5-64.raw',if=none,id=virtio-scsi0-id0,media=disk,cache=none,snapshot=off,format=raw,aio=threads \ -device scsi-hd,drive=virtio-scsi0-id0 \ -drive file='gluster://192.168.0.17:24007/gv1/storage2.raw',if=none,id=virtio-scsi2-id1,media=disk,cache=none,snapshot=off,format=raw,aio=threads \ -m 4096 \ -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \ ... 4. In guest # i=`/bin/ls /dev/[vs]db` && mkfs.ext4 $i -F > /dev/null; partprobe; umount /mnt; mount $i /mnt && echo 3 > /proc/sys/vm/drop_caches && sleep 3 # fio --rw=%s --bs=%s --iodepth=%s --runtime=1m --direct=1 --filename=/mnt/%s --name=job1 --ioengine=libaio --thread --group_reporting --numjobs=16 --size=512MB --time_based --ioscheduler=deadline Actual results: - Before running job, host memory shows: # free -m total used free shared buffers cached Mem: 7615 177 7437 0 8 34 -/+ buffers/cache: 134 7480 Swap: 2047 0 2047 - The whole job keeps running about one hour, after running about 30 minutes,the free memory on host decreases to ~200M and host hangs at last. - Please refer to the log: http://kvm-perf.englab.nay.redhat.com/results/3510-autotest/dell-op780-06.qe.lab.eng.nay.redhat.com/debug/client.0.log Expected results: There is no memory leak. Additional info: