Bug 1118256

Summary: qemu-kvm will exhaust host memory and do not release when create many snapshots over glusterfs:native backend in one chain
Product: Red Hat Enterprise Linux 6 Reporter: Qian Guo <qiguo>
Component: glusterfsAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED WONTFIX QA Contact: storage-qa-internal <storage-qa-internal>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.6CC: juzhang, michen, mkenneth, qzhang, rbalakri, rpacheco, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1118612 (view as bug list) Environment:
Last Closed: 2017-12-06 11:33:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 1118612    

Description Qian Guo 2014-07-10 10:01:41 UTC
Description of problem:
Create snapshots one by one, qemu-kvm will exhaust a growing number of memory, and can not be released

Version-Release number of selected component (if applicable):
# rpm -q qemu-kvm-rhev
qemu-kvm-rhev-0.12.1.2-2.430.el6.x86_64
# rpm -q qemu-img-rhev
qemu-img-rhev-0.12.1.2-2.430.el6.x86_64
# uname -r
2.6.32-489.el6.x86_64
# rpm -qa |grep gluster
glusterfs-api-3.6.0.24-1.el6.x86_64
glusterfs-rdma-3.6.0.24-1.el6.x86_64
glusterfs-devel-3.6.0.24-1.el6.x86_64
glusterfs-cli-3.6.0.24-1.el6.x86_64
glusterfs-libs-3.6.0.24-1.el6.x86_64
glusterfs-api-devel-3.6.0.24-1.el6.x86_64
glusterfs-debuginfo-3.6.0.24-1.el6.x86_64
glusterfs-fuse-3.6.0.24-1.el6.x86_64
glusterfs-3.6.0.24-1.el6.x86_64



How reproducible:
100%x

Steps to Reproduce:
1.Boot guest:
/usr/libexec/qemu-kvm -cpu Penryn -m 4G -smp 4 -M pc -enable-kvm -name rhel6u4 -nodefaults -nodefconfig -vga std -monitor stdio -drive file=gluster://10.66.9.152/gv0/rhel5.11cp2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,werror=stop,rerror=stop,aio=native,cache=none -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 -vnc :20 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-netpci0,mac=54:52:1b:36:1a:02 -qmp unix:/tmp/q1,server,nowait

2.over qmp, create live snapshots:
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk0", "snapshot-file": "gluster://10.66.9.152/gv0/sn1", "format": "qcow2" } }

{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk0", "snapshot-file": "gluster://10.66.9.152/gv0/sn2", "format": "qcow2" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk0", "snapshot-file": "gluster://10.66.9.152/gv0/sn3", "format": "qcow2" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk0", "snapshot-file": "gluster://10.66.9.152/gv0/sn4", "format": "qcow2" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk0", "snapshot-file": "gluster://10.66.9.152/gv0/sn5", "format": "qcow2" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk0", "snapshot-file": "gluster://10.66.9.152/gv0/sn6", "format": "qcow2" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk0", "snapshot-file": "gluster://10.66.9.152/gv0/sn7", "format": "qcow2" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk0", "snapshot-file": "gluster://10.66.9.152/gv0/sn8", "format": "qcow2" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk0", "snapshot-file": "gluster://10.66.9.152/gv0/sn9", "format": "qcow2" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "drive-virtio-disk0", "snapshot-file": "gluster://10.66.9.152/gv0/sn10", "format": "qcow2" } }

3.

Actual results:
during create the live snapshots, the memory that qemu-kvm exhaust has been increasing, and when I finished 10 snapshots creation:

# top -p 13313

top - 05:59:10 up 1 day,  3:35,  7 users,  load average: 0.02, 0.07, 0.15
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.1%us,  0.1%sy,  0.0%ni, 99.8%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   7932288k total,  7804340k used,   127948k free,     1200k buffers
Swap:  8077308k total,  1176472k used,  6900836k free,    93140k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
13313 root      20   0 22.9g 7.1g 4844 S 16.0 94.2   2:13.89 qemu-kvm         

# free -g
             total       used       free     shared    buffers     cached
Mem:             7          7          0          0          0          0
-/+ buffers/cache:          7          0
Swap:            7          1          6


Expected results:
Should release the memory after the snapshots creation

Additional info:
the "info block" after tests:
(qemu) info block
drive-virtio-disk0: removable=0 io-status=ok file=gluster://10.66.9.152/gv0/sn10 backing_file=gluster://10.66.9.152/gv0/sn9 ro=0 drv=qcow2 encrypted=0 bps=0 bps_rd=0 bps_wr=0 iops=0 iops_rd=0 iops_wr=0

Comment 2 Qian Guo 2014-07-11 06:47:51 UTC
Test this with local file, no such issue, so this is related with glusterfs:native, I changed the summary according to this tested result.

Additional info:
# rpm -qa |grep gluster
glusterfs-fuse-3.6.0.22-1.el6rhs.x86_64
glusterfs-rdma-3.6.0.22-1.el6rhs.x86_64
glusterfs-libs-3.6.0.22-1.el6rhs.x86_64
glusterfs-3.6.0.22-1.el6rhs.x86_64
glusterfs-cli-3.6.0.22-1.el6rhs.x86_64
glusterfs-geo-replication-3.6.0.22-1.el6rhs.x86_64
glusterfs-api-3.6.0.22-1.el6rhs.x86_64
glusterfs-server-3.6.0.22-1.el6rhs.x86_64


This bug is not regression, test with qemu-kvm-rhev-0.12.1.2-2.426.el6.x86_64, hit same issue.

Comment 5 Jan Kurik 2017-12-06 11:33:55 UTC
Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.

The official life cycle policy can be reviewed here:

http://redhat.com/rhel/lifecycle

This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:

https://access.redhat.com/