Bug 1235496

Summary: [RHEV-RHGS] App VMs goes to paused state, after few rebalance operation
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: distributeAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED DUPLICATE QA Contact: SATHEESARAN <sasundar>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: nbalacha
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
RHEV 3.5.3 RHEL 6.6 Host as Hypervisor RHGS 3.1 Nightly Build ( glusterfs-3.7.1-4.el6rhs )
Last Closed: 2015-08-03 13:53:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description SATHEESARAN 2015-06-25 01:55:02 UTC
Description of the problem
--------------------------

Distributed-replicate gluster volume was used as Virtual Machine imagestore in RHEV. There were 2 app vms that were created with their disk image on gluster volume. After few rebalance operations caused by add-brick and remove-brick, the app vms went in to paused state


Version-Release number of selected component (if applicable):
-------------------------------------------------------------
glusterfs-3.7.1-4.el6rhs

How Reproducible:
-----------------
Always

Actual Result:
--------------
After few rebalance operation, app VMs went in to paused state

Expected Result:
----------------
After rebalance, app VMs should be in running state. There shouldn't be any errors seen in the normal operation of app VMs

Comment 2 SATHEESARAN 2015-06-25 02:10:44 UTC
Missed the step to reproduce in comment0

Steps to reproduce:
-------------------
0. Add RHGS nodes to gluster enabled cluster in RHEVM
1. Create a distributed-replicate volume of 2X2 and start it
2. Optimize the volume for virt-store and set ownership of RHEV on that volume
3. Use this volume as a Data Domain that uses glusterfs as a type ( this makes the gluster volume as VM Image store in RHEV )
4. Create few appvms in RHEV that has its root disk image on gluster volume
5. Install OS in the appvms
6. Add more bricks to the gluster volume and perform rebalance
7. After every rebalance operation, check for state of App VMs

Comment 3 SATHEESARAN 2015-06-25 02:14:08 UTC
Observed few errors from QEMU logs :

QEMU logs corresponds to VM errors and could be located under '/var/log/libvirt/qemu'. Following is the error message from appvm1, /var/log/libvirt/qemu/appvm1.log 

<snip>
block I/O error in device 'drive-virtio-disk0': Transport endpoint is not connected (107)
block I/O error in device 'drive-virtio-disk0': Transport endpoint is not connected (107)
block I/O error in device 'drive-virtio-disk0': Transport endpoint is not connected (107)
block I/O error in device 'drive-virtio-disk0': Transport endpoint is not connected (107)
</snip>

Comment 5 SATHEESARAN 2015-06-25 10:52:50 UTC
I tried to reproduce this issue with QEMU/KVM + RHGS integration environment and
I am not seeing this issue.

I will retry the issue with RHEV + RHGS integration.

I will remove blocker flag as this issue is not yet frequently reproducible.

Comment 7 SATHEESARAN 2015-06-25 16:45:38 UTC
Removing keyword REGRESSION, as this is issue not always hit as mentioned in comment5

Comment 9 SATHEESARAN 2015-07-15 05:22:26 UTC
I couldn't reproduce this issue again after few attempts.
Anyhow, at the end of regression testing, I will CLOSE this bug, if its no longer reproducible.

Thanks

Comment 10 SATHEESARAN 2015-08-03 13:53:17 UTC
This issue is because of issue found in https://bugzilla.redhat.com/show_bug.cgi?id=1243542

Closing this bug as a DUPLICATE of https://bugzilla.redhat.com/show_bug.cgi?id=1243542

*** This bug has been marked as a duplicate of bug 1243542 ***