Description of problem: ========================= When one storage node was powered off in a replicate volume (1x2) , one VM became unresponsive. The VM moved to paused state. When tried to Run the VM, the operation failed. RHEVM Error Message: ==================== "VM afr_rep_vm3 has paused due to unknown storage error." Client (hypervisor) log message when tried to run the VM: ======================================================== "2012-10-31 07:13:25.894853] W [fuse-bridge.c:1948:fuse_readv_cbk] 0-glusterfs-fuse: 4117398: READ => -1 (Transport endpoint is not connected)" Version-Release number of selected component (if applicable): ============================================================ [10/31/12 - 13:14:02 root@rhs-client7 0ac09f23-dbbf-4b25-891d-b5e26d636d58]# rpm -qa | grep gluster glusterfs-fuse-3.3.0rhsvirt1-8.el6rhs.x86_64 glusterfs-debuginfo-3.3.0rhsvirt1-8.el6rhs.x86_64 vdsm-gluster-4.9.6-14.el6rhs.noarch gluster-swift-plugin-1.0-5.noarch gluster-swift-container-1.4.8-4.el6.noarch org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch glusterfs-3.3.0rhsvirt1-8.el6rhs.x86_64 glusterfs-server-3.3.0rhsvirt1-8.el6rhs.x86_64 glusterfs-rdma-3.3.0rhsvirt1-8.el6rhs.x86_64 gluster-swift-proxy-1.4.8-4.el6.noarch gluster-swift-account-1.4.8-4.el6.noarch gluster-swift-doc-1.4.8-4.el6.noarch glusterfs-geo-replication-3.3.0rhsvirt1-8.el6rhs.x86_64 gluster-swift-1.4.8-4.el6.noarch gluster-swift-object-1.4.8-4.el6.noarch [10/31/12 - 13:51:00 root@rhs-client7 0ac09f23-dbbf-4b25-891d-b5e26d636d58]# gluster --version glusterfs 3.3.0rhsvirt1 built on Oct 28 2012 23:50:59 How reproducible: ================ Steps to Reproduce: ====================== Refer to : https://tcms.engineering.redhat.com/case/205187/?from_plan=7048 Actual results: ============== The VM moved into the paused state at step 15 when tried to perform operations on the VM. Expected results: ================ performing operations on VM should have been successful. Additional info: ================== [10/31/12 - 13:51:11 root@rhs-client7 0ac09f23-dbbf-4b25-891d-b5e26d636d58]# gluster v info replicate Volume Name: replicate Type: Replicate Volume ID: 19270a9d-a664-4344-8adb-a4ff1909f7f6 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: rhs-client6.lab.eng.blr.redhat.com:/disk0 Brick2: rhs-client7.lab.eng.blr.redhat.com:/disk0 Options Reconfigured: diagnostics.client-log-level: DEBUG performance.quick-read: disable performance.io-cache: disable performance.stat-prefetch: disable performance.read-ahead: disable cluster.eager-lock: enable storage.linux-aio: enable Note: the "client-log-level" was only set only to debug this problem .
Yet to start work on this bug.
How reproducible is this bug in your testing? 100%? 50%? If reproducible enough to discern, does it always occur at the same point in the test case? This appears similar to a recent report suspected to be related to selinux, but ultimately was not reproducible. I've attempted to reproduce this a couple times on a local rhev/rhs setup without success. I've run through the entire test case sequence a couple times as well as repeated the final recovery step (step 15, boot up a node into recovery while the VMs are being updated) a couple more times independently. One of the latter tests is still running at the moment. I've reproduced some hung task messages, but no paused VMs thus far... I didn't see anything obvious in the logs that would explain the failure, though there is a lot of data (including expected failure output) so it's very possible I've missed something. I'll take a second look when I have a chance, but in the meantime I'd suggest we try and get access to an environment in this state if at all possible.
Per 31/01 tiger team bug triage meeting, reducing priority because we can resume from the paused state.
Tried to recreate the problem. The test case passed this time with no VM's getting paused.
Dropping blocker tag as per program meeting on 03/11.
Targeting for 2.1.z (Big Bend) U1.
Can we run a round of test? its been 7months since we last ran this test-case.
per triage 12/13, removing from corbett list
To add to this bug, this issue was filed when server-side quorum and client-side quorum are not made as default to virt profile. From RHSS 2.1 Update2 , we have enabled server-side quorum and client-side quorum in virt profile (i.e) for virt-store volumes client-side quorum has certain constraints on its design, that the first brick of the replica group should be up. So, in this case, when client-side quorum is enabled, the VMs on that volume will go to paused state, failing fault-tolerance But the failure of second brick doesn't affect the VMs on that virt-store. Also tested the behavior without quorums enabled. When one of the brick/node goes down, the other replica pair was available and App VMs are up and running healthy.
What is the status of bug fix for this problem? My customer is has just filed a support ticket #01079904 for the same problem. Thanks Jin
More info: Customer is using: RHEV 3.3.1-0.48.el6ev RHS glusterfs 3.4.0.59rhs He has a replicated volume on a two-node RHSS cluster: gluster-node-0 gluster-node-1 He is using "GlusterFS" Storage Domain as configured below Path: gluster-node-0.example.com:/TCC-RHEV VFS Type: glusterfs Mount options: backup-volfile-servers=gluster-node-1.example.com He reported that, VM got paused when he manually took down the "gluster-node-0" node. But the VM works fine when he manually took down node "gluster-node-1". In addition, I know when "PosixFS" is used, RHEV uses Gluster FUSE client to mount Gluster volume on RHEV-H, but in this case he is using "GlusetFS" type Storage Domain. Does the "Mount Options" actually do anything in "GlusterFS" Storage Domain (is it using libgfapi?)? Should customer use "PosixFS" or "GlusetFS" in RHEV?
hi Jin Zhou, Did the customer enable client-quorum by any chance? If it is enabled then this behavior is expected. Could you please check gluster volume info output to confirm the same. Pranith
Customer is using the default "virt" profile, so I think by default client quorum is set to auto, and server quorum is set to "server". But the part I don't understand is what caused the difference betwwen failure on gluster-node-0, and gluster-node-1? I would expect the client quorum being enforced regardless of which brick/node goes down. Why VM is only suspended when gluster-node-0 goes offline, not gluster-node-1? SATHEESARAN's note above seem to indicate this behavior, but not detailed enough for me. Lastly, since we only officially support replica=2 today, what is rational for enabling client quorum as "auto", it seems useless to me. But I could be wrong. Thanks
hi Jin Zhou, client-quorum calculation happens the following way: In general cases, quorum is met when n/2 + 1 bricks of the replica set are available, but if the number of bricks is even and exactly n/2 bricks in replica set are up then quorum is met if the first brick in the set is up. The reason why client-quorum is enabled by default is that image going into split-brain is much worse than losing availability when the first brick goes down. Without any quorum VMs are accessible when 1) both bricks are up 2) When only first brick goes down in replica set 3) When only second brick goes down in replica set With client quorum, in cases 1), 3) the VMs are accessible. Pranith
Tested with RHGS 3.1 Nightly build ( glusterfs-3.7.1-11.el7rhgs ) with the following test : 1. Used replica 2 volume to back the RHEV Data domain 2. Powered off one of the node abruptly and observed that the VMs are still accessible and available. Marking this bug as VERIFIED
Hi Pranith, The doc text is updated. Please review the same and share your technical review comments. If it looks ok, then sign-off on the same. Regards, Bhavana
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1495.html
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days