Bug 852147

Summary: glusterd operations hang if the other peers are down
Product: Red Hat Gluster Storage Reporter: Vidya Sakar <vinaraya>
Component: glusterdAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED ERRATA QA Contact: Sachidananda Urs <surs>
Severity: unspecified Docs Contact:
Priority: high    
Version: 2.0CC: amarts, gluster-bugs, nsathyan, pkarampu, racpatel, rfortier, rhs-bugs, spandura, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.4.0.2rhs-1.el6rhs Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 847214 Environment:
Last Closed: 2013-09-23 18:38:58 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On: 847214    
Bug Blocks: 858476    

Description Vidya Sakar 2012-08-27 13:34:55 EDT
+++ This bug was initially created as a clone of Bug #847214 +++

Description of problem:
Did a volume set operation while the other peers in the cluster were down. Op-sm hung.
Op-sm is stuck in an infinite state-transition:
Old State: [Ack drain]
New State: [Ack drain]
Event    : [GD_OP_EVENT_START_UNLOCK]
timestamp: [2012-08-10 06:10:25]

Old State: [Ack drain]
New State: [Ack drain]
Event    : [GD_OP_EVENT_START_UNLOCK]
timestamp: [2012-08-10 06:10:28]

Old State: [Ack drain]
New State: [Ack drain]
Event    : [GD_OP_EVENT_START_UNLOCK]
timestamp: [2012-08-10 06:10:28]

Old State: [Ack drain]
New State: [Ack drain]
Event    : [GD_OP_EVENT_START_UNLOCK]
timestamp: [2012-08-10 06:10:31]

Old State: [Ack drain]
New State: [Ack drain]
Event    : [GD_OP_EVENT_START_UNLOCK]
timestamp: [2012-08-10 06:10:31]


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
This the setup in which I got the problem, but I think it can be triggered even with 2 machines
1.Have a cluster with 3 machines.
2.Bring two of the glusterds down.
3.Execute any glusterd operation command which uses op-sm, I used volume set. 
  
Actual results:
The operation will hang after commit-op.
Expected results:
volume set operation should have been successful.

Additional info:
Comment 2 Amar Tumballi 2012-11-29 03:46:46 EST
*** Bug 861919 has been marked as a duplicate of this bug. ***
Comment 3 Amar Tumballi 2012-11-29 03:46:54 EST
*** Bug 860568 has been marked as a duplicate of this bug. ***
Comment 4 Amar Tumballi 2012-12-28 03:57:44 EST
http://review.gluster.org/4297 is a base need for this to get fixed. Will fix 'all' the commands once this gets fixed for volume status.
Comment 5 krishnan parthasarathi 2013-01-09 02:55:48 EST
*** Bug 852295 has been marked as a duplicate of this bug. ***
Comment 6 Vijay Bellur 2013-02-03 14:53:12 EST
CHANGE: http://review.gluster.org/4295 (glusterd: Moved node rsp functions to glusterd-utils.c) merged in master by Anand Avati (avati@redhat.com)
Comment 7 Vijay Bellur 2013-02-03 14:54:30 EST
CHANGE: http://review.gluster.org/4296 (glusterd: Added syncop version of BRICK_OP) merged in master by Anand Avati (avati@redhat.com)
Comment 8 Vijay Bellur 2013-02-03 14:56:31 EST
CHANGE: http://review.gluster.org/4297 (glusterd: Made volume-status use synctask framework) merged in master by Anand Avati (avati@redhat.com)
Comment 9 Vijay Bellur 2013-02-08 15:03:56 EST
CHANGE: http://review.gluster.org/4494 (glusterd: Made volume-statedump use synctask framework.) merged in master by Anand Avati (avati@redhat.com)
Comment 10 Vijay Bellur 2013-02-08 16:29:12 EST
CHANGE: http://review.gluster.org/4492 (glusterd: Made volume-delete use synctask framework.) merged in master by Anand Avati (avati@redhat.com)
Comment 11 Vijay Bellur 2013-02-08 16:29:30 EST
CHANGE: http://review.gluster.org/4491 (glusterd: Made volume-stop use synctask framework.) merged in master by Anand Avati (avati@redhat.com)
Comment 12 Vijay Bellur 2013-02-08 17:08:36 EST
CHANGE: http://review.gluster.org/4490 (glusterd : Made volume clear-locks use synctask framework.) merged in master by Anand Avati (avati@redhat.com)
Comment 13 Vijay Bellur 2013-02-08 17:21:28 EST
CHANGE: http://review.gluster.org/4489 (glusterd: Made volume-sync use synctask framework.) merged in master by Anand Avati (avati@redhat.com)
Comment 14 Vijay Bellur 2013-02-08 22:04:47 EST
CHANGE: http://review.gluster.org/4488 (glusterd : Made volume-set use synctask framework.) merged in master by Anand Avati (avati@redhat.com)
Comment 15 Vijay Bellur 2013-02-08 22:06:04 EST
CHANGE: http://review.gluster.org/4474 (glusterd: Making volume-reset use synctask framework) merged in master by Anand Avati (avati@redhat.com)
Comment 16 Vijay Bellur 2013-02-08 22:09:43 EST
CHANGE: http://review.gluster.org/4473 (glusterd: Made gsync set use synctask framework) merged in master by Anand Avati (avati@redhat.com)
Comment 17 Vijay Bellur 2013-02-13 20:47:21 EST
CHANGE: http://review.gluster.org/4478 (glusterd: Made log-rotate use synctask framework.) merged in master by Anand Avati (avati@redhat.com)
Comment 18 Vijay Bellur 2013-02-16 23:41:25 EST
CHANGE: http://review.gluster.org/4495 (glusterd: Made volume-quota use synctask framework.) merged in master by Anand Avati (avati@redhat.com)
Comment 19 Vijay Bellur 2013-02-17 01:32:36 EST
CHANGE: http://review.gluster.org/4493 (glusterd: Made volume-heal use synctask framework.) merged in master by Anand Avati (avati@redhat.com)
Comment 20 Vijay Bellur 2013-02-19 21:58:01 EST
CHANGE: http://review.gluster.org/4507 (glusterd: Made gd_synctask_begin less 'monolithic' in terms of LOC.) merged in master by Anand Avati (avati@redhat.com)
Comment 24 Sachidananda Urs 2013-08-02 01:24:24 EDT
Verified on: glusterfs 3.4.0.14rhs built on Jul 30 2013 09:09:36
Comment 25 Scott Haines 2013-09-23 18:38:58 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html
Comment 26 Scott Haines 2013-09-23 18:41:32 EDT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html