Bug 1179136 - glusterd: Gluster rebalance status returns failure
Summary: glusterd: Gluster rebalance status returns failure
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Atin Mukherjee
QA Contact:
URL:
Whiteboard:
Depends On: 1130158 1154635
Blocks: glusterfs-3.6.3
TreeView+ depends on / blocked
 
Reported: 2015-01-06 09:54 UTC by Atin Mukherjee
Modified: 2016-01-08 09:18 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1154635
Environment:
Last Closed: 2016-01-08 09:18:34 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Anand Avati 2015-01-06 09:55:31 UTC
REVIEW: http://review.gluster.org/9393 (glusterd : release cluster wide locks in op-sm during failures) posted (#1) for review on release-3.6 by Atin Mukherjee (amukherj)

Comment 2 Anand Avati 2015-01-11 11:57:31 UTC
REVIEW: http://review.gluster.org/9393 (glusterd : release cluster wide locks in op-sm during failures) posted (#2) for review on release-3.6 by Atin Mukherjee (amukherj)

Comment 3 Anand Avati 2015-03-04 07:31:12 UTC
COMMIT: http://review.gluster.org/9393 committed in release-3.6 by Raghavendra Bhat (raghavendra) 
------
commit b646678334f4fab78883ecc1b993ec0cb1b49aba
Author: Atin Mukherjee <amukherj>
Date:   Mon Oct 27 12:12:03 2014 +0530

    glusterd : release cluster wide locks in op-sm during failures
    
    glusterd op-sm infrastructure has some loophole in handing error cases in
    locking/unlocking phases which ends up having stale locks restricting
    further transactions to go through.
    
    This patch still doesn't handle all possible unlocking error cases as the
    framework neither has retry mechanism nor the lock timeout. For eg - if
    unlocking fails in one of the peer, cluster wide lock is not released and
    further transaction can not be made until and unless originator node/the node
    where unlocking failed is restarted.
    
    Following test cases were executed (with the help of gdb) after applying this
    patch:
    
    * RPC timesout in lock cbk
    * Decoding of RPC response in lock cbk fails
    * RPC response is received from unknown peer in lock cbk
    * Setting peerinfo in dictionary fails while sending lock request for first peer
      in the list
    * Setting peerinfo in dictionary fails while sending lock request for other
      peers
    * Lock RPC could not be sent for peers
    
    For all above test cases the success criteria is not to have any stale locks
    
    Patch link : http://review.gluster.org/9012
    
    Change-Id: Ia1550341c31005c7850ee1b2697161c9ca04b01a
    BUG: 1179136
    Signed-off-by: Atin Mukherjee <amukherj>
    Reviewed-on: http://review.gluster.org/9012
    Reviewed-by: Krishnan Parthasarathi <kparthas>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Kaushal M <kaushal>
    Reviewed-on: http://review.gluster.org/9393
    Reviewed-by: Raghavendra Bhat <raghavendra>


Note You need to log in before you can comment on or make changes to this bug.