Bug 1055747

Summary: CLI shows another transaction in progress when one node in cluster abruptly shutdown
Product: [Community] GlusterFS Reporter: Paul Cuzner <pcuzner>
Component: glusterdAssignee: bugs <bugs>
Status: CLOSED EOL QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 3.5.0CC: amukherj, bugs, george.lian, pcuzner
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-17 15:57:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
glusterd log file from one of the nodes showing the locking issue. none

Description Paul Cuzner 2014-01-20 20:47:45 UTC
Created attachment 852843 [details]
glusterd log file from one of the nodes showing the locking issue.

Description of problem:
Using on glusterfs-3.5.0-0.3.beta1.el6 RHEL6.5 - I have a 4-way cluster (VM's) and a distributed volume - one brick from each node. 

Test - simulate abrupt loss of a node by performing a force shut of one node in the cluster. 

Result - CLI is either unresponsive or inaccurate - ie. if you're in the gluster console at the time you have to break out.

Attempting a vol status returns;

"Another transaction is in progress. Please try again after sometime."

At this point peer status and pool list still show the node that was powered off as part of the cluster.

After 5 minutes, vol status is still not working and peer information remains out of date


Version-Release number of selected component (if applicable):
[root@glfs35-1 ~]# rpm -qa | grep gluster
glusterfs-fuse-3.5.0-0.3.beta1.el6.x86_64
glusterfs-devel-3.5.0-0.3.beta1.el6.x86_64
glusterfs-api-3.5.0-0.3.beta1.el6.x86_64
glusterfs-3.5.0-0.3.beta1.el6.x86_64
glusterfs-cli-3.5.0-0.3.beta1.el6.x86_64
glusterfs-server-3.5.0-0.3.beta1.el6.x86_64
glusterfs-libs-3.5.0-0.3.beta1.el6.x86_64


How reproducible:
Test performed 4 times - issue occurred on 3 tests

Steps to Reproduce:
1. Power off a node (not an orderly shutdown)
2. Observe CLI behaviour


Actual results:


Expected results:
I've done the same test on RHS 2.1u1 and this does not happen

Additional info:
errors in the attached log indicate lock acquisition problems.

Comment 2 Niels de Vos 2016-06-17 15:57:04 UTC
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.