Bug 1441932

Summary:	Gluster operations fails with another transaction in progress as volume delete acquires lock and won't release
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Bala Konda Reddy M <bmekala>
Component:	glusterd	Assignee:	Atin Mukherjee <amukherj>
Status:	CLOSED ERRATA	QA Contact:	Bala Konda Reddy M <bmekala>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	rhgs-3.3	CC:	nchilaka, rhinduja, rhs-bugs, storage-qa-internal, vbellur
Target Milestone:	---
Target Release:	RHGS 3.3.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:	brick-multiplexing
Fixed In Version:	glusterfs-3.8.4-23	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-09-21 04:37:54 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1441910, 1445408
Bug Blocks:	1417151

Description Bala Konda Reddy M 2017-04-13 06:37:03 UTC

Description of problem:
Unable to perform gluster vol status, start, delete on the cluster

Version-Release number of selected component (if applicable):
3.8.4.21

How reproducible:


Steps to Reproduce:
1. Created the cluster and enabled the Brick multiplexing
2. Created ec and dist-rep volumes. 
3. attached bricks to dist-rep volume and removed the bricks twice from the same volume
4. Stopped the volume and it is succeeded 
5. Volume delete failed as it acquired the lock and it won't release the lock.

Actual results:

[root@dhcp37-135 ~]# gluster vol delete testvol
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: testvol: failed: Another transaction is in progress for testvol. Please try again after sometime.


Expected results:
gluster operations should not fail should succeed


Additional info:
glusterd log

[2017-04-12 04:51:33.521131] I [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0x36d85) [0x7eff3edb6d85] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd4735) [0x7eff3ee54735] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7eff4a2c8105] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh --volname=testvol-ec --last=yes
[2017-04-12 04:51:33.529529] E [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0x36d85) [0x7eff3edb6d85] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd4696) [0x7eff3ee54696] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7eff4a2c8105] ) 0-management: Failed to execute script: /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh --volname=testvol-ec --last=yes
[2017-04-12 04:51:33.529677] I [socket.c:3492:socket_submit_request] 0-management: not connected (priv->connected = -1)
[2017-04-12 04:51:33.529734] W [rpc-clnt.c:1694:rpc_clnt_submit] 0-management: failed to submit rpc-request (XID: 0x8 Program: brick operations, ProgVers: 2, Proc: 1) to rpc-transport (management)
[2017-04-12 04:51:33.529780] W [rpc-clnt.c:1694:rpc_clnt_submit] 0-management: failed to submit rpc-request (XID: 0x9 Program: brick operations, ProgVers: 2, Proc: 1) to rpc-transport (management)
[2017-04-12 04:51:33.529805] W [rpc-clnt.c:1694:rpc_clnt_submit] 0-management: failed to submit rpc-request (XID: 0xa Program: brick operations, ProgVers: 2, Proc: 1) to rpc-transport (management)
[2017-04-12 04:51:33.529823] W [rpc-clnt.c:1694:rpc_clnt_submit] 0-management: failed to submit rpc-request (XID: 0xb Program: brick operations, ProgVers: 2, Proc: 1) to rpc-transport (management)
[2017-04-12 04:51:33.600864] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2017-04-12 04:51:33.601001] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: nfs service is stopped
[2017-04-12 04:51:33.601382] I [MSGID: 106568] [glusterd-proc-mgmt.c:87:glusterd_proc_stop] 0-management: Stopping glustershd daemon running in pid: 19756
[2017-04-12 04:51:34.601633] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: glustershd service is stopped
[2017-04-12 04:51:34.601786] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2017-04-12 04:51:34.601814] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: bitd service is stopped
[2017-04-12 04:51:34.601868] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
[2017-04-12 04:51:34.601889] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: scrub service is stopped
[2017-04-12 04:51:34.609297] W [glusterd-handler.c:5675:__glusterd_brick_rpc_notify] 0-management: got disconnect from stale rpc on /bricks/brick0/testvol_brick0
[2017-04-12 04:59:40.379360] I [MSGID: 106499] [glusterd-handler.c:4369:__glusterd_handle_status_volume] 0-management: Received status volume req for volume testvol
[2017-04-12 04:59:40.379522] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd39d0) [0x7eff3ee539d0] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd3900) [0x7eff3ee53900] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd8c0f) [0x7eff3ee58c0f] ) 0-management: Lock for testvol held by 2352d844-ab16-4c77-86e5-6bfcfcc8b3b2
[2017-04-12 04:59:40.379562] E [MSGID: 106119] [glusterd-syncop.c:1864:gd_sync_task_begin] 0-management: Unable to acquire lock for testvol
[2017-04-12 04:59:40.380043] I [MSGID: 106499] [glusterd-handler.c:4369:__glusterd_handle_status_volume] 0-management: Received status volume req for volume testvol-ec
[2017-04-12 04:59:40.381881] E [MSGID: 106301] [glusterd-syncop.c:1315:gd_stage_op_phase] 0-management: Staging of operation 'Volume Status' failed on localhost : Volume testvol-ec is not started
[2017-04-12 05:12:28.555698] I [MSGID: 106499] [glusterd-handler.c:4369:__glusterd_handle_status_volume] 0-management: Received status volume req for volume testvol
[2017-04-12 05:12:28.555849] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd39d0) [0x7eff3ee539d0] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd3900) [0x7eff3ee53900] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd8c0f) [0x7eff3ee58c0f] ) 0-management: Lock for testvol held by 2352d844-ab16-4c77-86e5-6bfcfcc8b3b2
[2017-04-12 05:12:28.555877] E [MSGID: 106119] [glusterd-syncop.c:1864:gd_sync_task_begin] 0-management: Unable to acquire lock for testvol
[2017-04-12 05:12:28.556369] I [MSGID: 106499] [glusterd-handler.c:4369:__glusterd_handle_status_volume] 0-management: Received status volume req for volume testvol-ec
[2017-04-12 05:12:28.558175] E [MSGID: 106301] [glusterd-syncop.c:1315:gd_stage_op_phase] 0-management: Staging of operation 'Volume Status' failed on localhost : Volume testvol-ec is not started
[2017-04-12 05:12:37.079892] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd39d0) [0x7eff3ee539d0] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd3900) [0x7eff3ee53900] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd8c0f) [0x7eff3ee58c0f] ) 0-management: Lock for testvol held by 2352d844-ab16-4c77-86e5-6bfcfcc8b3b2
[2017-04-12 05:12:37.079953] E [MSGID: 106119] [glusterd-syncop.c:1864:gd_sync_task_begin] 0-management: Unable to acquire lock for testvol
[2017-04-12 05:18:31.518899] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd39d0) [0x7eff3ee539d0] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd3900) [0x7eff3ee53900] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd8c0f) [0x7eff3ee58c0f] ) 0-management: Lock for testvol held by 2352d844-ab16-4c77-86e5-6bfcfcc8b3b2

Comment 3 Atin Mukherjee 2017-04-13 08:02:22 UTC

upstream patch : https://review.gluster.org/#/c/17055/

Comment 5 Atin Mukherjee 2017-04-18 11:26:17 UTC

downstream patch : https://code.engineering.redhat.com/gerrit/#/c/103636

Comment 7 Bala Konda Reddy M 2017-05-03 13:49:27 UTC

Build : 3.8.4-24

Followed the steps provided in the bug. After volume stop, able to delete the volume without any issue. 

Hence marking the bug as verified

Comment 9 errata-xmlrpc 2017-09-21 04:37:54 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774