Bug 1441932

Summary: Gluster operations fails with another transaction in progress as volume delete acquires lock and won't release
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Bala Konda Reddy M <bmekala>
Component: glusterdAssignee: Atin Mukherjee <amukherj>
Status: CLOSED ERRATA QA Contact: Bala Konda Reddy M <bmekala>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.3CC: nchilaka, rhinduja, rhs-bugs, storage-qa-internal, vbellur
Target Milestone: ---   
Target Release: RHGS 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: brick-multiplexing
Fixed In Version: glusterfs-3.8.4-23 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-21 04:37:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1441910, 1445408    
Bug Blocks: 1417151    

Description Bala Konda Reddy M 2017-04-13 06:37:03 UTC
Description of problem:
Unable to perform gluster vol status, start, delete on the cluster

Version-Release number of selected component (if applicable):
3.8.4.21

How reproducible:


Steps to Reproduce:
1. Created the cluster and enabled the Brick multiplexing
2. Created ec and dist-rep volumes. 
3. attached bricks to dist-rep volume and removed the bricks twice from the same volume
4. Stopped the volume and it is succeeded 
5. Volume delete failed as it acquired the lock and it won't release the lock.

Actual results:

[root@dhcp37-135 ~]# gluster vol delete testvol
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: testvol: failed: Another transaction is in progress for testvol. Please try again after sometime.


Expected results:
gluster operations should not fail should succeed


Additional info:
glusterd log

[2017-04-12 04:51:33.521131] I [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0x36d85) [0x7eff3edb6d85] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd4735) [0x7eff3ee54735] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7eff4a2c8105] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/stop/pre/S29CTDB-teardown.sh --volname=testvol-ec --last=yes
[2017-04-12 04:51:33.529529] E [run.c:191:runner_log] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0x36d85) [0x7eff3edb6d85] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd4696) [0x7eff3ee54696] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7eff4a2c8105] ) 0-management: Failed to execute script: /var/lib/glusterd/hooks/1/stop/pre/S30samba-stop.sh --volname=testvol-ec --last=yes
[2017-04-12 04:51:33.529677] I [socket.c:3492:socket_submit_request] 0-management: not connected (priv->connected = -1)
[2017-04-12 04:51:33.529734] W [rpc-clnt.c:1694:rpc_clnt_submit] 0-management: failed to submit rpc-request (XID: 0x8 Program: brick operations, ProgVers: 2, Proc: 1) to rpc-transport (management)
[2017-04-12 04:51:33.529780] W [rpc-clnt.c:1694:rpc_clnt_submit] 0-management: failed to submit rpc-request (XID: 0x9 Program: brick operations, ProgVers: 2, Proc: 1) to rpc-transport (management)
[2017-04-12 04:51:33.529805] W [rpc-clnt.c:1694:rpc_clnt_submit] 0-management: failed to submit rpc-request (XID: 0xa Program: brick operations, ProgVers: 2, Proc: 1) to rpc-transport (management)
[2017-04-12 04:51:33.529823] W [rpc-clnt.c:1694:rpc_clnt_submit] 0-management: failed to submit rpc-request (XID: 0xb Program: brick operations, ProgVers: 2, Proc: 1) to rpc-transport (management)
[2017-04-12 04:51:33.600864] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2017-04-12 04:51:33.601001] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: nfs service is stopped
[2017-04-12 04:51:33.601382] I [MSGID: 106568] [glusterd-proc-mgmt.c:87:glusterd_proc_stop] 0-management: Stopping glustershd daemon running in pid: 19756
[2017-04-12 04:51:34.601633] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: glustershd service is stopped
[2017-04-12 04:51:34.601786] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2017-04-12 04:51:34.601814] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: bitd service is stopped
[2017-04-12 04:51:34.601868] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
[2017-04-12 04:51:34.601889] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: scrub service is stopped
[2017-04-12 04:51:34.609297] W [glusterd-handler.c:5675:__glusterd_brick_rpc_notify] 0-management: got disconnect from stale rpc on /bricks/brick0/testvol_brick0
[2017-04-12 04:59:40.379360] I [MSGID: 106499] [glusterd-handler.c:4369:__glusterd_handle_status_volume] 0-management: Received status volume req for volume testvol
[2017-04-12 04:59:40.379522] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd39d0) [0x7eff3ee539d0] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd3900) [0x7eff3ee53900] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd8c0f) [0x7eff3ee58c0f] ) 0-management: Lock for testvol held by 2352d844-ab16-4c77-86e5-6bfcfcc8b3b2
[2017-04-12 04:59:40.379562] E [MSGID: 106119] [glusterd-syncop.c:1864:gd_sync_task_begin] 0-management: Unable to acquire lock for testvol
[2017-04-12 04:59:40.380043] I [MSGID: 106499] [glusterd-handler.c:4369:__glusterd_handle_status_volume] 0-management: Received status volume req for volume testvol-ec
[2017-04-12 04:59:40.381881] E [MSGID: 106301] [glusterd-syncop.c:1315:gd_stage_op_phase] 0-management: Staging of operation 'Volume Status' failed on localhost : Volume testvol-ec is not started
[2017-04-12 05:12:28.555698] I [MSGID: 106499] [glusterd-handler.c:4369:__glusterd_handle_status_volume] 0-management: Received status volume req for volume testvol
[2017-04-12 05:12:28.555849] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd39d0) [0x7eff3ee539d0] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd3900) [0x7eff3ee53900] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd8c0f) [0x7eff3ee58c0f] ) 0-management: Lock for testvol held by 2352d844-ab16-4c77-86e5-6bfcfcc8b3b2
[2017-04-12 05:12:28.555877] E [MSGID: 106119] [glusterd-syncop.c:1864:gd_sync_task_begin] 0-management: Unable to acquire lock for testvol
[2017-04-12 05:12:28.556369] I [MSGID: 106499] [glusterd-handler.c:4369:__glusterd_handle_status_volume] 0-management: Received status volume req for volume testvol-ec
[2017-04-12 05:12:28.558175] E [MSGID: 106301] [glusterd-syncop.c:1315:gd_stage_op_phase] 0-management: Staging of operation 'Volume Status' failed on localhost : Volume testvol-ec is not started
[2017-04-12 05:12:37.079892] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd39d0) [0x7eff3ee539d0] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd3900) [0x7eff3ee53900] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd8c0f) [0x7eff3ee58c0f] ) 0-management: Lock for testvol held by 2352d844-ab16-4c77-86e5-6bfcfcc8b3b2
[2017-04-12 05:12:37.079953] E [MSGID: 106119] [glusterd-syncop.c:1864:gd_sync_task_begin] 0-management: Unable to acquire lock for testvol
[2017-04-12 05:18:31.518899] W [glusterd-locks.c:572:glusterd_mgmt_v3_lock] (-->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd39d0) [0x7eff3ee539d0] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd3900) [0x7eff3ee53900] -->/usr/lib64/glusterfs/3.8.4/xlator/mgmt/glusterd.so(+0xd8c0f) [0x7eff3ee58c0f] ) 0-management: Lock for testvol held by 2352d844-ab16-4c77-86e5-6bfcfcc8b3b2

Comment 3 Atin Mukherjee 2017-04-13 08:02:22 UTC
upstream patch : https://review.gluster.org/#/c/17055/

Comment 5 Atin Mukherjee 2017-04-18 11:26:17 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/103636

Comment 7 Bala Konda Reddy M 2017-05-03 13:49:27 UTC
Build : 3.8.4-24

Followed the steps provided in the bug. After volume stop, able to delete the volume without any issue. 

Hence marking the bug as verified

Comment 9 errata-xmlrpc 2017-09-21 04:37:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774