Bug 1008173 - Running a second gluster command from the same node clears locks held by the first gluster command, even before the first command has completed execution
Running a second gluster command from the same node clears locks held by the ...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterfs (Show other bugs)
2.1
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Avra Sengupta
SATHEESARAN
: ZStream
Depends On: 1008172
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-15 09:09 EDT by Avra Sengupta
Modified: 2013-11-27 10:38 EST (History)
6 users (show)

See Also:
Fixed In Version: glusterfs-3.4.0.35rhs-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1008172
Environment:
Last Closed: 2013-11-27 10:38:14 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Avra Sengupta 2013-09-15 09:09:27 EDT
+++ This bug was initially created as a clone of Bug #1008172 +++

Description of problem:

While a gluster command holding lock is in execution,
any other gluster command which tries to run will fail to
acquire the lock. As a result command#2 will follow the
cleanup code flow, which also includes unlocking the held
locks. As both the commands are run from the same node,
command#2 will end up releasing the locks held by command#1
even before command#1 reaches completion.


Version-Release number of selected component (if applicable):


How reproducible:
Everytime


Steps to Reproduce:
1. Make a gluster command take long time to execute (put in a hack to make it call a script and make the script sleep for 2-3 mins)
2. Meanwhile run another gluster command from the same node. This command will fail to acquire locks but end up releasing the locks already held. 
3. Now command#1 is still in execution, and the locks are released. You can run another command parallely to the first gluster command (still in execution)

Actual results:
Second gluster transaction from the same node releases the locks held by another transaction.


Expected results:
The locks should only be unlocked from the same transaction.
Additional info:
Comment 3 SATHEESARAN 2013-10-15 22:00:46 EDT
I was trying to verify this bug with glusterfs-3.4.0.34rhs-1, and got glusterd crashing.

After looking in to logs, I could understand this is manifestation of bug, https://bugzilla.redhat.com/show_bug.cgi?id=1018043

So this bug also depends on, https://bugzilla.redhat.com/show_bug.cgi?id=1018043

This bug could not be verified, unless the fix for, https://bugzilla.redhat.com/show_bug.cgi?id=1018043 gets in.

Steps done to verify the bug
============================
1. Created a distributed-replicate volume of 2X2
2. Started the volume
3. NFS mounted the volume in RHEL 6.4
4. I was creating lots of files (i.e) merely 'touch'ing it
(i.e) for i in {1..100000}; do touch file${i}; done
5. While step 4, was in progress, from RHS Node, did gluster volume status for inode
(i.e) gluster volume status <vol-name> inode
6. 'gluster status inode' was taking sometime and I issued 'gluster volume status'
command from the same node.

Result - glusterd crashed with error message, as captured below

[Tue Oct 15 18:42:57 UTC 2013 root@10.70.37.170:~ ] # gluster volume status distrepvol inode
Connection failed. Please check if gluster daemon is operational.

Snip of glusterd crash in glusterd log
======================================

[2013-10-15 18:44:18.687354] E [glusterd-utils.c:149:glusterd_lock] 0-management: Unable to get lock for uuid: e9c5334f-9522-4d50-946c-aeb15a6166db, lock held by: e9c5334f-9522-4d50-946c-aeb15a6166db
[2013-10-15 18:44:18.687402] E [glusterd-syncop.c:1202:gd_sync_task_begin] 0-management: Unable to acquire lock
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2013-10-15 18:44:18configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.4.0.34rhs
/lib64/libc.so.6[0x3c5d232960]
/usr/lib64/glusterfs/3.4.0.34rhs/xlator/mgmt/glusterd.so(gd_unlock_op_phase+0xae)[0x7fa44c87ffce]
/usr/lib64/glusterfs/3.4.0.34rhs/xlator/mgmt/glusterd.so(gd_sync_task_begin+0xdf)[0x7fa44c880bef]
/usr/lib64/glusterfs/3.4.0.34rhs/xlator/mgmt/glusterd.so(glusterd_op_begin_synctask+0x3b)[0x7fa44c880f0b]
/usr/lib64/glusterfs/3.4.0.34rhs/xlator/mgmt/glusterd.so(__glusterd_handle_status_volume+0x14a)[0x7fa44c80e4fa]
/usr/lib64/glusterfs/3.4.0.34rhs/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x3f)[0x7fa44c80ea7f]
/usr/lib64/libglusterfs.so.0(synctask_wrap+0x12)[0x3a63a49a72]
/lib64/libc.so.6[0x3c5d243bb0]
---------

Expected
=========
1. When there is lock held already, it should throw, "Another transaction is in progress"
2. The attempt to acquire a lock, when its already held by another op, should not unlock the previously held lock

Conclusion
==========
Since glusterd crashed against expectation, marking this bug as FailedQA
Comment 5 SATHEESARAN 2013-10-18 10:01:49 EDT
VERIFIED with glusterfs-3.4.0.35rhs-1

Steps followed
===============
1. Created a distributed-replicate volume of 2X2
2. Started the volume
3. NFS mounted the volume in RHEL 6.4
4. I was creating lots of files (i.e) merely 'touch'ing it
(i.e) for i in {1..100000}; do touch file${i}; done
5. While step 4, was in progress, from RHS Node, did gluster volume status for inode
(i.e) gluster volume status <vol-name> inode
This command took sometime and its does a lock for other OPs

6. Executed 'gluster volume status' command from the same node and also from other node, for more than 4 or 5 times.

All that time, I could get "Another Transaction is in progress". This makes it clear, attempt to acquire lock by an OP doesn't unlock the previously held lock
Comment 6 SATHEESARAN 2013-10-18 10:02:50 EDT
Fixed in version should be - glusterfs-3.4.0.35rhs-1 and not glusterfs-3.4.0.35rhs
Comment 8 errata-xmlrpc 2013-11-27 10:38:14 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1769.html

Note You need to log in before you can comment on or make changes to this bug.