Bug 981661 - quota + core: Another transaction is in progress. Please try again after sometime.
Summary: quota + core: Another transaction is in progress. Please try again after some...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: 2.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Krutika Dhananjay
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-07-05 12:30 UTC by Saurabh
Modified: 2016-01-19 06:12 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.4.0.12rhs.beta6-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-23 22:24:55 UTC
Embargoed:


Attachments (Terms of Use)

Description Saurabh 2013-07-05 12:30:38 UTC
Description of problem:

after the BZ 981653

I am finding that "gluster volume status"
fails on other nodes of the cluster also


Version-Release number of selected component (if applicable):
[root@quota2 ~]# rpm -qa | grep glusterfs
glusterfs-3.4.0.12rhs.beta2-1.el6rhs.x86_64
glusterfs-fuse-3.4.0.12rhs.beta2-1.el6rhs.x86_64
glusterfs-server-3.4.0.12rhs.beta2-1.el6rhs.x86_64

How reproducible:
found after BZ 981653

Steps to Reproduce:
after BZ 981653
execute gluster volume status on any of the nodes of the cluster.

Actual results:
[root@quota2 ~]# gluster volume status
Another transaction is in progress. Please try again after sometime.
 
glusterd logs say,
2013-07-05 05:13:59.931692] E [glusterd-utils.c:333:glusterd_lock] 0-management: Unable to get lock for uuid: cc7bc8ba-fa3a-43d9-a899-114e34d27eb4, lock held by: 236e161a-fc82-4964-8e6d-bb0d9160990d
[2013-07-05 05:14:02.238117] E [socket.c:2158:socket_connect_finish] 0-management: connection to 10.70.37.98:24007 failed (Connection refused)
[2013-07-05 05:22:13.211163] E [glusterd-utils.c:333:glusterd_lock] 0-management: Unable to get lock for uuid: cc7bc8ba-fa3a-43d9-a899-114e34d27eb4, lock held by: 236e161a-fc82-4964-8e6d-bb0d9160990d
[2013-07-05 05:22:13.211243] E [glusterd-syncop.c:1128:gd_sync_task_begin] 0-management: Unable to acquire lock
[2013-07-05 05:22:13.211373] E [glusterd-utils.c:375:glusterd_unlock] 0-management: Cluster lock held by 236e161a-fc82-4964-8e6d-bb0d9160990d ,unlock req from cc7bc8ba-fa3a-43d9-a899-114e34d27eb4!
[2013-07-05 05:22:13.211404] E [glusterd-utils.c:333:glusterd_lock] 0-management: Unable to get lock for uuid: cc7bc8ba-fa3a-43d9-a899-114e34d27eb4, lock held by: 236e161a-fc82-4964-8e6d-bb0d9160990d
[2013-07-05 05:31:08.545951] E [glusterd-utils.c:333:glusterd_lock] 0-management: Unable to get lock for uuid: cc7bc8ba-fa3a-43d9-a899-114e34d27eb4, lock held by: 236e161a-fc82-4964-8e6d-bb0d9160990d
[2013-07-05 05:31:08.546016] E [glusterd-syncop.c:1128:gd_sync_task_begin] 0-management: Unable to acquire lock
[2013-07-05 05:31:08.546121] E [glusterd-utils.c:375:glusterd_unlock] 0-management: Cluster lock held by 236e161a-fc82-4964-8e6d-bb0d9160990d ,unlock req from cc7bc8ba-fa3a-43d9-a899-114e34d27eb4!
[2013-07-05 05:31:08.546142] E [glusterd-utils.c:333:glusterd_lock] 0-management: Unable to get lock for uuid: cc7bc8ba-fa3a-43d9-a899-114e34d27eb4, lock held by: 236e161a-fc82-4964-8e6d-bb0d9160990d
[2013-07-05 05:31:13.491554] I [glusterd-handler.c:966:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2013-07-05 05:38:01.968142] I [glusterd-handler.c:966:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
[2013-07-05 05:38:02.187306] E [glusterd-utils.c:333:glusterd_lock] 0-management: Unable to get lock for uuid: cc7bc8ba-fa3a-43d9-a899-114e34d27eb4, lock held by: 236e161a-fc82-4964-8e6d-bb0d9160990d
[2013-07-05 05:38:02.187355] E [glusterd-syncop.c:1128:gd_sync_task_begin] 0-management: Unable to acquire lock
[2013-07-05 05:38:02.187453] E [glusterd-utils.c:375:glusterd_unlock] 0-management: Cluster lock held by 236e161a-fc82-4964-8e6d-bb0d9160990d ,unlock req from cc7bc8ba-fa3a-43d9-a899-114e34d27eb4!
[2013-07-05 05:38:02.187490] E [glusterd-utils.c:333:glusterd_lock] 0-management: Unable to get lock for uuid: cc7bc8ba-fa3a-43d9-a899-114e34d27eb4, lock held by: 236e161a-fc82-4964-8e6d-bb0d9160990d
(END) 


Expected results:
if node crashed because of some reason, some other node should provide the information without fail. other wise whole cluster becomes unuseful without some "workaround"

Additional info:

Comment 4 Krutika Dhananjay 2013-07-17 08:55:58 UTC
https://code.engineering.redhat.com/gerrit/#/c/10364/ <-- Posted for review.

PROBLEM:

When the originator of a volume transaction goes down while it is still
the owning the lock, volume ops issued from the other nodes also fail
with the message that the lock is still held by the node that went down.

FIX:

Upon receiving DISCONNECT from the originator of a transaction, on the rest
of the nodes, perform the following actions:

a. Release the lock; and
b. reset the state of the node to GD_OP_STATE_DEFAULT.


Note:
This bug is not confined to 'volume quota' command. This state may be reached for any volume command when the originator goes down while in possession of the lock.

Comment 5 Krutika Dhananjay 2013-07-19 05:15:47 UTC
The change has been merged in downstream. Hence moving the state of the bug to MODIFIED.

Comment 7 Krutika Dhananjay 2013-08-02 09:28:32 UTC
https://code.engineering.redhat.com/gerrit/#/c/10364/ <-- Same as in comment #4

Comment 8 SATHEESARAN 2013-08-08 11:04:33 UTC
Tested this with glusterfs-3.4.0.17rhs-1

Steps
=====
1. Created a trusted storage pool of 3 nodes
2. Created a replica volume with 2 bricks ( 1 brick in node1 and another in node2 )
3. Start the volume
4. Abruptly powered down node1
5. Issue "gluster volume heal <vol-name>" from node2
6. 'heal' command waits [BZ 866758] for frame-timeout which is 600 secs
7. Issue, gluster volume status from the node3.
You will get the error as follows : 

[Thu Aug  8 10:50:50 UTC 2013 root.37.61:~ ] # gluster volume status
Another transaction is in progress. Please try again after sometime.

NOTE: above command is executed in node3, which doesn't actually have bricks in it

8. Abruptly power down, node2 also.
9. Check for "gluster volume status"

"gluster volume status" succeeded and thus moving it to VERIFIED state

Comment 9 SATHEESARAN 2013-08-08 11:32:19 UTC
Correction with #comment8,

Verified with glusterfs-3.4.0.18rhs-1

Comment 10 Scott Haines 2013-09-23 22:24:55 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html


Note You need to log in before you can comment on or make changes to this bug.