1465936 – [GSS]glusterd_mgmt_v3_lock issues while executing "gluster volume" commands

Bug 1465936 - [GSS]glusterd_mgmt_v3_lock issues while executing "gluster volume" commands

Summary: [GSS]glusterd_mgmt_v3_lock issues while executing "gluster volume" commands

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	rhgs-3.2
Hardware:	All
OS:	All
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Atin Mukherjee
QA Contact:	Bala Konda Reddy M
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-06-28 13:32 UTC by Riyas Abdulrasak
Modified:	2020-08-13 09:29 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-07-27 06:12:33 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Riyas Abdulrasak 2017-06-28 13:32:50 UTC

Description of problem:


We have a 20 node gluster setup where we are facing the locking issue with 1 node only i.e rhs01. Whenever a gluster command is run on rhs01, it says "request time out". But other nodes gives proper response. We tried to restart the glusterd on rhs01 & rhs07(where from logs it is saying staging failed ) but with no luck.



You can find full logs here http://collab-shell.usersys.redhat.com/01869975/

Snippet from the logs :


When volume status is run on rhs01 node it says below error :

~~~
[2017-06-20 11:31:35.182542] E [MSGID: 106116] [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking failed on rhs07. Please check log file for details.
[2017-06-20 11:31:35.182620] E [MSGID: 106151] [glusterd-syncop.c:1868:gd_sync_task_begin] 0-management: Locking Peers Failed.
[2017-06-20 11:32:35.237627] W [glusterd-locks.c:577:glusterd_mgmt_v3_lock] (-->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_op_begin_synctask+0x30) [0x7fbe9f0530c0] -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(gd_sync_task_begin+0x980) [0x7fbe9f052ff0] -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_lock+0x5f7) [0x7fbe9f0577e7] ) 0-management: Lock for commvault held by c7547ec0-574e-456b-a274-f55b0927b32f
~~~

Meanwhile rhs07 record below error :

~~~
[2017-06-20 11:31:35.182297] W [glusterd-locks.c:577:glusterd_mgmt_v3_lock] (-->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_op_sm+0x29f) [0x7fac3509acaf] -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(+0x65ab5) [0x7fac3508cab5] -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_lock+0x5f7) [0x7fac3512c7e7] ) 0-management: Lock for commvault held by c7547ec0-574e-456b-a274-f55b0927b32f
[2017-06-20 11:31:35.182424] E [MSGID: 106119] [glusterd-op-sm.c:3713:glusterd_op_ac_lock] 0-management: Unable to acquire lock for commvault
[2017-06-20 11:31:35.182473] E [MSGID: 106376] [glusterd-op-sm.c:7596:glusterd_op_sm] 0-management: handler returned: -1
[2017-06-20 11:31:35.183417] E [glusterd-op-sm.c:7577:glusterd_op_sm] (-->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x30) [0x7fac3507e770] -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(+0x128bc8) [0x7fac3514fbc8] -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_op_sm+0x37f) [0x7fac3509ad8f] ) 0-management: Unable to get transaction opinfo for transaction ID :15611444-8d18-437e-aa5c-f858fa265097
~~~

Version-Release number of selected component (if applicable):

Red Hat Gluster Storage 3.2

How reproducible:

Reproducible at customer env. 


Actual results:

gluster volume commands ran from a node(rhs01) give "request time out"


Expected results:

Gluster volume commands should run without issues. 


Additional info:

Note You need to log in before you can comment on or make changes to this bug.