913662 – Need to increase throughput of mgmt operations using synctask framework

Bug 913662 - Need to increase throughput of mgmt operations using synctask framework

Summary: Need to increase throughput of mgmt operations using synctask framework

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	krishnan parthasarathi
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	895528
TreeView+	depends on / blocked

Reported:	2013-02-21 18:21 UTC by krishnan parthasarathi
Modified:	2015-11-03 23:05 UTC (History)
CC List:	5 users (show)
Fixed In Version:	glusterfs-3.4.0
Clone Of:
Environment:
Last Closed:	2013-07-24 17:18:18 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description krishnan parthasarathi 2013-02-21 18:21:14 UTC

Description of problem:
Volume operations using synctask framework issue management rpc ops in a serial manner. This approach will not scale with increasing no. of peers in the cluster.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Vijay Bellur 2013-02-22 06:57:19 UTC

CHANGE: http://review.gluster.org/4558 (synctask: support for (assymetric) counted barriers) merged in master by Anand Avati (avati)

Comment 2 Vijay Bellur 2013-02-26 17:04:49 UTC

CHANGE: http://review.gluster.org/4580 (volgen: Use bind-address option for bricks when option set on glusterd) merged in master by Vijay Bellur (vbellur)

Comment 3 Vijay Bellur 2013-02-26 17:06:27 UTC

CHANGE: http://review.gluster.org/4570 (glusterd: Increasing throughput of synctask based mgmt ops.) merged in master by Vijay Bellur (vbellur)

Comment 4 Ben Turner 2013-02-28 21:06:46 UTC

I am seeing volume commands fail occasionally on my 6 node 3x2 setups.  Here is what I am seeing:

# gluster volume heal healtest info split-brain
operation failed
# echo $?
255

This happened at just about 20:32:07.  In the etc-glusterfs-glusterd.vol.log I see:

[2013-02-27 20:32:08.081243] E [glusterd-utils.c:278:glusterd_lock] 0-glusterd: Unable to get lock for uuid: c3538fdf-c16e-4eaf-9650-6cf0caf7478b, lock held by: 647b8ff3-8110-4eab-956b-df78235fb192
[2013-02-27 20:32:08.081254] E [glusterd-handler.c:470:glusterd_op_txn_begin] 0-management: Unable to acquire local lock, ret: -1

Is this a symptom of what this bug was opened on or is this a different issue?

Comment 5 krishnan parthasarathi 2013-03-01 04:45:16 UTC

Ben,
This bug was opened to track the code changes for how (internal rpc) requests were sent to peers in cluster, during a volume operation.
The above observation is not a symptom of this bug. While you open a new bug to track what you observe, please provide the output of "gluster system:: fsm log". Run the command on the machine whose uuid is (still) holding the lock.

Comment 6 Vijay Bellur 2013-03-07 05:32:54 UTC

CHANGE: http://review.gluster.org/4636 (synctask: support for (assymetric) counted barriers) merged in release-3.4 by Anand Avati (avati)

Comment 7 Vijay Bellur 2013-03-07 06:05:54 UTC

CHANGE: http://review.gluster.org/4637 (volgen: Use bind-address option for bricks when option set on glusterd) merged in release-3.4 by Anand Avati (avati)

Note You need to log in before you can comment on or make changes to this bug.