Bug 1161903

Summary:	Different client can not execute "for((i=0;i<1000;i++));do ls -al;done" in a same directory at the sametime
Product:	[Community] GlusterFS	Reporter:	jiademing.dd <iesool>
Component:	disperse	Assignee:	Xavi Hernandez <jahernan>
Status:	CLOSED DEFERRED	QA Contact:
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	3.6.0	CC:	bugs, gluster-bugs, iesool, jahernan, pkarampu
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1165041 (view as bug list)		Environment:
Last Closed:	2015-08-04 04:10:34 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description jiademing.dd 2014-11-09 05:55:29 UTC

Description of problem:
A disperse volume, Different client can not "ls -al " in a same directory at the sametime 

In client-1 mountpoint , exec cmd "for((i=0;i<1000;i++));do ls -al;done",In the other client-2 mountpoint's smae directory, cmd "for((i=0;i<1000;i++));do ls -al;done" or "touch newfile" or "mkdir newdirectory" is blocked before client-1's cmd(1000 ls -al loops) is over.

[root@localhost test]# gluster volume info

Volume Name: test
Type: Distributed-Disperse
Volume ID: 433248ee-24f5-44e3-b334-488743850e45
Status: Started
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: 10.10.101.111:/sda
Brick2: 10.10.101.111:/sdb
Brick3: 10.10.101.111:/sdc
Brick4: 10.10.101.111:/sdd
Brick5: 10.10.101.111:/sde
Brick6: 10.10.101.111:/sdf


Version-Release number of selected component (if applicable):
3.6.0

How reproducible:


Steps to Reproduce:
1. 
2.
3.

Actual results:
In the other client's same directory, ls, touch, mkdir is blocked

Expected results:
In the other client's same directory, ls, touch, mkdir should be ok or be blocked a short time.

Additional info:

Comment 1 Niels de Vos 2014-11-11 12:52:26 UTC

Have you tried this also on other types of volumes? Is this only affecting a disperse volume?

Comment 2 jiademing.dd 2014-11-12 01:36:07 UTC

Yes, only affecting a disperse volume, I tried to turn off the gf_timer_call_after() wthen ec_unlock in ec_common.c's ec_unlock_timer_add(), then can execute "for((i=0;i<1000;i++));do ls -al;done" in different client at the same time.

In my opinion, the af_timer_call_after in ec_unlock is optimize for one client, but maybe it is bad for many clients.


(In reply to Niels de Vos from comment #1)
> Have you tried this also on other types of volumes? Is this only affecting a
> disperse volume?

Comment 3 Xavi Hernandez 2014-11-12 17:23:25 UTC

Yes, this is a method to minimize lock/unlock calls. I'll try to find a good solution to minimize the multiple client problem.

Comment 4 jiademing.dd 2014-11-17 01:30:07 UTC

yes, I will  pay close attention to this problem, thanks.

(In reply to Xavier Hernandez from comment #3)
> Yes, this is a method to minimize lock/unlock calls. I'll try to find a good
> solution to minimize the multiple client problem.

Comment 5 Pranith Kumar K 2015-08-04 04:10:34 UTC

http://review.gluster.org/10852 fixed this bug in 3.7.x releases. This patch can't be backported to 3.6.0. Closing the bug.