Bug 1161903

Summary: Different client can not execute "for((i=0;i<1000;i++));do ls -al;done" in a same directory at the sametime
Product: [Community] GlusterFS Reporter: jiademing.dd <iesool>
Component: disperseAssignee: Xavi Hernandez <jahernan>
Status: CLOSED DEFERRED QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: bugs, gluster-bugs, iesool, jahernan, pkarampu
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1165041 (view as bug list) Environment:
Last Closed: 2015-08-04 04:10:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description jiademing.dd 2014-11-09 05:55:29 UTC
Description of problem:
A disperse volume, Different client can not "ls -al " in a same directory at the sametime 

In client-1 mountpoint , exec cmd "for((i=0;i<1000;i++));do ls -al;done",In the other client-2 mountpoint's smae directory, cmd "for((i=0;i<1000;i++));do ls -al;done" or "touch newfile" or "mkdir newdirectory" is blocked before client-1's cmd(1000 ls -al loops) is over.

[root@localhost test]# gluster volume info

Volume Name: test
Type: Distributed-Disperse
Volume ID: 433248ee-24f5-44e3-b334-488743850e45
Status: Started
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: 10.10.101.111:/sda
Brick2: 10.10.101.111:/sdb
Brick3: 10.10.101.111:/sdc
Brick4: 10.10.101.111:/sdd
Brick5: 10.10.101.111:/sde
Brick6: 10.10.101.111:/sdf


Version-Release number of selected component (if applicable):
3.6.0

How reproducible:


Steps to Reproduce:
1. 
2.
3.

Actual results:
In the other client's same directory, ls, touch, mkdir is blocked

Expected results:
In the other client's same directory, ls, touch, mkdir should be ok or be blocked a short time.

Additional info:

Comment 1 Niels de Vos 2014-11-11 12:52:26 UTC
Have you tried this also on other types of volumes? Is this only affecting a disperse volume?

Comment 2 jiademing.dd 2014-11-12 01:36:07 UTC
Yes, only affecting a disperse volume, I tried to turn off the gf_timer_call_after() wthen ec_unlock in ec_common.c's ec_unlock_timer_add(), then can execute "for((i=0;i<1000;i++));do ls -al;done" in different client at the same time.

In my opinion, the af_timer_call_after in ec_unlock is optimize for one client, but maybe it is bad for many clients.


(In reply to Niels de Vos from comment #1)
> Have you tried this also on other types of volumes? Is this only affecting a
> disperse volume?

Comment 3 Xavi Hernandez 2014-11-12 17:23:25 UTC
Yes, this is a method to minimize lock/unlock calls. I'll try to find a good solution to minimize the multiple client problem.

Comment 4 jiademing.dd 2014-11-17 01:30:07 UTC
yes, I will  pay close attention to this problem, thanks.

(In reply to Xavier Hernandez from comment #3)
> Yes, this is a method to minimize lock/unlock calls. I'll try to find a good
> solution to minimize the multiple client problem.

Comment 5 Pranith Kumar K 2015-08-04 04:10:34 UTC
http://review.gluster.org/10852 fixed this bug in 3.7.x releases. This patch can't be backported to 3.6.0. Closing the bug.