1165041 – Different client can not execute "for((i=0;i<1000;i++));do ls -al;done" in a same directory at the sametime

Bug 1165041 - Different client can not execute "for((i=0;i<1000;i++));do ls -al;done" in a same directory at the sametime

Summary: Different client can not execute "for((i=0;i<1000;i++));do ls -al;done" in a ...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	disperse
Sub Component:
Version:	mainline
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Xavi Hernandez
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1225279 1226149
TreeView+	depends on / blocked

Reported:	2014-11-18 08:16 UTC by Xavi Hernandez
Modified:	2016-06-16 12:40 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.8rc2
Clone Of:	1161903
Clones:	1225279 (view as bug list)
Environment:
Last Closed:	2016-06-16 12:40:24 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Xavi Hernandez 2014-11-18 08:16:03 UTC

+++ This bug was initially created as a clone of Bug #1161903 +++

Description of problem:
A disperse volume, Different client can not "ls -al " in a same directory at the sametime 

In client-1 mountpoint , exec cmd "for((i=0;i<1000;i++));do ls -al;done",In the other client-2 mountpoint's smae directory, cmd "for((i=0;i<1000;i++));do ls -al;done" or "touch newfile" or "mkdir newdirectory" is blocked before client-1's cmd(1000 ls -al loops) is over.

[root@localhost test]# gluster volume info

Volume Name: test
Type: Distributed-Disperse
Volume ID: 433248ee-24f5-44e3-b334-488743850e45
Status: Started
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: 10.10.101.111:/sda
Brick2: 10.10.101.111:/sdb
Brick3: 10.10.101.111:/sdc
Brick4: 10.10.101.111:/sdd
Brick5: 10.10.101.111:/sde
Brick6: 10.10.101.111:/sdf


Version-Release number of selected component (if applicable):
3.6.0

How reproducible:


Steps to Reproduce:
1. 
2.
3.

Actual results:
In the other client's same directory, ls, touch, mkdir is blocked

Expected results:
In the other client's same directory, ls, touch, mkdir should be ok or be blocked a short time.

Additional info:

--- Additional comment from Niels de Vos on 2014-11-11 13:52:26 CET ---

Have you tried this also on other types of volumes? Is this only affecting a disperse volume?

--- Additional comment from jiademing on 2014-11-12 02:36:07 CET ---

Yes, only affecting a disperse volume, I tried to turn off the gf_timer_call_after() wthen ec_unlock in ec_common.c's ec_unlock_timer_add(), then can execute "for((i=0;i<1000;i++));do ls -al;done" in different client at the same time.

In my opinion, the af_timer_call_after in ec_unlock is optimize for one client, but maybe it is bad for many clients.


(In reply to Niels de Vos from comment #1)
> Have you tried this also on other types of volumes? Is this only affecting a
> disperse volume?

--- Additional comment from Xavier Hernandez on 2014-11-12 18:23:25 CET ---

Yes, this is a method to minimize lock/unlock calls. I'll try to find a good solution to minimize the multiple client problem.

--- Additional comment from jiademing on 2014-11-17 02:30:07 CET ---

yes, I will  pay close attention to this problem, thanks.

(In reply to Xavier Hernandez from comment #3)
> Yes, this is a method to minimize lock/unlock calls. I'll try to find a good
> solution to minimize the multiple client problem.

Comment 1 Anand Avati 2015-05-20 13:32:15 UTC

REVIEW: http://review.gluster.org/10845 (cluster/ec: Forced unlock when lock contention is detected) posted (#1) for review on master by Xavier Hernandez (xhernandez)

Comment 2 Xavi Hernandez 2015-05-20 13:35:48 UTC

The previous patch should be half of the solution. Another patch will be sent to add wider support in the locks xlator.

Comment 3 Anand Avati 2015-05-21 14:45:16 UTC

REVIEW: http://review.gluster.org/10845 (cluster/ec: Forced unlock when lock contention is detected) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 4 Anand Avati 2015-05-21 14:45:19 UTC

REVIEW: http://review.gluster.org/10880 (features/locks: Handle virtual getxattrs in more fops) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 5 Anand Avati 2015-05-21 15:50:15 UTC

REVIEW: http://review.gluster.org/10845 (cluster/ec: Forced unlock when lock contention is detected) posted (#3) for review on master by Xavier Hernandez (xhernandez)

Comment 6 Anand Avati 2015-05-22 07:31:51 UTC

REVIEW: http://review.gluster.org/10845 (cluster/ec: Forced unlock when lock contention is detected) posted (#4) for review on master by Xavier Hernandez (xhernandez)

Comment 7 Anand Avati 2015-05-22 07:34:23 UTC

REVIEW: http://review.gluster.org/10845 (cluster/ec: Forced unlock when lock contention is detected) posted (#5) for review on master by Xavier Hernandez (xhernandez)

Comment 8 Anand Avati 2015-05-24 19:41:17 UTC

REVIEW: http://review.gluster.org/10845 (cluster/ec: Forced unlock when lock contention is detected) posted (#6) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 9 Anand Avati 2015-05-24 19:41:19 UTC

REVIEW: http://review.gluster.org/10880 (features/locks: Handle virtual getxattrs in more fops) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 10 Anand Avati 2015-05-25 07:57:02 UTC

REVIEW: http://review.gluster.org/10845 (cluster/ec: Forced unlock when lock contention is detected) posted (#7) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 11 Anand Avati 2015-05-25 08:28:33 UTC

REVIEW: http://review.gluster.org/10845 (cluster/ec: Forced unlock when lock contention is detected) posted (#8) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 12 Anand Avati 2015-05-25 10:27:52 UTC

REVIEW: http://review.gluster.org/10852 (cluster/ec: Forced unlock when lock contention is detected) posted (#4) for review on master by Xavier Hernandez (xhernandez)

Comment 13 Anand Avati 2015-05-25 11:20:48 UTC

REVIEW: http://review.gluster.org/10852 (cluster/ec: Forced unlock when lock contention is detected) posted (#5) for review on master by Xavier Hernandez (xhernandez)

Comment 14 Anand Avati 2015-05-25 13:25:54 UTC

REVIEW: http://review.gluster.org/10852 (cluster/ec: Forced unlock when lock contention is detected) posted (#6) for review on master by Xavier Hernandez (xhernandez)

Comment 15 Anand Avati 2015-05-25 16:42:46 UTC

REVIEW: http://review.gluster.org/10852 (cluster/ec: Forced unlock when lock contention is detected) posted (#7) for review on master by Xavier Hernandez (xhernandez)

Comment 16 Anand Avati 2015-05-26 04:07:25 UTC

REVIEW: http://review.gluster.org/10852 (cluster/ec: Forced unlock when lock contention is detected) posted (#8) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 17 Anand Avati 2015-05-26 10:18:15 UTC

REVIEW: http://review.gluster.org/10852 (cluster/ec: Forced unlock when lock contention is detected) posted (#9) for review on master by Xavier Hernandez (xhernandez)

Comment 18 Anand Avati 2015-05-26 15:06:04 UTC

REVIEW: http://review.gluster.org/10852 (cluster/ec: Forced unlock when lock contention is detected) posted (#10) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 19 Anand Avati 2015-05-27 06:48:52 UTC

REVIEW: http://review.gluster.org/10930 (cluster/ec: Remove ec tests from bad tests) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 20 Anand Avati 2015-05-27 10:22:46 UTC

COMMIT: http://review.gluster.org/10880 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit d37cb5aee7af8582d0343d2732c153226955945d
Author: Pranith Kumar K <pkarampu>
Date:   Tue May 19 20:53:30 2015 +0530

    features/locks: Handle virtual getxattrs in more fops
    
    With this patch getxattr of inodelk/entrylk counts can be requested in
    readv/writev/create/unlink/opendir.
    
    Change-Id: If7430317ad478a3c753eb33bdf89046cb001a904
    BUG: 1165041
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/10880
    Tested-by: NetBSD Build System
    Reviewed-by: Krutika Dhananjay <kdhananj>

Comment 21 Anand Avati 2015-05-27 10:25:51 UTC

COMMIT: http://review.gluster.org/10852 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 3b666b40efbed157e8c5991f29b345d93b28c659
Author: Xavier Hernandez <xhernandez>
Date:   Wed May 20 15:17:35 2015 +0200

    cluster/ec: Forced unlock when lock contention is detected
    
    EC uses an eager lock mechanism to optimize multiple read/write
    requests on the same entry or inode. This increases performance
    but can have adverse results when other clients try to access the
    same entry/inode.
    
    To solve this, this patch adds a functionality to detect when this
    happens and force an earlier release to not block other clients.
    
    The method consists on requesting GF_GLUSTERFS_INODELK_COUNT and
    GF_GLUSTERFS_ENTRYLK_COUNT for all fops that take a lock. When this
    count is greater than one, the lock is marked to be released. All
    fops already waiting for this lock will be executed normally before
    releasing the lock, but new requests that also require it will be
    blocked and restarted after the lock has been released and reacquired
    again.
    
    Another problem was that some operations did correctly lock the
    parent of an entry when needed, but got the size and version xattrs
    from the entry instead of the parent.
    
    This patch solves this problem by binding all queries of size and
    version to each lock and replacing all entrylk calls by inodelk ones
    to remove concurrent updates on directory metadata.  This also allows
    rename to correctly update source and destination directories.
    
    Change-Id: I2df0b22bc6f407d49f3cbf0733b0720015bacfbd
    BUG: 1165041
    Signed-off-by: Xavier Hernandez <xhernandez>
    Reviewed-on: http://review.gluster.org/10852
    Tested-by: NetBSD Build System
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu>

Comment 22 Anand Avati 2015-05-27 10:26:19 UTC

REVIEW: http://review.gluster.org/10930 (cluster/ec: Remove ec tests from bad tests) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 23 Anand Avati 2015-05-27 10:55:10 UTC

REVIEW: http://review.gluster.org/10930 (tests: Remove tests from bad tests) posted (#3) for review on master by Ravishankar N (ravishankar)

Comment 24 Anand Avati 2015-05-27 11:12:43 UTC

REVIEW: http://review.gluster.org/10930 (tests: Remove tests from bad tests) posted (#4) for review on master by Ravishankar N (ravishankar)

Comment 25 Anand Avati 2015-05-27 17:22:36 UTC

REVIEW: http://review.gluster.org/10930 (tests: Remove tests from bad tests) posted (#5) for review on master by Vijay Bellur (vbellur)

Comment 26 Nagaprasad Sathyanarayana 2015-10-25 14:43:15 UTC

Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.

Comment 27 Niels de Vos 2016-06-16 12:40:24 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.