1221145 – ctdb's ping_pong lock tester fails with input/output error on disperse volume mounted with glusterfs

Bug 1221145 - ctdb's ping_pong lock tester fails with input/output error on disperse volume mounted with glusterfs

Summary: ctdb's ping_pong lock tester fails with input/output error on disperse volume...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	disperse
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Assignee:	bugs@gluster.org
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1177167
Blocks:	1221906
TreeView+	depends on / blocked

Reported:	2015-05-13 11:35 UTC by Pranith Kumar K
Modified:	2016-06-16 13:01 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.8rc2
Clone Of:	1177167
Clones:	1221906 (view as bug list)
Environment:
Last Closed:	2016-06-16 13:01:11 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Pranith Kumar K 2015-05-13 11:35:05 UTC

+++ This bug was initially created as a clone of Bug #1177167 +++

Description of problem:
ctdb's ping_pong lock tester fails with input/output error on disperse volume mounted with glusterfs.

It apparently works when ping_pong is launched on various hosts where the volume is mounted. As soon as more then on ping_pong is launched on the same host, the tool shows input/output error.

The problem doesn't appear with replica volumes.

Version-Release number of selected component (if applicable):
3.6.1

How reproducible:
Always

Steps to Reproduce:
1. Create a disperse volume (I used 2+1) and mount it as glusterfs
   Problem shows up if the bricks are on a single host too. One single host
   can be used to reproduce it.
2. cd to mount point. Launch 2 simultaneous "ping_pong test 1"
3. 

Actual results:
$ ping_pong test 1
unlock at 0 failed! - Input/output error
lock at 0 failed! - Input/output error
unlock at 0 failed! - Input/output error
lock at 0 failed! - Input/output error
unlock at 0 failed! - Input/output error


Expected results:
$ ping_pong test 1
nnnnn locks/sec

Additional info:
$ gluster volume info test
 
Volume Name: test
Type: Disperse
Volume ID: c41b2c0b-a876-487f-9bf0-01e83027f9da
Status: Started
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.114.177:/gluster/cluster/brick.test
Brick2: 192.168.114.13:/gluster/cluster/brick.test
Brick3: 192.168.114.171:/gluster/cluster/brick.test
Options Reconfigured:
nfs.disable: off

Comment 1 Anand Avati 2015-05-13 11:35:40 UTC

REVIEW: http://review.gluster.org/10770 (cluster/ec: Prevent unnecessary self-heals) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 2 Anand Avati 2015-05-13 14:56:53 UTC

REVIEW: http://review.gluster.org/10770 (cluster/ec: Prevent unnecessary self-heals) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 3 Anand Avati 2015-05-15 08:24:54 UTC

COMMIT: http://review.gluster.org/10770 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 503acdb32ca84102d07cd1142eff464152b06690
Author: Pranith Kumar K <pkarampu>
Date:   Wed May 13 16:57:49 2015 +0530

    cluster/ec: Prevent unnecessary self-heals
    
    When a blocking lock is requested, lock request is succeeded even when
    ec->fragment number of locks are acquired successfully in non-blocking locking
    phase. This will lead to fop succeeding only on the bricks where the locks are
    acquired, leading to the necessity of self-heals. To prevent these un-necessary
    self-heals, if the remaining locks fail with EAGAIN in non-blocking lock phase
    try blocking locking phase instead.
    
    Change-Id: I940969e39acc620ccde2a876546cea77f7e130b6
    BUG: 1221145
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/10770
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Xavier Hernandez <xhernandez>

Comment 4 Niels de Vos 2016-06-16 13:01:11 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.