Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1159498

Summary:	when replace one brick on disperse volume, ls sometimes goes wrong
Product:	[Community] GlusterFS	Reporter:	lidi <lidi>
Component:	disperse	Assignee:	Xavi Hernandez <jahernan>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	3.6.0	CC:	bugs, gluster-bugs, jahernan, lmohanty
Target Milestone:	---	Keywords:	Triaged
Target Release:	---
Hardware:	Unspecified
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1163760 (view as bug list)		Environment:
Last Closed:	2015-01-28 14:27:51 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1163760
Bug Blocks:	1163723

Description lidi 2014-11-01 09:06:30 UTC

Steps to Reproduce:
1.gluster vol create test disperse 3 redundancy 1 10.10.21.20:/sdb 10.10.21.21:/sdb 10.10.21.22:/sdb force;
2.start the volume and mount it on /cluster2/test
3.cd /cluster2/test
4.mkdir a b c
5.touch a/1 b/2 c/3
6.gluster vol replace-brick test 10.10.21.22:/sdb 10.10.21.23:/sdb commit force
7.execute 'ls /cluster2/test/a' multiple times


Actual results:
sometimes 'ls /cluster2/test/a' can not list the file 1

Comment 1 Xavi Hernandez 2014-11-04 12:39:51 UTC

Are you using 3.6.0beta3 ? if that's the case, this problem should already be solved in latest version (see bug #1149727)

Comment 2 lidi 2014-11-05 02:07:29 UTC

I use official 3.6.0 for this test.

Comment 3 Xavi Hernandez 2014-11-10 15:43:11 UTC

I've tried to reproduce this bug repeating your steps using version 3.6.0 and I'm not able to see this problem. There was a bug on 3.6.0beta3 that caused this problem, but it should be solved.

Can you reproduce this problem with a 3.6.0 and a newly created volume using this version ?

Comment 4 lidi 2014-11-11 03:43:19 UTC

I got the source code form git://forge.gluster.org/glusterfs-core/glusterfs.git,and branch is release-3.6

I reformat all the disks, create a new volume and test again.

Then I found I made a mistake on  previous description. 

The step 7 should be : "ls a" multiple times;"cd a"; "ls" multiple times,then you'll see what I described.

Comment 5 Anand Avati 2014-11-13 13:06:26 UTC

REVIEW: http://review.gluster.org/9118 (ec: Avoid self-heal on directories on (f)stat calls) posted (#1) for review on release-3.6 by Xavier Hernandez (xhernandez)

Comment 6 Anand Avati 2014-11-15 18:01:36 UTC

COMMIT: http://review.gluster.org/9118 committed in release-3.6 by Vijay Bellur (vbellur) 
------
commit b01660c5d7cf4a59a85a8edc3c816e4585aa211b
Author: Xavier Hernandez <xhernandez>
Date:   Thu Nov 13 13:55:36 2014 +0100

    ec: Avoid self-heal on directories on (f)stat calls
    
    To avoid inconsistent directory listings, a full self-heal
    cannot happen on a directory until all its contents have
    been healed. This is controlled by a manual command using
    getfattr recursively and in post-order.
    
    While navigating the directories, sometimes an (f)stat fop
    can be sent. This fop caused a full self-heal of the directory.
    
    This patch makes that (f)stat only initiates a partial self-heal.
    
    This is a backport of http://review.gluster.org/9117/
    
    Change-Id: I0a92bda8f4f9e43c1acbceab2d7926944a8a4d9a
    BUG: 1159498
    Signed-off-by: Xavier Hernandez <xhernandez>
    Reviewed-on: http://review.gluster.org/9118
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Dan Lambright <dlambrig>