Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 762672 (GLUSTER-940)

Summary:	ESTALE on / when one of the servers is restarted
Product:	[Community] GlusterFS	Reporter:	Vikas Gorur <vikas>
Component:	unclassified	Assignee:	shishir gowda <sgowda>
Status:	CLOSED NOTABUG	QA Contact:
Severity:	medium	Docs Contact:
Priority:	high
Version:	3.0.4	CC:	amarts, gluster-bugs, Niu.ZGlinux, nsathyan, rabhat, vijay
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:		Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Vikas Gorur 2010-05-21 22:10:44 UTC

This bug has been seen a couple of times in the wild.

Scenario #1:

A pure-distribute setup with 6 servers. One of the server machine goes down and another machine assumes its responsibility. It starts its own GlusterFS server process and starts exporting the same LUNs that the now-dead server was exporting. Client starts seeing LOOKUP / => ESTALE

Scenario #2:

4-server distribute+replicate setup. One of the servers is shut down, its disk taken out and replaced with a blank one. GlusterFS started again, and self-heal triggered from the client. Client starts seeing LOOKUP / => ESTALE.

Comment 1 Vikas Gorur 2010-06-11 18:16:15 UTC

I happened to reproduce this inadvertently myself. The client volume file had a mistake and distribute's two subvolumes were identical (total 2 subvolumes). Mounting and remounting multiple times still led to the error ESTALE.

Comment 2 Niu Zhenguo 2010-08-30 04:31:35 UTC

glusterfs will get a different inode number from the new disk, when cbk, there should check the new inode number with the cached one, if it's not matched. errno will set to ESTALE. so if you want to replace a machine or a disk.you should flush cache before that. (In reply to comment #0)
> This bug has been seen a couple of times in the wild.
> 
> Scenario #1:
> 
> A pure-distribute setup with 6 servers. One of the server machine goes down and
> another machine assumes its responsibility. It starts its own GlusterFS server
> process and starts exporting the same LUNs that the now-dead server was
> exporting. Client starts seeing LOOKUP / => ESTALE
> 
> Scenario #2:
> 
> 4-server distribute+replicate setup. One of the servers is shut down, its disk
> taken out and replaced with a blank one. GlusterFS started again, and self-heal
> triggered from the client. Client starts seeing LOOKUP / => ESTALE.

Comment 3 Niu Zhenguo 2010-08-31 00:47:13 UTC

sorry, it's not inode number, st_dev changed, client_lookup_cbk will check it.

Comment 4 shishir gowda 2010-10-05 09:22:43 UTC

gfid changes invalidates this bug