Bug 762672 (GLUSTER-940) - ESTALE on / when one of the servers is restarted
Summary: ESTALE on / when one of the servers is restarted
Keywords:
Status: CLOSED NOTABUG
Alias: GLUSTER-940
Product: GlusterFS
Classification: Community
Component: unclassified
Version: 3.0.4
Hardware: All
OS: All
high
medium
Target Milestone: ---
Assignee: shishir gowda
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-21 22:10 UTC by Vikas Gorur
Modified: 2015-12-01 16:45 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Vikas Gorur 2010-05-21 22:10:44 UTC
This bug has been seen a couple of times in the wild.

Scenario #1:

A pure-distribute setup with 6 servers. One of the server machine goes down and another machine assumes its responsibility. It starts its own GlusterFS server process and starts exporting the same LUNs that the now-dead server was exporting. Client starts seeing LOOKUP / => ESTALE

Scenario #2:

4-server distribute+replicate setup. One of the servers is shut down, its disk taken out and replaced with a blank one. GlusterFS started again, and self-heal triggered from the client. Client starts seeing LOOKUP / => ESTALE.

Comment 1 Vikas Gorur 2010-06-11 18:16:15 UTC
I happened to reproduce this inadvertently myself. The client volume file had a mistake and distribute's two subvolumes were identical (total 2 subvolumes). Mounting and remounting multiple times still led to the error ESTALE.

Comment 2 Niu Zhenguo 2010-08-30 04:31:35 UTC
glusterfs will get a different inode number from the new disk, when cbk, there should check the new inode number with the cached one, if it's not matched. errno will set to ESTALE. so if you want to replace a machine or a disk.you should flush cache before that. (In reply to comment #0)
> This bug has been seen a couple of times in the wild.
> 
> Scenario #1:
> 
> A pure-distribute setup with 6 servers. One of the server machine goes down and
> another machine assumes its responsibility. It starts its own GlusterFS server
> process and starts exporting the same LUNs that the now-dead server was
> exporting. Client starts seeing LOOKUP / => ESTALE
> 
> Scenario #2:
> 
> 4-server distribute+replicate setup. One of the servers is shut down, its disk
> taken out and replaced with a blank one. GlusterFS started again, and self-heal
> triggered from the client. Client starts seeing LOOKUP / => ESTALE.

Comment 3 Niu Zhenguo 2010-08-31 00:47:13 UTC
sorry, it's not inode number, st_dev changed, client_lookup_cbk will check it.

Comment 4 shishir gowda 2010-10-05 09:22:43 UTC
gfid changes invalidates this bug


Note You need to log in before you can comment on or make changes to this bug.