Bug 762249 (GLUSTER-517) - Self-heal is not triggered when node comes back up under the NFS mount
Summary: Self-heal is not triggered when node comes back up under the NFS mount
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-517
Product: GlusterFS
Classification: Community
Component: booster
Version: mainline
Hardware: All
OS: Linux
medium
low
Target Milestone: ---
Assignee: Shehjar Tikoo
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-01-04 11:53 UTC by Shehjar Tikoo
Modified: 2015-12-01 16:45 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Shehjar Tikoo 2010-01-04 08:55:54 UTC
The absence of self-heal is noticed only on the unfs3-booster exports not on FUSE based exports.

The problem was seen on the storage platform with fully loaded configuration but
is seen even with simple 2 backends replicated over protocol/clients exported through unfs3-booster.

Comment 1 Shehjar Tikoo 2010-01-04 09:26:56 UTC
Here is a probable explanation for self-heal not happening.

When the touch,i.e. create operation returns, a file handle is returned to the
NFS client. When this file handle is returned, the second node is down. Once the second node comes back up, an ls -lR is done on the NFS mount point.

On this ls -lR, since the NFS client already has the file handle for the newly created file, it does a GETATTR on this file handle. At unfsd, the file handle is translated into the path because it is already in the fh-cache, followed by a stat on the file and not necessarily on the directory in which the file was created. Since a stat can be served even with one node down, the ls -lR succeeds.
In the absence of a stat on the directory, self-heal does not get triggered.

Comment 2 Shehjar Tikoo 2010-01-04 11:53:44 UTC
Reported by davide.damico:

Hi, I'm following gluster development for a long time and I think it's a great project.
The gluster storage is amazing and today I was trying it to understand if it fills my needs.
I created a mirrored volume and I mounted the share using nfs protocol on a freebsd machine.
Everything is fine (except an initial NFS stale handle message) but if I simulate a node-down
detaching the network cable, writing a file and then attaching again the second node, I don't
see the file I wrote during its down period.

Am I missing anything?

Thanks in advance,
d.


==================================================

I can confirm that self-heal does not get triggered on the glusterfsd backend
which was down when the file was touched by the user on the NFS mount-point.

Comment 3 Shehjar Tikoo 2010-02-23 07:06:37 UTC
Closing this bug as there is not much reason to continue using booster with unfsd.


Note You need to log in before you can comment on or make changes to this bug.