Bug 1258069 - gNFSd: NFS mount fails with "Remote I/O error"
gNFSd: NFS mount fails with "Remote I/O error"
Product: GlusterFS
Classification: Community
Component: nfs (Show other bugs)
All All
unspecified Severity medium
: ---
: ---
Assigned To: Niels de Vos
: Patch, Triaged
Depends On: 1258196
Blocks: glusterfs-3.6.6
  Show dependency treegraph
Reported: 2015-08-28 16:50 EDT by rwareing
Modified: 2015-12-01 11:45 EST (History)
4 users (show)

See Also:
Fixed In Version: glusterfs-3.6.6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1258196 (view as bug list)
Last Closed: 2015-09-30 08:15:13 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Repro & patch for bug 1258069. (2.58 KB, patch)
2015-08-28 16:57 EDT, rwareing
no flags Details | Diff
New patch - refactored + correct error handling (3.15 KB, patch)
2015-08-28 18:37 EDT, rwareing
no flags Details | Diff

  None (edit)
Description rwareing 2015-08-28 16:50:26 EDT
Description of problem:
gNFSd throws Remote IO error for mounts to directories which have changed OOB from the target gNFSd (say from a FUSE mount).  Internally this is due to ESTALE (op_errno == 116) being returned to mnt3_resolve_subdir_cbk, this causes the code path to unroll with an error.  Per the AFR2 code comments, the correct behavior is for gNFSd to purge the inode from it's inode table and do a fresh lookup on the inode.

The question might follow why does mnt3_resolve_subdir_cbk get ESTALE?  This is because the LOOKUP request is actually sent to the bricks via gfid vs a full path lookup, and this optimization happens because the path successfully grep's the gNFSd inode table for the GFID.  This isn't incorrect behavior, but is the root cause of the ESTALE.

Version-Release number of selected component (if applicable):
v3.6.x (verified), probably 3.7.x but unverified.

How reproducible:
100%, see prove test.

Steps to Reproduce:
See prove test.

Actual results:
Mount returns with "Remote I/O error"

Expected results:
The mount should succeed.

Additional info:
See attached prove test and patch which resolves the bug.
Comment 2 rwareing 2015-08-28 16:57:54 EDT
Created attachment 1068138 [details]
Repro & patch for bug 1258069.

Patch based off of FB GlusterFS v3.6.3, might not line up exactly but patching mnt3_resolve_subdir_cbk of mounts3.c per this patch should do the trick.
Comment 3 rwareing 2015-08-28 18:37:58 EDT
Created attachment 1068177 [details]
New patch - refactored + correct error handling
Comment 4 Niels de Vos 2015-08-30 03:55:37 EDT
Thanks for the patch! I've posted for review: http://review.gluster.org/12045
Comment 5 Anand Avati 2015-08-30 14:59:34 EDT
REVIEW: http://review.gluster.org/12045 (nfs: Fixes "Remote I/O error" mount failures) posted (#2) for review on release-3.6 by Niels de Vos (ndevos@redhat.com)
Comment 6 Raghavendra Bhat 2015-09-30 08:15:13 EDT
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.6, please open a new bug report.

glusterfs-3.6.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://www.gluster.org/pipermail/gluster-devel/2015-September/046821.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.