Bug 1115949 - [USS] : "ls -l" on .snaps directory from nfs mount gives " Remote I/O error "
Summary: [USS] : "ls -l" on .snaps directory from nfs mount gives " Remote I/O error "
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: unclassified
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Raghavendra Bhat
QA Contact:
URL:
Whiteboard:
Depends On: 1115899
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-03 11:43 UTC by Raghavendra Bhat
Modified: 2014-11-11 08:36 UTC (History)
4 users (show)

Fixed In Version: glusterfs-3.6.0beta1
Clone Of: 1115899
Environment:
Last Closed: 2014-11-11 08:36:32 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Raghavendra Bhat 2014-07-03 11:43:28 UTC
+++ This bug was initially created as a clone of Bug #1115899 +++

Description of problem:
==========================
when "ls -l " is executed on the ".snaps" directory on a distribute-replicate volume when "features.uss" is  "enabled", the command execution outputs  " Remote I/O error "

Version-Release number of selected component (if applicable):
=============================================================
glusterfs 3.6.0.22 built on Jun 23 2014 10:33:04

How reproducible:
====================
often


Steps to Reproduce:
===================
1.Create a distribute-replicate volume. Start the volume. set the "features.uss" option to "enable"

2. Create nfs mount. Create few files/dirs. 

3. Create a snapshot .

4. set the "features.uss" option to "enable"

5. From nfs mount, cd ".snaps" directory.

Actual results:
===================
root@dj [Jul-02-2014- 4:43:01] >ls -l
ls: snap0: Remote I/O error
ls: snap1: Remote I/O error
total 48
drwxr-xr-x. 53 root root 16384 Jul  2  2014 snap0
drwxr-xr-x. 78 root root 32768 Jul  2  2014 snap1
root@dj [Jul-02-2014- 4:43:02] >

Expected results:
====================
There shouldn't be any error.

Comment 1 Anand Avati 2014-07-03 12:52:33 UTC
REVIEW: http://review.gluster.org/8230 (gfapi: for listxattrs do not do xattr processing) posted (#1) for review on master by Raghavendra Bhat (raghavendra)

Comment 2 Anand Avati 2014-07-04 09:21:20 UTC
REVIEW: http://review.gluster.org/8230 (snapview-server: serve listxattr requests also) posted (#2) for review on master by Raghavendra Bhat (raghavendra)

Comment 3 Anand Avati 2014-07-04 09:41:55 UTC
REVIEW: http://review.gluster.org/8230 (snapview-server: serve listxattr requests also) posted (#3) for review on master by Raghavendra Bhat (raghavendra)

Comment 4 Anand Avati 2014-07-08 06:51:51 UTC
REVIEW: http://review.gluster.org/8230 (make snapview-server more compatible with NFS server) posted (#4) for review on master by Raghavendra Bhat (raghavendra)

Comment 5 Anand Avati 2014-07-09 11:32:48 UTC
REVIEW: http://review.gluster.org/8230 (make snapview-server more compatible with NFS server) posted (#5) for review on master by Raghavendra Bhat (raghavendra)

Comment 6 Anand Avati 2014-07-16 09:28:08 UTC
COMMIT: http://review.gluster.org/8230 committed in master by Vijay Bellur (vbellur) 
------
commit 1dea949cb60c3814c9206df6ba8dddec8d471a94
Author: Raghavendra Bhat <raghavendra>
Date:   Thu Jul 3 17:13:38 2014 +0530

    make snapview-server more compatible with NFS server
    
    * There was no handle based API for listxattr. With this change, glfs_h_getxattrs
      also handles the listxattr functionality by checking whether the name is NULL
      or not (like posix). But all the gfapi functions for listxattr
      (glfs_h_getxattrs AND glfs_listxattr AND glfs_flistxattr) returns the names of
      the xattrs in a buffer provided by the caller. But snapview-server has to
      return the list of xattrs in a dict itself (similar to posix xlator). But
      the buffer just contains the names of the xattrs. So for each xattr, a zero
      byte value is set (i.e. "") into the dict and sent back. Translators which
      do xattr caching (as of now md-cache which caches selinux and acl related
      xattrs) should not cache those xattrs whose value is a zero byte data ("").
      So made changes in md-cache to ignore zero byte values.
    
    * NFS server was not linking the inodes to inode table in readdirp. This was
      leading to applications getting errors. The below set of operations would
      lead to applications getting error
      1) ls -l in one of the snaopshots (snapview-server would generate gfids for
         each entry on the fly and link the inodes associated with those entries)
      2) NFS server upon getting readdirp reply would not link the inodes of the
         entries. But it used to generate filehandles for each entry and associate
         the gfid of that entry with the filehandle and send it as part of the
         reply to nfs client.
      3) NFS client would send the filehandle of one of those entries when some
         activity is done on it.
      4) NFS server would not be able to find the inode for the gfid present in the
         filehandle (as the inode was not linked) and would go for hard resolution
         by sending a lookup on the gfid by creating a new inode.
      5) snapview-client will not able to identify whether the inode is a real inode
         existing in the main volume or a virtual inode existing in the snapshots
         as there would not be any inode context.
      6) Since the gfid upon which lookup is sent is a virtual gfid which is not
         present in the disk, lookup would fail and the application would get an
         error.
    
      To handle above situation, now nfs server also does inode linking in readdirp.
    
    Change-Id: Ibb191408347b6b5f21cff72319ccee619ea77bcd
    BUG: 1115949
    Signed-off-by: Raghavendra Bhat <raghavendra>
    Reviewed-on: http://review.gluster.org/8230
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Niels de Vos <ndevos>
    Reviewed-by: Raghavendra G <rgowdapp>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 7 Anand Avati 2014-07-17 09:53:20 UTC
REVIEW: http://review.gluster.org/8324 (snapview-server: get the handle if its absent before doing any fop) posted (#1) for review on master by Raghavendra Bhat (raghavendra)

Comment 8 Anand Avati 2014-07-17 09:55:17 UTC
REVIEW: http://review.gluster.org/8324 (snapview-server: get the handle if its absent before doing any fop) posted (#2) for review on master by Raghavendra Bhat (raghavendra)

Comment 9 Anand Avati 2014-09-09 04:54:29 UTC
REVIEW: http://review.gluster.org/8324 (snapview-server: get the handle if its absent before doing any fop) posted (#3) for review on master by Raghavendra Bhat (raghavendra)

Comment 10 Anand Avati 2014-09-11 07:20:09 UTC
REVIEW: http://review.gluster.org/8324 (snapview-server: get the handle if its absent before doing any fop) posted (#4) for review on master by Raghavendra Bhat (raghavendra)

Comment 11 Anand Avati 2014-09-12 17:55:10 UTC
COMMIT: http://review.gluster.org/8324 committed in master by Vijay Bellur (vbellur) 
------
commit 5d6f55ed9f122d3aeab583bb0ad16cb0c392a339
Author: Raghavendra Bhat <raghavendra>
Date:   Thu Jul 17 12:15:54 2014 +0530

    snapview-server: get the handle if its absent before doing any fop
    
    * Now that NFS server does inode linking in readdirp, it can resolve the
      gfid (i.e. find the right inode from its inode table) present in the
      filehandle sent by the NFS client on which a fop came. So instead of
      sending the lookup on that entry, it directly sends the fop. But
      snapview-server does not get the handle for the entries in readdirp
      (because doing a lookup on each entry via gfapi would be costly. So it
       waits till a lookup is done on that inode, to get the handle and the
       fs instance and fill it in the inode context). So when NFS resoves the
       gfid and directly sends the fop, snapview-server will not be able to
       perform the fop as the inode contet would not contain the fs instance
       and the handle. So fops should check for the handle before doing gfapi
       calls. If the handle and fs instance are not present in the inode context
       they should get them by doing an explicit lookup on the entry.
    
    Change-Id: Idd648fbcc3ff6aadc3b63ff236561ca967b92f5d
    BUG: 1115949
    Signed-off-by: Raghavendra Bhat <raghavendra>
    Reviewed-on: http://review.gluster.org/8324
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 12 Niels de Vos 2014-09-22 12:44:24 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 13 Niels de Vos 2014-11-11 08:36:32 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users


Note You need to log in before you can comment on or make changes to this bug.