Bug 1244100

Summary: using fop's dict for resolving causes problems
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: protocolAssignee: Raghavendra Bhat <rabhat>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.7.2CC: bugs, gluster-bugs, rgowdapp
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1244613 (view as bug list) Environment:
Last Closed: 2015-07-30 09:50:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1244613, 1252348, 1255669    

Description Raghavendra Bhat 2015-07-17 06:34:42 UTC
Description of problem:

protocol/server tries to resolve the inodes (of both parent and the entry dipeneding upon the fop) whenever a fop comes to it, before continuing the fop. If it cannot find the inode in the inode table for a gfid (soft resolve), then a lookup is sent on it (hard resolve) to build the inode into the inode table. For sending the lookup as part of resolve it uses same xdata as that of the fop. This causes problems in the below situation.

Lru limit has reached. Because of that some of the inodes have been purged. If all the inodes for the dentries present in a directory are purged, then there is no one holding the ref on the parent directory's inode. So soon it also gets moved to the lru list from the active list and might get purged as well if no one holds a ref on that (means, no new entries are created in it and old entries present in it are not looked up).

Now if a create operation comes within that directory, protocol/server is not able to find the inode for the parent directory's gfid and sends a lookup on it. Now if some xlator wants to get some extended attributes as part of lookup, then they add their xattr names in the xdata (ex: bit-rot-stub adding version, sign and bad-object keys into the dict. And the xdata used here is same as that of the create fop's xdata). The lookup succeeds and now create happens with extra xattr names added in the xdata and posix as part of create sets those xattrs. Since bad-object key is also present, this leads to the object that is created to be treated as bad object for rest of its life and will not allow any i/o on it.

Version-Release number of selected component (if applicable):


How reproducible:
Easilu if inode-lru-limit is set to 1.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

http://review.gluster.org/11661 has been sent to master and accepted.

Comment 1 Anand Avati 2015-07-17 06:37:49 UTC
REVIEW: http://review.gluster.org/11703 (protocol/server: use different dict for resolving) posted (#1) for review on release-3.7 by Raghavendra Bhat (raghavendra)

Comment 2 Anand Avati 2015-07-23 06:46:22 UTC
COMMIT: http://review.gluster.org/11703 committed in release-3.7 by Raghavendra G (rgowdapp) 
------
commit 960b99577bbef18add4087599faffa43f09c1dd6
Author: Raghavendra Bhat <raghavendra>
Date:   Tue Jul 14 16:16:00 2015 +0530

    protocol/server: use different dict for resolving
    
                     Backport of http://review.gluster.org/11661
    
    protocol/server has to resolve the inode before continuing with any fop coming
    from the clients. For resolving it, server xlator was using the same dict
    associated with the fop. It causes problems in some situations.
    
    If a directory's inode was forgotten because of lru limit being exceeded, then
    when a create fop comes for an entry within that directory, server tries to
    resolve it. But since the parent directory's inode is not found in the inode
    table, it tries to do a hard resolve by doing a lookup on the parent gfid.
    
    If any xlator below server wants to get some extended attributes whenever
    lookup comes, then they set the new keys in the same dict that came along with
    the create fop. Now, the lookup of the parent succeeds and the create fop
    proceeds with the same dict (with extra keys present). posix xlaror creates
    those xattrs that are present in the dict. Thus the xattrs which were not to
    be present by default are also set as part of create. (Ex: bit-rot related
    xattrs such as bad-file, version and sign xattrs)
    
    Change-Id: I62b0b012b0af3c92df6fced61f87dd0b6b015d4c
    BUG: 1244100
    Signed-off-by: Raghavendra Bhat <raghavendra>
    Reviewed-on: http://review.gluster.org/11703
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 3 Kaushal 2015-07-30 09:50:50 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.3, please open a new bug report.

glusterfs-3.7.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/12078
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user