Bug 1336320 - [Tiering]: Unable to access file(s) from nfs client; gfid mismatch between cold and hot tier entries
Summary: [Tiering]: Unable to access file(s) from nfs client; gfid mismatch between c...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: tiering
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Nithya Balachandran
QA Contact: bugs@gluster.org
URL:
Whiteboard:
Depends On: 1334577
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-05-16 07:45 UTC by Nithya Balachandran
Modified: 2018-08-29 03:35 UTC (History)
4 users (show)

Fixed In Version: glusterfs-4.1.3 (or later)
Clone Of: 1334577
Environment:
Last Closed: 2018-08-29 03:35:42 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Nithya Balachandran 2016-05-16 07:50:37 UTC
The issue is as follows:
The dht_linkfile_create_cbk () function does a lookup if the create call fails with EEXIST. The lookup uses the same frame as that for the create so the op_ret and op_errno returned by the mknod call is overwritten by that of the lookup. As the linkfile creation apparently succeeds, tier proceeds to create the data file, but using the gfid-req value which does not match the gfid on the already present linktofile.

Comment 2 Nithya Balachandran 2016-05-16 07:53:43 UTC
RCA:

I was able to hit this using gdb using the following steps:

1. mount the tier volume using gluster-nfs
2. gdb into the nfs process and set a breakpoint at tier_create_linkfile_create_cbk
3. On the mount point, create a file using "touch f3"
4. Once the breakpoint is hit, run 'gluster v start <volname> force'. This will restart the NFS server. The create call is thus aborted after the linkto file is created but before the data file is.
5. Once the touch command returns, check the gfids of the linkfile and the data file on the brick. They will be different.


[root@nb-rhs3-srv1 bricks]# getx brick2/hot-*/f3
# file: brick2/hot-3/f3
trusted.gfid=0x4fabbe60d86b402cb064356db35cf798

[root@nb-rhs3-srv1 bricks]# getx brick1/gs1-*/f3
# file: brick1/gs1-3/f3
trusted.gfid=0xbf1f836f07174259b6332759cc58e867
trusted.tier.tier-dht.linkto=0x6773312d686f742d64687400


It looks like the NFS client sends the create call again without a lookup.

This issue also exists in 3.1.2 (reproducible using the same set of steps). So I am removing the Regression keyword.

--- Additional comment from Nithya Balachandran on 2016-05-12 06:43:20 EDT ---

The same issue exists in dht.

Comment 3 Vijay Bellur 2016-05-16 07:58:02 UTC
REVIEW: http://review.gluster.org/14352 (cluster/dht : Use a new frame for linkfile lookup) posted (#1) for review on master by N Balachandran (nbalacha)

Comment 4 Amar Tumballi 2018-08-29 03:35:42 UTC
This update is done in bulk based on the state of the patch and the time since last activity. If the issue is still seen, please reopen the bug.


Note You need to log in before you can comment on or make changes to this bug.