+++ This bug was initially created as a clone of Bug #1312816 +++ Description of problem: Afr does dict_ref of the xattr_req that comes to it and deletes "gfid-req" key. Dht uses same dict to send lookup to other subvolumes. So in case of directories and more than 1 dht subvolumes, second subvolume till the last subvolume won't get a lookup request with "gfid-req". So gfid reset never happens on the directories in distributed replicate subvolume for 2nd till last subvolumes. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from Vijay Bellur on 2016-02-29 05:26:17 EST --- REVIEW: http://review.gluster.org/13545 (cluster/afr: Don't delete gfid-req from lookup request) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Vijay Bellur on 2016-03-01 03:58:38 EST --- REVIEW: http://review.gluster.org/13545 (cluster/afr: Don't delete gfid-req from lookup request) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Vijay Bellur on 2016-03-01 21:36:25 EST --- REVIEW: http://review.gluster.org/13545 (cluster/afr: Don't delete gfid-req from lookup request) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Vijay Bellur on 2016-03-02 03:55:01 EST --- COMMIT: http://review.gluster.org/13545 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 9b022c3a3f2f774904b5b458ae065425b46cc15d Author: Pranith Kumar K <pkarampu> Date: Sat Feb 27 23:08:06 2016 +0530 cluster/afr: Don't delete gfid-req from lookup request Problem: Afr does dict_ref of the xattr_req that comes to it and deletes "gfid-req" key. Dht uses same dict to send lookup to other subvolumes. So in case of directories and more than 1 dht subvolumes, second subvolume till the last subvolume won't get a lookup request with "gfid-req". So gfid reset never happens on the directories in distributed replicate subvolume for 2nd till last subvolumes. Fix: Make a copy of lookup xattr request. Also fixed replies_wipe possibly resetting gfid to NULL gfid BUG: 1312816 Change-Id: Ic16260e5a4664837d069c1dc05b9e96ca05bda88 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/13545 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Krutika Dhananjay <kdhananj> --- Additional comment from Vijay Bellur on 2016-03-16 12:28:37 EDT --- REVIEW: http://review.gluster.org/13754 (cluster/afr: Enhance the test to be more robust) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu) --- Additional comment from Vijay Bellur on 2016-03-16 12:40:59 EDT --- REVIEW: http://review.gluster.org/13754 (cluster/afr: Enhance the test to be more robust) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
User-side problem description from support case 01581565 is below. This should help to clarify what the impact of the bug is. Issue is : After clearing gfid using script we perform named lookup and expect that new gfid would get created on all the sub-volumes. But in reality, GFID get created only on one subvolume. On other subvolume, you will find gfid missing.
QATP: ==== 1)create a dist-rep volume and start it 2)now mount the volume and create a directory on the mount 3)check the backend-bricks and the dir should be created on all bricks of all subvols 4) get the gfid from these backend bricks. the gfid should be same 5)now from the backend, simultaneously create a new brick on the bricks directly Thsi means the new dir would not have got any gfid 6)now do a look up from mount Expected result: the lookup must have caused a gfid assign to all the bricks in all subvols check the backend bricks and all subvols and bricks must have the same gfid (previously only first subvol got the gfid) rerun on x3 and on both fuse and client
We should also check if the softlink with the new gfid present in .glusterfs/ab/cd/abcd.... Pranith
Ran the qatp on x2 and x3 volume on glusterfs-server-3.7.9-5.el7rhgs.x86_64 The case has passed and also the softlinks are availble in .glusterfs ALso, I tested with softlinks for the dirs and it worked well Hence moving to verified
Laura, I don't think users understand gfid-reset. May be we should explicitly say that it means 'gfid was cleared from the backend bricks'
Laura, Please note the changes between '*' When a GFID was cleared from *all the backend bricks* of a distributed replicate volume, only the first replica pair received the new GFID. This update ensures all replicas receive new GFIDs. Pranith
Looks good to me.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240