Description of problem: Implement directory heal for ec Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata/name/entry heal implementation for ec) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10240 (libglusterfs: Implement cluster-syncop) posted (#5) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata/name/entry heal implementation for ec) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10240 (libglusterfs: Implement cluster-syncop) posted (#6) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata/name/entry heal implementation for ec) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10240 (libglusterfs: Implement cluster-syncop) posted (#7) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata/name/entry heal implementation for ec) posted (#5) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10240 (libglusterfs: Implement cluster-syncop) posted (#8) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata/name/entry heal implementation for ec) posted (#6) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10240 (libglusterfs: Implement cluster-syncop) posted (#9) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata/name/entry heal implementation for ec) posted (#7) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10240 (libglusterfs: Implement cluster-syncop) posted (#10) for review on master by Vijay Bellur (vbellur)
COMMIT: http://review.gluster.org/10240 committed in master by Vijay Bellur (vbellur) ------ commit 557ea3781e984f5f3cf206dd4b8d0a81c8cbdb58 Author: Pranith Kumar K <pkarampu> Date: Tue Apr 14 13:45:33 2015 +0530 libglusterfs: Implement cluster-syncop This patch implements syncop equivalent for cluster of xlators. The xlators on which the fop needs to be performed is taken in input arguments to the functions and the responses are gathered and provided as the output. This idea is taken from afr-v2 self-heal implementation by Avati. Change-Id: I2b568f4340cf921a65054b8ab0df7edc4478b5ca BUG: 1213358 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/10240 Reviewed-by: Krutika Dhananjay <kdhananj> Tested-by: NetBSD Build System Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata-heal implementation for ec) posted (#8) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata-heal implementation for ec) posted (#9) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata-heal implementation for ec) posted (#10) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata-heal implementation for ec) posted (#11) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata/name/entry heal implementation for ec) posted (#12) for review on master by Pranith Kumar Karampuri (pkarampu)
REVIEW: http://review.gluster.org/10298 (cluster/ec: metadata/name/entry heal implementation for ec) posted (#13) for review on master by Pranith Kumar Karampuri (pkarampu)
COMMIT: http://review.gluster.org/10298 committed in master by Vijay Bellur (vbellur) ------ commit 33fdc310700da74a4142dab48d00c4753100904b Author: Pranith Kumar K <pkarampu> Date: Thu Apr 16 09:25:31 2015 +0530 cluster/ec: metadata/name/entry heal implementation for ec Metadata self-heal: 1) Take inode lock in domain 'this->name' on 0-0 range (full file) 2) perform lookup and get the xattrs on all the bricks 3) Choose the brick with highest version as source 4) Setattr uid/gid/permissions 5) removexattr stale xattrs 6) Setxattr existing/new xattrs 7) xattrop with -ve values of 'dirty' and difference of highest and its own version values for version xattr 8) unlock lock acquired in 1) Entry self-heal: 1) take directory lock in domain 'this->name:self-heal' on 'NULL' to prevent more than one self-heal 2) we take directory lock in domain 'this->name' on 'NULL' 3) Perform lookup on version, dirty and remember the values 4) unlock lock acquired in 2) 5) readdir on all the bricks and trigger name heals 6) xattrop with -ve values of 'dirty' and difference of highest and its own version values for version xattr 7) unlock lock acquired in 1) Name heal: 1) Take 'name' lock in 'this->name' on 'NULL' 2) Perform lookup on 'name' and get stat and xattr structures 3) Build gfid_db where for each gfid we know what subvolumes/bricks have a file with 'name' 4) Delete all the stale files i.e. the file does not exist on more than ec->redundancy number of bricks 5) On all the subvolumes/bricks with missing entry create 'name' with same type,gfid,permissions etc. 6) Unlock lock acquired in 1) Known limitation: At the moment with present design, it conservatively preserves the 'name' in case it can not decide whether to delete it. this can happen in the following scenario: 1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A) 2) rename d1/f1 -> d2/f2 is performed but the rename is successful only on one of the bricks (Lets say B) 3) Now name self-heal on d1 and d2 would re-create the file on both d1 and d2 resulting in d1/f1 and d2/f2. Because we wanted to prevent data loss in the case above, the following scenario is not healable, i.e. it needs manual intervention: 1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A) 2) We have two hard links: d1/a, d2/b and another file d3/c even before the brick went down 3) rename d3/c -> d2/b is performed 4) Now name self-heal on d2/b doesn't heal because d2/b with older gfid will not be deleted. One could think why not delete the link if there is more than 1 hardlink, but that leads to similar data loss issue I described earlier: Scenario: 1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A) 2) We have two hard links: d1/a, d2/b 3) rename d1/a -> d3/c, d2/b -> d4/d is performed and both the operations are successful only on one of the bricks (Lets say B) 4) Now name self-heal on the 'names' above which can happen in parallel can decide to delete the file thinking it has 2 links but after all the self-heals do unlinks we are left with data loss. Change-Id: I3a68218a47bb726bd684604efea63cf11cfd11be BUG: 1213358 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: http://review.gluster.org/10298 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user