Bug 1211037
Summary: | [dist-geo-rep]:Directory not empty and Stale file handle errors in geo-rep logs during deletes from master in history/changelog crawl | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Aravinda VK <avishwan> | |
Component: | geo-replication | Assignee: | Aravinda VK <avishwan> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | urgent | Docs Contact: | ||
Priority: | high | |||
Version: | mainline | CC: | aavati, avishwan, bugs, csaba, gluster-bugs, nlevinki, rhinduja, rhs-bugs, smanjara, storage-qa-internal, vagarwal | |
Target Milestone: | --- | Keywords: | Reopened | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.8rc2 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1201732 | |||
: | 1218922 (view as bug list) | Environment: | ||
Last Closed: | 2016-06-16 12:49:31 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1201732 | |||
Bug Blocks: | 1218922 |
Description
Aravinda VK
2015-04-12 12:30:52 UTC
REVIEW: http://review.gluster.org/10204 (geo-rep: Minimize rm -rf race in Geo-rep) posted (#1) for review on master by Aravinda VK (avishwan) REVIEW: http://review.gluster.org/10204 (geo-rep: Minimize rm -rf race in Geo-rep) posted (#2) for review on master by Aravinda VK (avishwan) COMMIT: http://review.gluster.org/10204 committed in master by Vijay Bellur (vbellur) ------ commit 08107796c89f5f201b24d689ab6757237c743c0d Author: Aravinda VK <avishwan> Date: Sun Apr 12 17:46:45 2015 +0530 geo-rep: Minimize rm -rf race in Geo-rep While doing RMDIR worker gets ENOTEMPTY because same directory will have files from other bricks which are not deleted since that worker is slow processing. So geo-rep does recursive_delete. Recursive delete was done using shutil.rmtree. once started, it will not check disk_gfid in between. So it ends up deleting the new files created by other workers. Also if other worker creates files after one worker gets list of files to be deleted, then first worker will again get ENOTEMPTY again. To fix these races, retry is added when it gets ENOTEMPTY/ESTALE/ENODATA. And disk_gfid check added for original path for which recursive_delete is called. This disk gfid check executed before every Unlink/Rmdir. If disk gfid is not matching with GFID from Changelog, that means other worker deleted the directory. Even if the subdir/file present, it belongs to different parent. Exit without performing further deletes. Retry on ENOENT during create is ignored, since if CREATE/MKNOD/MKDIR failed with ENOENT will not succeed unless parent directory is created again. Rsync errors handling was handling unlinked_gfids_list only for one Changelog, but when processed in batch it fails to detect unlinked_gfids and retries again. Finally skips the entire Changelogs in that batch. Fixed this issue by moving self.unlinked_gfids reset logic before batch start and after batch end. Most of the Geo-rep races with rm -rf is eliminated with this patch, but in some cases stale directories left in some bricks and in mount point we get ENOTEMPTY.(DHT issue, Error will be logged in Slave log) BUG: 1211037 Change-Id: I8716b88e4c741545f526095bf789f7c1e28008cb Signed-off-by: Aravinda VK <avishwan> Reviewed-on: http://review.gluster.org/10204 Reviewed-by: Kotresh HR <khiremat> Tested-by: Gluster Build System <jenkins.com> Tested-by: NetBSD Build System Reviewed-by: Vijay Bellur <vbellur> *** Bug 1112531 has been marked as a duplicate of this bug. *** This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report. This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user |