Bug 1116150
Summary: | [DHT:REBALANCE]: Rebalance failures are seen with error message " remote operation failed: File exists" | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | vsomyaju | |
Component: | distribute | Assignee: | Nagaprasad Sathyanarayana <nsathyan> | |
Status: | CLOSED EOL | QA Contact: | ||
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | pre-release | CC: | bugs, gluster-bugs, nsathyan, shmohan, smohan, surs | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | 1110694 | |||
: | 1129541 1138385 1139995 (view as bug list) | Environment: | ||
Last Closed: | 2015-10-22 15:40:20 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1110694 | |||
Bug Blocks: | 1129541, 1138385, 1139995 |
Comment 1
Anand Avati
2014-07-03 21:43:16 UTC
REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#2) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#3) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#4) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#5) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8231 (uster/dht: Fix races to avoid deletion of linkto file) posted (#6) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#7) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#8) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#9) for review on master by venkatesh somyajulu (vsomyaju) COMMIT: http://review.gluster.org/8231 committed in master by Vijay Bellur (vbellur) ------ commit 74d92e322e3c9f4f70ddfbf9b0e2140922009658 Author: Venkatesh Somyajulu <vsomyaju> Date: Tue Jul 15 18:17:19 2014 +0530 cluster/dht: Fix races to avoid deletion of linkto file Explanation of Race between rebalance processes: https://bugzilla.redhat.com/show_bug.cgi?id=1110694#c4 STATE 1: BRICK-1 only one brick Cached File in the system STATE 2: Add brick-2 BRICK-1 BRICK-2 STATE 3: Lookup of File on brick-2 by this node's rebalance will fail because hashed file is not created yet. So dht_lookup_everywhere is about to get called. STATE 4: As part of lookup link file at brick-2 will be created. STATE 5: getxattr to check that cached file belongs to this node is done STATE 6: dht_lookup_everywhere_cbk detects the link created by rebalance-1. It will unlink it. STATE 7: getxattr at the link file with "pathinfo" key will be called will fail as the link file is deleted by rebalance on node-2 Fix: So in the STATE 6, we should avoid the deletion of link file. Every time dht_lookup_everywhere gets called, lookup will be performed on all the nodes. So to avoid STATE 6, if linkto file is found, it is not deleted until valid case is found in dht_lookup_everywhere_done. Case 1: if linkto file points to cached node, and cached file exists, uwind with success. Case 2: if linkto does not point to current cached node, and cached file exists: a) Unlink stale link file b) Create new link file Case 3: Only linkto file exists: Delete linkto file Case 4: Only cached file Create link file (Handled event without patch) Case 5: Neither cached nor hashed file is present Return with ENOENT (handled even without patch) Change-Id: Ibf53671410d8d613b8e2e7e5d0ec30fc7dcc0298 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju> Reviewed-on: http://review.gluster.org/8231 Reviewed-by: Vijay Bellur <vbellur> Tested-by: Vijay Bellur <vbellur> REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#1) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#2) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#3) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#4) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#5) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#6) for review on master by venkatesh somyajulu (vsomyaju) COMMIT: http://review.gluster.org/8345 committed in master by Vijay Bellur (vbellur) ------ commit 966997992bdbd5fffc632bf705678e287ed50bf7 Author: Venkatesh Somyajulu <vsomyaju> Date: Fri Jul 25 17:21:04 2014 +0530 cluster/dht: Modified logic of linkto file deletion on non-hashed Currently whenever dht_lookup_everywhere gets called, if in dht_lookup_everywhere_cbk, a linkto file is found on non-hashed subvolume, file is unlinked. But there are cases when this file is under migration. Under such condition, we should avoid deletion of file. When some other rebalance process changes the layout of parent such that dst_file (w.r.t. migration) falls on non-hashed node, then may be lookup could have found it as linkto file but just before unlink, file is under migration or already migrated In such cased unlink can be avoided. Race: ------- If we have two bricks (brick-1 and brick-2) with initial file "a" under BaseDir which is hashed as well as cached on (brick-1). Assume "a" hashing gives 44. Brick-1 Brick-2 Initial Setup: BaseDir/a BaseDir [1-50] [51-100] Now add new-brick Brick-3. 1. Rebalance-1 on node Node-1 (Brick-1 node) will reset the BaseDir Layout. 2. After that it will perform a) Create linkto file on new-hashed (brick-2) b) Perform file migration. 1.Rebalance-1 Fixes the base-layout: Brick-1 Brick-2 Brick-3 --------- ---------- ------------ BaseDir/a BaseDir BaseDir [1-33] [34-66] [67-100] 2. Only a) is BaseDir/a BaseDir/a(linkto) BaseDir performed Create linktofile Now rebalance 2 on node-2 jumped in and it will perform step 1 and 2-a. After (rebal-2, step-1), it changes the layout of the BaseDir. BaseDir/a BaseDir/a(link) BaseDir [67-100] [1-33] [34-66] For (rebale-2, step-2), It will perform lookup at Brick-3 as w.r.t new layout 44 falls for brick-3. But lookup will fail. So dht_lookup_everywhere gets called. NOTE: On brick-2 by rebalance-1, a linkto file was created. Currently that linkto files gets deleted by rebalance-2 lookup as it is considered as stale linkto file. But with patch if rebalance is already in progress or rebalance is over, linkto file will not be unlinked. If rebalance is in progress fd will be open and if rebalance is over then linkto file wont be set. Change-Id: I3fee0d28de3c76197325536a9e30099d2413f079 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju> Reviewed-on: http://review.gluster.org/8345 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra G <rgowdapp> Reviewed-by: Shyamsundar Ranganathan <srangana> Reviewed-by: Vijay Bellur <vbellur> REVIEW: http://review.gluster.org/8428 (cluster/dht: Added keys in dht_lookup_everywhere_done) posted (#1) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8428 (cluster/dht: Added keys in dht_lookup_everywhere_done) posted (#2) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8355 (cluster/dht: Added code to capture races in dht/rebalance) posted (#4) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8429 (cluster/dht: Added keys in dht_lookup_everywhere_done) posted (#1) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8429 (cluster/dht: Added keys in dht_lookup_everywhere_done) posted (#2) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht/rebalance) posted (#1) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8449 (storage/posix: Dont unlink .glusterfs-hardlink before linkto check) posted (#2) for review on master by venkatesh somyajulu (vsomyaju) COMMIT: http://review.gluster.org/8429 committed in master by Vijay Bellur (vbellur) ------ commit 718f10e0d68715be2d73e677974629452485c699 Author: Venkatesh Somyajulu <vsomyaju> Date: Thu Aug 7 16:28:48 2014 +0530 cluster/dht: Added keys in dht_lookup_everywhere_done Case where both cached (C1) and hashed file are found, but hash does not point to above cached node (C1), then dont unlink if either fd-is-open on hashed or linkto-xattr is not found. Change-Id: I7ef49b88d2c88bf9d25d3aa7893714e6c0766c67 BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju> Change-Id: I86d0a21d4c0501c45d837101ced4f96d6fedc5b9 Signed-off-by: Venkatesh Somyajulu <vsomyaju> Reviewed-on: http://review.gluster.org/8429 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: susant palai <spalai> Reviewed-by: Raghavendra G <rgowdapp> Reviewed-by: Vijay Bellur <vbellur> REVIEW: http://review.gluster.org/8513 (cluster/dht: Added code to capture races in dht-lookup path) posted (#1) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#2) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8449 (storage/posix: Dont unlink .glusterfs-hardlink before linkto check) posted (#3) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8449 (storage/posix: Dont unlink .glusterfs-hardlink before linkto check) posted (#4) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8449 (storage/posix: Don't unlink .glusterfs-hardlink before linkto check) posted (#5) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#3) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8449 (storage/posix: Don't unlink .glusterfs-hardlink before linkto check) posted (#6) for review on master by Raghavendra G (rgowdapp) REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#4) for review on master by Raghavendra G (rgowdapp) REVIEW: http://review.gluster.org/8449 (storage/posix: Don't unlink .glusterfs-hardlink before linkto check) posted (#7) for review on master by Raghavendra G (rgowdapp) REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#5) for review on master by Raghavendra G (rgowdapp) REVIEW: http://review.gluster.org/8449 (storage/posix: Don't unlink .glusterfs-hardlink before linkto check) posted (#8) for review on master by Raghavendra G (rgowdapp) REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#6) for review on master by Raghavendra G (rgowdapp) REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#7) for review on master by Raghavendra G (rgowdapp) REVIEW: http://review.gluster.org/8559 (storage/posix: Don't unlink .glusterfs-hardlink before linkto check) posted (#1) for review on master by venkatesh somyajulu (vsomyaju) COMMIT: http://review.gluster.org/8559 committed in master by Vijay Bellur (vbellur) ------ commit b23be2e7581c6aa295053dc8866cab841ae374b6 Author: Venkatesh Somyajulu <vsomyaju> Date: Fri Aug 22 17:07:15 2014 +0530 storage/posix: Don't unlink .glusterfs-hardlink before linkto check BUG: 1116150 Change-Id: I90a10ac54123fbd8c7383ddcbd04e8879ae51232 Signed-off-by: Venkatesh Somyajulu <vsomyaju> Reviewed-on: http://review.gluster.org/8559 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: N Balachandran <nbalacha> Reviewed-by: Vijay Bellur <vbellur> REVIEW: http://review.gluster.org/8561 (cluster/dht: Added code to capture races in dht-lookup path) posted (#1) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8561 (cluster/dht: Added code to capture races in dht-lookup path) posted (#2) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#8) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8585 (storage/posix : Missing space in log message) posted (#1) for review on master by N Balachandran (nbalacha) REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#9) for review on master by venkatesh somyajulu (vsomyaju) REVIEW: http://review.gluster.org/8587 (storage/posix: Added space in log message.) posted (#1) for review on master by venkatesh somyajulu (vsomyaju) COMMIT: http://review.gluster.org/8585 committed in master by Vijay Bellur (vbellur) ------ commit e03559c20ab37f1a7db54a367258bb1cd005e50d Author: Nithya Balachandran <nbalacha> Date: Wed Sep 3 14:18:00 2014 +0530 storage/posix : Missing space in log message Added a space in a log message Change-Id: Iabd50e6b5c9ff4673f59d6b52b785894b3dcdaf9 BUG: 1116150 Signed-off-by: Nithya Balachandran <nbalacha> Reviewed-on: http://review.gluster.org/8585 Reviewed-by: Vijay Bellur <vbellur> Tested-by: Gluster Build System <jenkins.com> COMMIT: http://review.gluster.org/8430 committed in master by Vijay Bellur (vbellur) ------ commit bb2d5f49b5684e6484af16a580870cfe104aecd2 Author: Venkatesh Somyajulu <vsomyaju> Date: Wed Sep 3 14:42:43 2014 +0530 cluster/dht: Added code to capture races in dht-lookup path Change-Id: I9270d2d40ebd4b113ff961583dfda7754741f15b BUG: 1116150 Signed-off-by: Venkatesh Somyajulu <vsomyaju> Reviewed-on: http://review.gluster.org/8430 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Vijay Bellur <vbellur> pre-release version is ambiguous and about to be removed as a choice. If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it. |