Bug 1116150

Summary: [DHT:REBALANCE]: Rebalance failures are seen with error message " remote operation failed: File exists"
Product: [Community] GlusterFS Reporter: vsomyaju
Component: distributeAssignee: Nagaprasad Sathyanarayana <nsathyan>
Status: CLOSED EOL QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: pre-releaseCC: bugs, gluster-bugs, nsathyan, shmohan, smohan, surs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1110694
: 1129541 1138385 1139995 (view as bug list) Environment:
Last Closed: 2015-10-22 15:40:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1110694    
Bug Blocks: 1129541, 1138385, 1139995    

Comment 1 Anand Avati 2014-07-03 21:43:16 UTC
REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#1) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 2 Anand Avati 2014-07-09 07:22:52 UTC
REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#2) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 3 Anand Avati 2014-07-09 09:32:34 UTC
REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#3) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 4 Anand Avati 2014-07-10 13:03:26 UTC
REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#4) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 5 Anand Avati 2014-07-10 21:06:33 UTC
REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#5) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 6 Anand Avati 2014-07-14 11:52:12 UTC
REVIEW: http://review.gluster.org/8231 (uster/dht: Fix races to avoid deletion of linkto file) posted (#6) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 7 Anand Avati 2014-07-14 12:05:36 UTC
REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#7) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 8 Anand Avati 2014-07-15 12:48:37 UTC
REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#8) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 9 Anand Avati 2014-07-18 09:52:14 UTC
REVIEW: http://review.gluster.org/8231 (cluster/dht: Fix races to avoid deletion of linkto file) posted (#9) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 10 Anand Avati 2014-07-18 10:11:18 UTC
COMMIT: http://review.gluster.org/8231 committed in master by Vijay Bellur (vbellur@redhat.com) 
------
commit 74d92e322e3c9f4f70ddfbf9b0e2140922009658
Author: Venkatesh Somyajulu <vsomyaju@redhat.com>
Date:   Tue Jul 15 18:17:19 2014 +0530

    cluster/dht: Fix races to avoid deletion of linkto file
    
    Explanation of Race between rebalance processes:
    https://bugzilla.redhat.com/show_bug.cgi?id=1110694#c4
    
    STATE 1:                          BRICK-1
    only one brick                   Cached File
    in the system
    
    STATE 2:
    Add brick-2                       BRICK-1                BRICK-2
    
    STATE 3:                                       Lookup of File on brick-2
                                                   by this node's rebalance
                                                   will fail because hashed
                                                   file is not created yet.
                                                   So dht_lookup_everywhere is
                                                   about to get called.
    
    STATE 4:                         As part of lookup
                                     link file at brick-2
                                     will be created.
    
    STATE 5:                         getxattr to check that
                                     cached file belongs to
                                     this node is done
    
    STATE 6:
    
                                                dht_lookup_everywhere_cbk detects
                                                the link created by rebalance-1.
                                                It will unlink it.
    
    STATE 7:                        getxattr at the link
                                    file with "pathinfo" key
                                    will be called will fail
                                    as the link file is deleted
                                    by rebalance on node-2
    
    Fix:
    So in the STATE 6, we should avoid the deletion of link file. Every time
    dht_lookup_everywhere gets called, lookup will be performed on all the nodes.
    So to avoid STATE 6, if linkto file is found, it is not deleted until valid
    case is found in dht_lookup_everywhere_done.
    
    Case 1: if linkto file points to cached node, and cached file exists,
            uwind with success.
    
    Case 2: if linkto does not point to current cached node, and cached file
            exists:
            a) Unlink stale link file
            b) Create new link file
    
    Case 3: Only linkto file exists:
            Delete linkto file
    
    Case 4: Only cached file
            Create link file (Handled event without patch)
    
    Case 5: Neither cached nor hashed file is present
            Return with ENOENT (handled even without patch)
    
    Change-Id: Ibf53671410d8d613b8e2e7e5d0ec30fc7dcc0298
    BUG: 1116150
    Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
    Reviewed-on: http://review.gluster.org/8231
    Reviewed-by: Vijay Bellur <vbellur@redhat.com>
    Tested-by: Vijay Bellur <vbellur@redhat.com>

Comment 11 Anand Avati 2014-07-22 10:13:30 UTC
REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#1) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 12 Anand Avati 2014-07-22 10:25:34 UTC
REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#2) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 13 Anand Avati 2014-07-23 07:03:43 UTC
REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#3) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 14 Anand Avati 2014-07-23 07:23:51 UTC
REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#4) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 15 Anand Avati 2014-07-25 11:52:33 UTC
REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#5) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 16 Anand Avati 2014-07-25 20:20:09 UTC
REVIEW: http://review.gluster.org/8345 (cluster/dht: Modified logic of linkto file deletion on non-hashed) posted (#6) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 17 Anand Avati 2014-08-01 05:46:42 UTC
COMMIT: http://review.gluster.org/8345 committed in master by Vijay Bellur (vbellur@redhat.com) 
------
commit 966997992bdbd5fffc632bf705678e287ed50bf7
Author: Venkatesh Somyajulu <vsomyaju@redhat.com>
Date:   Fri Jul 25 17:21:04 2014 +0530

    cluster/dht: Modified logic of linkto file deletion on non-hashed
    
    Currently whenever dht_lookup_everywhere gets called, if in
    dht_lookup_everywhere_cbk, a linkto file is found on non-hashed
    subvolume, file is unlinked. But there are cases when this file
    is under migration. Under such condition, we should avoid deletion
    of file.
    
    When  some other rebalance process changes the layout of parent
    such that dst_file (w.r.t. migration) falls on non-hashed node,
    then may be lookup could have found it as linkto file but just
    before unlink, file  is under migration or already migrated
    In such cased unlink can be avoided.
    
    Race:
    -------
    If we have two bricks (brick-1 and brick-2) with initial file "a"
    under BaseDir which is hashed as well as cached on (brick-1).
    
    Assume "a"  hashing gives 44.
    
                                  Brick-1              Brick-2
    
    Initial Setup:               BaseDir/a             BaseDir
                                 [1-50]                [51-100]
    
    Now add new-brick Brick-3.
    
    1. Rebalance-1 on node Node-1 (Brick-1 node) will reset
    the BaseDir Layout.
    
    2. After that it will perform
    a)  Create linkto file on  new-hashed (brick-2)
    b)  Perform file migration.
    
    1.Rebalance-1 Fixes the base-layout:
                     Brick-1             Brick-2           Brick-3
                     ---------         ----------         ------------
                     BaseDir/a            BaseDir           BaseDir
                      [1-33]              [34-66]           [67-100]
    
    2. Only a) is     BaseDir/a          BaseDir/a(linkto)   BaseDir
       performed                         Create linktofile
    
    Now rebalance 2 on node-2 jumped in and it will perform
    step 1 and 2-a.
    
    After (rebal-2, step-1), it changes the layout of the BaseDir.
                        BaseDir/a     BaseDir/a(link)    BaseDir
                        [67-100]           [1-33]        [34-66]
    
    For  (rebale-2, step-2), It will perform lookup at Brick-3 as w.r.t new
    layout 44 falls for brick-3. But lookup will fail.
    So  dht_lookup_everywhere gets called.
    
    NOTE: On brick-2 by rebalance-1, a linkto file was created.
    
    Currently that linkto files gets deleted by rebalance-2 lookup as it
    is considered as stale linkto file.  But  with patch if rebalance is
    already in progress or rebalance is over,  linkto file will not be
    unlinked. If rebalance is in progress fd will be  open and if rebalance
    is over then linkto file wont be set.
    
    Change-Id: I3fee0d28de3c76197325536a9e30099d2413f079
    BUG: 1116150
    Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
    Reviewed-on: http://review.gluster.org/8345
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
    Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
    Reviewed-by: Vijay Bellur <vbellur@redhat.com>

Comment 18 Anand Avati 2014-08-07 10:06:04 UTC
REVIEW: http://review.gluster.org/8428 (cluster/dht: Added keys in dht_lookup_everywhere_done) posted (#1) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 19 Anand Avati 2014-08-07 10:39:32 UTC
REVIEW: http://review.gluster.org/8428 (cluster/dht: Added keys in dht_lookup_everywhere_done) posted (#2) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 20 Anand Avati 2014-08-07 11:02:24 UTC
REVIEW: http://review.gluster.org/8355 (cluster/dht: Added code to capture races in dht/rebalance) posted (#4) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 21 Anand Avati 2014-08-07 11:02:30 UTC
REVIEW: http://review.gluster.org/8429 (cluster/dht: Added keys in dht_lookup_everywhere_done) posted (#1) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 22 Anand Avati 2014-08-07 11:11:15 UTC
REVIEW: http://review.gluster.org/8429 (cluster/dht: Added keys in dht_lookup_everywhere_done) posted (#2) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 23 Anand Avati 2014-08-07 11:11:23 UTC
REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht/rebalance) posted (#1) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 24 Anand Avati 2014-08-10 14:30:41 UTC
REVIEW: http://review.gluster.org/8449 (storage/posix: Dont unlink .glusterfs-hardlink before linkto check) posted (#2) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 25 Anand Avati 2014-08-14 17:59:15 UTC
COMMIT: http://review.gluster.org/8429 committed in master by Vijay Bellur (vbellur@redhat.com) 
------
commit 718f10e0d68715be2d73e677974629452485c699
Author: Venkatesh Somyajulu <vsomyaju@redhat.com>
Date:   Thu Aug 7 16:28:48 2014 +0530

    cluster/dht: Added keys in dht_lookup_everywhere_done
    
    Case where both cached  (C1)  and hashed file are found,
    but hash does not point to above cached node (C1), then
    dont unlink if either fd-is-open on hashed or
    linkto-xattr is not found.
    
    Change-Id: I7ef49b88d2c88bf9d25d3aa7893714e6c0766c67
    BUG: 1116150
    Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
    
    Change-Id: I86d0a21d4c0501c45d837101ced4f96d6fedc5b9
    Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
    Reviewed-on: http://review.gluster.org/8429
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: susant palai <spalai@redhat.com>
    Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
    Reviewed-by: Vijay Bellur <vbellur@redhat.com>

Comment 26 Anand Avati 2014-08-22 07:55:07 UTC
REVIEW: http://review.gluster.org/8513 (cluster/dht: Added code to capture races in dht-lookup path) posted (#1) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 27 Anand Avati 2014-08-22 07:56:41 UTC
REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#2) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 28 Anand Avati 2014-08-22 10:30:09 UTC
REVIEW: http://review.gluster.org/8449 (storage/posix: Dont unlink .glusterfs-hardlink before linkto check) posted (#3) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 29 Anand Avati 2014-08-22 11:38:03 UTC
REVIEW: http://review.gluster.org/8449 (storage/posix: Dont unlink .glusterfs-hardlink before linkto check) posted (#4) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 30 Anand Avati 2014-08-22 12:49:50 UTC
REVIEW: http://review.gluster.org/8449 (storage/posix: Don't unlink .glusterfs-hardlink before linkto check) posted (#5) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 31 Anand Avati 2014-08-22 12:49:53 UTC
REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#3) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 32 Anand Avati 2014-08-25 06:30:42 UTC
REVIEW: http://review.gluster.org/8449 (storage/posix: Don't unlink .glusterfs-hardlink before linkto check) posted (#6) for review on master by Raghavendra G (rgowdapp@redhat.com)

Comment 33 Anand Avati 2014-08-25 06:30:50 UTC
REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#4) for review on master by Raghavendra G (rgowdapp@redhat.com)

Comment 34 Anand Avati 2014-08-25 11:32:13 UTC
REVIEW: http://review.gluster.org/8449 (storage/posix: Don't unlink .glusterfs-hardlink before linkto check) posted (#7) for review on master by Raghavendra G (rgowdapp@redhat.com)

Comment 35 Anand Avati 2014-08-25 11:32:16 UTC
REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#5) for review on master by Raghavendra G (rgowdapp@redhat.com)

Comment 36 Anand Avati 2014-08-26 06:23:12 UTC
REVIEW: http://review.gluster.org/8449 (storage/posix: Don't unlink .glusterfs-hardlink before linkto check) posted (#8) for review on master by Raghavendra G (rgowdapp@redhat.com)

Comment 37 Anand Avati 2014-08-26 06:23:15 UTC
REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#6) for review on master by Raghavendra G (rgowdapp@redhat.com)

Comment 38 Anand Avati 2014-08-27 06:33:55 UTC
REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#7) for review on master by Raghavendra G (rgowdapp@redhat.com)

Comment 39 Anand Avati 2014-08-28 07:22:55 UTC
REVIEW: http://review.gluster.org/8559 (storage/posix: Don't unlink .glusterfs-hardlink before linkto check) posted (#1) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 40 Anand Avati 2014-08-28 10:05:23 UTC
COMMIT: http://review.gluster.org/8559 committed in master by Vijay Bellur (vbellur@redhat.com) 
------
commit b23be2e7581c6aa295053dc8866cab841ae374b6
Author: Venkatesh Somyajulu <vsomyaju@redhat.com>
Date:   Fri Aug 22 17:07:15 2014 +0530

    storage/posix: Don't unlink .glusterfs-hardlink before linkto check
    
    BUG: 1116150
    Change-Id: I90a10ac54123fbd8c7383ddcbd04e8879ae51232
    Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
    Reviewed-on: http://review.gluster.org/8559
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: N Balachandran <nbalacha@redhat.com>
    Reviewed-by: Vijay Bellur <vbellur@redhat.com>

Comment 41 Anand Avati 2014-08-28 10:19:20 UTC
REVIEW: http://review.gluster.org/8561 (cluster/dht: Added code to capture races in dht-lookup path) posted (#1) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 42 Anand Avati 2014-09-01 09:03:06 UTC
REVIEW: http://review.gluster.org/8561 (cluster/dht: Added code to capture races in dht-lookup path) posted (#2) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 43 Anand Avati 2014-09-01 09:07:55 UTC
REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#8) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 44 Anand Avati 2014-09-03 08:49:14 UTC
REVIEW: http://review.gluster.org/8585 (storage/posix : Missing space in log message) posted (#1) for review on master by N Balachandran (nbalacha@redhat.com)

Comment 45 Anand Avati 2014-09-03 09:13:56 UTC
REVIEW: http://review.gluster.org/8430 (cluster/dht: Added code to capture races in dht-lookup path) posted (#9) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 46 Anand Avati 2014-09-03 09:38:44 UTC
REVIEW: http://review.gluster.org/8587 (storage/posix: Added space in log message.) posted (#1) for review on master by venkatesh somyajulu (vsomyaju@redhat.com)

Comment 47 Anand Avati 2014-09-03 10:40:23 UTC
COMMIT: http://review.gluster.org/8585 committed in master by Vijay Bellur (vbellur@redhat.com) 
------
commit e03559c20ab37f1a7db54a367258bb1cd005e50d
Author: Nithya Balachandran <nbalacha@redhat.com>
Date:   Wed Sep 3 14:18:00 2014 +0530

    storage/posix : Missing space in log message
    
    Added a space in a log message
    
    Change-Id: Iabd50e6b5c9ff4673f59d6b52b785894b3dcdaf9
    BUG: 1116150
    Signed-off-by: Nithya Balachandran <nbalacha@redhat.com>
    Reviewed-on: http://review.gluster.org/8585
    Reviewed-by: Vijay Bellur <vbellur@redhat.com>
    Tested-by: Gluster Build System <jenkins@build.gluster.com>

Comment 48 Anand Avati 2014-09-03 11:22:23 UTC
COMMIT: http://review.gluster.org/8430 committed in master by Vijay Bellur (vbellur@redhat.com) 
------
commit bb2d5f49b5684e6484af16a580870cfe104aecd2
Author: Venkatesh Somyajulu <vsomyaju@redhat.com>
Date:   Wed Sep 3 14:42:43 2014 +0530

    cluster/dht: Added code to capture races in dht-lookup path
    
    Change-Id: I9270d2d40ebd4b113ff961583dfda7754741f15b
    BUG: 1116150
    Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
    Reviewed-on: http://review.gluster.org/8430
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-by: Vijay Bellur <vbellur@redhat.com>

Comment 50 Kaleb KEITHLEY 2015-10-22 15:40:20 UTC
pre-release version is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.