Bug 1469971 - cluster/dht: Fix hardlink migration failures
Summary: cluster/dht: Fix hardlink migration failures
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: RHGS 3.3.0
Assignee: Susant Kumar Palai
QA Contact: Prasad Desala
URL:
Whiteboard:
Depends On: 1469964 1473141
Blocks: 1417151
TreeView+ depends on / blocked
 
Reported: 2017-07-12 07:47 UTC by Susant Kumar Palai
Modified: 2017-09-21 05:02 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8.4-34
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1469964
Environment:
Last Closed: 2017-09-21 05:02:13 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1463248 0 medium CLOSED [Remove-brick] Hardlink migration fails with "migrate-data failed for $file [Unknown error 109023]" errors in rebalance ... 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHBA-2017:2774 0 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 08:16:29 UTC

Internal Links: 1463248

Description Susant Kumar Palai 2017-07-12 07:47:09 UTC
+++ This bug was initially created as a clone of Bug #1469964 +++

Description of problem:
There are few races in remove-brick hardlink migration code path detailed below.
    
 A brief about how hardlink migration works:
     - Different hardlinks (to the same file) may hash to different bricks,
    but their cached subvol will be same. Rebalance picks up the first hardlink,
    calculates it's  hash(call it TARGET) and set the hashed subvolume as an 
    xattr on the data file.
    - Now all the hardlinks those come after this will fetch that xattr and will
    create linkto files on TARGET (all linkto files for the hardlinks will be 
    hardlink   to each other on TARGET).
    - When number of hardlinks on source is equal to the number of hardlinks on
    TARGET, the data migration will happen.
    
    RACE:1
      Since rebalance is multi-threaded, the first lookup (which decides where 
      the TARGET subvol should be), can be called by two hardlink migration 
      parallely and they may end up creating linkto files on two different 
      TARGET subvols. Hence, hardlinks won't be migrated.
    
   
    RACE:2
      The linkto files on TARGET can be created by other clients also if they
      are doing lookup on the hardlinks.  Consider a scenario where you have 100 
      hardlinks.  When rebalance is migrating 99th hardlink, as a result of 
      continuous lookups from other client, linkcount on TARGET is equal to 
      source linkcount. Rebalance will migrate data on the 99th hardlink itself. 
      On 100th hardlink migration, hardlink will have TARGET as  cached 
      subvolume. If it's hash is also the same, then a migration will be 
      triggered from TARGET to TARGET leading to data loss.
    

 This is reproducible intermittently. Since this is related to hardlink migration, this happens only with remove-brick process.

--- Additional comment from Worker Ant on 2017-07-12 12:44:13 MVT ---

REVIEW: https://review.gluster.org/17755 (cluster/rebalance: Fix hardlink migration failures) posted (#1) for review on master by Susant Palai (spalai)

Comment 2 Atin Mukherjee 2017-07-12 11:58:04 UTC
upstream patch : https://review.gluster.org/#/c/17755/

Comment 11 Prasad Desala 2017-07-31 10:38:13 UTC
On glusterfs version 3.8.4-36.el7rhgs.x86_64, followed the steps mentioned in Comment 10 for 10 times and could not hit this issue. 

Moving this BZ to Verified.

Comment 13 errata-xmlrpc 2017-09-21 05:02:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774


Note You need to log in before you can comment on or make changes to this bug.