Bug 1161311 - DHT + rebalance :- DATA LOSS - while file is in migration, creation of Hard-link and unlink of original file ends in data loss(both files are missing from mount and backend
Summary: DHT + rebalance :- DATA LOSS - while file is in migration, creation of Hard-...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Shyamsundar
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-06 21:29 UTC by Shyamsundar
Modified: 2015-05-14 17:44 UTC (History)
11 users (show)

Fixed In Version: glusterfs-3.7.0
Clone Of: 1136714
Environment:
Last Closed: 2015-05-14 17:28:22 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Shyamsundar 2014-11-06 21:29:11 UTC
+++ This bug was initially created as a clone of Bug #1136714 +++

Description of problem:
=======================
Hard-links are missing after rebalance if hard-link is created while file migration is in progress

How reproducible:
=================
always

Steps to Reproduce:
==================

1. Created a 1GB file on the mount point.
2. Started rebalance force after adding brick [so that file will be migrated]
3. Created multiple hard links after this.
[root@vm100 mnt]# ll -h
total 954M
-rw-r--r--. 1 root root 954M Sep  1 23:39 file
[root@vm100 mnt]# ln file link
[root@vm100 mnt]# ls
file  link
[root@vm100 mnt]# ln file link2
[root@vm100 mnt]# ls
file  link  link2
[root@vm100 mnt]# ln file link3
[root@vm100 mnt]#


5. Waited for rebalance to complete.
[root@vm100 ~]# gluster v rebalance test1 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                1       953.7MB             1             0             0            completed              44.00
6. Checked file status on mount point.
[root@vm100 mnt]# ll -h
total 954M
-rw-r--r--. 1 root root 954M Sep  1 23:39 file
[root@vm100 mnt]#


Actual results:
===============
The hard links are gone as mentioned earlier. 


Expected results:
=================
hard link should be present

Additional info:
================

--- Additional comment from Rachana Patel on 2014-09-04 05:59:50 EDT ---

AS mentioned in Description above, If we create hard-link while file is in migration, hard-link will be deleted after migration.

With same steps mentioned above, if we unlink original file after creating hard link (while file is in migration), we will loose both files and after migration we will not have data on backend or mount.

Steps to Reproduce:
==================

1. Created a  file on the mount point.
2. Started rebalance force after adding brick [so that file will be migrated]
3. While file is in migration, Created multiple hard links after this.
[root@vm100 mnt]# ll -h
total 954M
-rw-r--r--. 1 root root 954M Sep  1 23:39 file
[root@vm100 mnt]# ln file link
[root@vm100 mnt]# ls
file  link
[root@vm100 mnt]# ln file link2
[root@vm100 mnt]# ls
file  link  link2
[root@vm100 mnt]# ln file link3
[root@vm100 mnt]# ls 
file  link  link2 link3

4. while file is in migration unlink original file as we have multiple hard link for the same.

[root@vm100 mnt]# unlink file

5.  Waited for rebalance to complete. check data on mount and bricks
[root@vm100 mnt]# ls
[root@vm100 mnt]# 


Actual results:
===============
No files are present on mount or bricks


Expected results:
=================
There should not be any data/file loss If user has deleted orginal file after creating hard link for the same

--- Additional comment from Shyamsundar on 2014-11-06 16:07:01 EST ---

Reasons for this happening:
---------------------------

1) dht_link would create a link to the cached file and a linkto at the hashed location

2) When file is under migration, if (1) happens, then we create a hard link to the cached (which is under migration) and create a linkto on the subvol that the new name hashes to.

3) If new name hashes to the same subvol as the old name, then the file survives, as the linkto file, for new name, on the hashed is a hard link to the linkto file for old name

4) If new name hashed to a different subvol, then the file does not survive. As the cached is on the subvol that is migrating and hence when migration is over that file is truncated and made 0, so it retains the P2 state old file (as the file is migrated) which has linkto information and sticky bit. So in all we lose the file.

The resolution for this is, to redirect the link to the new cached subvol for a file under migration. So when we get a dht_link on the postop, we need to send a link to the real cached subvol that the file is being migrated to, IOW follow the linkto and link the file there as well.

The above is a first cut RCA and resolution thought process.

Here is some FS data on the same, to help corelate with the comments above,

< State of the brick before rebalance starts on FILE1>
# ls -l /d/backends/patchy*
/d/backends/patchy1:
total 5242880
-rw-r--r--. 2 root root 5368709120 Nov  6 14:49 FILE1

/d/backends/patchy2:
total 0

/d/backends/patchy3:
total 0

< State of the brick as soon as rebalance starts on FILE1 >
[root@marvin ~]# ls -l /d/backends/patchy*
/d/backends/patchy1:
total 5242884
-rw-r-Sr-T. 2 root root 5368709120 Nov  6 14:49 FILE1

/d/backends/patchy2:
total 0

/d/backends/patchy3:
total 52868
---------T. 2 root root 5368709120 Nov  6 14:50 FILE1

<Create hard link FILE2 which hashes to second subvol
 and hard link FILE5 which hashes to third subvol >
[root@marvin ~]# ls -l /d/backends/patchy*
/d/backends/patchy1:
total 20971536
-rw-r-Sr-T. 5 root root 5368709120 Nov  6 14:49 FILE1
-rw-r-Sr-T. 5 root root 5368709120 Nov  6 14:49 FILE2 (this is the hard link as cached for FILE1 is subvol1)
-rw-r-Sr-T. 5 root root 5368709120 Nov  6 14:49 FILE5

/d/backends/patchy2:
total 4
---------T. 2 root root 0 Nov  6 14:50 FILE2 (this is the linkto for FILE2)

/d/backends/patchy3:
total 5164812
---------T. 4 root root 5368709120 Nov  6 14:50 FILE1
---------T. 4 root root 5368709120 Nov  6 14:50 FILE5 (this is the linkto for FILE5 but is a hardlink to FILE1 as GFID is the same (check stat information))

< End of rebalance of FILE1, so on subvol1 we have the P1 file left, which is a linkto file>
[root@marvin ~]# ls -l /d/backends/patchy*
/d/backends/patchy:
total 0

/d/backends/patchy1:
total 12
---------T. 4 root root 0 Nov  6 14:50 FILE2 (bad FILE2, as FILE1 post migration was truncated, and made a linkto file and then unlinked, so the hard links survive with the P2 file, and linkto as subvol3, where FILE2 does not exist, IF it existed then it would be a double linkto, which again a lookup everywhere may cleanup)
---------T. 4 root root 0 Nov  6 14:50 FILE5 (good FILE5, but sort of useless as it is stale linkto, will get cleaned up later)

/d/backends/patchy2:
total 4
---------T. 2 root root 0 Nov  6 14:50 FILE2 (Hashed subvol of FILE2 pointing to NULL file in subvol1)

/d/backends/patchy3:
total 15728640
-rw-r--r--. 4 root root 5368709120 Nov  6 14:49 FILE1 (FILE1 is now hashed/cached here and good)
-rw-r--r--. 4 root root 5368709120 Nov  6 14:49 FILE5 (FILE5 was created as a hard link to the linkto for FILE1 during rebalance, so now automatically becomes a good file)

So bottom line is, we should create a hardlink in dht_link following a file under migration to its _new_ destination to resolve the issue.

Even if the original file was unlinked during migration, I assume the target would survive as there are hard links to it, this is a test case that needs repetition once we fix the issue in dht_link.

Comment 1 Anand Avati 2014-11-12 15:22:00 UTC
REVIEW: http://review.gluster.org/9105 (cluster/dht: Fix dht_link to follow files under migration) posted (#1) for review on master by Shyamsundar Ranganathan (srangana)

Comment 2 Anand Avati 2014-11-13 14:40:51 UTC
REVIEW: http://review.gluster.org/9105 (cluster/dht: Fix dht_link to follow files under migration) posted (#2) for review on master by Shyamsundar Ranganathan (srangana)

Comment 3 Anand Avati 2015-02-09 20:22:55 UTC
REVIEW: http://review.gluster.org/9105 (cluster/dht: Fix dht_link to follow files under migration) posted (#3) for review on master by Shyamsundar Ranganathan (srangana)

Comment 4 Anand Avati 2015-02-17 16:07:47 UTC
COMMIT: http://review.gluster.org/9105 committed in master by Raghavendra G (rgowdapp) 
------
commit 7c6da2f7ceea2956197641b6cdb1e2f79cdb063e
Author: Shyam <srangana>
Date:   Wed Nov 12 10:12:13 2014 -0500

    cluster/dht: Fix dht_link to follow files under migration
    
    Currently if a file is under migration, a hardlink to that file
    is lost post migration of the file. This is due to the fact that
    the hard link is created against the cached subvol of the source
    and as the source is under migration, it shifts to a linkto file
    post migration. Thus losing the hardlink.
    
    This change follows the stat information that triggers a phase1/2
    detection for a file under migration, to create the link on the new
    subvol that the source file is migrating to. Thereby preserving the
    hard link post migration.
    
    NOTES:
    The test case added create a ~1GB file, so that we can catch the file
    during migration, smaller files may not capture this state and the
    test may fail.
    Even if migration of the file fails, we would only be left with stale
    linkto files on the subvol that the source was migrating to, which is
    not a problem.
    This change would create a double linkto, i.e new target hashed subvol
    would point to old source cached subol, which would point to the real
    cached subvol. This double redirection although not handled directly in
    DHT, works as lookup searches everywhere on hitting linkto files. The
    downside is that it never heals the new target hashed subvol linkto
    file, which is another bug to be resolved (does not cause functional
    impact).
    
    Change-Id: I871e6885b15e65e05bfe70a0b0180605493cb534
    BUG: 1161311
    Signed-off-by: Shyam <srangana>
    Reviewed-on: http://review.gluster.org/9105
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: N Balachandran <nbalacha>
    Reviewed-by: susant palai <spalai>
    Reviewed-by: venkatesh somyajulu <vsomyaju>
    Reviewed-by: Raghavendra G <rgowdapp>
    Tested-by: Raghavendra G <rgowdapp>

Comment 5 Niels de Vos 2015-05-14 17:28:22 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 6 Niels de Vos 2015-05-14 17:35:42 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 7 Niels de Vos 2015-05-14 17:38:04 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 8 Niels de Vos 2015-05-14 17:44:37 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.