1399432 – A hard link is lost during rebalance+lookup

Bug 1399432 - A hard link is lost during rebalance+lookup

Summary: A hard link is lost during rebalance+lookup

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	distribute
Sub Component:
Version:	3.8
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Mohit Agrawal
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1392837 1396048
Blocks:
TreeView+	depends on / blocked

Reported:	2016-11-29 04:37 UTC by Mohit Agrawal
Modified:	2017-01-16 12:26 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.8.8
Clone Of:	1392837
Environment:
Last Closed:	2017-01-16 12:26:06 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Mohit Agrawal 2016-11-29 04:37:39 UTC

+++ This bug was initially created as a clone of Bug #1392837 +++

Description of problem:
=======================
Hard link fl3275 is lost when performed below steps,

Version-Release number of selected component (if applicable):
3.8.4-3.el7rhgs.x86_64

How reproducible:
=================
1/1

Steps to Reproduce:
===================
1) Create a distributed replicate volume and start it.
2) Fuse mount the volume on multiple clients.
3) Perform below tasks simultaneously from multiple clients,
     a) From client-1, touch -->  for i in {1..20000};do touch f$i;done
     b) From client-2, create hard links for the created files , for i in {1..20000};do ln f$i fl$i;done
     c) From client-3, change the permissions for the created files, for i in {1..20000};do chmod 660 f$i;done
     d) From client-4, do a continuous lookup.
5) While the tasks in step-4 are in progress, add few bricks to the volume and start rebalance.
Wait till step-4 and step-5 completes.
Check the created files and hard links count.

Actual results:
===============
Hard link fl3275 is lost

Expected results:
=================
No data loss should be seen.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-11-08 05:57:39 EST ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.2.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Prasad Desala on 2016-11-08 06:12:32 EST ---

Adding a point, 
- The hard link 'fl3275' is not present on both mount point and subvols. 
- The original file 'f3275' exists on both mount point and subvols.

sosreports@ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1392837/

Additional info:
===============
volume name: distrep
FUSE mounted the volume on below clients:
Client-1: 10.70.42.156      mount -t glusterfs 10.70.42.7:/distrep /mnt/fuse/ --> Created touch files
Client-2: 10.70.41.254       mount -t glusterfs 10.70.42.7:/distrep /mnt/fuse/--> Created hard links
Client-3: 10.70.42.55      mount -t glusterfs 10.70.42.7:/distrep /mnt/fuse/ --> Changed permissions.
Client-4: 10.70.42.21      mount -t glusterfs 10.70.42.7:/distrep /mnt/fuse/ --> Lookups

Volume Name: distrep
Type: Distributed-Replicate
Volume ID: 1e411efc-9f16-41cf-99ad-8b28f1c7d935
Status: Started
Snapshot Count: 0
Number of Bricks: 8 x 2 = 16
Transport-type: tcp
Bricks:
Brick1: 10.70.42.7:/bricks/brick0/b0
Brick2: 10.70.41.211:/bricks/brick0/b0
Brick3: 10.70.43.141:/bricks/brick0/b0
Brick4: 10.70.43.156:/bricks/brick0/b0
Brick5: 10.70.42.7:/bricks/brick1/b1
Brick6: 10.70.41.211:/bricks/brick1/b1
Brick7: 10.70.43.141:/bricks/brick1/b1
Brick8: 10.70.43.156:/bricks/brick1/b1
Brick9: 10.70.42.7:/bricks/brick2/b2
Brick10: 10.70.41.211:/bricks/brick2/b2
Brick11: 10.70.43.141:/bricks/brick2/b2
Brick12: 10.70.43.156:/bricks/brick2/b2
Brick13: 10.70.42.7:/bricks/brick3/b3
Brick14: 10.70.41.211:/bricks/brick3/b3
Brick15: 10.70.43.141:/bricks/brick3/b3
Brick16: 10.70.43.156:/bricks/brick3/b3
Options Reconfigured:
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
features.uss: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on

Brick logs:
===========
[root@dhcp42-7 bricks]# grep -i fl3275 bricks-brick*
bricks-brick0-b0.log:[2016-11-08 07:23:36.833098] I [MSGID: 113030] [posix.c:1956:posix_unlink] 0-distrep-posix: open-fd-key-status: 0 for /bricks/brick0/b0/fl3275
bricks-brick0-b0.log:[2016-11-08 07:23:36.833348] I [MSGID: 113031] [posix.c:1867:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr status: 0 for /bricks/brick0/b0/fl3275
bricks-brick2-b2.log:[2016-11-08 07:23:36.839643] I [MSGID: 113030] [posix.c:1956:posix_unlink] 0-distrep-posix: open-fd-key-status: 0 for /bricks/brick2/b2/fl3275
bricks-brick2-b2.log:[2016-11-08 07:23:36.839845] I [MSGID: 113031] [posix.c:1867:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr status: 0 for /bricks/brick2/b2/fl3275

FUSE logs:
==========
[root@dhcp42-156 glusterfs]# grep -i fl3275 mnt-fuse.log
[2016-11-08 07:23:53.829436] I [MSGID: 109070] [dht-common.c:1942:dht_lookup_linkfile_cbk] 2-distrep-dht: lookup of /fl3275 on distrep-replicate-0 (following linkfile) reached link,gfid = 00000000-0000-0000-0000-000000000000
[2016-11-08 07:23:53.831770] I [MSGID: 109045] [dht-common.c:1821:dht_lookup_everywhere_cbk] 2-distrep-dht: attempting deletion of stale linkfile /fl3275 on distrep-replicate-0 (hashed subvol is distrep-replicate-4)
[2016-11-08 07:23:53.838362] I [MSGID: 109069] [dht-common.c:1133:dht_lookup_unlink_cbk] 2-distrep-dht: lookup_unlink returned with op_ret -> 0 and op-errno -> 0 for /fl3275
[2016-11-08 07:23:53.843772] I [MSGID: 109069] [dht-common.c:1223:dht_lookup_unlink_stale_linkto_cbk] 2-distrep-dht: Returned with op_ret 0 and op_errno 0 for /fl3275

--- Additional comment from Nithya Balachandran on 2016-11-13 23:47:24 EST ---

RCA: 

To be confirmed but this is most likely the cause:

Rebalance skips files with hardlinks except in the case of a remove-brick operation.

In dht_migrate_file (), __is_file_migratable () function checks if a file has hardlinks. If yes, the file is not migrated.

In dht_migrate_file (), __dht_rebalance_open_src_file () sets the trusted.dht.linkto xattr and the S and T bits in the file mode to indicate that the file is being migrated.

The dht_link_cbk () function checks to see if a file on which the hardlink was created was being migrated. If yes, it redirects/repeats the operation on the dst subvol as well.

If a hardlink is created after __is_file_migratable() and before __dht_rebalance_open_src_file (), the link file is not created on the dst subvol and ends up as a hardlink on the linktofile which is deleted post migration as it is considered a stale linkto file.

--- Additional comment from Mohit Agrawal on 2016-11-16 00:12:52 EST ---

Hi,

  To debug the issue with gdb i have put sleep before call 
  dht_migrate_file in rebalance_task.
  
  Thereafter executes below steps from a terminal to create a volume and 
  then touch 2 file on  that brick and then add a new brick and start rebalance process.

  >>>>>>>>>>>>>>>

  pkill -f gluster;rm -rf /var/lib/glusterd/*;
  rm -rf /var/log/glusterfs/*;rm -rf  /dist1/brick*;
  systemctl restart glusterd.service;
  gluster volume create dist 10.65.7.252:/dist1/brick1;
  gluster volume start dist;
  mount -t glusterfs 10.65.7.252:/dist /mnt;cd /mnt;touch 7;touch 5;cd - ;


  gluster volume add-brick dist 10.65.7.252:/dist1/brick2

  gluster volume rebalance dist start force

  gluster v rebalance dist status

  >>>>>>>>>>>>>>>>>>>

  From another terminal find the pid of rebalance process and attach the 
  same with gdb and put breakpoint after the function(__is_file_migratable) 
  and run command to create a linke like below

  #gdb
     break dht-rebalance.c:1297
     cont;
     shell /mnt/7 /mnt/file7 (to create a hardlink)
     cont;

     #quit


   After run all above steps I have observed there is no hardlink file available 
   on mount point but the same is available on brick location.

ls /mnt/
5  7

ls /dist1/*
/dist1/brick1:
file7

/dist1/brick2:
5  7


Regards
Mohit Agrawal

--- Additional comment from Mohit Agrawal on 2016-11-16 06:38:23 EST ---

Hi,

One another thing i want to share, after execute rebalance daemon again hardlink is moved on brick2 and available on mount point. 

Regards
Mohit Agrawal

--- Additional comment from Prasad Desala on 2016-11-16 07:06:14 EST ---

(In reply to Mohit Agrawal from comment #4)
> Hi,
> 
>   To debug the issue with gdb i have put sleep before call 
>   dht_migrate_file in rebalance_task.
>   
>   Thereafter executes below steps from a terminal to create a volume and 
>   then touch 2 file on  that brick and then add a new brick and start
> rebalance process.
> 
>   >>>>>>>>>>>>>>>
> 
>   pkill -f gluster;rm -rf /var/lib/glusterd/*;
>   rm -rf /var/log/glusterfs/*;rm -rf  /dist1/brick*;
>   systemctl restart glusterd.service;
>   gluster volume create dist 10.65.7.252:/dist1/brick1;
>   gluster volume start dist;
>   mount -t glusterfs 10.65.7.252:/dist /mnt;cd /mnt;touch 7;touch 5;cd - ;
> 
> 
>   gluster volume add-brick dist 10.65.7.252:/dist1/brick2
> 
>   gluster volume rebalance dist start force
> 
>   gluster v rebalance dist status
> 
>   >>>>>>>>>>>>>>>>>>>
> 
>   From another terminal find the pid of rebalance process and attach the 
>   same with gdb and put breakpoint after the function(__is_file_migratable) 
>   and run command to create a linke like below
> 
>   #gdb
>      break dht-rebalance.c:1297
>      cont;
>      shell /mnt/7 /mnt/file7 (to create a hardlink)
>      cont;
> 
>      #quit
> 
> 
>    After run all above steps I have observed there is no hardlink file
> available 
>    on mount point but the same is available on brick location.
 
At that time I hit the issue, the hardlink file is completely lost. The hardlink file is not present on both mount point and bricks. Please see Comment 2.  

> ls /mnt/
> 5  7
> 
> ls /dist1/*
> /dist1/brick1:
> file7
> 
> /dist1/brick2:
> 5  7
> 
> 
> Regards
> Mohit Agrawal

--- Additional comment from Mohit Agrawal on 2016-11-16 09:08:08 EST ---

Hi Prasad,

sorry for missed the comment #2,I tried to reproduce the issue in minimal environment(3 VM, 2 VM used as server and 1 as a client and all clients are on the same vm) after done minor improvements in steps but did not get any success

1) for i in {1..20000};do touch f$i;done
2) for i in {1..20000}; do    while [ ! -f ./f$i ] ; do echo " " > /dev/null;done;   ln f$i fl$i; done
3) for i in {1..20000}; do    while [ ! -f ./f$i ] ; do echo " " > /dev/null;done;   chmod 660 f$i; done
4) for i in {1..20000}; do    while [ ! -f ./f$i ] ; do echo " " > /dev/null;done;   stat f$i; done

Is it possible to share the steps for minimal environment to reproduce the same?

Regards
Mohit Agrawal

--- Additional comment from Atin Mukherjee on 2016-11-21 00:02:18 EST ---

upstream mainline patch http://review.gluster.org/15866 posted for review.

--- Additional comment from Atin Mukherjee on 2016-11-25 06:26:32 EST ---

This BZ has been accepted by all the stake holders for rhgs-3.2.0 as per today's triage meeting and data is available at As per today's triaging meeting and based on the data available at https://docs.google.com/spreadsheets/d/1ew4cafcvIVEWuJ4tLDuZ4ao7ZTYpsRz5NwCtQ4JVZaQ/edit#gid=0 this BZ has been deferred from rhgs-3.2.0 . Providing devel_ack.

Comment 1 Worker Ant 2016-11-29 04:52:58 UTC

REVIEW: http://review.gluster.org/15951 (cluster/dht: A hard link is lost during rebalance + lookup) posted (#1) for review on master by MOHIT AGRAWAL (moagrawa)

Comment 2 Mohit Agrawal 2016-11-29 04:54:41 UTC

Patch is posted for review  
http://review.gluster.org/#/c/15951/

Comment 3 Worker Ant 2016-11-29 05:21:55 UTC

REVIEW: http://review.gluster.org/15954 (cluster/dht: A hard link is lost during rebalance + lookup) posted (#1) for review on release-3.8 by MOHIT AGRAWAL (moagrawa)

Comment 4 Worker Ant 2016-11-29 05:25:59 UTC

REVIEW: http://review.gluster.org/15954 (cluster/dht: A hard link is lost during rebalance + lookup) posted (#2) for review on release-3.8 by MOHIT AGRAWAL (moagrawa)

Comment 5 Worker Ant 2016-11-29 09:42:28 UTC

REVIEW: http://review.gluster.org/15954 (cluster/dht: A hard link is lost during rebalance + lookup) posted (#3) for review on release-3.8 by MOHIT AGRAWAL (moagrawa)

Comment 6 Worker Ant 2016-12-15 05:32:07 UTC

COMMIT: http://review.gluster.org/15954 committed in release-3.8 by Raghavendra G (rgowdapp) 
------
commit 145185454464c1c45af64c13919e6fe5bf559769
Author: Mohit Agrawal <moagrawa>
Date:   Tue Nov 29 10:50:04 2016 +0530

    cluster/dht: A hard link is lost during rebalance + lookup
    
    Problem: A hard link is lost during rebalance + lookup.Rebalance skip
             files if file has hardlink.In dht_migrate_file
             __is_file_migratable () function checks if a file has hardlink,
             if yes file is not migrated but if link is created after call
             this function then link will lost.
    
    Solution: Call __check_file_has_hardlink to check hardlink existence
              after (S+T) bits in migration process ,if file has hardlink
              then skip the file for migrate rebalance process.
    
    > BUG: 1396048
    > Change-Id: Ia53c07ef42f1128c2eedf959a757e8df517b9d12
    > Signed-off-by: Mohit Agrawal <moagrawa>
    > Reviewed-on: http://review.gluster.org/15866
    > Reviewed-by: Susant Palai <spalai>
    > Smoke: Gluster Build System <jenkins.org>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.org>
    > Reviewed-by: N Balachandran <nbalacha>
    > Reviewed-by: Raghavendra G <rgowdapp>
    > (cherry picked from commit 71dd2e914d4a537bf74e1ec3a24512fc83bacb1d)
    
    BUG: 1399432
    Change-Id: I30e21efd5a054d8a3e640ab3ed8aa7955d083926
    Signed-off-by: Mohit Agrawal <moagrawa>
    Reviewed-on: http://review.gluster.org/15954
    CentOS-regression: Gluster Build System <jenkins.org>
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: N Balachandran <nbalacha>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 7 Worker Ant 2016-12-21 23:22:07 UTC

REVIEW: http://review.gluster.org/16258 (cluster/dht: A hard link is lost during rebalance + lookup) posted (#1) for review on release-3.8-fb by Kevin Vigor (kvigor)

Comment 8 Nithya Balachandran 2017-01-02 06:24:48 UTC

Marking this BZ Modified as the patch has been merged in the release-3.8.

Comment 9 Niels de Vos 2017-01-16 12:26:06 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.8, please open a new bug report.

glusterfs-3.8.8 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2017-January/000064.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.