+++ This bug was initially created as a clone of Bug #1392837 +++ Description of problem: ======================= Hard link fl3275 is lost when performed below steps, Version-Release number of selected component (if applicable): 3.8.4-3.el7rhgs.x86_64 How reproducible: ================= 1/1 Steps to Reproduce: =================== 1) Create a distributed replicate volume and start it. 2) Fuse mount the volume on multiple clients. 3) Perform below tasks simultaneously from multiple clients, a) From client-1, touch --> for i in {1..20000};do touch f$i;done b) From client-2, create hard links for the created files , for i in {1..20000};do ln f$i fl$i;done c) From client-3, change the permissions for the created files, for i in {1..20000};do chmod 660 f$i;done d) From client-4, do a continuous lookup. 5) While the tasks in step-4 are in progress, add few bricks to the volume and start rebalance. Wait till step-4 and step-5 completes. Check the created files and hard links count. Actual results: =============== Hard link fl3275 is lost Expected results: ================= No data loss should be seen. --- Additional comment from Red Hat Bugzilla Rules Engine on 2016-11-08 05:57:39 EST --- This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.2.0' to '?'. If this bug should be proposed for a different release, please manually change the proposed release flag. --- Additional comment from Prasad Desala on 2016-11-08 06:12:32 EST --- Adding a point, - The hard link 'fl3275' is not present on both mount point and subvols. - The original file 'f3275' exists on both mount point and subvols. sosreports@ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/Prasad/1392837/ Additional info: =============== volume name: distrep FUSE mounted the volume on below clients: Client-1: 10.70.42.156 mount -t glusterfs 10.70.42.7:/distrep /mnt/fuse/ --> Created touch files Client-2: 10.70.41.254 mount -t glusterfs 10.70.42.7:/distrep /mnt/fuse/--> Created hard links Client-3: 10.70.42.55 mount -t glusterfs 10.70.42.7:/distrep /mnt/fuse/ --> Changed permissions. Client-4: 10.70.42.21 mount -t glusterfs 10.70.42.7:/distrep /mnt/fuse/ --> Lookups Volume Name: distrep Type: Distributed-Replicate Volume ID: 1e411efc-9f16-41cf-99ad-8b28f1c7d935 Status: Started Snapshot Count: 0 Number of Bricks: 8 x 2 = 16 Transport-type: tcp Bricks: Brick1: 10.70.42.7:/bricks/brick0/b0 Brick2: 10.70.41.211:/bricks/brick0/b0 Brick3: 10.70.43.141:/bricks/brick0/b0 Brick4: 10.70.43.156:/bricks/brick0/b0 Brick5: 10.70.42.7:/bricks/brick1/b1 Brick6: 10.70.41.211:/bricks/brick1/b1 Brick7: 10.70.43.141:/bricks/brick1/b1 Brick8: 10.70.43.156:/bricks/brick1/b1 Brick9: 10.70.42.7:/bricks/brick2/b2 Brick10: 10.70.41.211:/bricks/brick2/b2 Brick11: 10.70.43.141:/bricks/brick2/b2 Brick12: 10.70.43.156:/bricks/brick2/b2 Brick13: 10.70.42.7:/bricks/brick3/b3 Brick14: 10.70.41.211:/bricks/brick3/b3 Brick15: 10.70.43.141:/bricks/brick3/b3 Brick16: 10.70.43.156:/bricks/brick3/b3 Options Reconfigured: features.quota-deem-statfs: on features.inode-quota: on features.quota: on features.uss: on transport.address-family: inet performance.readdir-ahead: on nfs.disable: on Brick logs: =========== [root@dhcp42-7 bricks]# grep -i fl3275 bricks-brick* bricks-brick0-b0.log:[2016-11-08 07:23:36.833098] I [MSGID: 113030] [posix.c:1956:posix_unlink] 0-distrep-posix: open-fd-key-status: 0 for /bricks/brick0/b0/fl3275 bricks-brick0-b0.log:[2016-11-08 07:23:36.833348] I [MSGID: 113031] [posix.c:1867:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr status: 0 for /bricks/brick0/b0/fl3275 bricks-brick2-b2.log:[2016-11-08 07:23:36.839643] I [MSGID: 113030] [posix.c:1956:posix_unlink] 0-distrep-posix: open-fd-key-status: 0 for /bricks/brick2/b2/fl3275 bricks-brick2-b2.log:[2016-11-08 07:23:36.839845] I [MSGID: 113031] [posix.c:1867:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr status: 0 for /bricks/brick2/b2/fl3275 FUSE logs: ========== [root@dhcp42-156 glusterfs]# grep -i fl3275 mnt-fuse.log [2016-11-08 07:23:53.829436] I [MSGID: 109070] [dht-common.c:1942:dht_lookup_linkfile_cbk] 2-distrep-dht: lookup of /fl3275 on distrep-replicate-0 (following linkfile) reached link,gfid = 00000000-0000-0000-0000-000000000000 [2016-11-08 07:23:53.831770] I [MSGID: 109045] [dht-common.c:1821:dht_lookup_everywhere_cbk] 2-distrep-dht: attempting deletion of stale linkfile /fl3275 on distrep-replicate-0 (hashed subvol is distrep-replicate-4) [2016-11-08 07:23:53.838362] I [MSGID: 109069] [dht-common.c:1133:dht_lookup_unlink_cbk] 2-distrep-dht: lookup_unlink returned with op_ret -> 0 and op-errno -> 0 for /fl3275 [2016-11-08 07:23:53.843772] I [MSGID: 109069] [dht-common.c:1223:dht_lookup_unlink_stale_linkto_cbk] 2-distrep-dht: Returned with op_ret 0 and op_errno 0 for /fl3275 --- Additional comment from Nithya Balachandran on 2016-11-13 23:47:24 EST --- RCA: To be confirmed but this is most likely the cause: Rebalance skips files with hardlinks except in the case of a remove-brick operation. In dht_migrate_file (), __is_file_migratable () function checks if a file has hardlinks. If yes, the file is not migrated. In dht_migrate_file (), __dht_rebalance_open_src_file () sets the trusted.dht.linkto xattr and the S and T bits in the file mode to indicate that the file is being migrated. The dht_link_cbk () function checks to see if a file on which the hardlink was created was being migrated. If yes, it redirects/repeats the operation on the dst subvol as well. If a hardlink is created after __is_file_migratable() and before __dht_rebalance_open_src_file (), the link file is not created on the dst subvol and ends up as a hardlink on the linktofile which is deleted post migration as it is considered a stale linkto file. --- Additional comment from Mohit Agrawal on 2016-11-16 00:12:52 EST --- Hi, To debug the issue with gdb i have put sleep before call dht_migrate_file in rebalance_task. Thereafter executes below steps from a terminal to create a volume and then touch 2 file on that brick and then add a new brick and start rebalance process. >>>>>>>>>>>>>>> pkill -f gluster;rm -rf /var/lib/glusterd/*; rm -rf /var/log/glusterfs/*;rm -rf /dist1/brick*; systemctl restart glusterd.service; gluster volume create dist 10.65.7.252:/dist1/brick1; gluster volume start dist; mount -t glusterfs 10.65.7.252:/dist /mnt;cd /mnt;touch 7;touch 5;cd - ; gluster volume add-brick dist 10.65.7.252:/dist1/brick2 gluster volume rebalance dist start force gluster v rebalance dist status >>>>>>>>>>>>>>>>>>> From another terminal find the pid of rebalance process and attach the same with gdb and put breakpoint after the function(__is_file_migratable) and run command to create a linke like below #gdb break dht-rebalance.c:1297 cont; shell /mnt/7 /mnt/file7 (to create a hardlink) cont; #quit After run all above steps I have observed there is no hardlink file available on mount point but the same is available on brick location. ls /mnt/ 5 7 ls /dist1/* /dist1/brick1: file7 /dist1/brick2: 5 7 Regards Mohit Agrawal --- Additional comment from Mohit Agrawal on 2016-11-16 06:38:23 EST --- Hi, One another thing i want to share, after execute rebalance daemon again hardlink is moved on brick2 and available on mount point. Regards Mohit Agrawal --- Additional comment from Prasad Desala on 2016-11-16 07:06:14 EST --- (In reply to Mohit Agrawal from comment #4) > Hi, > > To debug the issue with gdb i have put sleep before call > dht_migrate_file in rebalance_task. > > Thereafter executes below steps from a terminal to create a volume and > then touch 2 file on that brick and then add a new brick and start > rebalance process. > > >>>>>>>>>>>>>>> > > pkill -f gluster;rm -rf /var/lib/glusterd/*; > rm -rf /var/log/glusterfs/*;rm -rf /dist1/brick*; > systemctl restart glusterd.service; > gluster volume create dist 10.65.7.252:/dist1/brick1; > gluster volume start dist; > mount -t glusterfs 10.65.7.252:/dist /mnt;cd /mnt;touch 7;touch 5;cd - ; > > > gluster volume add-brick dist 10.65.7.252:/dist1/brick2 > > gluster volume rebalance dist start force > > gluster v rebalance dist status > > >>>>>>>>>>>>>>>>>>> > > From another terminal find the pid of rebalance process and attach the > same with gdb and put breakpoint after the function(__is_file_migratable) > and run command to create a linke like below > > #gdb > break dht-rebalance.c:1297 > cont; > shell /mnt/7 /mnt/file7 (to create a hardlink) > cont; > > #quit > > > After run all above steps I have observed there is no hardlink file > available > on mount point but the same is available on brick location. At that time I hit the issue, the hardlink file is completely lost. The hardlink file is not present on both mount point and bricks. Please see Comment 2. > ls /mnt/ > 5 7 > > ls /dist1/* > /dist1/brick1: > file7 > > /dist1/brick2: > 5 7 > > > Regards > Mohit Agrawal --- Additional comment from Mohit Agrawal on 2016-11-16 09:08:08 EST --- Hi Prasad, sorry for missed the comment #2,I tried to reproduce the issue in minimal environment(3 VM, 2 VM used as server and 1 as a client and all clients are on the same vm) after done minor improvements in steps but did not get any success 1) for i in {1..20000};do touch f$i;done 2) for i in {1..20000}; do while [ ! -f ./f$i ] ; do echo " " > /dev/null;done; ln f$i fl$i; done 3) for i in {1..20000}; do while [ ! -f ./f$i ] ; do echo " " > /dev/null;done; chmod 660 f$i; done 4) for i in {1..20000}; do while [ ! -f ./f$i ] ; do echo " " > /dev/null;done; stat f$i; done Is it possible to share the steps for minimal environment to reproduce the same? Regards Mohit Agrawal --- Additional comment from Atin Mukherjee on 2016-11-21 00:02:18 EST --- upstream mainline patch http://review.gluster.org/15866 posted for review. --- Additional comment from Atin Mukherjee on 2016-11-25 06:26:32 EST --- This BZ has been accepted by all the stake holders for rhgs-3.2.0 as per today's triage meeting and data is available at As per today's triaging meeting and based on the data available at https://docs.google.com/spreadsheets/d/1ew4cafcvIVEWuJ4tLDuZ4ao7ZTYpsRz5NwCtQ4JVZaQ/edit#gid=0 this BZ has been deferred from rhgs-3.2.0 . Providing devel_ack.
REVIEW: http://review.gluster.org/15951 (cluster/dht: A hard link is lost during rebalance + lookup) posted (#1) for review on master by MOHIT AGRAWAL (moagrawa)
Patch is posted for review http://review.gluster.org/#/c/15951/
REVIEW: http://review.gluster.org/15954 (cluster/dht: A hard link is lost during rebalance + lookup) posted (#1) for review on release-3.8 by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/15954 (cluster/dht: A hard link is lost during rebalance + lookup) posted (#2) for review on release-3.8 by MOHIT AGRAWAL (moagrawa)
REVIEW: http://review.gluster.org/15954 (cluster/dht: A hard link is lost during rebalance + lookup) posted (#3) for review on release-3.8 by MOHIT AGRAWAL (moagrawa)
COMMIT: http://review.gluster.org/15954 committed in release-3.8 by Raghavendra G (rgowdapp) ------ commit 145185454464c1c45af64c13919e6fe5bf559769 Author: Mohit Agrawal <moagrawa> Date: Tue Nov 29 10:50:04 2016 +0530 cluster/dht: A hard link is lost during rebalance + lookup Problem: A hard link is lost during rebalance + lookup.Rebalance skip files if file has hardlink.In dht_migrate_file __is_file_migratable () function checks if a file has hardlink, if yes file is not migrated but if link is created after call this function then link will lost. Solution: Call __check_file_has_hardlink to check hardlink existence after (S+T) bits in migration process ,if file has hardlink then skip the file for migrate rebalance process. > BUG: 1396048 > Change-Id: Ia53c07ef42f1128c2eedf959a757e8df517b9d12 > Signed-off-by: Mohit Agrawal <moagrawa> > Reviewed-on: http://review.gluster.org/15866 > Reviewed-by: Susant Palai <spalai> > Smoke: Gluster Build System <jenkins.org> > NetBSD-regression: NetBSD Build System <jenkins.org> > CentOS-regression: Gluster Build System <jenkins.org> > Reviewed-by: N Balachandran <nbalacha> > Reviewed-by: Raghavendra G <rgowdapp> > (cherry picked from commit 71dd2e914d4a537bf74e1ec3a24512fc83bacb1d) BUG: 1399432 Change-Id: I30e21efd5a054d8a3e640ab3ed8aa7955d083926 Signed-off-by: Mohit Agrawal <moagrawa> Reviewed-on: http://review.gluster.org/15954 CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: N Balachandran <nbalacha> Reviewed-by: Raghavendra G <rgowdapp>
REVIEW: http://review.gluster.org/16258 (cluster/dht: A hard link is lost during rebalance + lookup) posted (#1) for review on release-3.8-fb by Kevin Vigor (kvigor)
Marking this BZ Modified as the patch has been merged in the release-3.8.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.8, please open a new bug report. glusterfs-3.8.8 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2017-January/000064.html [2] https://www.gluster.org/pipermail/gluster-users/