Distributed volume rebalance errors due to hardlinks to .glusterfs/... Using GlusterFS 3.4.2, I sometimes getting "file has hardlinks" errors in the GlusterFS *-rebalance.logs: [2014-03-06 03:33:54.782401] W [dht-rebalance.c:227:__is_file_migratable] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47588.rvf: file has hardlinks [2014-03-06 03:33:56.893538] I [dht-rebalance.c:666:dht_migrate_file] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47029.rvf: attempting to move from U5-4-client-1 to U5-4-client-2 [2014-03-06 03:33:56.893971] W [dht-rebalance.c:227:__is_file_migratable] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47029.rvf: file has hardlinks [2014-03-06 03:34:00.237993] I [dht-rebalance.c:666:dht_migrate_file] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47009.rvf: attempting to move from U5-4-client-1 to U5-4-client-2 [2014-03-06 03:34:00.238369] W [dht-rebalance.c:227:__is_file_migratable] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47009.rvf: file has hardlinks [2014-03-06 03:34:02.815369] I [dht-rebalance.c:666:dht_migrate_file] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf: attempting to move from U5-4-client-1 to U5-4-client-2 [2014-03-06 03:34:02.815624] W [dht-rebalance.c:227:__is_file_migratable] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf: file has hardlinks which is preventing the re-balancing of these files. When this happens, it seems to occur for most or all files on the volume, making rebalance ineffective. It is understandable that re-balancing a file with hard links could be problematic as all files would have to be moved together to properly maintain the linkage. Since the volume is a Samba/FUSE mounted CIFS share, there should not be any hard links made by the client. Yet, a stat on any file always shows two links: # stat /samba/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf File: `/samba/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf' Size: 65536 Blocks: 128 IO Block: 131072 regular file Device: 39h/57d Inode: 10611842014504481175 Links: 2 Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: (2123366912/NASLAB+domain admins) Access: 2014-01-22 13:32:15.518616000 -0800 Modify: 2014-02-13 21:31:37.035164000 -0800 Change: 2014-02-13 21:31:37.035212651 -0800 But there are no no other files shown as linked: # find /samba/U5-4/ -samefile /samba/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf -ls 10611842014504481175 64 -rw-rw---- 2 root NASLAB+domain admins 65536 Feb 13 21:31 /samba/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf It seems that the 'glusterfs' client mount always shows 2 Links in 'stat' output for some reason. When I look at the file on the brick, there are also hard links: # stat /exports/nas-segment-0002/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf File: `/exports/nas-segment-0002/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf' Size: 65536 Blocks: 128 IO Block: 4096 regular file Device: 8c0h/2240d Inode: 60078978 Links: 3 Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: (2123366912/NASLAB+domain admins) Access: 2014-01-22 13:32:15.518616000 -0800 Modify: 2014-02-13 21:31:37.035164000 -0800 Change: 2014-02-13 21:31:37.035212651 -0800 and in this case, the hard links do exist, are on the .glusterfs directory tree: # find /exports/nas-segment-0002/U5-4/ -samefile /exports/nas-segment-0002/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf -ls 60078978 64 -rw-rw---- 3 root NASLAB+domain admins 65536 Feb 13 21:31 /exports/nas-segment-0002/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf 60078978 64 -rw-rw---- 3 root NASLAB+domain admins 65536 Feb 13 21:31 /exports/nas-segment-0002/U5-4/.glusterfs/fd/9f/fd9f0601-3c28-49ad-86c4-569d4a6b63a0 60078978 64 -rw-rw---- 3 root NASLAB+domain admins 65536 Feb 13 21:31 /exports/nas-segment-0002/U5-4/.glusterfs/ef/8e/ef8e9aae-5075-44aa-9344-d616971af197 # getfattr -d -m 'trusted' /exports/nas-segment-0002/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf trusted.gfid=0s746arlB1RKqTRNYWlxrxlw== I am surprised that there are any '.glusterfs' directories at all. I've read that these are primarily used for replica volume healing, but this volume has never been a replica, is only distributed: # gluster volume info U5-4 Volume Name: U5-4 Type: Distribute Volume ID: 9edcf168-7e8c-4fb4-b032-c6942d3ce8dd Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: 10.10.60.235:/exports/nas-segment-0001/U5-4 Brick2: 10.10.60.235:/exports/nas-segment-0002/U5-4 Brick3: 10.10.60.60:/exports/nas-segment-0001/U5-4 Brick4: 10.10.60.232:/exports/nas-segment-0001/U5-4 Options Reconfigured: performance.read-ahead: off performance.write-behind: off performance.stat-prefetch: off nfs.disable: on nfs.addr-namelookup: off I found that if I deleted the '.glusterfs' directory trees manually, the volume rebuild would then work without any "file has hardlinks" errors, but I shouldn't really be doing this. Should GlusterFS be blocking the re-balancing of files that have hard links to the .glusterfs hierarchy that it created itself.
Case 1: ------- If rebalance is triggered by executing "gluster volume rebalance volume-name start", then if files have hard links (excluding one link file under .glusterfs) then file will not be migrated. Case 2: ------- But if brick is removed, then even if file exists with hard-links, files will be migrated. In the case 1: Any new file created under gluster-volume will contain only one hard-link to actual file under .glusterfs. And hard-link of file under .glusterfs will present for all types of volumes.(distributed and replicate-distribute volumes). And rebalance should not be blocked as long as there is just one link present inside .glusterfs directory. But in your system, under .glusterfs there are two hard links. Link-1: -------- 60078978 /exports/nas-segment-0002/U5-4/.glusterfs/fd/9f/fd9f0601-3c28-49ad-86c4-569d4a6b63a0 Link-2: -------- 60078978 /exports/nas-segment-0002/U5-4/.glusterfs/ef/8e/ef8e9aae-5075-44aa-9344-d616971af197 Link-2 is legitimate one, because gfid of file and hard-link name is same. Presence of Link-1 is suspicious. Will update bugzilla, If I can come up with cases which can lead to such situation. If you can provide, what operations were performed at the mount or any script which ran at the mount point, it will be helpful.
GlusterFS 3.7.0 has been released (http://www.gluster.org/pipermail/gluster-users/2015-May/021901.html), and the Gluster project maintains N-2 supported releases. The last two releases before 3.7 are still maintained, at the moment these are 3.6 and 3.5. This bug has been filed against the 3,4 release, and will not get fixed in a 3.4 version any more. Please verify if newer versions are affected with the reported problem. If that is the case, update the bug with a note, and update the version if you can. In case updating the version is not possible, leave a comment in this bug report with the version you tested, and set the "Need additional information the selected bugs from" below the comment box to "bugs". If there is no response by the end of the month, this bug will get automatically closed.
GlusterFS 3.4.x has reached end-of-life. If this bug still exists in a later release please reopen this and change the version or open a new bug.
GlusterFS 3.4.x has reached end-of-life.\ \ If this bug still exists in a later release please reopen this and change the version or open a new bug.