Bug 1073616 - Distributed volume rebalance errors due to hardlinks to .glusterfs/...
Summary: Distributed volume rebalance errors due to hardlinks to .glusterfs/...
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: 3.4.2
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-03-06 19:44 UTC by Jeff Byers
Modified: 2015-10-07 13:50 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-10-07 13:49:43 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Jeff Byers 2014-03-06 19:44:25 UTC
Distributed volume rebalance errors due to hardlinks to .glusterfs/...

Using GlusterFS 3.4.2, I sometimes getting "file has
hardlinks" errors in the GlusterFS *-rebalance.logs:

[2014-03-06 03:33:54.782401] W [dht-rebalance.c:227:__is_file_migratable] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47588.rvf: file has hardlinks
[2014-03-06 03:33:56.893538] I [dht-rebalance.c:666:dht_migrate_file] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47029.rvf: attempting to move from U5-4-client-1 to U5-4-client-2
[2014-03-06 03:33:56.893971] W [dht-rebalance.c:227:__is_file_migratable] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47029.rvf: file has hardlinks
[2014-03-06 03:34:00.237993] I [dht-rebalance.c:666:dht_migrate_file] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47009.rvf: attempting to move from U5-4-client-1 to U5-4-client-2
[2014-03-06 03:34:00.238369] W [dht-rebalance.c:227:__is_file_migratable] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47009.rvf: file has hardlinks
[2014-03-06 03:34:02.815369] I [dht-rebalance.c:666:dht_migrate_file] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf: attempting to move from U5-4-client-1 to U5-4-client-2
[2014-03-06 03:34:02.815624] W [dht-rebalance.c:227:__is_file_migratable] 0-U5-4-dht: /cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf: file has hardlinks

which is preventing the re-balancing of these files. When
this happens, it seems to occur for most or all files on the
volume, making rebalance ineffective.

It is understandable that re-balancing a file with hard
links could be problematic as all files would have to be
moved together to properly maintain the linkage.

Since the volume is a Samba/FUSE mounted CIFS share, there
should not be any hard links made by the client.

Yet, a stat on any file always shows two links:

# stat /samba/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf
  File: `/samba/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf'
  Size: 65536           Blocks: 128        IO Block: 131072 regular file
Device: 39h/57d Inode: 10611842014504481175  Links: 2
Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (2123366912/NASLAB+domain admins)
Access: 2014-01-22 13:32:15.518616000 -0800
Modify: 2014-02-13 21:31:37.035164000 -0800
Change: 2014-02-13 21:31:37.035212651 -0800

But there are no no other files shown as linked:

# find /samba/U5-4/ -samefile /samba/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf -ls
10611842014504481175   64 -rw-rw----   2 root     NASLAB+domain admins    65536 Feb 13 21:31 /samba/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf

It seems that the 'glusterfs' client mount always shows 2
Links in 'stat' output for some reason.

When I look at the file on the brick, there are also hard
links:

# stat /exports/nas-segment-0002/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf
  File: `/exports/nas-segment-0002/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf'
  Size: 65536           Blocks: 128        IO Block: 4096   regular file
Device: 8c0h/2240d      Inode: 60078978    Links: 3
Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (2123366912/NASLAB+domain admins)
Access: 2014-01-22 13:32:15.518616000 -0800
Modify: 2014-02-13 21:31:37.035164000 -0800
Change: 2014-02-13 21:31:37.035212651 -0800

and in this case, the hard links do exist, are on the
.glusterfs directory tree:

# find /exports/nas-segment-0002/U5-4/ -samefile /exports/nas-segment-0002/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf -ls
60078978   64 -rw-rw----   3 root     NASLAB+domain admins    65536 Feb 13 21:31 /exports/nas-segment-0002/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf
60078978   64 -rw-rw----   3 root     NASLAB+domain admins    65536 Feb 13 21:31 /exports/nas-segment-0002/U5-4/.glusterfs/fd/9f/fd9f0601-3c28-49ad-86c4-569d4a6b63a0
60078978   64 -rw-rw----   3 root     NASLAB+domain admins    65536 Feb 13 21:31 /exports/nas-segment-0002/U5-4/.glusterfs/ef/8e/ef8e9aae-5075-44aa-9344-d616971af197

# getfattr -d -m 'trusted' /exports/nas-segment-0002/U5-4/cifs_share/VMW2K8R2-211-140/VMW2K8R2-211WRC47518.rvf
trusted.gfid=0s746arlB1RKqTRNYWlxrxlw==

I am surprised that there are any '.glusterfs' directories
at all. I've read that these are primarily used for replica 
volume healing, but this volume has never been a replica, is
only distributed:

# gluster volume info U5-4
Volume Name: U5-4
Type: Distribute
Volume ID: 9edcf168-7e8c-4fb4-b032-c6942d3ce8dd
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.10.60.235:/exports/nas-segment-0001/U5-4
Brick2: 10.10.60.235:/exports/nas-segment-0002/U5-4
Brick3: 10.10.60.60:/exports/nas-segment-0001/U5-4
Brick4: 10.10.60.232:/exports/nas-segment-0001/U5-4
Options Reconfigured:
performance.read-ahead: off
performance.write-behind: off
performance.stat-prefetch: off
nfs.disable: on
nfs.addr-namelookup: off

I found that if I deleted the '.glusterfs' directory trees
manually, the volume rebuild would then work without any
"file has hardlinks" errors, but I shouldn't really be doing
this.

Should GlusterFS be blocking the re-balancing of files that
have hard links to the .glusterfs hierarchy that it created
itself.

Comment 1 vsomyaju 2014-06-27 07:37:37 UTC
Case 1:
-------
If rebalance is triggered by executing "gluster volume rebalance volume-name 
start", then if files have hard links (excluding one link file under .glusterfs) then file will not be migrated.

Case 2:
-------
But if brick is removed, then even if file exists with hard-links, files will be migrated.

In the case 1: Any new file created under gluster-volume will contain only one hard-link to actual file under .glusterfs.
 
And hard-link of file under .glusterfs will present for all types of volumes.(distributed and replicate-distribute volumes).

And rebalance should not be blocked as long as there is just one link present
inside .glusterfs directory.

But in your system, under .glusterfs there are two hard links.

Link-1:
--------
60078978   
/exports/nas-segment-0002/U5-4/.glusterfs/fd/9f/fd9f0601-3c28-49ad-86c4-569d4a6b63a0

Link-2:
--------
60078978    
/exports/nas-segment-0002/U5-4/.glusterfs/ef/8e/ef8e9aae-5075-44aa-9344-d616971af197


Link-2 is legitimate one, because gfid of file and hard-link name is same.

Presence of Link-1 is suspicious. 
Will update bugzilla, If I can come up with cases which can lead to such situation.

If you can provide, what operations were performed at the mount or any script which ran at the mount point, it will be helpful.

Comment 2 Niels de Vos 2015-05-17 22:00:12 UTC
GlusterFS 3.7.0 has been released (http://www.gluster.org/pipermail/gluster-users/2015-May/021901.html), and the Gluster project maintains N-2 supported releases. The last two releases before 3.7 are still maintained, at the moment these are 3.6 and 3.5.

This bug has been filed against the 3,4 release, and will not get fixed in a 3.4 version any more. Please verify if newer versions are affected with the reported problem. If that is the case, update the bug with a note, and update the version if you can. In case updating the version is not possible, leave a comment in this bug report with the version you tested, and set the "Need additional information the selected bugs from" below the comment box to "bugs".

If there is no response by the end of the month, this bug will get automatically closed.

Comment 3 Kaleb KEITHLEY 2015-10-07 13:49:43 UTC
GlusterFS 3.4.x has reached end-of-life.

If this bug still exists in a later release please reopen this and change the version or open a new bug.

Comment 4 Kaleb KEITHLEY 2015-10-07 13:50:53 UTC
GlusterFS 3.4.x has reached end-of-life.\                                                   \                                                                               If this bug still exists in a later release please reopen this and change the version or open a new bug.


Note You need to log in before you can comment on or make changes to this bug.