Bug 1305490 - [Tier]: Creation of hardlink on tiered volume failed with EXISTS
[Tier]: Creation of hardlink on tiered volume failed with EXISTS
Status: POST
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: tier (Show other bugs)
3.1
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Mohammed Rafi KC
nchilaka
tier-fuse-nfs-samba
: ZStream
Depends On:
Blocks: 1316517 1268895
  Show dependency treegraph
 
Reported: 2016-02-08 07:15 EST by Rahul Hinduja
Modified: 2018-02-07 19:11 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
A race condition between tier migration and hard link creation results in the hard link operation failing with a 'File exists' error, and logging 'Stale file handle' messages on the client. This does not impact functionality, and file access works as expected. This race occurs when a file is migrated to the cold tier after a hard link has been created on the cold tier, but before a hard link is created to the data on the hot tier. In this situation, the attempt to create a hard link on the hot tier fails. However, because the migration converts the hard link on the cold tier to a data file, and a hard link to the file already exists on the cold tier, the links exist and work as expected.
Story Points: ---
Clone Of:
: 1316517 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rahul Hinduja 2016-02-08 07:15:18 EST
Description of problem:
=======================

On a tiered volume created 2000 files and it started demotion of files. Tried to create hardlink of all the 2000 files, it reports failure as:

[root@dj fuse]# for i in {1..2000}; do ln file.$i hardlink_1.$i  ; done
ln: failed to create hard link ‘hardlink_1.1398’: File exists
[root@dj fuse]#


But the files is present only once as:

[root@dj fuse]# ll hardlink_1.1398
-rw-r--r--. 2 root root 1048576 Feb  8  2016 hardlink_1.1398
[root@dj fuse]# 


[root@dj fuse]# ls -l hardlink_1.1398
-rw-r--r--. 2 root root 1048576 Feb  8  2016 hardlink_1.1398
[root@dj fuse]# stat hardlink_1.1398
  File: ‘hardlink_1.1398’
  Size: 1048576   	Blocks: 2048       IO Block: 131072 regular file
Device: 26h/38d	Inode: 10987475693105410840  Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:fusefs_t:s0
Access: 2016-02-08 05:14:59.915525000 +0530
Modify: 2016-02-08 05:14:59.941525000 +0530
Change: 2016-02-08 05:25:33.436849308 +0530
 Birth: -
[root@dj fuse]# 

Client logs:
============

[2016-02-07 23:00:47.357412] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-master-client-12: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2016-02-07 23:00:47.358235] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-master-client-12: Connected to master-client-12, attached to remote volume '/rhs/brick3/tier0'.
[2016-02-07 23:00:47.358262] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-master-client-12: Server and Client lk-version numbers are not same, reopening the fds
[2016-02-07 23:00:47.362914] I [fuse-bridge.c:5123:fuse_graph_setup] 0-fuse: switched to graph 0
[2016-02-07 23:00:47.363436] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-master-client-12: Server lk version = 1
[2016-02-07 23:00:47.363591] I [fuse-bridge.c:4040:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22
[2016-02-07 23:08:33.028518] W [MSGID: 114031] [client-rpc-fops.c:2812:client3_3_link_cbk] 0-master-client-13: remote operation failed: (/file.1398 -> /hardlink_1.1398) [Stale file handle]
[2016-02-07 23:08:33.028713] W [MSGID: 114031] [client-rpc-fops.c:2812:client3_3_link_cbk] 0-master-client-12: remote operation failed: (/file.1398 -> /hardlink_1.1398) [Stale file handle]
[2016-02-07 23:08:33.029666] W [fuse-bridge.c:464:fuse_entry_cbk] 0-glusterfs-fuse: 39657: LINK() /hardlink_1.1398 => -1 (Stale file handle)
(END)


Hot tier Bricks logs:
=====================

[2016-02-07 23:40:16.698624] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2016-02-07 23:40:16.796903] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2016-02-07 23:40:16.801922] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2016-02-07 23:40:16.803407] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2016-02-07 23:40:46.441479] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-master-server: accepted client from dj.lab.eng.blr.redhat.com-27723-2016/02/07-23:00:47:130217-master-client-12-0-0 (version: 3.7.5)
[2016-02-07 23:41:17.403764] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-master-server: accepted client from mia.lab.eng.blr.redhat.com-15904-2016/02/08-04:31:18:472434-master-client-12-0-0 (version: 3.7.5)
[2016-02-07 23:48:32.136403] I [MSGID: 115062] [server-rpc-fops.c:1200:server_link_cbk] 0-master-server: 62496: LINK (null) (c629cb4c-7528-4943-987b-5aedc1c97318) -> 00000000-0000-0000-0000-000000000001/hardlink_1.1398 ==> (Stale file handle) [Stale file handle]


Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.7.5-19.el7rhgs.x86_64


How reproducible:
=================

1/1


Steps to Reproduce:
===================
1. Have tiered volume
2. Create 2k files such that demotions start
3. Start creating hardlinks for all the 2k files

Actual results:
===============

Hardlink errored with EXISTS on cli. But from functionality hardlink file gets created. 


Expected results:
=================

There should not be failure reported in cli
Comment 3 Mohammed Rafi KC 2016-02-08 07:37:27 EST
This looks like a race between tier migration and hardlink creation. If the file is in hot tier, then we create linkfile on cold tier first and then will create link on hot tier. If the file completely moved from hot tier to cold tier, just before the hardlink creation in hot tier, it will fail with ESTALE error, because the inode from brick side would have forgotten .

Since we already created hardlink on the destination, it would have converted to the actual file, so everything will work as expected.
Comment 13 Mohammed Rafi KC 2016-03-10 09:00:34 EST
Upstream patch : http://review.gluster.org/#/c/13672/

This patch will solve the race condition with linkfile creation

When the file is in hot tier and linkfile in cold tier


1) Hardlink created on cold tier for linkfile
2) if Link creation on hot tier success then send success to application
3) else if Link creation on hot tier fails
 4) Do a lookup on hot tier with source path.
 5) if source is present
    5.1 if (gfid differ)               // File can recreate with different gfid
        5.2.1 make link call success
    5.2 else if same gfid
        5.2.1 fail the link call
 6) else if source is not present       // Either migrated or deleted
    6.1 Do lookup on dst loc in cold tier
    6.2 if (dst is not present)
        6.2.1 Created link was deleted. Fail the link call
    6.3 else if dst is present
        6.3.1 if it is linkto file // source could have deleted
            6.3.1.1 if gfid is same
               6.3.1.1.1 delete the linkfile
               6.3.1.1.2 fail the link call
            6.3.1.2 else if gfid differ
               6.3.1.2.1 Fail the link call
        6.3.2 else if it is a regular file
            6.3.2.1 if gfid is same
               6.3.2.1.1 Success the link call.
            6.3.2.2 else if gfid is different
               6.3.2.2.1 Fail the link call.
                 
7) End
Comment 16 Mike McCune 2016-03-28 19:28:22 EDT
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune@redhat.com with any questions

Note You need to log in before you can comment on or make changes to this bug.