| Summary: | [Tier]: Creation of hardlink on tiered volume failed with EXISTS | |||
|---|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Rahul Hinduja <rhinduja> | |
| Component: | tier | Assignee: | Mohammed Rafi KC <rkavunga> | |
| Status: | CLOSED WONTFIX | QA Contact: | Nag Pavan Chilakam <nchilaka> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | rhgs-3.1 | CC: | hgowtham, nbalacha, rhs-bugs, rkavunga, sankarshan | |
| Target Milestone: | --- | Keywords: | ZStream | |
| Target Release: | --- | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | tier-fuse-nfs-samba | |||
| Fixed In Version: | Doc Type: | Known Issue | ||
| Doc Text: |
A race condition between tier migration and hard link creation results in the hard link operation failing with a 'File exists' error, and logging 'Stale file handle' messages on the client. This does not impact functionality, and file access works as expected.
This race occurs when a file is migrated to the cold tier after a hard link has been created on the cold tier, but before a hard link is created to the data on the hot tier. In this situation, the attempt to create a hard link on the hot tier fails. However, because the migration converts the hard link on the cold tier to a data file, and a hard link to the file already exists on the cold tier, the links exist and work as expected.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1316517 (view as bug list) | Environment: | ||
| Last Closed: | 2018-11-08 19:20:36 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1268895, 1316517 | |||
This looks like a race between tier migration and hardlink creation. If the file is in hot tier, then we create linkfile on cold tier first and then will create link on hot tier. If the file completely moved from hot tier to cold tier, just before the hardlink creation in hot tier, it will fail with ESTALE error, because the inode from brick side would have forgotten . Since we already created hardlink on the destination, it would have converted to the actual file, so everything will work as expected. Upstream patch : http://review.gluster.org/#/c/13672/ This patch will solve the race condition with linkfile creation When the file is in hot tier and linkfile in cold tier 1) Hardlink created on cold tier for linkfile 2) if Link creation on hot tier success then send success to application 3) else if Link creation on hot tier fails 4) Do a lookup on hot tier with source path. 5) if source is present 5.1 if (gfid differ) // File can recreate with different gfid 5.2.1 make link call success 5.2 else if same gfid 5.2.1 fail the link call 6) else if source is not present // Either migrated or deleted 6.1 Do lookup on dst loc in cold tier 6.2 if (dst is not present) 6.2.1 Created link was deleted. Fail the link call 6.3 else if dst is present 6.3.1 if it is linkto file // source could have deleted 6.3.1.1 if gfid is same 6.3.1.1.1 delete the linkfile 6.3.1.1.2 fail the link call 6.3.1.2 else if gfid differ 6.3.1.2.1 Fail the link call 6.3.2 else if it is a regular file 6.3.2.1 if gfid is same 6.3.2.1.1 Success the link call. 6.3.2.2 else if gfid is different 6.3.2.2.1 Fail the link call. 7) End This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions As tier is not being actively developed, I'm closing this bug. Feel free to open it if necessary. |
Description of problem: ======================= On a tiered volume created 2000 files and it started demotion of files. Tried to create hardlink of all the 2000 files, it reports failure as: [root@dj fuse]# for i in {1..2000}; do ln file.$i hardlink_1.$i ; done ln: failed to create hard link ‘hardlink_1.1398’: File exists [root@dj fuse]# But the files is present only once as: [root@dj fuse]# ll hardlink_1.1398 -rw-r--r--. 2 root root 1048576 Feb 8 2016 hardlink_1.1398 [root@dj fuse]# [root@dj fuse]# ls -l hardlink_1.1398 -rw-r--r--. 2 root root 1048576 Feb 8 2016 hardlink_1.1398 [root@dj fuse]# stat hardlink_1.1398 File: ‘hardlink_1.1398’ Size: 1048576 Blocks: 2048 IO Block: 131072 regular file Device: 26h/38d Inode: 10987475693105410840 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:fusefs_t:s0 Access: 2016-02-08 05:14:59.915525000 +0530 Modify: 2016-02-08 05:14:59.941525000 +0530 Change: 2016-02-08 05:25:33.436849308 +0530 Birth: - [root@dj fuse]# Client logs: ============ [2016-02-07 23:00:47.357412] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-master-client-12: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2016-02-07 23:00:47.358235] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-master-client-12: Connected to master-client-12, attached to remote volume '/rhs/brick3/tier0'. [2016-02-07 23:00:47.358262] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-master-client-12: Server and Client lk-version numbers are not same, reopening the fds [2016-02-07 23:00:47.362914] I [fuse-bridge.c:5123:fuse_graph_setup] 0-fuse: switched to graph 0 [2016-02-07 23:00:47.363436] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-master-client-12: Server lk version = 1 [2016-02-07 23:00:47.363591] I [fuse-bridge.c:4040:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22 [2016-02-07 23:08:33.028518] W [MSGID: 114031] [client-rpc-fops.c:2812:client3_3_link_cbk] 0-master-client-13: remote operation failed: (/file.1398 -> /hardlink_1.1398) [Stale file handle] [2016-02-07 23:08:33.028713] W [MSGID: 114031] [client-rpc-fops.c:2812:client3_3_link_cbk] 0-master-client-12: remote operation failed: (/file.1398 -> /hardlink_1.1398) [Stale file handle] [2016-02-07 23:08:33.029666] W [fuse-bridge.c:464:fuse_entry_cbk] 0-glusterfs-fuse: 39657: LINK() /hardlink_1.1398 => -1 (Stale file handle) (END) Hot tier Bricks logs: ===================== [2016-02-07 23:40:16.698624] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed [2016-02-07 23:40:16.796903] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec] 0-mgmt: Volume file changed [2016-02-07 23:40:16.801922] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing [2016-02-07 23:40:16.803407] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing [2016-02-07 23:40:46.441479] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-master-server: accepted client from dj.lab.eng.blr.redhat.com-27723-2016/02/07-23:00:47:130217-master-client-12-0-0 (version: 3.7.5) [2016-02-07 23:41:17.403764] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-master-server: accepted client from mia.lab.eng.blr.redhat.com-15904-2016/02/08-04:31:18:472434-master-client-12-0-0 (version: 3.7.5) [2016-02-07 23:48:32.136403] I [MSGID: 115062] [server-rpc-fops.c:1200:server_link_cbk] 0-master-server: 62496: LINK (null) (c629cb4c-7528-4943-987b-5aedc1c97318) -> 00000000-0000-0000-0000-000000000001/hardlink_1.1398 ==> (Stale file handle) [Stale file handle] Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.7.5-19.el7rhgs.x86_64 How reproducible: ================= 1/1 Steps to Reproduce: =================== 1. Have tiered volume 2. Create 2k files such that demotions start 3. Start creating hardlinks for all the 2k files Actual results: =============== Hardlink errored with EXISTS on cli. But from functionality hardlink file gets created. Expected results: ================= There should not be failure reported in cli