Description of problem: ===================== Rename of a file to another existing file in a different tier but in same dht hash range seems to be causing split brain. There is a file corruption and the file is showing up as two copies on the fuse mount client Version-Release number of selected component (if applicable): ============================================================= glusterfs-server-3.7.5-5.el7rhgs.x86_64 How reproducible: ================== easily Steps to Reproduce: =================== 1.create a tier volume and have it mounted on fuse 2.create a big file of 1GB say GB.txt 3.create some zero byte files say z{1..10} 4. Now note down all the files which share the same brick as GB.txt in hot tier 5. Now keep all files idle and wait from them to get demoted 6.Now once all files are demoted, note down the files which share the same brick as GB.txt in cold tier 7.Now identify the file which shares GB.txt both in cold and hot tier. Lets assume the file is z4 8. Now touch all z{1..10} to get them to hot tier 9. Now rename GB.txt to Z4 using "mv" command 10. After proceeding with the confirm prompt, it can be seen that there are two instances of z4 on mount. Also check the client mount logs CLient fuse logs: =============== [root@mia newname]# ll total 9116425 -rw-r--r--. 1 root root 1555868318 Nov 2 2015 ff2 -rw-r--r--. 1 root root 1555868318 Nov 2 2015 ff4 -rw-r--r--. 1 root root 1555868318 Nov 2 2015 FnF7.mkv -rw-r--r--. 1 root root 1555868318 Nov 2 07:52 k1 -rw-r--r--. 1 root root 0 Nov 2 07:52 k10 -rw-r--r--. 1 root root 1555868318 Nov 2 2015 k2 -rw-r--r--. 1 root root 1555868318 Nov 2 2015 k2 -rw-r--r--. 1 root root 0 Nov 2 07:52 k3 -rw-r--r--. 1 root root 0 Nov 2 07:52 k4 -rw-r--r--. 1 root root 0 Nov 2 07:52 k5 -rw-r--r--. 1 root root 0 Nov 2 07:52 k6 -rw-r--r--. 1 root root 0 Nov 2 07:52 k7 -rw-r--r--. 1 root root 0 Nov 2 07:52 k8 -rw-r--r--. 1 root root 0 Nov 2 07:52 k9 -rw-r--r--. 1 root root 6358 Nov 2 2015 stat.log [root@mia newname]# [2015-11-02 02:20:05.500755] I [MSGID: 109066] [dht-rename.c:1411:dht_rename] 0-newname-tier-dht: renaming /ff3 (hash=newname-hot-dht/cache=newname-cold-dht) => /k1 (hash=newname-hot-dht/cache=newname-hot-dht) [2015-11-02 02:20:05.505746] E [MSGID: 108008] [afr-transaction.c:1981:afr_transaction] 0-newname-replicate-1: Failing SETATTR on gfid d423e54f-85cc-4725-b495-60addde165e1: split-brain observed. [Input/output error] [2015-11-02 02:20:05.505792] E [MSGID: 109031] [dht-linkfile.c:306:dht_linkfile_setattr_cbk] 0-newname-cold-dht: Failed to set attr uid/gid on /ff3 :<gfid:00000000-0000-0000-0000-000000000000> [Input/output error] [2015-11-02 02:20:05.505827] I [MSGID: 109066] [dht-rename.c:1411:dht_rename] 0-newname-hot-dht: renaming /ff3 (hash=newname-replicate-2/cache=newname-replicate-2) => /k1 (hash=newname-replicate-2/cache=newname-replicate-2) [2015-11-02 02:23:31.481425] I [MSGID: 109066] [dht-rename.c:1411:dht_rename] 0-newname-tier-dht: renaming /ff1 (hash=newname-hot-dht/cache=newname-cold-dht) => /k2 (hash=newname-hot-dht/cache=newname-hot-dht) [2015-11-02 02:23:31.485198] I [MSGID: 109066] [dht-rename.c:1411:dht_rename] 0-newname-hot-dht: renaming /ff1 (hash=newname-replicate-3/cache=newname-replicate-3) => /k2 (hash=newname-replicate-2/cache=newname-replicate-2) [2015-11-02 02:23:31.486837] W [MSGID: 109065] [dht-rename.c:1231:dht_rename_lock_cbk] 0-newname-hot-dht: acquiring inodelk failed rename (/ff1:d0b5d1c0-ba5d-40f9-af2a-9e7fe745bf4d:newname-replicate-3 /k2:c79c0457-6285-4fd1-8235-7a9fa655c625:newname-replicate-2), returning EBUSY [Stale file handle] [2015-11-02 02:23:31.486879] I [MSGID: 109030] [dht-rename.c:729:dht_rename_cbk] 0-newname-tier-dht: /ff1: Rename (linkto file) on newname-hot-dht failed, (gfid = d0b5d1c0-ba5d-40f9-af2a-9e7fe745bf4d) [Stale file handle] [2015-11-02 02:31:31.261176] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-newname-client-5: remote operation failed. Path: /k10 (72b447d2-4434-4088-9554-2b13f2cc8dd8) [No such file or directory] [2015-11-02 02:31:31.261306] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-newname-client-4: remote operation failed. Path: /k10 (72b447d2-4434-4088-9554-2b13f2cc8dd8) [No such file or directory] ============== See logs of file ff1 and K2 as ff1 was renamed to k2====== [2015-11-02 09:30:00.679944] E [MSGID: 109037] [tier.c:1498:tier_start] 0-newname-tier-dht: Promotion failed [2015-11-02 09:31:10.788134] I [MSGID: 109028] [dht-rebalance.c:3607:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 2353.00 secs [2015-11-02 09:31:10.788177] I [MSGID: 109028] [dht-rebalance.c:3611:gf_defrag_status_get] 0-glusterfs: Files migrated: 13, size: 0, lookups: 76, failures: 2, skipped: 0 [2015-11-02 09:31:10.826864] I [MSGID: 109028] [dht-rebalance.c:3607:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 2353.00 secs [2015-11-02 09:31:10.826898] I [MSGID: 109028] [dht-rebalance.c:3611:gf_defrag_status_get] 0-glusterfs: Files migrated: 13, size: 0, lookups: 76, failures: 2, skipped: 0 [2015-11-02 09:32:22.542170] I [MSGID: 109028] [dht-rebalance.c:3607:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 2425.00 secs [2015-11-02 09:32:22.542215] I [MSGID: 109028] [dht-rebalance.c:3611:gf_defrag_status_get] 0-glusterfs: Files migrated: 24, size: 0, lookups: 87, failures: 2, skipped: 0 [2015-11-02 09:32:22.566528] I [MSGID: 109028] [dht-rebalance.c:3607:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 2425.00 secs [2015-11-02 09:32:22.566531] I [MSGID: 109028] [dht-rebalance.c:3611:gf_defrag_status_get] 0-glusterfs: Files migrated: 24, size: 0, lookups: 87, failures: 2, skipped: 0 [2015-11-02 09:36:00.922513] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:36:00.923177] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:36:01.102854] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:38:00.126505] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:38:00.127246] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:38:00.127412] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:40:00.151145] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:40:00.151954] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:40:00.152134] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:42:00.178071] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:42:00.178843] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:42:00.179016] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:44:00.202865] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:44:00.203542] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:44:00.203674] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:46:00.227746] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:46:00.228477] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:46:00.228607] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:48:00.252896] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:48:00.253641] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:48:00.253827] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:50:00.277610] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:50:00.278400] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:50:00.278553] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:52:00.305940] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:52:00.306664] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:52:00.306812] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:54:00.330672] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:54:00.331513] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:54:00.331655] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:56:00.358181] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:56:00.358968] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:56:00.359139] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 09:58:00.382393] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 09:58:00.383448] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 09:58:00.383565] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed [2015-11-02 10:00:00.405530] W [MSGID: 109023] [dht-rebalance.c:530:__dht_rebalance_create_dst_file] 0-newname-tier-dht: /k2: failed to lookup file (Stale file handle) [2015-11-02 10:00:00.406339] E [MSGID: 109037] [tier.c:523:tier_migrate_using_query_file] 0-newname-tier-dht: ERROR -28 in current migration k2 /k2 [2015-11-02 10:00:00.406457] E [MSGID: 109037] [tier.c:1488:tier_start] 0-newname-tier-dht: Demotion failed
sosreports@ below location. Refer volume "newname" [nchilaka@rhsqe-repo bug.1277088]$ pwd /home/repo/sosreports/nchilaka/bug.1277088
Tested with build glusterfs-server-3.7.5-8, and tried both rename(move) of files in hot tier to files in the cold tier and same way rename (move) of files in the cold tier to files in hot tier and after rename operation mount shows file correctly so marking this bug as verified Note: If client is running lower version (glusterfs-api-3.7.5-5) rename operation is failing with device busy error
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0193.html