Description of problem: ======================== On a tiered volume which has files under migration, if we issue an rm -rf, all the files including which are under migration are deleted but are leaving the link-to file(in hashed subvol) undeleted. The link-to files are later getting converted to regular files and occupying disk space unncessarily. While, we are deleting the original or cached file, I don't see a point of having the hashed file anymore. We need to have locks removed there too. Version-Release number of selected component (if applicable): ============================================================ glusterfs-server-3.7.5-0.3.el7rhgs.x86_64 How reproducible: ================== very easy and always Steps to Reproduce: ==================== 1.create,start and mount a tiered volume 2.create some files which take a while to get promoted/demoted. So let each file be of atleast 800MB. Create about 20 such files 3.Now let the demote cycle start. 4. Once the demote cycle starts, it can be seen that the files are being demoted as below in the cached and hashed subvol(see file is.7) [root@zod glusterfs]# ll /rhs/brick*/rosa*/ /rhs/brick1/rosa/: total 1876672 -rw-r--r--. 2 root root 614400000 Oct 29 12:00 is.1 -rw-r--r--. 2 root root 614400000 Oct 29 12:00 is.3 ---------T. 2 root root 614400000 Oct 29 12:06 is.7 -rw-r--r--. 2 root root 614400000 Oct 28 19:32 new.14 /rhs/brick2/rosa/: total 1800000 -rw-r--r--. 2 root root 614400000 Oct 29 12:00 is.2 -rw-r--r--. 2 root root 614400000 Oct 29 12:01 is.4 -rw-r--r--. 2 root root 614400000 Oct 29 12:01 is.6 /rhs/brick6/rosa_hot/: total 9388992 -rw-r--r--. 2 root root 614400000 Oct 29 12:02 is.10 -rw-r--r--. 2 root root 398327808 Oct 29 12:04 is.22 -rw-r-Sr-T. 2 root root 614400000 Oct 29 12:01 is.7 5. Now from the fuse mount, before all files are demoted, issue a rm -rf to delete all files 6. It can be seen all files are delete except for the files which were under migrate 7. Now if u check the backend brick immediately, it can be seen that it is a link-to file which is not deleted. And after a few seconds this link-to file is converted to a normal read-write file as below [root@zod glusterfs]# ll /rhs/brick*/rosa*/ /rhs/brick1/rosa/: total 582400 ---------T. 2 root root 614400000 Oct 29 12:07 is.7 ==after few seconds======== [root@zod glusterfs]# [root@zod glusterfs]# ll /rhs/brick*/rosa*/ /rhs/brick1/rosa/: total 600000 -rw-r--r--. 2 root root 614400000 Oct 29 12:01 is.7 8. If u monitor the client fuse logs, it can be seen that a possible split brain is observed: [2015-10-29 11:41:18.567156] W [MSGID: 114031] [client-rpc-fops.c:1569:client3_3_fstat_cbk] 0-rosa-client-2: remote operation failed [No such file or directory] [2015-10-29 11:41:18.571387] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-rosa-replicate-1: Unreadable subvolume -1 found with event generation 2 for gfid 360ed98c-d031-4631-a1fc-0fface82400f. (Possible split-brain) [2015-10-29 11:41:18.575262] E [MSGID: 109040] [dht-helper.c:1020:dht_migration_complete_check_task] 0-rosa-cold-dht: (null): failed to lookup the file on rosa-cold-dht [Stale file handle] [2015-10-29 11:41:18.578245] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-rosa-replicate-1: Unreadable subvolume -1 found with event generation 2 for gfid 360ed98c-d031-4631-a1fc-0fface82400f. (Possible split-brain) Actual results: ============== 1)linkto file getting converted to a regular file 2)disk wastage happens due to this 3)split brain possibly seen 4)Also, later I can see a different bit rot version on the replicas(i didnt enable bitrot) [root@zod glusterfs]# getfattr -d -m . -e hex /rhs/brick*/rosa*/* getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/rosa/is.7 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x0200000000000000562f7f97000aa25b trusted.gfid=0x6db6cae40a784af38da9af842243ffe8 trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x00000000249f00000000000000000001 trusted.pgfid.00000000-0000-0000-0000-000000000001=0x00000001 trusted.tier-gfid.linkto=0x726f73612d686f742d64687400 replica: [root@yarrow glusterfs]# getfattr -d -m . -e hex /rhs/brick*/rosa*/* getfattr: Removing leading '/' from absolute path names # file: rhs/brick1/rosa/is.7 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x0200000000000000562f7f9a0003bc6d trusted.gfid=0x6db6cae40a784af38da9af842243ffe8 trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri=0x00000000249f00000000000000000001 trusted.pgfid.00000000-0000-0000-0000-000000000001=0x00000001 trusted.tier-gfid.linkto=0x726f73612d686f742d64687400 Expected results: =================== None of the issues should be seen
following is the xattrs during the delete of files: [root@zod glusterfs]# head -n 853 /heels.log |tail -n 100 /rhs/brick1/rosa/: total 0 /rhs/brick2/rosa/: total 510080 ---------T. 2 root root 614400000 Oct 29 13:16 heaven.3 /rhs/brick6/rosa_hot/: total 0 /rhs/brick7/rosa_hot/: total 0 # file: rhs/brick2/rosa/heaven.3 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000010000000000000000 trusted.bit-rot.version=0x0200000000000000562f7f97000d37e6 trusted.gfid=0x644b07152673448f8b29cb3e43940f13 trusted.pgfid.00000000-0000-0000-0000-000000000001=0x00000001 trusted.tier-gfid.linkto=0x726f73612d686f742d64687400 /rhs/brick1/rosa/: total 0 /rhs/brick2/rosa/: total 568960 ---------T. 2 root root 614400000 Oct 29 13:16 heaven.3 /rhs/brick6/rosa_hot/: total 0 /rhs/brick7/rosa_hot/: total 0 # file: rhs/brick2/rosa/heaven.3 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000010000000000000000 trusted.bit-rot.version=0x0200000000000000562f7f97000d37e6 trusted.gfid=0x644b07152673448f8b29cb3e43940f13 trusted.pgfid.00000000-0000-0000-0000-000000000001=0x00000001 trusted.tier-gfid.linkto=0x726f73612d686f742d64687400 /rhs/brick1/rosa/: total 0 /rhs/brick2/rosa/: total 600000 -rw-r--r--. 2 root root 614400000 Oct 29 13:13 heaven.3 /rhs/brick6/rosa_hot/: total 0 /rhs/brick7/rosa_hot/: total 0 # file: rhs/brick2/rosa/heaven.3 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x0200000000000000562f7f97000d37e6 trusted.gfid=0x644b07152673448f8b29cb3e43940f13 trusted.pgfid.00000000-0000-0000-0000-000000000001=0x00000001 trusted.tier-gfid.linkto=0x726f73612d686f742d64687400 /rhs/brick1/rosa/: total 0 /rhs/brick2/rosa/: total 600000 -rw-r--r--. 2 root root 614400000 Oct 29 13:13 heaven.3 /rhs/brick6/rosa_hot/: total 0 /rhs/brick7/rosa_hot/: total 0 # file: rhs/brick2/rosa/heaven.3 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x0200000000000000562f7f97000d37e6 trusted.gfid=0x644b07152673448f8b29cb3e43940f13 trusted.pgfid.00000000-0000-0000-0000-000000000001=0x00000001 trusted.tier-gfid.linkto=0x726f73612d686f742d64687400 /rhs/brick1/rosa/: total 0 /rhs/brick2/rosa/: total 600000 -rw-r--r--. 2 root root 614400000 Oct 29 13:13 heaven.3 /rhs/brick6/rosa_hot/: total 0 /rhs/brick7/rosa_hot/: total 0 # file: rhs/brick2/rosa/heaven.3 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x0200000000000000562f7f97000d37e6 trusted.gfid=0x644b07152673448f8b29cb3e43940f13 trusted.pgfid.00000000-0000-0000-0000-000000000001=0x00000001 trusted.tier-gfid.linkto=0x726f73612d686f742d64687400 [root@zod glusterfs]#
tier logs: =========== 2015-10-29 07:46:23.327109] I [MSGID: 109038] [tier.c:476:tier_migrate_using_query_file] 0-rosa-tier-dht: Tier 0 src_subvol rosa-hot-dht file heaven.3 [2015-10-29 07:46:23.328847] I [dht-rebalance.c:1103:dht_migrate_file] 0-rosa-tier-dht: /heaven.3: attempting to move from rosa-hot-dht to rosa-cold-dht [2015-10-29 07:46:44.142458] W [dht-rebalance.c:1247:dht_migrate_file] 0-rosa-tier-dht: /heaven.3: failed to fsync on rosa-cold-dht (Structure needs cleaning) [2015-10-29 07:46:44.144700] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-rosa-client-7: remote operation failed. Path: <gfid:644b0715-2673-448f-8b29-cb3e43940f13> (644b0715-2673-448f-8b29-cb3e43940f13) [No such file or directory] [2015-10-29 07:46:44.144923] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-rosa-client-6: remote operation failed. Path: <gfid:644b0715-2673-448f-8b29-cb3e43940f13> (644b0715-2673-448f-8b29-cb3e43940f13) [No such file or directory] [2015-10-29 07:46:44.145032] W [MSGID: 109023] [dht-rebalance.c:1317:dht_migrate_file] 0-rosa-tier-dht: Migrate file failed:/heaven.3: failed to get xattr from rosa-hot-dht (No such file or directory) [2015-10-29 07:46:44.145091] E [MSGID: 108008] [afr-transaction.c:1975:afr_transaction] 0-rosa-replicate-2: Failing FSETATTR on gfid 644b0715-2673-448f-8b29-cb3e43940f13: split-brain observed. [Input/output error] [2015-10-29 07:46:44.145470] W [MSGID: 109023] [dht-rebalance.c:1356:dht_migrate_file] 0-rosa-tier-dht: Migrate file failed:/heaven.3: failed to perform setattr on rosa-hot-dht [Input/output error] [2015-10-29 07:46:44.146381] E [MSGID: 109037] [tier.c:492:tier_migrate_using_query_file] 0-rosa-tier-dht: ERROR -28 in current migration heaven.3 /heaven.3 [2015-10-29 07:46:44.150682] E [MSGID: 109037] [tier.c:442:tier_migrate_using_query_file] 0-rosa-tier-dht: ERROR in current lookup [2015-10-29 07:46:44.153524] E [MSGID: 109037] [tier.c:442:tier_migrate_using_query_file] 0-rosa-tier-dht: ERROR in current lookup [2015-10-29 07:46:44.153656] E [MSGID: 109037] [tier.c:1446:tier_start] 0-rosa-tier-dht: Demotion failed [2015-10-29 07:48:00.161457] I [MSGID: 109038] [tier.c:1010:tier_build_migration_qfile] 0-rosa-tier-dht: Failed to remove /var/run/gluster/rosa-tier-dht/demotequeryfile-rosa-tier-dht ^C
upstream patch : http://review.gluster.org/12829
https://code.engineering.redhat.com/gerrit/64015
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0193.html