Hide Forgot
Description of problem: ======================== On a 2x3 dis-rep cold tier and 2x3 dis-rep hot tiered volume, modified files (truncate all the files to size 0) while the bricks were brought offline. The bricks were brought offline. Following are some of the observations made. 1) AFR extended attributes were not marked on the source brick to indicate write failures on the bricks which were brought offline 2) Since the extended attributes were not marked, self-heal daemon didn't perform the self heal on the files 3) heal info showed all the entries as '0'. 4) calculated arequal-checksum from mount. This actually healed the files from sink to source. i.e the sink brick had higher file size where as source had all 0 sized files and the afr extended attributes were also not marked to indicate source and sink bricks on the files. 5) all the files were on hot-tier Version-Release number of selected component (if applicable): =========================================================== glusterfs-3.7.5-13.el6rhs.x86_64 How reproducible: ================== 1/3 Steps to Reproduce: =================== 1. create a 2x3 dis-rep cold and hot tiered volume. start the volume. Create fuse mount. 2. Create files on mount point. 3. Repeat the following steps for the operations: (modify the files, truncate the file to size 0) a. starting the operation on the mount b. bring down certain bricks from each subvolume c. after the modification of files are complete calculate arequal-checksum d. bring back the bricks e. wait for self-heal to complete f. once self-heal is complete, calculate arequal-checksum g. compare checksums calculated at (c) and (f). they should be same Actual results: =============== The checksums mismatched with truncate operation. Expected results: =================== checksums should match Additional info: ==================== rhsauto021.lab.eng.blr.redhat.com:/bricks/brick1/testvol_tier2/ : Online brick rhsauto020.lab.eng.blr.redhat.com:/bricks/brick2/testvol_tier1/ : Offline brick rhsauto019.lab.eng.blr.redhat.com:/bricks/brick2/testvol_tier0/ : Online brick ############################################################################### Extended attributes of files when the brick was down and truncate succeeded on other 2 bricks ############################################################################### 2015-12-28 17:57:40,846 INFO get_number_of_entries_in_brick Extended attributes of all the files/dirs 2015-12-28 17:57:40,847 INFO run Executing getfattr -d -e hex -m . /bricks/brick1/testvol_tier2/* on rhsauto021.lab.eng.blr.redhat.com 2015-12-28 17:57:40,867 INFO run "getfattr -d -e hex -m . /bricks/brick1/testvol_tier2/*" on rhsauto021.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:57:40,867 INFO run "getfattr -d -e hex -m . /bricks/brick1/testvol_tier2/*" on rhsauto021.lab.eng.blr.redhat.com: STDOUT is # file: bricks/brick1/testvol_tier2/D_file_10 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0xb9d22d7c03364bb18ca00b65fe90823a # file: bricks/brick1/testvol_tier2/D_file_2 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0xb7a25ccb4234485699458536e2a36151 # file: bricks/brick1/testvol_tier2/D_file_4 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0x1f4860de1ed041c6bfe0ebe9814474ef # file: bricks/brick1/testvol_tier2/D_file_5 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0xbe8e0a4886e24d1bb6ea94a17b71ea51 # file: bricks/brick1/testvol_tier2/D_file_7 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0x748fe3c075544b8793c99d80d2b83052 # file: bricks/brick1/testvol_tier2/D_file_9 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0x4864b70bf9e24c138b2b97147597351e # file: bricks/brick1/testvol_tier2/file_dir_ops.sh security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00067eca trusted.gfid=0x44b51c0fa0c142c9b048200d17c821d7 2015-12-28 17:57:40,867 ERROR run "getfattr -d -e hex -m . /bricks/brick1/testvol_tier2/*" on rhsauto021.lab.eng.blr.redhat.com: STDERR is getfattr: Removing leading '/' from absolute path names 2015-12-28 17:57:40,868 INFO run Executing ls -l /bricks/brick1/testvol_tier2/* on rhsauto021.lab.eng.blr.redhat.com 2015-12-28 17:57:40,890 INFO run "ls -l /bricks/brick1/testvol_tier2/*" on rhsauto021.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:57:40,890 INFO run "ls -l /bricks/brick1/testvol_tier2/*" on rhsauto021.lab.eng.blr.redhat.com: STDOUT is -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_10 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_2 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_4 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_5 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_7 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_9 -rwxr-xr-x. 2 root root 66826 Dec 28 12:17 /bricks/brick1/testvol_tier2/file_dir_ops.sh 2015-12-28 17:57:40,890 INFO run Executing find /bricks/brick2/testvol_tier1 -mindepth 1 | grep -ve '.glusterfs\|.trashcan' | wc -l on rhsauto020.lab.eng.blr.redhat.com 2015-12-28 17:57:40,926 INFO run "find /bricks/brick2/testvol_tier1 -mindepth 1 | grep -ve '.glusterfs\|.trashcan' | wc -l" on rhsauto020.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:57:40,926 INFO get_number_of_entries_in_brick Number of entries on rhsauto020.lab.eng.blr.redhat.com:/bricks/brick2/testvol_tier1: 7 2015-12-28 17:57:40,926 INFO get_number_of_entries_in_brick Extended attributes of all the files/dirs 2015-12-28 17:57:40,927 INFO run Executing getfattr -d -e hex -m . /bricks/brick2/testvol_tier1/* on rhsauto020.lab.eng.blr.redhat.com 2015-12-28 17:57:40,948 INFO run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier1/*" on rhsauto020.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:57:40,948 INFO run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier1/*" on rhsauto020.lab.eng.blr.redhat.com: STDOUT is # file: bricks/brick2/testvol_tier1/D_file_10 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0xb9d22d7c03364bb18ca00b65fe90823a # file: bricks/brick2/testvol_tier1/D_file_2 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0xb7a25ccb4234485699458536e2a36151 # file: bricks/brick2/testvol_tier1/D_file_4 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0x1f4860de1ed041c6bfe0ebe9814474ef # file: bricks/brick2/testvol_tier1/D_file_5 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0xbe8e0a4886e24d1bb6ea94a17b71ea51 # file: bricks/brick2/testvol_tier1/D_file_7 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0x748fe3c075544b8793c99d80d2b83052 # file: bricks/brick2/testvol_tier1/D_file_9 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0x4864b70bf9e24c138b2b97147597351e # file: bricks/brick2/testvol_tier1/file_dir_ops.sh security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0x44b51c0fa0c142c9b048200d17c821d7 2015-12-28 17:57:40,949 ERROR run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier1/*" on rhsauto020.lab.eng.blr.redhat.com: STDERR is getfattr: Removing leading '/' from absolute path names 2015-12-28 17:57:40,949 INFO run Executing ls -l /bricks/brick2/testvol_tier1/* on rhsauto020.lab.eng.blr.redhat.com 2015-12-28 17:57:40,972 INFO run "ls -l /bricks/brick2/testvol_tier1/*" on rhsauto020.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:57:40,972 INFO run "ls -l /bricks/brick2/testvol_tier1/*" on rhsauto020.lab.eng.blr.redhat.com: STDOUT is -rw-r--r--. 2 root root 5120 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_10 -rw-r--r--. 2 root root 1024 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_2 -rw-r--r--. 2 root root 2048 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_4 -rw-r--r--. 2 root root 2560 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_5 -rw-r--r--. 2 root root 3584 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_7 -rw-r--r--. 2 root root 4608 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_9 -rwxr-xr-x. 2 root root 66826 Dec 28 12:17 /bricks/brick2/testvol_tier1/file_dir_ops.sh 2015-12-28 17:57:40,972 INFO run Executing find /bricks/brick2/testvol_tier0 -mindepth 1 | grep -ve '.glusterfs\|.trashcan' | wc -l on rhsauto019.lab.eng.blr.redhat.com 2015-12-28 17:57:40,995 INFO run "find /bricks/brick2/testvol_tier0 -mindepth 1 | grep -ve '.glusterfs\|.trashcan' | wc -l" on rhsauto019.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:57:40,995 INFO get_number_of_entries_in_brick Number of entries on rhsauto019.lab.eng.blr.redhat.com:/bricks/brick2/testvol_tier0: 7 2015-12-28 17:57:40,996 INFO get_number_of_entries_in_brick Extended attributes of all the files/dirs 2015-12-28 17:57:40,996 INFO run Executing getfattr -d -e hex -m . /bricks/brick2/testvol_tier0/* on rhsauto019.lab.eng.blr.redhat.com 2015-12-28 17:57:41,015 INFO run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier0/*" on rhsauto019.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:57:41,015 INFO run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier0/*" on rhsauto019.lab.eng.blr.redhat.com: STDOUT is # file: bricks/brick2/testvol_tier0/D_file_10 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0xb9d22d7c03364bb18ca00b65fe90823a # file: bricks/brick2/testvol_tier0/D_file_2 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0xb7a25ccb4234485699458536e2a36151 # file: bricks/brick2/testvol_tier0/D_file_4 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0x1f4860de1ed041c6bfe0ebe9814474ef # file: bricks/brick2/testvol_tier0/D_file_5 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0xbe8e0a4886e24d1bb6ea94a17b71ea51 # file: bricks/brick2/testvol_tier0/D_file_7 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0x748fe3c075544b8793c99d80d2b83052 # file: bricks/brick2/testvol_tier0/D_file_9 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0x4864b70bf9e24c138b2b97147597351e # file: bricks/brick2/testvol_tier0/file_dir_ops.sh security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0x44b51c0fa0c142c9b048200d17c821d7 2015-12-28 17:57:41,016 ERROR run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier0/*" on rhsauto019.lab.eng.blr.redhat.com: STDERR is getfattr: Removing leading '/' from absolute path names 2015-12-28 17:57:41,016 INFO run Executing ls -l /bricks/brick2/testvol_tier0/* on rhsauto019.lab.eng.blr.redhat.com 2015-12-28 17:57:41,036 INFO run "ls -l /bricks/brick2/testvol_tier0/*" on rhsauto019.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:57:41,037 INFO run "ls -l /bricks/brick2/testvol_tier0/*" on rhsauto019.lab.eng.blr.redhat.com: STDOUT is -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_10 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_2 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_4 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_5 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_7 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_9 -rwxr-xr-x. 2 root root 66826 Dec 28 12:17 /bricks/brick2/testvol_tier0/file_dir_ops.sh ############################################################################### Extended attributes of files when the brick came online and self-heal complete and before arequal-checksum ############################################################################### 2015-12-28 17:58:33,323 INFO get_number_of_entries_in_brick Extended attributes of all the files/dirs 2015-12-28 17:58:33,324 INFO run Executing getfattr -d -e hex -m . /bricks/brick1/testvol_tier2/* on rhsauto021.lab.eng.blr.redhat.com 2015-12-28 17:58:33,342 INFO run "getfattr -d -e hex -m . /bricks/brick1/testvol_tier2/*" on rhsauto021.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:58:33,343 INFO run "getfattr -d -e hex -m . /bricks/brick1/testvol_tier2/*" on rhsauto021.lab.eng.blr.redhat.com: STDOUT is # file: bricks/brick1/testvol_tier2/D_file_10 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0xb9d22d7c03364bb18ca00b65fe90823a # file: bricks/brick1/testvol_tier2/D_file_2 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0xb7a25ccb4234485699458536e2a36151 # file: bricks/brick1/testvol_tier2/D_file_4 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0x1f4860de1ed041c6bfe0ebe9814474ef # file: bricks/brick1/testvol_tier2/D_file_5 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0xbe8e0a4886e24d1bb6ea94a17b71ea51 # file: bricks/brick1/testvol_tier2/D_file_7 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0x748fe3c075544b8793c99d80d2b83052 # file: bricks/brick1/testvol_tier2/D_file_9 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x03000000000000005681299d000d3c76 trusted.gfid=0x4864b70bf9e24c138b2b97147597351e # file: bricks/brick1/testvol_tier2/file_dir_ops.sh security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00067eca trusted.gfid=0x44b51c0fa0c142c9b048200d17c821d7 2015-12-28 17:58:33,343 ERROR run "getfattr -d -e hex -m . /bricks/brick1/testvol_tier2/*" on rhsauto021.lab.eng.blr.redhat.com: STDERR is getfattr: Removing leading '/' from absolute path names 2015-12-28 17:58:33,343 INFO run Executing ls -l /bricks/brick1/testvol_tier2/* on rhsauto021.lab.eng.blr.redhat.com 2015-12-28 17:58:33,365 INFO run "ls -l /bricks/brick1/testvol_tier2/*" on rhsauto021.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:58:33,365 INFO run "ls -l /bricks/brick1/testvol_tier2/*" on rhsauto021.lab.eng.blr.redhat.com: STDOUT is -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_10 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_2 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_4 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_5 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_7 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick1/testvol_tier2/D_file_9 -rwxr-xr-x. 2 root root 66826 Dec 28 12:17 /bricks/brick1/testvol_tier2/file_dir_ops.sh 2015-12-28 17:58:33,365 INFO run Executing find /bricks/brick2/testvol_tier1 -mindepth 1 | grep -ve '.glusterfs\|.trashcan' | wc -l on rhsauto020.lab.eng.blr.redhat.com 2015-12-28 17:58:33,391 INFO run "find /bricks/brick2/testvol_tier1 -mindepth 1 | grep -ve '.glusterfs\|.trashcan' | wc -l" on rhsauto020.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:58:33,391 INFO get_number_of_entries_in_brick Number of entries on rhsauto020.lab.eng.blr.redhat.com:/bricks/brick2/testvol_tier1: 7 2015-12-28 17:58:33,391 INFO get_number_of_entries_in_brick Extended attributes of all the files/dirs 2015-12-28 17:58:33,392 INFO run Executing getfattr -d -e hex -m . /bricks/brick2/testvol_tier1/* on rhsauto020.lab.eng.blr.redhat.com 2015-12-28 17:58:33,413 INFO run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier1/*" on rhsauto020.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:58:33,414 INFO run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier1/*" on rhsauto020.lab.eng.blr.redhat.com: STDOUT is # file: bricks/brick2/testvol_tier1/D_file_10 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0xb9d22d7c03364bb18ca00b65fe90823a # file: bricks/brick2/testvol_tier1/D_file_2 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0xb7a25ccb4234485699458536e2a36151 # file: bricks/brick2/testvol_tier1/D_file_4 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0x1f4860de1ed041c6bfe0ebe9814474ef # file: bricks/brick2/testvol_tier1/D_file_5 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0xbe8e0a4886e24d1bb6ea94a17b71ea51 # file: bricks/brick2/testvol_tier1/D_file_7 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0x748fe3c075544b8793c99d80d2b83052 # file: bricks/brick2/testvol_tier1/D_file_9 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0x4864b70bf9e24c138b2b97147597351e # file: bricks/brick2/testvol_tier1/file_dir_ops.sh security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283d00065050 trusted.gfid=0x44b51c0fa0c142c9b048200d17c821d7 2015-12-28 17:58:33,414 ERROR run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier1/*" on rhsauto020.lab.eng.blr.redhat.com: STDERR is getfattr: Removing leading '/' from absolute path names 2015-12-28 17:58:33,414 INFO run Executing ls -l /bricks/brick2/testvol_tier1/* on rhsauto020.lab.eng.blr.redhat.com 2015-12-28 17:58:33,435 INFO run "ls -l /bricks/brick2/testvol_tier1/*" on rhsauto020.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:58:33,435 INFO run "ls -l /bricks/brick2/testvol_tier1/*" on rhsauto020.lab.eng.blr.redhat.com: STDOUT is -rw-r--r--. 2 root root 5120 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_10 -rw-r--r--. 2 root root 1024 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_2 -rw-r--r--. 2 root root 2048 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_4 -rw-r--r--. 2 root root 2560 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_5 -rw-r--r--. 2 root root 3584 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_7 -rw-r--r--. 2 root root 4608 Dec 28 12:17 /bricks/brick2/testvol_tier1/D_file_9 -rwxr-xr-x. 2 root root 66826 Dec 28 12:17 /bricks/brick2/testvol_tier1/file_dir_ops.sh 2015-12-28 17:58:33,436 INFO run Executing find /bricks/brick2/testvol_tier0 -mindepth 1 | grep -ve '.glusterfs\|.trashcan' | wc -l on rhsauto019.lab.eng.blr.redhat.com 2015-12-28 17:58:33,459 INFO run "find /bricks/brick2/testvol_tier0 -mindepth 1 | grep -ve '.glusterfs\|.trashcan' | wc -l" on rhsauto019.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:58:33,459 INFO get_number_of_entries_in_brick Number of entries on rhsauto019.lab.eng.blr.redhat.com:/bricks/brick2/testvol_tier0: 7 2015-12-28 17:58:33,460 INFO get_number_of_entries_in_brick Extended attributes of all the files/dirs 2015-12-28 17:58:33,460 INFO run Executing getfattr -d -e hex -m . /bricks/brick2/testvol_tier0/* on rhsauto019.lab.eng.blr.redhat.com 2015-12-28 17:58:33,481 INFO run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier0/*" on rhsauto019.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:58:33,481 INFO run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier0/*" on rhsauto019.lab.eng.blr.redhat.com: STDOUT is # file: bricks/brick2/testvol_tier0/D_file_10 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0xb9d22d7c03364bb18ca00b65fe90823a # file: bricks/brick2/testvol_tier0/D_file_2 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0xb7a25ccb4234485699458536e2a36151 # file: bricks/brick2/testvol_tier0/D_file_4 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0x1f4860de1ed041c6bfe0ebe9814474ef # file: bricks/brick2/testvol_tier0/D_file_5 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0xbe8e0a4886e24d1bb6ea94a17b71ea51 # file: bricks/brick2/testvol_tier0/D_file_7 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0x748fe3c075544b8793c99d80d2b83052 # file: bricks/brick2/testvol_tier0/D_file_9 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0x4864b70bf9e24c138b2b97147597351e # file: bricks/brick2/testvol_tier0/file_dir_ops.sh security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x02000000000000005681283a000d56ea trusted.gfid=0x44b51c0fa0c142c9b048200d17c821d7 2015-12-28 17:58:33,481 ERROR run "getfattr -d -e hex -m . /bricks/brick2/testvol_tier0/*" on rhsauto019.lab.eng.blr.redhat.com: STDERR is getfattr: Removing leading '/' from absolute path names 2015-12-28 17:58:33,481 INFO run Executing ls -l /bricks/brick2/testvol_tier0/* on rhsauto019.lab.eng.blr.redhat.com 2015-12-28 17:58:33,504 INFO run "ls -l /bricks/brick2/testvol_tier0/*" on rhsauto019.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-28 17:58:33,504 INFO run "ls -l /bricks/brick2/testvol_tier0/*" on rhsauto019.lab.eng.blr.redhat.com: STDOUT is -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_10 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_2 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_4 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_5 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_7 -rw-r--r--. 2 root root 0 Dec 28 12:23 /bricks/brick2/testvol_tier0/D_file_9 -rwxr-xr-x. 2 root root 66826 Dec 28 12:17 /bricks/brick2/testvol_tier0/file_dir_ops.sh
Created attachment 1110131 [details] shell script to create/modify/truncate files Command Example: file_dir_ops.sh data_ops <mountpoint> truncate1 10 0M
Created attachment 1110720 [details] Distaf log. After looking at the distaf logs with Pranith, it was found that the xfs godown method used to kill the brick mounts is taking a long time for killing the bricks of the replica. (One minute interval between successive godowns). Since the modification/truncate tests are run asynchronously, they could have completed before all bricks were down. Also, since the godown method kills xfs, the truncates might not have been synced to the disk, which is why when it comes up, the file size is what it was before truncate. In AFR, if there are no pending xattrs, the bigger file is selected as source and the heal happens when file is accessed from the mount. This explains the arequal checksum mismatch, lack of pending xattrs on the files and the lack of heals by the self-heal demon. After modifying the test to use BRICK_TAKEDOWN_METHOD="service_kill" instead of xfs godown, the test passed in all 3 runs. Since this is expected behaviour, it is not a blocker bug per se. But we could provide a fix for 3.1.3 by doing an fsync after truncate in afr transaction. If the fsync fails, then post-op will have pending xattrs for the killed brick and heal will happen in the right direction.
Moving this tp 3.1.4 after triaging.