Description of problem: ======================= In a 2 x 2 distribute-replicate volume, when bricks were brought down create/delete/rename operations on files and directories were performed. Bricks were brought online and self-heal got completed. After self-heal some of the renamed files are missing in the mount point. Version-Release number of selected component (if applicable): ============================================================== glusterfs-3.7.1-1.el6rhs.x86_64 How reproducible: ============== Often Steps to Reproduce: ====================== step1) create 2 x 2 distribute-replicate volume. start the volume. create fuse mount. bring down brick1. From client execute entry_self_heal.sh <abs_path_mountpoint> "create" 1 calculate arequal-checksum (after_data_creation) bring back brick1 ( service glusterd restart ) trigger self-heal After self-heal is complete , calculate arequal-checksum (after_self_heal) compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match Step2) bring down brick2. calculate arequal-checksum (before_data_creation) compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match From client execute entry_self_heal.sh <abs_path_mountpoint> "delete" 1 calculate arequal-checksum (after_data_creation) bring back brick2 ( service glusterd restart ) trigger self-heal After self-heal is complete , calculate arequal-checksum (after_self_heal) compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match step3) bring down brick3. calculate arequal-checksum (before_data_creation) compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match From client execute entry_self_heal.sh <abs_path_mountpoint> "rename" 1 calculate arequal-checksum (after_data_creation) bring back brick3 ( service glusterd restart ) trigger self-heal After self-heal is complete , calculate arequal-checksum (after_self_heal) compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match step4) bring down brick4 calculate arequal-checksum (before_data_creation) compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match From client execute entry_self_heal.sh <abs_path_mountpoint> "create" 2 calculate arequal-checksum (after_data_creation) bring back brick4 ( service glusterd restart ) trigger self-heal After self-heal is complete , calculate arequal-checksum (after_self_heal) compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match step5) bring down brick1 and brick3 calculate arequal-checksum (before_data_creation) compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match From client execute entry_self_heal.sh <abs_path_mountpoint> "delete" 2 calculate arequal-checksum (after_data_creation) bring back brick1 and brick3 ( service glusterd restart ) trigger self-heal After self-heal is complete , calculate arequal-checksum (after_self_heal) compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match step6) bring down brick1 and brick4. calculate arequal-checksum (before_data_creation) compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match From client execute entry_self_heal.sh <abs_path_mountpoint> "rename" 2 calculate arequal-checksum (after_data_creation) bring back brick1 and brick4 ( service glusterd restart ) trigger self-heal After self-heal is complete , calculate arequal-checksum (after_self_heal) compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match Actual results: ================== :: [ FAIL ] :: Files /arequal-data/rhsauto053.lab.eng.blr.redhat.com_gluster-mount_arequal_checksum_after_rename_2.log and /arequal-data/rhsauto053.lab.eng.blr.redhat.com_gluster-mount_arequal_checksum_after_self_heal_rename_2.log should not differ :: [ 18:55:17 ] :: arequal checksum of after_rename_2 Entry counts Regular files : 640 Directories : 43 Symbolic links : 0 Other : 0 Total : 683 Metadata checksums Regular files : 3e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 8d5ae90e794c15b2befdda509790c2f7 Directories : d02060104765b76 Symbolic links : 0 Other : 0 Total : 3ea5355feaaa8c33 :: [ 18:55:17 ] :: arequal checksum of after_self_heal_rename_2 Entry counts Regular files : 594 Directories : 43 Symbolic links : 0 Other : 0 Total : 637 Metadata checksums Regular files : 3e9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 53abb6943e3d8c97da3cc7cb50ac2171 Directories : d02060104765b76 Symbolic links : 0 Other : 0 Total : 8495775e6ae7f690 :: [ 18:55:18 ] :: Checking if there are any duplicate entries under /gluster-mount :: [ PASS ] :: Duplicate entries not found under /gluster-mount :: [ 18:55:19 ] :: Checking if there are missing entries under /gluster-mount after_rename_2 to after_self_heal_rename_2 :: [ FAIL ] :: Missing entries found under /gluster-mount after_rename_2 to after_self_heal_rename_2 :: [ 18:55:19 ] :: Listing all the Missing entries on mount /gluster-mount after_rename_2 to after_self_heal_rename_2 -/gluster-mount/E_dir_new_2_2_2_11/E_file_new_2_2_2_10 -/gluster-mount/E_dir_new_2_2_2_11/E_file_new_2_2_2_13 -/gluster-mount/E_dir_new_2_2_2_11/E_file_new_2_2_2_5 -/gluster-mount/E_dir_new_2_2_2_11/E_file_new_2_2_2_8 -/gluster-mount/E_dir_new_2_2_2_17/E_file_new_2_2_2_10 -/gluster-mount/E_dir_new_2_2_2_17/E_file_new_2_2_2_13 -/gluster-mount/E_dir_new_2_2_2_17/E_file_new_2_2_2_5 -/gluster-mount/E_dir_new_2_2_2_17/E_file_new_2_2_2_8 -/gluster-mount/E_dir_new_2_2_2_18/E_file_new_2_2_2_10 -/gluster-mount/E_dir_new_2_2_2_18/E_file_new_2_2_2_13 -/gluster-mount/E_dir_new_2_2_2_18/E_file_new_2_2_2_5 -/gluster-mount/E_dir_new_2_2_2_18/E_file_new_2_2_2_8 -/gluster-mount/E_dir_new_2_2_2_19/E_file_new_2_2_2_10 -/gluster-mount/E_dir_new_2_2_2_19/E_file_new_2_2_2_13 -/gluster-mount/E_dir_new_2_2_2_19/E_file_new_2_2_2_5 -/gluster-mount/E_dir_new_2_2_2_19/E_file_new_2_2_2_8 -/gluster-mount/E_dir_new_2_2_2_20/E_file_new_2_2_2_10 -/gluster-mount/E_dir_new_2_2_2_20/E_file_new_2_2_2_13 -/gluster-mount/E_dir_new_2_2_2_20/E_file_new_2_2_2_5 -/gluster-mount/E_dir_new_2_2_2_20/E_file_new_2_2_2_8 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_1 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_11 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_12 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_14 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_15 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_2 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_3 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_4 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_6 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_7 -/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_9 -/gluster-mount/E_dir_new_2_2_2_6/E_file_new_2_2_2_10 -/gluster-mount/E_dir_new_2_2_2_6/E_file_new_2_2_2_13 -/gluster-mount/E_dir_new_2_2_2_6/E_file_new_2_2_2_5 -/gluster-mount/E_dir_new_2_2_2_6/E_file_new_2_2_2_8 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_1 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_11 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_12 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_14 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_15 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_2 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_3 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_4 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_6 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_7 -/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_9 :: [ 18:55:20 ] :: Checking if there are additional entries under /gluster-mount after_self_heal_rename_2 to after_rename_2 :: [ PASS ] :: No Additional entries found under /gluster-mount after_self_heal_rename_2 to after_rename_2 :: [ 18:55:21 ] :: Total Number of files and directories in the volume : 637 Expected results: ==================== arequal-checksum should match.
Link to the gluster logs : http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1231732/ Link to the beaker job : https://beaker.engineering.redhat.com/jobs/983606
Created attachment 1039853 [details] Sc
Created attachment 1045375 [details] Logs from mgmt node and brick0.
Created attachment 1045376 [details] Logs from client and brick1.
Created attachment 1045377 [details] Logs from brick2 and brick3.
After checking the logs provided by Shwetha which contain the list of files from bricks after each operation is performed, it is verified that the files are indeed present on the brick but were missing from the mount. RCA is done and patch is sent upstream for review. http://review.gluster.org/#/c/11498/ Clearing needinfo on Shwetha.
Patch posted for review on downstream : https://code.engineering.redhat.com/gerrit/#/c/52357/ Upstream links : master : http://review.gluster.org/11498/ 3.7 : http://review.gluster.org/11544/
Verified the test on 2 x 3 distribute-replicate volume on build "glusterfs-3.7.1-8.el6rhs.x86_64" . Bug is fixed. Moving the bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1495.html