Hide Forgot
Description of problem: ========================== On a 2x3 cold and hot tiered volume, brought down 1 brick from each subvolume. copied files to mount. Brought back bricks online. Self-heal got completed on hot and cold tier. Brought down other brick from each sub volume and tried to get the copied files. There were few copied files missing. Observation: =========== For the files that are missing, the data file exist on the hot-tier but one of the brick in the cold-tier subvolume doesn't have 'link-to' file. Version-Release number of selected component (if applicable): ============================================================= glusterfs-3.7.5-13.el7rhgs.x86_64 How reproducible: ================= Tired once Steps to Reproduce: ==================== 1. create a 2x3 dis-rep cold and hot tiered volume. start the volume. Create fuse mount. 2. Repeat the following steps for the operations: (create file/dirs, copy files/dirs from files created) a. starting the operation on the mount b. bring down certain bricks from each subvolume c. after the operation is complete calculate arequal-checksum d. bring back the bricks e. wait for self-heal to complete f. once self-heal is complete, calculate arequal-checksum g. compare checksums calculated at (c) and (f). they should be same h. bring down bricks from each subvolume i. calculate arequal-checksum j. compare checksums calculated at (f) and (j). they should be same. Actual results: ================ Arequal checksum mismatched. 015-12-29 15:49:37,797 INFO compare_arequal_checksum_mount Arequal-Checksum on mount cutlass.lab.eng.blr.redhat.com:/mnt/glusterfs : 'after-self-heal-copy' is: Entry counts Regular files : 1057 Directories : 69 Symbolic links : 0 Other : 0 Total : 1126 Metadata checksums Regular files : 48974c Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : 1cbd48496b3a49731c6b8c8b25536b98 Directories : 4b071170126a5d7f Symbolic links : 0 Other : 0 Total : 4bd1d5b25c037f94 2015-12-29 15:49:37,797 INFO compare_arequal_checksum_mount Arequal-Checksum on mount cutlass.lab.eng.blr.redhat.com:/mnt/glusterfs : 'before-next-op-rename' is: Entry counts Regular files : 1052 Directories : 69 Symbolic links : 0 Other : 0 Total : 1121 Metadata checksums Regular files : 2cb0 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : e3f89df229df6151b02a6fadfa0895fa Directories : 2858412f24757255 Symbolic links : 0 Other : 0 Total : 7b8ab370f7a286fe 2015-12-29 15:49:37,797 ERROR compare_arequal_checksum_mount Checksums on mount cutlass.lab.eng.blr.redhat.com:/mnt/glusterfs of 'after-self-heal-copy' and 'before-next-op-rename' doesn't match 2015-12-29 15:49:37,797 INFO run Executing find /mnt/glusterfs | uniq -d on cutlass.lab.eng.blr.redhat.com 2015-12-29 15:49:40,095 INFO run "find /mnt/glusterfs | uniq -d" on cutlass.lab.eng.blr.redhat.com: RETCODE is 0 2015-12-29 15:49:40,095 INFO get_duplicate_entries_from_mount No Duplicate Entries found under cutlass.lab.eng.blr.redhat.com:/mnt/glusterfs 2015-12-29 15:49:40,095 ERROR get_missing_entries_from_mount Missing entries from mount when comparing entries 'after-self-heal-copy' and entries 'before-next-op-rename': /mnt/glusterfs/E_file_copy_32 /mnt/glusterfs/E_file_copy_33 /mnt/glusterfs/E_file_copy_30 /mnt/glusterfs/E_file_copy_31 /mnt/glusterfs/E_file_copy_35 Expected results: =================== arequal-checksums should match Additional info: ================== Volume Name: testvol Type: Tier Volume ID: 50b291c4-68ec-4b40-8ca3-cd2a1524f43f Status: Started Number of Bricks: 12 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 3 = 6 Brick1: rhsauto020.lab.eng.blr.redhat.com:/bricks/brick3/testvol_tier5 Brick2: rhsauto019.lab.eng.blr.redhat.com:/bricks/brick3/testvol_tier4 Brick3: rhsauto022.lab.eng.blr.redhat.com:/bricks/brick1/testvol_tier3 Brick4: rhsauto021.lab.eng.blr.redhat.com:/bricks/brick1/testvol_tier2 Brick5: rhsauto020.lab.eng.blr.redhat.com:/bricks/brick2/testvol_tier1 Brick6: rhsauto019.lab.eng.blr.redhat.com:/bricks/brick2/testvol_tier0 Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 2 x 3 = 6 Brick7: rhsauto019.lab.eng.blr.redhat.com:/bricks/brick0/testvol_brick0 Brick8: rhsauto020.lab.eng.blr.redhat.com:/bricks/brick0/testvol_brick1 Brick9: rhsauto021.lab.eng.blr.redhat.com:/bricks/brick0/testvol_brick2 Brick10: rhsauto022.lab.eng.blr.redhat.com:/bricks/brick0/testvol_brick3 Brick11: rhsauto019.lab.eng.blr.redhat.com:/bricks/brick1/testvol_brick4 Brick12: rhsauto020.lab.eng.blr.redhat.com:/bricks/brick1/testvol_brick5 Options Reconfigured: performance.readdir-ahead: on features.ctr-enabled: on cluster.tier-mode: cache cluster.watermark-low: 75 cluster.watermark-hi: 90 [root@rhsauto019:/etc/yum.repos.d] Dec-29-2015 10:34:30 $ls -l /bricks/brick*/testvol_*/E_file_copy_33 -rw-r--r--. 2 root root 33792 Dec 29 10:08 /bricks/brick3/testvol_tier4/E_file_copy_33 [root@rhsauto019:/etc/yum.repos.d] Dec-29-2015 10:34:32 $ [root@rhsauto020:/etc/yum.repos.d] Dec-29-2015 10:34:30 $ls -l /bricks/brick*/testvol_*/E_file_copy_33 ---------T. 2 root root 0 Dec 29 10:08 /bricks/brick0/testvol_brick1/E_file_copy_33 -rw-r--r--. 2 root root 33792 Dec 29 10:08 /bricks/brick3/testvol_tier5/E_file_copy_33 [root@rhsauto020:/etc/yum.repos.d] Dec-29-2015 10:34:32 $ [root@rhsauto021:/etc/yum.repos.d] Dec-29-2015 10:34:30 $ls -l /bricks/brick*/testvol_*/E_file_copy_33 ---------T. 2 root root 0 Dec 29 10:08 /bricks/brick0/testvol_brick2/E_file_copy_33 [root@rhsauto021:/etc/yum.repos.d] Dec-29-2015 10:34:32 $ [root@rhsauto022:/etc/yum.repos.d] Dec-29-2015 10:34:30 $ls -l /bricks/brick*/testvol_*/E_file_copy_33 -rw-r--r--. 2 root root 33792 Dec 29 10:08 /bricks/brick1/testvol_tier3/E_file_copy_33 [root@rhsauto022:/etc/yum.repos.d] Dec-29-2015 10:34:32 $
SOSREPORT : http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1294632/
Shwetha, sos-reports don't have client logs in them. Could you please provide the client logs?
*** Bug 1294732 has been marked as a duplicate of this bug. ***
Related bug: BZ 1294597