Bug 1231732 - Renamed Files are missing after self-heal
Summary: Renamed Files are missing after self-heal
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: replicate
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: RHGS 3.1.0
Assignee: Anuradha
QA Contact: spandura
URL:
Whiteboard:
Depends On:
Blocks: 1202842 1238508 1240183
TreeView+ depends on / blocked
 
Reported: 2015-06-15 10:19 UTC by spandura
Modified: 2016-09-20 02:01 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.7.1-8
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1238508 (view as bug list)
Environment:
Last Closed: 2015-07-29 05:03:23 UTC
Target Upstream Version:


Attachments (Terms of Use)
Sc (4.06 KB, application/x-shellscript)
2015-06-17 09:59 UTC, spandura
no flags Details
Logs from mgmt node and brick0. (7.36 MB, application/x-tar)
2015-07-02 07:31 UTC, Anuradha
no flags Details
Logs from client and brick1. (9.64 MB, application/x-tar)
2015-07-02 07:33 UTC, Anuradha
no flags Details
Logs from brick2 and brick3. (13.11 MB, application/x-tar)
2015-07-02 07:35 UTC, Anuradha
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1495 0 normal SHIPPED_LIVE Important: Red Hat Gluster Storage 3.1 update 2015-07-29 08:26:26 UTC

Description spandura 2015-06-15 10:19:09 UTC
Description of problem:
=======================
In a 2 x 2 distribute-replicate volume, when bricks were brought down create/delete/rename operations on files and directories were performed. Bricks were brought online and self-heal got completed. After self-heal some of the renamed files are missing in the mount point. 

Version-Release number of selected component (if applicable):
==============================================================
glusterfs-3.7.1-1.el6rhs.x86_64

How reproducible:
==============
Often

Steps to Reproduce:
======================
step1)

    create 2 x 2 distribute-replicate volume. start the volume.
    create fuse mount.
    bring down brick1.
    From client execute entry_self_heal.sh <abs_path_mountpoint> "create" 1
    calculate arequal-checksum (after_data_creation)
    bring back brick1 ( service glusterd restart )
    trigger self-heal
    After self-heal is complete , calculate arequal-checksum (after_self_heal)
    compare arequal-checksum (after_data_creation) and arequal-checksum
(after_self_heal) . The arequal-checksums should match


    Step2)
    bring down brick2. 
    calculate arequal-checksum (before_data_creation)
    compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match
    From client execute entry_self_heal.sh <abs_path_mountpoint> "delete" 1
    calculate arequal-checksum (after_data_creation)
    bring back brick2 ( service glusterd restart )
    trigger self-heal
    After self-heal is complete , calculate arequal-checksum (after_self_heal)
    compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match

    step3)
    bring down brick3.
    calculate arequal-checksum (before_data_creation)
    compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match
    From client execute entry_self_heal.sh <abs_path_mountpoint> "rename" 1
    calculate arequal-checksum (after_data_creation)
    bring back brick3 ( service glusterd restart )
    trigger self-heal
    After self-heal is complete , calculate arequal-checksum (after_self_heal)
    compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match

    step4)
    bring down brick4
    calculate arequal-checksum (before_data_creation)
    compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match
    From client execute entry_self_heal.sh <abs_path_mountpoint> "create" 2
    calculate arequal-checksum (after_data_creation)
    bring back brick4 ( service glusterd restart )
    trigger self-heal
    After self-heal is complete , calculate arequal-checksum (after_self_heal)
    compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match

    step5)
    bring down brick1 and brick3
    calculate arequal-checksum (before_data_creation)
    compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match
    compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match
    From client execute entry_self_heal.sh <abs_path_mountpoint> "delete" 2
    calculate arequal-checksum (after_data_creation)
    bring back brick1 and brick3 ( service glusterd restart )
    trigger self-heal
    After self-heal is complete , calculate arequal-checksum (after_self_heal)
    compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match

    step6)
    bring down brick1 and brick4.
    calculate arequal-checksum (before_data_creation)
    compare arequal-checksum (after_self_heal calculated above) and arequal-checksum (before_data_creation) . The arequal-checksums should match
    From client execute entry_self_heal.sh <abs_path_mountpoint> "rename" 2
    calculate arequal-checksum (after_data_creation)
    bring back brick1 and brick4 ( service glusterd restart )
    trigger self-heal
    After self-heal is complete , calculate arequal-checksum (after_self_heal)
    compare arequal-checksum (after_data_creation) and arequal-checksum (after_self_heal) . The arequal-checksums should match 

Actual results:
==================
:: [   FAIL   ] :: Files /arequal-data/rhsauto053.lab.eng.blr.redhat.com_gluster-mount_arequal_checksum_after_rename_2.log and /arequal-data/rhsauto053.lab.eng.blr.redhat.com_gluster-mount_arequal_checksum_after_self_heal_rename_2.log should not differ 
:: [ 18:55:17 ] :: arequal checksum of after_rename_2

Entry counts
Regular files   : 640
Directories     : 43
Symbolic links  : 0
Other           : 0
Total           : 683

Metadata checksums
Regular files   : 3e9
Directories     : 24d74c
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : 8d5ae90e794c15b2befdda509790c2f7
Directories     : d02060104765b76
Symbolic links  : 0
Other           : 0
Total           : 3ea5355feaaa8c33
:: [ 18:55:17 ] :: arequal checksum of after_self_heal_rename_2

Entry counts
Regular files   : 594
Directories     : 43
Symbolic links  : 0
Other           : 0
Total           : 637

Metadata checksums
Regular files   : 3e9
Directories     : 24d74c
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : 53abb6943e3d8c97da3cc7cb50ac2171
Directories     : d02060104765b76
Symbolic links  : 0
Other           : 0
Total           : 8495775e6ae7f690
:: [ 18:55:18 ] :: Checking if there are any duplicate entries under /gluster-mount
:: [   PASS   ] :: Duplicate entries not found under /gluster-mount 
:: [ 18:55:19 ] :: Checking if there are missing entries under /gluster-mount  after_rename_2 to after_self_heal_rename_2
:: [   FAIL   ] :: Missing entries found under /gluster-mount  after_rename_2 to after_self_heal_rename_2 
:: [ 18:55:19 ] :: Listing all the Missing entries on mount /gluster-mount  after_rename_2 to after_self_heal_rename_2
-/gluster-mount/E_dir_new_2_2_2_11/E_file_new_2_2_2_10
-/gluster-mount/E_dir_new_2_2_2_11/E_file_new_2_2_2_13
-/gluster-mount/E_dir_new_2_2_2_11/E_file_new_2_2_2_5
-/gluster-mount/E_dir_new_2_2_2_11/E_file_new_2_2_2_8
-/gluster-mount/E_dir_new_2_2_2_17/E_file_new_2_2_2_10
-/gluster-mount/E_dir_new_2_2_2_17/E_file_new_2_2_2_13
-/gluster-mount/E_dir_new_2_2_2_17/E_file_new_2_2_2_5
-/gluster-mount/E_dir_new_2_2_2_17/E_file_new_2_2_2_8
-/gluster-mount/E_dir_new_2_2_2_18/E_file_new_2_2_2_10
-/gluster-mount/E_dir_new_2_2_2_18/E_file_new_2_2_2_13
-/gluster-mount/E_dir_new_2_2_2_18/E_file_new_2_2_2_5
-/gluster-mount/E_dir_new_2_2_2_18/E_file_new_2_2_2_8
-/gluster-mount/E_dir_new_2_2_2_19/E_file_new_2_2_2_10
-/gluster-mount/E_dir_new_2_2_2_19/E_file_new_2_2_2_13
-/gluster-mount/E_dir_new_2_2_2_19/E_file_new_2_2_2_5
-/gluster-mount/E_dir_new_2_2_2_19/E_file_new_2_2_2_8
-/gluster-mount/E_dir_new_2_2_2_20/E_file_new_2_2_2_10
-/gluster-mount/E_dir_new_2_2_2_20/E_file_new_2_2_2_13
-/gluster-mount/E_dir_new_2_2_2_20/E_file_new_2_2_2_5
-/gluster-mount/E_dir_new_2_2_2_20/E_file_new_2_2_2_8
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_1
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_11
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_12
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_14
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_15
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_2
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_3
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_4
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_6
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_7
-/gluster-mount/E_dir_new_2_2_2_5/E_file_new_2_2_2_9
-/gluster-mount/E_dir_new_2_2_2_6/E_file_new_2_2_2_10
-/gluster-mount/E_dir_new_2_2_2_6/E_file_new_2_2_2_13
-/gluster-mount/E_dir_new_2_2_2_6/E_file_new_2_2_2_5
-/gluster-mount/E_dir_new_2_2_2_6/E_file_new_2_2_2_8
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_1
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_11
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_12
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_14
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_15
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_2
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_3
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_4
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_6
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_7
-/gluster-mount/E_dir_new_2_2_2_9/E_file_new_2_2_2_9
:: [ 18:55:20 ] :: Checking if there are additional entries under /gluster-mount  after_self_heal_rename_2 to after_rename_2
:: [   PASS   ] :: No Additional entries found under /gluster-mount  after_self_heal_rename_2 to after_rename_2 
:: [ 18:55:21 ] :: Total Number of files and directories in the volume : 637

Expected results:
====================
arequal-checksum should match.

Comment 2 spandura 2015-06-16 06:00:17 UTC
Link to the gluster logs : http://rhsqe-repo.lab.eng.blr.redhat.com/bugs_necessary_info/1231732/

Link to the beaker job : https://beaker.engineering.redhat.com/jobs/983606

Comment 3 spandura 2015-06-17 09:59:51 UTC
Created attachment 1039853 [details]
Sc

Comment 7 Anuradha 2015-07-02 07:31:58 UTC
Created attachment 1045375 [details]
Logs from mgmt node and brick0.

Comment 8 Anuradha 2015-07-02 07:33:39 UTC
Created attachment 1045376 [details]
Logs from client and brick1.

Comment 9 Anuradha 2015-07-02 07:35:40 UTC
Created attachment 1045377 [details]
Logs from brick2 and brick3.

Comment 10 Anuradha 2015-07-02 07:39:27 UTC
After checking the logs provided by Shwetha which contain the list of files from bricks after each operation is performed, it is verified that the files are indeed present on the brick but were missing from the mount.
RCA is done and patch is sent upstream for review.
http://review.gluster.org/#/c/11498/

Clearing needinfo on Shwetha.

Comment 11 Anuradha 2015-07-06 08:28:14 UTC
Patch posted for review on downstream :
https://code.engineering.redhat.com/gerrit/#/c/52357/

Upstream links :
master : http://review.gluster.org/11498/
3.7    : http://review.gluster.org/11544/

Comment 12 spandura 2015-07-13 04:40:39 UTC
Verified the test on 2 x 3 distribute-replicate volume on build "glusterfs-3.7.1-8.el6rhs.x86_64" . Bug is fixed. Moving the bug to verified state.

Comment 13 errata-xmlrpc 2015-07-29 05:03:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html


Note You need to log in before you can comment on or make changes to this bug.