Bug 990458

Summary: Dist-geo-rep: Metadata Checksum of directories doesn't match after the sync
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: M S Vishwanath Bhat <vbhat>
Component: geo-replicationAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED EOL QA Contact: M S Vishwanath Bhat <vbhat>
Severity: high Docs Contact:
Priority: high    
Version: 2.1CC: avishwan, chrisw, csaba, david.macdonald, mzywusko, rhs-bugs, rwheeler, sdharane, smanjara, vagarwal
Target Milestone: ---Keywords: Reopened, ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: consistency
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-25 08:48:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description M S Vishwanath Bhat 2013-07-31 09:04:04 UTC
Description of problem:
The metadata checksum of directories doesn't match after the geo-rep sync. All other checksums are matching.

Version-Release number of selected component (if applicable):
glusterfs-3.4.0.12rhs.beta6-1.el6rhs.x86_64

How reproducible:
Once. Not sure if it's reproducible.

Steps to Reproduce:
1. Create and start two 2*2 distribute replicate volume as master and slave.
2. Now set the changelog-dir to separate xfs partition.
3. Do the kernel untar on the master mount point.
4. Now create and start a geo-rep session. and start untarring linux kernel again in different directory.
5. Once untarring is done, Overwrite all the .txt files with "Hello World"
6. Once the sync is done, get the arequl-checksum on both master and slave.

Actual results:
The metadata checksums of directories did not match.

[root@typhoon ~]# /opt/qa/tools/arequal-checksum /mnt/master/

Entry counts
Regular files   : 86032
Directories     : 5506
Symbolic links  : 4
Other           : 0
Total           : 91542

Metadata checksums
Regular files   : 3e9
Directories     : 3e9
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : 861bffd15ba69368ed8a27e0b66ec291
Directories     : 6f6f66
Symbolic links  : 0
Other           : 0
Total           : 6b91d831eda73e9f


[root@lightning ~]# /opt/qa/tools/arequal-checksum /mnt/slave/

Entry counts
Regular files   : 86032
Directories     : 5506
Symbolic links  : 4
Other           : 0
Total           : 91542

Metadata checksums
Regular files   : 3e9
Directories     : cd9
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : 861bffd15ba69368ed8a27e0b66ec291
Directories     : 6f6f66
Symbolic links  : 0
Other           : 0
Total           : 6b91d831eda73e9f

Master has '3e9' and slave has 'cd9'

Expected results:
All the checksums should match.

Additional info:

I'm not entirely sure what arequal-checksum does to calculate the checksum. I have collected all the logs. If arequal-checksum is found to do something which is not allowed for geo-rep or if it's invalid, I will move the bug to notabug.

There were lots of Rsync, Errorcode 23 log messages in the geo-replication log file. Also there were lots of Warning messages in the auxiliary mount point on the slave.

Comment 2 Venky Shankar 2013-08-12 05:01:09 UTC

*** This bug has been marked as a duplicate of bug 980910 ***

Comment 3 M S Vishwanath Bhat 2013-08-12 08:42:58 UTC
The bug https://bugzilla.redhat.com/show_bug.cgi?id=980910 is about change in metadata not being synced to slave. But here I am not changing any metadata of the file. I just simply untar linux kernel and after syncing the metadata checksums don't match. No setxattr, chmod, chown etc etc.

With the description of the bug, this is not a duplicate. But is it duplicate because the code changes to fix both of the bus are same?

Comment 4 M S Vishwanath Bhat 2013-08-13 10:49:35 UTC
I don't think this is a duplicate. At least by the description of both the bugs.

This used to work before. I hadn't hit this issue until recently. This is kind of regression IMO. I'm reopening since this is not change is metadata which is syncing, it the initial metadata itself. If you think the code changes are same of both of them and that's why it's duplicate, then move it to duplicate again.

That said I use arequal-checksum to verify the metadata. And that's where it's failing. I'm not 100% sure on how it calculates that.

Comment 5 Vijaykumar Koppad 2014-01-28 11:51:08 UTC
This has happened again in the build glusterfs-3.4.0.57. 

There was mismatch in metadata checksum for directories after doing chmod, chown or chgrp on all the files. This is being reproduced consistently.

Comment 6 shilpa 2014-06-30 06:58:15 UTC
I see this error consistenty in the xsync crawl for symlinks. Version glusterfs-3.6.0.22.
But looking at the arequal checksums on master and slave, they appear to be same.

arequal-checksum of master is : 
 
Entry counts
Regular files   : 10000
Directories     : 2011
Symbolic links  : 11900
Other           : 0
Total           : 23911

Metadata checksums
Regular files   : 3b68f87510
Directories     : c52eb589e28
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : f110f5d2d4bf9c7f58f92f101b8bee0e
Directories     : 7f35770119653171
Symbolic links  : 7219756330546964
Other           : 0
Total           : a4c5d8a0e6052a64

arequal-checksum of geo_rep_slave slavevol: 
 
Entry counts
Regular files   : 10000
Directories     : 1451
Symbolic links  : 2470
Other           : 0
Total           : 13921

Metadata checksums
Regular files   : ceb82592c0
Directories     : 16f1ead1b00
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : f110f5d2d4bf9c7f58f92f101b8bee0e
Directories     : 2e47026e1f64257a
Symbolic links  : 771e3f5e2b67340e
Other           : 0
Total           : f0b0e7f2fb376305

Meta data checksum for regular files doesn't match between master and  slavevol
Meta data checksum for directories doesn't match between master and slavevol
Failed to sync all the files from master to slavevol
Checksum for directories doesn't match between master and slavevol

Comment 7 shilpa 2014-06-30 07:09:46 UTC
After the sync is complete and the checksums match we still see a checksum error:

 Regular files   : 10000
 Directories     : 2011
 Symbolic links  : 11900
 Other           : 0
 Total           : 23911
 Metadata checksums
 Regular files   : 3b68f87510
 Directories     : c52eb589e28
 Symbolic links  : 3e9
 Other           : 3e9
 Checksums
 Regular files   : f110f5d2d4bf9c7f58f92f101b8bee0e
 Directories     : 7f35770119653171
 Symbolic links  : 7219756330546964
 Other           : 0
 Total           : a4c5d8a0e6052a64
 arequal-checksum of geo_rep_slave slavevol:
 Entry counts
 Regular files   : 10000
 Directories     : 2011
 Symbolic links  : 11900
 Other           : 0
 Total           : 23911
 Metadata checksums
 Regular files   : 3b68f87510
 Directories     : c558e679bd0
 Symbolic links  : 3e9
 Other           : 3e9
 Checksums
 Regular files   : f110f5d2d4bf9c7f58f92f101b8bee0e
 Directories     : 7f35770119653171
 Symbolic links  : 7219756330546964
 Other           : 0
 Total           : a4c5d8a0e6052a64
 Meta data checksum for directories doesn't match between master and slavevol

Comment 8 Vijaykumar Koppad 2014-07-17 12:29:05 UTC
This is still happening in the build glusterfs-3.6.0.24-1.el6rhs, both in xsync crawl and changelog crawl. 

================================================================================
requal-checksum of master is : 
 
Entry counts
Regular files   : 7006
Directories     : 594
Symbolic links  : 801
Other           : 0
Total           : 8401

Metadata checksums
Regular files   : 2f7b
Directories     : f922
Symbolic links  : 5a815a
Other           : 3e9

Checksums
Regular files   : e8f8e25dd381796ea69c6e28ba814418
Directories     : 3630166833605726
Symbolic links  : 3a03770349194039
Other           : 0
Total           : 4257ed1e13792a69

arequal-checksum of geo_rep_slave slave: 
 
Entry counts
Regular files   : 7006
Directories     : 594
Symbolic links  : 801
Other           : 0
Total           : 8401

Metadata checksums
Regular files   : 2f7b
Directories     : e2ca
Symbolic links  : 5a815a
Other           : 3e9

Checksums
Regular files   : e8f8e25dd381796ea69c6e28ba814418
Directories     : 3630166833605726
Symbolic links  : 3a03770349194039
Other           : 0
Total           : 4257ed1e13792a69

================================================================================

Comment 9 Aravinda VK 2015-01-29 10:15:42 UTC
Looks like duplicate of this bug 1146256.

Comment 10 Aravinda VK 2015-11-25 08:48:36 UTC
Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.

Comment 11 Aravinda VK 2015-11-25 08:50:33 UTC
Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.