Bug 1724043

Summary: [geo-rep]: Checksum mismatch when 2x2 vols are converted to arbiter
Product: [Community] GlusterFS Reporter: hari gowtham <hgowtham>
Component: geo-replicationAssignee: hari gowtham <hgowtham>
Status: CLOSED UPSTREAM QA Contact:
Severity: urgent Docs Contact:
Priority: urgent    
Version: mainlineCC: amukherj, bkunal, bugs, csaba, khiremat, ksubrahm, nchilaka, pasik, pkarampu, ravishankar, rcyriac, rhs-bugs, sheggodu, storage-qa-internal, sunkumar, vdas
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1683893 Environment:
Last Closed: 2020-03-12 14:27:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1683893, 1686568    
Bug Blocks: 1687672, 1687687, 1687746    

Comment 1 Worker Ant 2019-06-26 06:58:13 UTC
REVIEW: https://review.gluster.org/22945 ([WIP] Georep: make passive worker sync self-heal traffic) posted (#1) for review on master by hari gowtham

Comment 2 hari gowtham 2019-06-26 07:01:19 UTC
Description of problem:
=======================
While converting 2x2 to 2x(2+1) (arbiter), there was a checksum mismatch:

[root@dhcp43-143 ~]# ./arequal-checksum -p /mnt/master/

Entry counts
Regular files   : 10000
Directories     : 2011
Symbolic links  : 11900
Other           : 0
Total           : 23911

Metadata checksums
Regular files   : 5ce564791c
Directories     : 288ecb21ce24
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : 8e69e8576625d36f9ee1866c92bfb6a3
Directories     : 4a596e7e1e792061
Symbolic links  : 756e690d61497f6a
Other           : 0
Total           : 2fbf69488baa3ac7


[root@dhcp43-143 ~]# ./arequal-checksum -p /mnt/slave/

Entry counts
Regular files   : 10000
Directories     : 2011
Symbolic links  : 11900
Other           : 0
Total           : 23911

Metadata checksums
Regular files   : 5ce564791c
Directories     : 288ecb21ce24
Symbolic links  : 3e9
Other           : 3e9

Checksums
Regular files   : 53c64bd1144f6d9855f0af3edb55e614
Directories     : 4a596e7e1e792061
Symbolic links  : 756e690d61497f6a
Other           : 0
Total           : 3901e39cb02ad487



Everything matches except under "CHECKSUMS", Regular files and the total are a mismatch. 



Version-Release number of selected component (if applicable):
==============================================================
glusterfs-3.12.2-45.el7rhgs.x86_64

How reproducible:
=================
2/2

Steps to Reproduce:
====================
1. Create and start a geo-rep session with master and slave being 2x2
2. Mount the vols and start pumping data
3. Disable and stop self healing (prior to add-brick)

# gluster volume set VOLNAME cluster.data-self-heal off
# gluster volume set VOLNAME cluster.metadata-self-heal off
# gluster volume set VOLNAME cluster.entry-self-heal off
# gluster volume set VOLNAME self-heal-daemon off

4. Add brick to the master and slave to convert them to 2x(2+1) arbiter vols
5. Start rebalance on master and slave

6. Re-enable self healing :

# gluster volume set VOLNAME cluster.data-self-heal on
# gluster volume set VOLNAME cluster.metadata-self-heal on
# gluster volume set VOLNAME cluster.entry-self-heal on
# gluster volume set VOLNAME self-heal-daemon on

7. Wait for rebalance to complete
8. Check the checksum between master and slave


Actual results:
===============
Checksum does not fully match


Expected results:
================
Checksum should match

Comment 3 Sunny Kumar 2020-02-04 09:16:45 UTC
*** Bug 1686568 has been marked as a duplicate of this bug. ***

Comment 4 Worker Ant 2020-03-12 14:27:45 UTC
This bug is moved to https://github.com/gluster/glusterfs/issues/1049, and will be tracked there from now on. Visit GitHub issues URL for further details