Bug 1894758

Summary: [DR] Remote data sync to the secondary site never completes
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: rhhiAssignee: Gobinda Das <godas>
Status: CLOSED CURRENTRELEASE QA Contact: SATHEESARAN <sasundar>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rhhiv-1.8CC: godas, kmajumde, rcyriac, rhs-bugs
Target Milestone: ---   
Target Release: RHHI-V 1.8.z Async Update   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1895277 (view as bug list) Environment:
Last Closed: 2021-01-11 07:12:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1895277    
Bug Blocks:    
Attachments:
Description Flags
ENGINE.LOG
none
supervdsm.log_grafton10
none
supervdsm.log_grafton10
none
supervdsm.log_grafton7.tar.gz
none
supervdsm.log_grafton8.tar.gz
none
supervdsm.log_grafton9.tar.gz
none
engine.log with debug enabled none

Description SATHEESARAN 2020-11-05 01:06:03 UTC
Description of problem:
------------------------
RHHI-V DR mechanism makes use of gluster geo-replication to sync the data to the remote site. I see that works good, and the checkpoint is reached, which means the data is synced successfully to the secondary site. But RHV Manager at the primary site fails to recognize the completion of geo-rep data sync and waits indefinitely.

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
RHV Manager 4.4.3
RHVH 4.4.3
RHGS 3.5.3

How reproducible:
-------------------
Always

Steps to Reproduce:
-------------------
1. Create a primary site with 3 node RHHI-V deployment
2. Create a secondary site with 3 node RHHI-V deployment, with no storage domains created, but just the volumes created
3. Create a VM with 40GB OS disk and install it with RHEL 8.3
4. Create a geo-rep session from primary site to secondary site
5. Create a schedule to sync the data to secondary site
6. Wait for the schedule for geo-rep session to get triggered

Actual results:
---------------
Geo-rep session starts and syncs the data successfully, which the RHV Manager /Engine fails to interpret

Expected results:
-----------------
Once the gluster geo-replication successfully completes data sync, engine should understand the same, and appropriate events to be triggered.

Comment 3 SATHEESARAN 2020-11-05 01:51:51 UTC
Created attachment 1726770 [details]
ENGINE.LOG

Comment 6 SATHEESARAN 2020-11-05 07:34:54 UTC
Created attachment 1726791 [details]
supervdsm.log_grafton10

Comment 7 SATHEESARAN 2020-11-05 07:36:04 UTC
Created attachment 1726792 [details]
supervdsm.log_grafton10

Comment 8 SATHEESARAN 2020-11-05 07:42:11 UTC
Created attachment 1726794 [details]
supervdsm.log_grafton7.tar.gz

Comment 9 SATHEESARAN 2020-11-05 07:42:48 UTC
Created attachment 1726795 [details]
supervdsm.log_grafton8.tar.gz

Comment 10 SATHEESARAN 2020-11-05 07:44:22 UTC
Created attachment 1726796 [details]
supervdsm.log_grafton9.tar.gz

Comment 11 SATHEESARAN 2020-11-05 07:49:26 UTC
Comment on attachment 1726791 [details]
supervdsm.log_grafton10

This is not the right log file

Comment 12 SATHEESARAN 2020-11-05 07:50:09 UTC
Comment on attachment 1726792 [details]
supervdsm.log_grafton10

Incorrect logfile

Comment 13 SATHEESARAN 2020-11-05 14:59:12 UTC
Created attachment 1726900 [details]
engine.log with debug enabled

Comment 20 SATHEESARAN 2020-12-04 10:54:48 UTC
Tested with 4.4.3.12-0.1.el8ev and glusterfs-6.0-49.el8rhgs, with glusterfs-selinux package.

Geo-replication successfully syncs the data from the primary gluster volume to secondary gluster volume
using rsync as the sync-method.

Also post the sync disaster-recovery roles works good and the VMs could successfully start on the 
secondary site

Comment 21 SATHEESARAN 2021-01-11 07:12:51 UTC
Closing this bug as the fix is shipped with latest RHHI-V 1.8.2