Bug 1403394

Summary: Database replication is failing for LVDC
Product: Red Hat CloudForms Management Engine Reporter: Josh Carter <jocarter>
Component: ReplicationAssignee: Gregg Tanzillo <gtanzill>
Status: CLOSED DUPLICATE QA Contact: Dave Johnson <dajohnso>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 5.6.0CC: cpelland, jhardy, jocarter, ncarboni, obarenbo
Target Milestone: GAKeywords: TestOnly, ZStream
Target Release: cfme-future   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1412279 (view as bug list) Environment:
Last Closed: 2017-01-11 22:16:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1412279    

Description Josh Carter 2016-12-09 23:00:06 UTC
Description of problem:

Database replication is failing for LVDC
After upgrade from 3.2 to 4.2

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 Nick Carboni 2016-12-14 22:38:58 UTC
This was caused by a combination of factors.

-The latency between the remote server running the replication worker and the global was about 40ms.
-The replication was slow enough that in many cases, the time to replicate one batch of data would exceed the replication subprocess heartbeat timeout and the process was being killed.
-https://bugzilla.redhat.com/show_bug.cgi?id=1404028 was causing an incredible backlog (taggings and events) during provisioning.

As a fix for this issue we will look at the frequency the replication process heartbeats to avoid the second issue mentioned above.