Bug 1321126

Summary: Replication changelog can incorrectly skip over updates
Product: Red Hat Enterprise Linux 6 Reporter: Noriko Hosoi <nhosoi>
Component: 389-ds-baseAssignee: Noriko Hosoi <nhosoi>
Status: CLOSED ERRATA QA Contact: Viktor Ashirov <vashirov>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.0CC: arubin, ekeck, gparente, jnansi, lkrispen, nkinder, rmeggins, rmj, tlavigne
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 389-ds-base-1.2.11.15-82.el6 Doc Type: Bug Fix
Doc Text:
Previously, a bug in the changelog iterator buffer could in some scenarios point to an incorrect position when reloading the buffer. This could cause replication to skip parts of the changelog, and consequently some changes were not being replicated. This bug has been fixed, and replication data loss due to an incorrectly reloaded changelog buffer no longer occurs.
Story Points: ---
Clone Of:
: 1354331 (view as bug list) Environment:
Last Closed: 2017-03-21 10:20:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1354331    

Description Noriko Hosoi 2016-03-24 17:49:13 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/48766

In a MMR environment where all the masters are under heavy load the replication changelog cache/buffer mechanism does not always use the correct anchor csn and some updates are not sent to the consumer.  

Typically it's the very first "bulk load" read from the changelog at the start of a replication session that has issues, but it can also happen during subsequent bulk loads during the same session.

Comment 1 Noriko Hosoi 2016-03-31 17:01:57 UTC
Justification: This bug is severe because it could cause data loss on some random replication consumer servers without being noticed.

Comment 4 Noriko Hosoi 2016-07-08 01:23:23 UTC
Justification:

https://bugzilla.redhat.com/show_bug.cgi?id=1321124#c19
(In reply to German Parente from comment #18)
> hi Marcel,
> 
> you are right that this bug has been identified only once in case 01632462.
> But I think the reason of having this in 7.2.Z is that there's a case where
> we could have inconsistency in the different nodes under replication and
> without noticing. That's the reason why engineering wanted this bug fixed in
> rhel7.2.z

This is a very severe bug that impacts all customers who use replication.  The main issue, already mentioned by German, is that updates are silently skipped.  This leads to data inconsistency, which basically means replication is broken.  This is a worst case scenario for customers using replication, because replication is not working correctly but there are warnings/errors.  So the customer has no idea things are broken, and when storing/replicating things like credit card/bank information, passwords, or other sensitive data, this inconsistency is unacceptable.

Comment 12 errata-xmlrpc 2017-03-21 10:20:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0667.html