Bug 1370300

Summary: set proper update status to replication agreement in case of failure
Product: Red Hat Enterprise Linux 7 Reporter: Noriko Hosoi <nhosoi>
Component: 389-ds-baseAssignee: mreynolds
Status: CLOSED ERRATA QA Contact: Viktor Ashirov <vashirov>
Severity: unspecified Docs Contact: Aneta Šteflová Petrová <apetrova>
Priority: unspecified    
Version: 7.3CC: lmiksik, mreynolds, msauton, nkinder, pbokoc, rmeggins, sramling
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 389-ds-base-1.3.5.10-10.el7 Doc Type: Bug Fix
Doc Text:
Replication agreement update status now includes details about replication agreement failures The replication agreement update status previously displayed only a generic message after an error occurred, which made troubleshooting the replication agreement failure difficult. Now, the update status includes a detailed error message. As a result, all replication agreement update failures are correctly and precisely logged.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-03 20:45:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Noriko Hosoi 2016-08-25 20:57:04 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/48957

The replication agreement contains a last update status field, which can be read by ldapsearch.
Unfortunately, if send_updates() returns a fatal error, the update status is always set to:

{{{
agmt_set_last_update_status(prp->agmt, -1, 0, "Incremental update has failed and requires administrator action");
}}}
but send_updates() has 13 different reasons to return UPDATE_FATAL_ERROR and logs more info to the error log, this could be propagated to the caller or directly set to the agreement

Comment 2 Viktor Ashirov 2016-09-01 08:56:33 UTC
This change breaks repl-monitor.pl since it considers all messages with the word "Error" as erroneous and marks them with red color (see attached html):

[root@ibm-x3650m4-02-vm-03 ~]# ldapsearch -D "cn=directory manager" -w Secret123 -h localhost -p 38003 -x -o ldif-wrap=no -b "cn=config" nsds5replicaLastUpdateStatus | grep -i nsds5replicaLastUpdateStatus
# requesting: nsds5replicaLastUpdateStatus 
nsds5replicaLastUpdateStatus: Error (0) Replica acquired successfully: Incremental update succeeded
nsds5replicaLastUpdateStatus: Error (0) Replica acquired successfully: Incremental update succeeded
nsds5replicaLastUpdateStatus: Error (0) Replica acquired successfully: Incremental update succeeded
nsds5replicaLastUpdateStatus: Error (0) Replica acquired successfully: Incremental update succeeded
nsds5replicaLastUpdateStatus: Error (0) Replica acquired successfully: Incremental update succeeded
nsds5replicaLastUpdateStatus: Error (0) Replica acquired successfully: Incremental update succeeded
nsds5replicaLastUpdateStatus: Error (0) Replica acquired successfully: Incremental update succeeded
nsds5replicaLastUpdateStatus: Error (0) Replica acquired successfully: Incremental update started
nsds5replicaLastUpdateStatus: Error (0) Replica acquired successfully: Incremental update succeeded
	
./ldap/admin/src/scripts/repl-monitor.pl.in:
   880				if ($status =~ /error/i) {
   881				  $redfontstart = "<font color='red'>";
   882				  $redfontend = "</font>";
   883				}
   884				elsif ($status =~ /^(\d+) /) {
   885					if ( $1 != 0 ) {
   886						# warning
   887						$redfontstart = "<font color='#FF7777'>";
   888						$redfontend = "</font>";
   889					}
   890				}

Comment 4 mreynolds 2016-09-01 17:00:19 UTC
Fixed upstream

Comment 5 Sankar Ramalingam 2016-09-13 13:41:16 UTC
1). Checked replication monitor output from reliab15 report and it shows no problem with the error messages for successful replica status.

http://storm.idmqe.lab.eng.bos.redhat.com/qa/archive/ds/rhel73/reliab15/run10_1.3.5.10-10.el7.bz1321124-baseline/out/repl-monitor-output/09100421.html

2). Checked replica status by changing replication agreement with few wrong values. It shows different error messages 
[root@ratangad MMR_WINSYNC]# ldapsearch -D "cn=directory manager" -w Secret123 -h localhost -p 1189 -x -o ldif-wrap=no -b "cn=mapping tree,cn=config" nsds5replicaLastUpdateStatus | grep -i nsds5replicaLastUpdateStatus
# requesting: nsds5replicaLastUpdateStatus 
nsds5replicaLastUpdateStatus: Error (-1) Problem connecting to replica - LDAP error: Can't contact LDAP server (connection error)
nsds5replicaLastUpdateStatus: Error (32) Problem connecting to replica - LDAP error: No such object (connection error)
nsds5replicaLastUpdateStatus: Error (-1) Problem connecting to replica - LDAP error: Can't contact LDAP server (connection error)

Based on above comments, marking the bug as Verified.

Comment 6 Sankar Ramalingam 2016-09-13 13:41:57 UTC
[root@ratangad MMR_WINSYNC]# rpm -qa |grep -i 389-ds-base
389-ds-base-1.3.5.10-10.el7.x86_64
389-ds-base-debuginfo-1.3.5.10-6.el7.x86_64
389-ds-base-libs-1.3.5.10-10.el7.x86_64
389-ds-base-devel-1.3.5.10-10.el7.x86_64

Comment 7 Sankar Ramalingam 2016-09-13 18:40:45 UTC
I changed the replication agreement's credentials for testing proper update status from replicas.

The error message is misleading. It says, error 49 as well error 0. Is this an expected behavior?

[13/Sep/2016:23:58:09.079096268 +051800] slapi_ldap_bind - Error: could not bind id [cn=SyncManager,cn=config] authentication mechanism [SIMPLE]: error 49 (Invalid credentials) errno 0 (Success)
[13/Sep/2016:23:58:11.059955989 +051800] NSMMReplicationPlugin - Finished total update of replica "agmt="cn=1189_to_1389_on_ratangad.eng.blr.redhat.com" (ratangad:1389)". Sent 126 entries.
[14/Sep/2016:00:00:56.279623167 +051800] slapi_ldap_bind - Error: could not bind id [cn=SyncManager,cn=config] authentication mechanism [SIMPLE]: error 49 (Invalid credentials) errno 0 (Success)

Comment 8 mreynolds 2016-09-13 18:52:22 UTC
(In reply to Sankar Ramalingam from comment #7)
> I changed the replication agreement's credentials for testing proper update
> status from replicas.
> 
> The error message is misleading. It says, error 49 as well error 0. Is this
> an expected behavior?
> 
> [13/Sep/2016:23:58:09.079096268 +051800] slapi_ldap_bind - Error: could not
> bind id [cn=SyncManager,cn=config] authentication mechanism [SIMPLE]: error
> 49 (Invalid credentials) errno 0 (Success)
> [13/Sep/2016:23:58:11.059955989 +051800] NSMMReplicationPlugin - Finished
> total update of replica
> "agmt="cn=1189_to_1389_on_ratangad.eng.blr.redhat.com" (ratangad:1389)".
> Sent 126 entries.
> [14/Sep/2016:00:00:56.279623167 +051800] slapi_ldap_bind - Error: could not
> bind id [cn=SyncManager,cn=config] authentication mechanism [SIMPLE]: error
> 49 (Invalid credentials) errno 0 (Success)

This is probably a bug in that log message, but it has nothing to do with this bug.  This bug was for improvement the replication agreement status message (not the error log messages).

Comment 9 Sankar Ramalingam 2016-09-14 14:44:43 UTC
(In reply to mreynolds from comment #8)
> (In reply to Sankar Ramalingam from comment #7)
> > I changed the replication agreement's credentials for testing proper update
> > status from replicas.
> > 
> > The error message is misleading. It says, error 49 as well error 0. Is this
> > an expected behavior?
> > 
> > [13/Sep/2016:23:58:09.079096268 +051800] slapi_ldap_bind - Error: could not
> > bind id [cn=SyncManager,cn=config] authentication mechanism [SIMPLE]: error
> > 49 (Invalid credentials) errno 0 (Success)
> > [13/Sep/2016:23:58:11.059955989 +051800] NSMMReplicationPlugin - Finished
> > total update of replica
> > "agmt="cn=1189_to_1389_on_ratangad.eng.blr.redhat.com" (ratangad:1389)".
> > Sent 126 entries.
> > [14/Sep/2016:00:00:56.279623167 +051800] slapi_ldap_bind - Error: could not
> > bind id [cn=SyncManager,cn=config] authentication mechanism [SIMPLE]: error
> > 49 (Invalid credentials) errno 0 (Success)
> 
> This is probably a bug in that log message, but it has nothing to do with
> this bug.  This bug was for improvement the replication agreement status
> message (not the error log messages).

Thanks Mark!. I filed a bug for the error message - https://bugzilla.redhat.com/show_bug.cgi?id=1376057

Comment 13 errata-xmlrpc 2016-11-03 20:45:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2594.html