Bug 962885

Summary: RHEL 6.2 to 6.4 ipa upgrade selinuxusermap data not replicating
Product: Red Hat Enterprise Linux 6 Reporter: Scott Poore <spoore>
Component: 389-ds-baseAssignee: Rich Megginson <rmeggins>
Status: CLOSED ERRATA QA Contact: Sankar Ramalingam <sramling>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.4CC: jgalipea, jrusnack, lnovich, mkosek, nhosoi, nkinder, rmeggins
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 389-ds-base-1.2.11.15-22.el6 Doc Type: Bug Fix
Doc Text:
Cause: IPA upgrade changes nsslapd-port to 0 for security reasons, to completely deny any traffic on the unencrypted port. Consequence: The nsslapd-port is also used to construct the RUV used by replication. The replication startup code checks the existing RUV against the current hostname and port, finds that it is changed, removes the RUV, and removes the changelog. This causes a loss of changes and a partial reset of the state of the replica. A supplier will then attempt to send changes that already exist. A certain combination of these will cause the supplier to get into an endless loop attempting to send the same duplicate update over and over again. Fix: At replication startup, if the port number is 0, do not remove the RUV element for the localhost, just assume the port number should not be changed. Result: Changing the nsslapd-port to 0 will not remove the local replica from the RUV, and replication will continue to work.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-11-21 21:07:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Scott Poore 2013-05-14 17:13:51 UTC
Description of problem:

When I upgrade an IPA replica before the master, I don't see selinuxusermap entries added on master replicated to replica.

Version-Release number of selected component (if applicable):
389-ds-base-1.2.11.15-11.el6.x86_64
ipa-server-3.0.0-26.el6_4.2.x86_64

How reproducible:
seeing it consistently in upgrade test automation but, not consistent on manual.

Steps to Reproduce:

1. On RHEL6.2, install IPA server, then replica, then client
2. add repo config pointing to a rhel6.4 repo on all servers
3. Upgrade IPA client with yum update ipa-server
4. Upgrade IPA replica with yum update ipa-server
5. Upgrade IPA master with yum update ipa-server
On MASTER:
6. ipa user-add seluser1 --first=sel --last=user1
7. ipa selinuxusermap-add   --hostcat=all --selinuxuser=staff_u:s0-s0:c0.c1023 serule1
8. ipa selinuxusermap-add-user --users=seluser1 serule1
On REPLICA:
9. ipa selinuxusermap-show serule1

NOTE:  On automated tests where this has been seen, this is occuring after a few previous install/upgrade/uninstall tests where this error was not seen.  

Actual results:
Not able to see selinuxusermap entried on replica that was created on master.

[root@ipaqavmc slapd-TESTRELM-COM]# ipa selinuxusermap-show serule1
ipa: ERROR: serule1: SELinux User Map rule not found


Expected results:
Should see it on replica, same as master:

[root@ipaqavmb slapd-TESTRELM-COM]# ipa selinuxusermap-show serule1
  Rule name: serule1
  SELinux User: staff_u:s0-s0:c0.c1023
  Host category: all
  Enabled: TRUE
  Users: jordan

Additional info:

Comment 2 Scott Poore 2013-05-14 17:31:49 UTC
Created attachment 747768 [details]
slapd log from master

Comment 3 Scott Poore 2013-05-14 17:32:09 UTC
Created attachment 747769 [details]
slapd log from replica

Comment 7 Martin Kosek 2013-05-15 08:51:03 UTC
Is this a 389-ds-base issue? Should I change the component?

Comment 8 Rich Megginson 2013-05-15 13:37:09 UTC
Yes.  The directory server supplier is sending over the same changes twice.  The reason is because the RUV returned from the consumer (the Consumer RUV in the supplier error log) is bogus - the RUV element for the supplier (rid 4) is empty and even has the wrong port number in the pURL.  The RUV element for the supplier should contain the max CSN of the most recent changes sent over.  What I find odd is that we did a full complement of replication testing for RHEL 6.4 and we did not see this issue.

Comment 9 Martin Kosek 2013-05-15 13:46:44 UTC
Ok. Changing the component. This bug do sounds like something we would want for RHEL-6.5 (also based on your additional investigation).

Comment 10 Rich Megginson 2013-05-15 15:50:44 UTC
This is a case of a duplicate ADD - the entries were added directly to the replica earlier:

[14/May/2013:12:57:00 -0400] conn=7 op=25 ADD dn="cn=selinux,dc=testrelm,dc=com"
[14/May/2013:12:57:00 -0400] conn=7 op=25 RESULT err=0 tag=105 nentries=0 etime=0 csn=51926cdd000000030000

The replica was unable to send this change to the master because there was a problem with replication:

[14/May/2013:12:57:00 -0400] slapi_ldap_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: error -2 (Local error)
[14/May/2013:12:57:00 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): Replication bind with GSSAPI auth failed: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Cannot determine realm for numeric host address))

replication resumes:
[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): Replication bind with GSSAPI auth resumed
schema repl issue:
[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): Warning: unable to replicate schema: rc=1

The above add, and several other changes, appear to be missing from the supplier RUV:

[14/May/2013:12:59:12 -0400] - _cl5PositionCursorForReplay (agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389)): Supplier RUV:
[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): {replicageneration} 51926881000000040000
[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): {replica 3 ldap://ipaqavma.testrelm.com:389} 51926d51000000030000 51926d61000000030000 51926d60

Note that the min csn in the RUV element for the replica (rid 3) is 51926d51000000030000, which is greater than 51926cdd000000030000.  But the consumer has this:

[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): {replica 3 ldap://ipaqavma.testrelm.com:389} 51926887000800030000 51926cc4000000030000 00000000

51926cc4000000030000 is less than 51926cdd000000030000, so the master has not seen that change yet.

The replica attempts to replay these changes to the master:
[14/May/2013:12:59:12 -0400] agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389) - session start: anchorcsn=51926cc4000000030000
[14/May/2013:12:59:12 -0400] agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389) - clcache_load_buffer: rc=-30988
[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - changelog program - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): CSN 51926aa6000000040000 not found and no purging, probably a reinit

So, for some reason, the replica doesn't have 51926cc4000000030000.  This is a change which originated on the replica (rid 3) - not sure why it isn't found.  Since it isn't found, it tries to use the min csn from the supplier, which is 51926d51000000030000, which skips the changes made earlier.

Comment 11 Nathan Kinder 2013-05-15 16:01:37 UTC
Upstream ticket:
https://fedorahosted.org/389/ticket/47362

Comment 17 Rich Megginson 2013-05-20 15:20:31 UTC
TET RHEL64
Sending        testcases/DS/6.0/mmrepl/accept/accept.sh
Transmitting file data .
Committed revision 7625.

Comment 18 Rich Megginson 2013-05-20 15:23:00 UTC
TET trunk
Sending        testcases/DS/6.0/mmrepl/accept/accept.sh
Transmitting file data .
Committed revision 7626.

Comment 24 Sankar Ramalingam 2013-08-19 09:41:25 UTC
As per comment #17, marking this bug with qe_test_coverage+ flag.

Comment 25 Ján Rusnačko 2013-08-28 09:51:44 UTC
TestCase [trac47362] result-> [PASS] 

with 389-ds-base-1.2.11.15-22

Comment 26 errata-xmlrpc 2013-11-21 21:07:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1653.html