Hide Forgot
Description of problem: Testing IPA Upgrades, I'm seeing DNS errors (and later resolution stops working). I'm seeing a lot of named LDAP errors: Feb 4 02:59:35 vm-idm-023 named[16726]: LDAP instance 'ipa' is being synchronized, please ignore message 'all zones loaded' Feb 4 02:59:35 vm-idm-023 named[16726]: LDAP error: Critical extension is unavailable: unable to start SyncRepl session: is RFC 4533 supported by LDAP server? Feb 4 02:59:35 vm-idm-023 named[16726]: ldap_syncrepl will reconnect in 60 seconds Also, I just noticed that during the upgrade I see this: Add failure missing required attribute "objectclass" Looking in /var/log/ipaupgrade.log: 2015-02-03T20:52:12Z INFO New entry: cn=Content Synchronization,cn=plugins,cn=config 2015-02-03T20:52:12Z DEBUG --------------------------------------------- 2015-02-03T20:52:12Z DEBUG Initial value 2015-02-03T20:52:12Z DEBUG dn: cn=Content Synchronization,cn=plugins,cn=config 2015-02-03T20:52:12Z DEBUG only: set nsslapd-pluginEnabled to 'on', current value [] 2015-02-03T20:52:12Z DEBUG only: updated value ['on'] 2015-02-03T20:52:12Z DEBUG --------------------------------------------- 2015-02-03T20:52:12Z DEBUG Final value after applying updates 2015-02-03T20:52:12Z DEBUG dn: cn=Content Synchronization,cn=plugins,cn=config 2015-02-03T20:52:12Z DEBUG nsslapd-pluginEnabled: 2015-02-03T20:52:12Z DEBUG on 2015-02-03T20:52:12Z ERROR Add failure missing required attribute "objectclass" 2015-02-03T20:52:12Z INFO Updating existing entry: cn=kerberos,dc=testrelm,dc=test Thierry Bordaz found that this is a problem with adding ReplSync support: " Both error messages (are related to the same problem). There is a failure to ADD the ReplSync config entry. [04/Feb/2015:02:22:11 +051800] conn=12 op=65 RESULT err=0 tag=101 nentries=1 etime=0 [04/Feb/2015:02:22:11 +051800] conn=12 op=66 SRCH base="cn=Content Synchronization,cn=plugins,cn=config" scope=0 filter="(objectClass=*)" attrs="aci * attributeTypes objectClasses" [04/Feb/2015:02:22:11 +051800] conn=12 op=66 RESULT err=32 tag=101 nentries=0 etime=0 [04/Feb/2015:02:22:11 +051800] conn=12 op=67 ADD dn="cn=Content Synchronization,cn=plugins,cn=config" [04/Feb/2015:02:22:12 +051800] conn=12 op=67 RESULT err=65 tag=105 nentries=0 etime=1 According to the returned message, the added entry was missing the 'objectclass' attribute and so the ADD failed. The later in ipaupgrade errors were reported related to DNS/named, because it requires ReplSync support. But as ReplSync was not configured/enabled, 389-ds was failing to decode ReplSync controls [04/Feb/2015:02:59:34 +051800] conn=100 op=3 SRCH base="(null)" scope=2 filter="(|(objectClass=idnsConfigObject)(objectClass=idnsZone)(objectClass=idnsForwardZone)(objectClass=idnsRecord))", failed to decode LDAP controls [04/Feb/2015:02:59:34 +051800] conn=100 op=3 RESULT err=12 tag=101 nentries=0 etime=0 So this DNS errors are a different bug (not related to schema update). This new bug seems to be that the ADD 'content synchronization' plugin entry was missing objectclass " Version-Release number of selected component (if applicable): ipa-server-4.1.0-16.el7.x86_64 389-ds-base-1.3.3.1-13.el7.x86_64 How reproducible: often but, not 100% consistent. Steps to Reproduce: 1. Setup IPA Master on RHEL 7.0 2. Setup IPA Replica on RHEL 7.0 3. Upgrade IPA Master to RHEL 7.1 4. Upgrade IPA Replica to RHEL 7.1 5. dig <MASTER> Actual results: After upgrade DNS no longer works on one of the IPA servers. And I see a lot of named ldap errors in /var/log/messages like listed above. Expected results: DNS works and resolves hostnames as expected with no ldap errors in log. Additional info:
Waiting on Scott, to provide reproduction or more logs for the failure. It should not happen under normal conditions, unless Directory Server upgrade script failed and did not create cn=Content Synchronization,cn=plugins,cn=config.
Created attachment 988985 [details] setup-ds.pl output
Content Syncronization did appear to be created on the master when upgraded: [root@vm-idm-003 ~]# ldapsearch -x -D "cn=Directory Manager" -w Secret123 -b "cn=Content Synchronization,cn=plugins,cn=config" # extended LDIF # # LDAPv3 # base <cn=Content Synchronization,cn=plugins,cn=config> with scope subtree # filter: (objectclass=*) # requesting: ALL # # Content Synchronization, plugins, config dn: cn=Content Synchronization,cn=plugins,cn=config objectClass: top objectClass: nsSlapdPlugin objectClass: extensibleObject cn: Content Synchronization nsslapd-pluginPath: libcontentsync-plugin nsslapd-pluginInitfunc: sync_init nsslapd-pluginType: object nsslapd-pluginEnabled: off nsslapd-plugin-depends-on-named: Retro Changelog Plugin nsslapd-pluginId: none nsslapd-pluginVersion: none nsslapd-pluginVendor: none nsslapd-pluginDescription: none # search result search: 2 result: 0 Success # numResponses: 2 # numEntries: 1 What is the error from the ipaupgrade.log though? Is that not related here?
Ludwig investigated this bug, and he is pretty confident that it is a race condition between shutdown of ds and running the update script. Additional info from Ludwig's investigation: The DS upgrade runs in offline mode, so it will stop the servers and call setup-ds.pl. If the server is still running setup-ds exits and aborts upgrade. It looks like 389-ds-base rpm handles stopping dirsrv before running the setup-ds.pl script: /bin/systemctl stop dirsrv.target > $output 2>&1 || : echo remove pid files . . . > $output 2>&1 || : /bin/rm -f /var/run/dirsrv*.pid /var/run/dirsrv*.startpid # do the upgrade echo upgrading instances . . . > $output 2>&1 || : /usr/sbin/setup-ds.pl -l $output -u -s General.UpdateMode=offline > $output 2>&1 || : in a test on my machine I got: shutting down all instances . . . tail: /tmp/setup-389.log: file truncated remove pid files . . . tail: /tmp/setup-389.log: file truncated upgrading instances . . . tail: /tmp/setup-389.log: file truncated [15/02/10:16:47:16] - [Setup] Info Error: offline mode selected but the server [slapd-elkris] is still running. [15/02/10:16:47:16] - [Setup] Fatal Error: could not update the directory server. [15/02/10:16:47:16] - [Setup] Fatal Exiting . . . so the server should have been down, but it looks like it wasn't yet
Could it be the same issue as: https://fedorahosted.org/freeipa/ticket/4709#comment:6 ? systemctl stop dirsrv.target does not wait until instances are stopped
I guess it's possible that it's related to that freeipa ticket about dirsrv.target not waiting till it's complete. I was able to successfully get several reproductions today using some slower VMs that I thought I'd seen the problem on more often. That seemed to work. Is there a way I can modify dirsrv.target to force it to wait until everything stops before returning? If so, I can test that case against my slower VMs and see if it stops the error from occurring. Thanks, Scott
Hi Scott, Ludwig provided fixes on 389-ds-base.spec file. I've applied them with some modifications and created this scratch build. http://brewweb.devel.redhat.com/brew/taskinfo?taskID=8734627 Could it be possible to run your upgrade test? Thanks! --noriko
Running tests now. I'll see how it goes. 20 runs should hopefully be enough to confirm. Especially since I saw in in 3/10 from the last batch. It'll take me some time to get that done though. I'll let you know when it is. Thanks, Scott
I've been having issues with test servers. I'm still trying to get good clean tests here. I'll give an update tomorrow to see how my tests are running.
FYI, I've been working through some test and test host issues. I have however been able to get a lot of good runs so far with no DNS errors that I was seeing here. I have 4 final test runs going now. When those finish, if I don't see this error, I think we're good. I'll update the bug when I have those results. Thanks, Scott
Ok, my last job finally finished and I have not seen this error again with the test fix. I think we can go with that fix. Let me know if you need me to test something else for this.
Verification steps provided by Scott is quite straight forward to verify from IPA. We can do that from DS QE team. Please share any other pointers to verify the same using standalone 389-ds-base.
(In reply to Sankar Ramalingam from comment #17) > Verification steps provided by Scott is quite straight forward to verify > from IPA. We can do that from DS QE team. Please share any other pointers to > verify the same using standalone 389-ds-base. There's only a difference is in the spec file which is being used in the upgrade. Before the fix, the next operation has started before shutting down the server was not completed. The fix makes the shutdown wait, then the next operation begins. We added debug options to the fix (in the 7.1.z build which hasn't been made yet), it might be useful to see the output you can get by doing... export DEBUGPOSTTRANS=/path/to/upgrade_log_file export DEBUGPOSTSETUPOPT=ddddd Run rpm with -U option. See /path/to/upgrade_log_file once the upgrade is done. You should be able to see this output in the log file. It does not prove the shutdown was completed in each systemctl stop, but at least we could say the code was executed. + echo stopping instance $inst >> $output 2>&1 || : + /bin/systemctl stop $inst >> $output 2>&1 || :
Hit the bug twice (from two *fresh* installation) using 389-ds-base-1.3.3.1-15.el7_1.x86_64 where it's supposed to be fixed Step to reproduce: 1) Install RHEL7 2) yum install ipa-server-3.3.3-28.el7_0.3.x86_64 bind bind-dyndb-ldap (need IPA3.X because of bug in RHEVM ) 3) ipa-server-install --setup-dns (no errors) dnsrecord-show works find and all records are in place but bind still doesn't resolve them. Plugins is disabled because 289ds was running at the moment of config file was modified
do you have the content sync plugin in the dse.ldif or not ?
I had it but it was disabled as well as Retro Changelog Plugin. Workaround was to enable both manually.
then it is a different problem, this bug was taht a failing DS upgrade did not add the plugin to the dse.ldif. If you have it, disabled or not, then some other step in ipa install is failing
Scott, have you seen recently DNS errors in your upgrade tests?
Viktor, So far we haven't seen this bug again when testing with RHEL7.2 upgrades. So, I think this can be closed. If we do run into this again during upgrade testing, we can reopen it. Let me know if you prefer I close this and I will. Thanks, Scott
Verified. Version :: 389-ds-base-1.3.4.0-9.el7.x86_64 ipa-server-4.2.0-3.el7.x86_64 Results :: So far we haven't seen these errors again with the patch.
(In reply to Scott Poore from comment #31) > Verified. > > Version :: > > 389-ds-base-1.3.4.0-9.el7.x86_64 > ipa-server-4.2.0-3.el7.x86_64 > > Results :: > > So far we haven't seen these errors again with the patch. Thank you, Scott!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2351.html