Bug 803930
Summary: | ipa not starting after upgade because of missing data | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Scott Poore <spoore> |
Component: | 389-ds-base | Assignee: | Rich Megginson <rmeggins> |
Status: | CLOSED ERRATA | QA Contact: | IDM QE LIST <seceng-idm-qe-list> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 6.3 | CC: | jgalipea, mkosek, nhosoi, nkinder, rmeggins |
Target Milestone: | rc | Keywords: | TestBlocker |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | 389-ds-base-1.2.10.2-4.el6 | Doc Type: | Bug Fix |
Doc Text: |
Cause: Upgrading directory server.
Consequence: See error messages like this in the errors log:
ldif2dbm - _get_and_add_parent_rdns: Failed to convert DN cn=TESTRELM.COM to RDN
Fix: Make sure upgrade does not start up the server until it has finished doing setup-ds.pl -u with the server off, to properly upgrade the database.
Result: Upgrades do not cause any error messages after starting the server.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2012-06-20 07:14:57 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Scott Poore
2012-03-16 02:33:05 UTC
Created attachment 570467 [details]
ipaupgade.log file
Created attachment 570468 [details]
/var/log/dirsrv/slapd-TESTRELM-COM/errors log file
Problems occur between the final shutdown of 389-ds-base 1.2.9.14 and the start of 1.2.10.2. It looks like the db upgrade failed. So is there a way on the current system to try to track that down or will I need to start over and maybe set some debugging/loglevel for the dirsrv before the upgrade? Also, this particular scenario was tested on a KVM guest that is a clone that has a working IPA server install. The clone starts off with it's time off a bit so before the yum upgrade I was running the following to correct the time: service ntpd stop service ntpdate start service ntpd start ipactl restart I was going that so that IPA components started with the correct time before I ran the upgrade. Could that have anything to do with this issue? I don't think the time issue is related. We'll need to get the 389-ds-team input on how to proceed. Created attachment 570635 [details]
db2ldif export of TESTRELM-COM instance
Ok, attached db export from following command: [root@spoore-dvm1 ~]# ns-slapd db2ldif -D /etc/dirsrv/slapd-TESTRELM-COM -s dc=testrelm,dc=com -a /tmp/TESTRELM-COM.export.ldif -r [16/Mar/2012:10:21:21 -0500] - /etc/dirsrv/slapd-TESTRELM-COM/dse.ldif: nsslapd-maxdescriptors: nsslapd-maxdescriptors: invalid value "8192", maximum file descriptors must range from 1 to 1024 (the current process limit). Server will use a setting of 1024. [16/Mar/2012:10:21:21 -0500] - Config Warning: - nsslapd-maxdescriptors: invalid value "8192", maximum file descriptors must range from 1 to 1024 (the current process limit). Server will use a setting of 1024. [16/Mar/2012:10:21:21 -0500] - Backend Instance(s): [16/Mar/2012:10:21:21 -0500] - userRoot [16/Mar/2012:10:21:21 -0500] schema-compat-plugin - warning: no entries set up under ou=SUDOers, dc=testrelm,dc=com [16/Mar/2012:10:21:21 -0500] - Skipping CoS Definition cn=Password Policy,cn=accounts,dc=testrelm,dc=com--no CoS Templates found, which should be added before the CoS Definition. ldiffile: /tmp/TESTRELM-COM.export.ldif [16/Mar/2012:10:21:21 -0500] - export userRoot: Processed 246 entries (100%). [16/Mar/2012:10:21:21 -0500] - Waiting for 4 database threads to stop [16/Mar/2012:10:21:22 -0500] - All database threads now stopped Upstream ticket: https://fedorahosted.org/freeipa/ticket/2541 Unit test works fine. Steps: 1. Installed ipa-server-2.1.3-9.el6.x86_64 (w/ 389-ds-base.1.2.9.14). 2. Upgraded just 389-ds-base to 389-ds-base-1.2.10.4-1.el6.x86_64. $ setup-ds.pl -u [16/Mar/2012:23:01:22 -0700] - /etc/dirsrv/slapd-EXAMPLE-COM/dse.ldif: nsslapd-maxdescriptors: nsslapd-maxdescriptors: invalid value "8192", maximum file descriptors must range from 1 to 1024 (the current process limit). Server will use a setting of 1024. [16/Mar/2012:23:01:22 -0700] - Config Warning: - nsslapd-maxdescriptors: invalid value "8192", maximum file descriptors must range from 1 to 1024 (the current process limit). Server will use a setting of 1024. [16/Mar/2012:23:01:22 -0700] - check_and_set_import_cache: pagesize: 4096, pages: 2014359, procpages: 55536 [16/Mar/2012:23:01:22 -0700] - Import allocates 3222972KB import cache. [16/Mar/2012:23:01:22 -0700] Upgrade DN Format - userRoot: Start upgrade dn format. [16/Mar/2012:23:01:22 -0700] Upgrade DN Format - Instance userRoot in /var/lib/dirsrv/slapd-EXAMPLE-COM/db/userRoot is up-to-date [...] Finished successful update of directory server. Please restart your directory servers. Exiting . . . Log file is '/tmp/setupXrP5vR.log' It seems the failure could be the combination of IPA upgrades. Trying to reproduce the problem by upgrading the entire IPA system... So when you run "ipa user-find", what do you get? And, do you have all output from the install and update? Could I see that to compare to what I'm seeing? You could attach or just email it directly to me. Can you send me a copy of the /root/anaconda-ks.cfg file to see if I can use that to test? I still have had no luck even if I ugrade 389-ds-base first. If SELinux is Enforcing, I get a known error from another BZ. If Permissive, I am now seeing something new (or something I hadn't noticed before). Now named isn't starting causing ipactl to abort. I'm still looking into that one. [root@ipaqavmc ~]# ipactl restart Restarting Directory Service Shutting down dirsrv: PKI-IPA... server already stopped[FAILED] TESTRELM-COM... server already stopped[FAILED] *** Error: 2 instance(s) unsuccessfully stopped[FAILED] Starting dirsrv: PKI-IPA...[ OK ] TESTRELM-COM...[ OK ] Restarting KDC Service Stopping Kerberos 5 KDC: [FAILED] Starting Kerberos 5 KDC: [ OK ] Restarting KPASSWD Service Stopping Kerberos 5 Admin Server: [FAILED] Starting Kerberos 5 Admin Server: [ OK ] Restarting DNS Service Stopping named: [ OK ] Starting named: [FAILED] Failed to restart DNS Service Shutting down Stopping Kerberos 5 KDC: [ OK ] Stopping Kerberos 5 Admin Server: [ OK ] Stopping named: [ OK ] Stopping httpd: [FAILED] Stopping pki-ca: [ OK ] Shutting down dirsrv: PKI-IPA...[ OK ] TESTRELM-COM...[ OK ] Aborting ipactl # rpm -q ipa-server 389-ds-base ipa-server-2.2.0-4.el6.x86_64 389-ds-base-1.2.10.2-3.el6.x86_64 # getenforce Enforcing # rpm -qa | egrep -i --color selinux selinux-policy-3.7.19-139.el6.noarch selinux-policy-targeted-3.7.19-139.el6.noarch ipa-server-selinux-2.2.0-4.el6.x86_64 pki-selinux-9.0.3-21.el6_2.noarch libselinux-utils-2.0.94-5.2.el6.x86_64 libselinux-devel-2.0.94-5.2.el6.x86_64 libselinux-python-2.0.94-5.2.el6.x86_64 libselinux-2.0.94-5.2.el6.x86_64 # ipa user-find -------------- 1 user matched -------------- User login: admin Last name: Administrator Home directory: /home/admin Login shell: /bin/bash UID: 1609800000 GID: 1609800000 Account disabled: False Password: True Kerberos keys available: True ---------------------------- Number of entries returned 1 ---------------------------- Created attachment 571251 [details]
anaconda-ks.cfg on the test VM
Another steps which worked fine. 1. install ipa-server-2.1.3-9 w/ 389-ds-base.1.2.9.14-1 2. ipa-server-install 3. Upgrade ipa and 389-ds without running post scripts. # rpm -Uvh --nopost --nopostun --nodeps ipa-* Preparing... ########################################### [100%] 1:ipa-python ########################################### [ 17%] 2:ipa-client ########################################### [ 33%] 3:ipa-admintools ########################################### [ 50%] 4:ipa-server ########################################### [ 67%] 5:ipa-server-selinux ########################################### [ 83%] 6:ipa-debuginfo ########################################### [100%] # rpm -Uvh --nopost --nopostun --nodeps 389-ds-base* Preparing... ########################################### [100%] 1:389-ds-base-libs ########################################### [ 25%] 2:389-ds-base warning: /etc/sysconfig/dirsrv created as /etc/sysconfig/dirsrv.rpmnew ########################################### [ 50%] 3:389-ds-base-devel ########################################### [ 75%] 4:389-ds-base-debuginfo ########################################### [100%] # service dirsrv stop 4. ran setup-ds.pl, which was successful. Finished successful update of directory server. Please restart your directory servers. 5. ran ipa upgrade scripts /sbin/chkconfig --add ipa /usr/sbin/ipa-upgradeconfig /usr/sbin/ipa-ldap-updater --upgrade /sbin/service ipa condrestart, which restarted the DS successfully. and no errors were logged in the DS error log. I'll take your anaconda-ks.cfg and try to rebuild my guest closer to that. One question though, was that everything? Not %packages section? I'll see if that yields anything for me. Thanks I ran "/usr/sbin/ipa-ldap-updater --upgrade" before running setup-ds.pl which broke the DS database: # egrep get_and_add errors [20/Mar/2012:03:03:49 -0700] ldif2dbm - _get_and_add_parent_rdns: Failed to convert DN cn=EXAMPLE.COM to RDN [...] It seems the problem is running ipa upgrade script (executed at %post server phase in ipa.spec) prior to 389-ds-base upgrade (setup-ds.pl -u executed at %posttrans phase). (In reply to comment #24) > I'll take your anaconda-ks.cfg and try to rebuild my guest closer to that. One > question though, was that everything? You mean the attached anaconda-ks.cfg? Yeah, that's it. I attached the file as is... > Not %packages section? I don't see it, indeed... ok, then, I'll try to sort through the named issues I'm seeing when I tried upgrading 389-ds-base* before ipa*. I'll see if I can track that down or at least narrow down the problem to get a lead for where to look next. Thanks for all of your help here, Noriko. Noriko, Did you use --setup-dns when running ipa-server-install? Thanks, Scott Ok, I was able to get the upgrade to work when I did not include DNS for the setup/config of the IPA server. So, the upgrade with yum pointing to a 6.3 repo worked when I left out the DNS options and ran this for the initial install: ipa-server-install --hostname=$hostname_s.$DOMAIN -r $RELM -n $DOMAIN -p $ADMINPW -P $ADMINPW -a $ADMINPW -U Anyone know of DNS related upgrade issues? Looking further. Scott, as you figured, I did not use --setup-dns in my testing. So, finally, we are sharing the same ground. ;)
I think you found 2 problems.
1. SELinux failure
2. DS database corruption in the upgrade
> I was able to get the upgrade to work when I did not include DNS for the
setup/config of the IPA server
This means without --setup-dns, you could successfully upgrade IPA if you upgrade 389-ds-base first, then upgrade IPA, right? (If you upgrade 389-ds-base as a part of IPA, the upgrade still fails, doesn't it?)
If you run ipa-server-install with --setup-dns, which problem do you see, 1 (SELinux) or 2 (DS db)? If 2, even if you upgrade 389-ds-base first, then upgrade IPA, the upgrade fails???
I believe from the conversation we had earlier that it's been narrowed down as Noriko mentioned to the ipa updater running before the setup-ds.pl. Here's my response to the last questions though and my test results (mostly from today or tests also re-run today for consistency): Yes, without --setup-dns, it worked if I upgraded 389-ds-base first. Think I found a fix/workaround for upgrades when DNS is configured: upgrade bind and bind-dyndb-ldap, then 389-ds-base, then ipa. yum -y update bind bind-dyndb-ldap yum -y update '389-ds-base*' yum -y update 'ipa*' Ok, listing possible combinations I've tried so far and successes/failures: 1.) Permissive w/ DNS; yum update ipa* Failed with DS database corruption. This is what I started with that led to opening this ticket. 2.) Enforcing w/ DNS; yum update ipa* Failed AVC denial. This is what I tried originally. This failed more than once related to bug 803054 which seems to relate to bug 799102. Ran it just now the same way I ran the other tests below and I get the DNS failure there. 3.) Permissive w/ DNS; yum update 389-ds-base*; yum update ipa* Failed with DNS related errors, unable to start named, as in comment 20. 4.) Enforcing w/ DNS; yum update 389-ds-base*; yum update ipa* Failed with DNS related errors, unable to start named, as in comment 20. 5.) Permissive w/ DNS; upgrading bind bind-dyndb-ldap, then 389-ds-base, then ipa. Success. 6.) Enforcing w/ DNS; upgrading bind bind-dyndb-ldap, then 389-ds-base, then ipa. Success. 7.) Permissive w/ DNS; yum update bind bind-dyndb-ldap; yum update ipa* Failed with DB corruption. dirsrv won't start. 8.) Enforcing w/ DNS; yum update bind bind-dyndb-ldap; yum update ipa* Failed with DB corruption. dirsrv won't start. 9.) Permissive w/o DNS; upgrading ipa (everything as dependency) Failed with DB corruption. dirsrv won't start. 10.) Enforcing w/o DNS; upgrading ipa (everything as dependency) Failed with DB corruption. dirsrv won't start. 11.) Permissive w/o DNS; upgrading 389-ds-base, then ipa. Success. 12.) Enforcing w/o DNS; upgrading 389-ds-base, then ipa. Success. Great matrix! Thanks a lot, Scott! ok - can reproduce with just 389-ds-base - here's how: install the RHEL 6.2 389-ds-base (if building from upstream, use the latest 1.2.9 branch code) create instance enable usn plugin add data from Example.ldif stop ds upgrade to RHEL 6.3 389-ds-base (if building from upstream, use 1.2.10 branch) if using rpm, do rpm --noscripts start ds Note: the DBVERSION is now bdb/4.7/libback-ldbm/newidl/rdn-format-2/dn-4514 - so the server now thinks the entryrdn is using the new format even though we have not upgraded it yet add index entries for freeipa/install/updates/20-indices.update launch an index task for those attributes memberuid memberof memberHost memberUser in the errors log you will see errors like this: [20/Mar/2012:19:22:51 -0600] - userRoot: Indexing attribute: memberuid [20/Mar/2012:19:22:51 -0600] ldif2dbm - _get_and_add_parent_rdns: Failed to convert DN ou=People to RDN [20/Mar/2012:19:22:51 -0600] - ldbm2index: Skip ID 4 [20/Mar/2012:19:22:51 -0600] - Parent entry (ID 4) of entry. (ID 6, rdn: uid=scarter) does not exist. [20/Mar/2012:19:22:51 -0600] - We recommend to export the backend instance userRoot and reimport it. Note that I did not have to delete anything - ou=People is not a tombstone. I think the basic problem is this: After upgrading the binaries, we must not start the server until the database has been upgraded. Unless we can guarantee this, we will have problems. First quick tests DB looks good: 1.) Permissive w/ DNS; yum update ipa* Failed with DNS error but no _get_and_add DB errors. 2.) Enforcing w/ DNS; yum update ipa* Failed with DNS error but no _get_and_add DB errors. After the failure, I upgraded bind and bind-dyn-ldap for both and the problem was resolved. # rpm -q bind bind-dyndb-ldap bind-9.8.2-0.6.rc1.el6.x86_64 bind-dyndb-ldap-1.1.0-0.3.b1.el6.x86_64 Created attachment 571874 [details]
Manual Results Verified for Permissive with DNS and yum update bind then yum update ipa
Created attachment 571875 [details]
Manual Results Verified for Enforcing with DNS and yum update bind then yum update ipa
Verified. Version :: 389-ds-base-1.2.10.2-4.el6.x86_64 Manual Test Results :: 1.) Permissive w/ DNS; yum update ipa* Success when upgrading bind and bind-dyndb-ldap before ipa*. No dirsrv errors seen: [root@spoore-dvm1 ~]# rpm -q ipa-server 389-ds-base bind bind-dyndb-ldap ipa-server-2.2.0-4.el6.x86_64 389-ds-base-1.2.10.2-4.el6.x86_64 bind-9.8.2-0.6.rc1.el6.x86_64 bind-dyndb-ldap-1.1.0-0.3.b1.el6.x86_64 [root@spoore-dvm1 ~]# getenforce Permissive [root@spoore-dvm1 ~]# egrep _get_and_add /var/log/dirsrv/slapd-TESTRELM-COM/errors [root@spoore-dvm1 ~]# See attachment 571874 [details] for full results. 2.) Enforcing w/ DNS; yum update ipa* [root@spoore-dvm2 ~]# rpm -q ipa-server 389-ds-base bind bind-dyndb-ldap ipa-server-2.2.0-4.el6.x86_64 389-ds-base-1.2.10.2-4.el6.x86_64 bind-9.8.2-0.6.rc1.el6.x86_64 bind-dyndb-ldap-1.1.0-0.3.b1.el6.x86_64 [root@spoore-dvm2 ~]# getenforce Enforcing [root@spoore-dvm2 ~]# egrep _get_and_add /var/log/dirsrv/slapd-TESTRELM-COM/errors [root@spoore-dvm2 ~]# See attachment 571875 [details] for full results. Since no DB errors were seen with these upgrade, this can be verified and close. Upstream spec fixed as well: master: https://fedorahosted.org/freeipa/changeset/00ce15b7442914be859c9e0912d0d02a836fe649 ipa-2-2: https://fedorahosted.org/freeipa/changeset/92961a6aa6cdfcd31037fc1f200411ed3103577a *** Bug 803452 has been marked as a duplicate of this bug. *** Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: Upgrading directory server. Consequence: See error messages like this in the errors log: ldif2dbm - _get_and_add_parent_rdns: Failed to convert DN cn=TESTRELM.COM to RDN Fix: Make sure upgrade does not start up the server until it has finished doing setup-ds.pl -u with the server off, to properly upgrade the database. Result: Upgrades do not cause any error messages after starting the server. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0813.html |