Bug 803452

Summary: ipa upgrade with services down errors on configuring ipa_memcached
Product: Red Hat Enterprise Linux 6 Reporter: Scott Poore <spoore>
Component: ipaAssignee: Rob Crittenden <rcritten>
Status: CLOSED DUPLICATE QA Contact: IDM QE LIST <seceng-idm-qe-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: jgalipea, mkosek
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-03-27 19:46:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
full yum update output
none
ipaupgade.log file
none
dirsrv error log none

Description Scott Poore 2012-03-14 18:20:58 UTC
Created attachment 570061 [details]
full yum update output

Description of problem:

Upgrading IPA to version 2.2.0-3 on RHEL6.2 errors configuring ipa_memcached.

In this case, services were down before the upgrade, /var/run/slapd-TESTRELM-COM.socket was removed, and resolv.conf was pointed to a different name server.

ipactl stop
rm /var/run/slapd-TESTRELM-COM.socket
yum update 'ipa*'

Version-Release number of selected component (if applicable):
RHEL6.2 build with IPA 2.1.3 being upgraded to 2.2.0-3.

How reproducible:
Currently unknown.  First time seeing this and haven't reproduced yet.

Steps to Reproduce:
1.  <Start with RHEL6.2 build>
2.  <setup IPA 2.1.3 server from base OS repos>
3.  ipactl stop
4.  rm /var/run/slapd-<REALMINSTANCE>.socket
5.  <add RHEL6.3 repos and/or repos containing IPA 2.2.0-3 rpms>
6.  vi /etc/resolv.conf # point to known good DNS server if necessary
7.  yum -y update 'ipa*'
  
Actual results:

Error and traceback seen:

<snip>

  Updating   : ipa-server-2.2.0-3.el6.x86_64                              21/44 
Upgraded /etc/httpd/conf.d/ipa.conf to version 4
Configuring ipa_memcached
  [1/2]: starting ipa_memcached 
  [2/2]: configuring ipa_memcached to start on boot
Traceback (most recent call last):
  File "/usr/sbin/ipa-upgradeconfig", line 289, in <module>
    sys.exit(main())
  File "/usr/sbin/ipa-upgradeconfig", line 282, in main
    memcache.create_instance('MEMCACHE', fqdn, None, ipautil.realm_to_suffix(krbctx.default_realm))
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 325, in create_instance
    self.start_creation("Configuring %s" % self.service_name)
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 257, in start_creation
    method()
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 338, in __enable
    self.dm_password, self.suffix)
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 311, in ldap_enable
    self.admin_conn.addEntry(entry)
  File "/usr/lib/python2.6/site-packages/ipaserver/ipaldap.py", line 496, in addEntry
    self.__handle_errors(e, arg_desc=arg_desc)
  File "/usr/lib/python2.6/site-packages/ipaserver/ipaldap.py", line 312, in __handle_errors
    raise errors.NotFound(reason=arg_desc)
ipalib.errors.NotFound: entry=dn: cn=MEMCACHE,cn=dell-pe2950-01.testrelm.com,cn=masters,cn=ipa,cn=etc,dc=testrelm,dc=com
cn: MEMCACHE
ipaconfigstring: enabledService
ipaconfigstring: startOrder 39
objectclass: nsContainer
objectclass: ipaConfigObject
</snip>


And, afterwards, IPA will not start:

[root@dell-pe2950-01 ipa-upgrade]# ipactl start
Starting Directory Service
Starting dirsrv: 
    PKI-IPA...[  OK  ]
    TESTRELM-COM...[  OK  ]
Failed to read data from Directory Service: Failed to get list of services to probe status!
Configured hostname 'dell-pe2950-01.testrelm.com' does not match any master server in LDAP:
No master found because of error: {'matched': 'dc=testrelm,dc=com', 'desc': 'No such object'}
Shutting down
Shutting down dirsrv: 
    PKI-IPA...[  OK  ]
    TESTRELM-COM...[  OK  ]

Expected results:

Clean upgrade and IPA can be started after upgrade.

Additional info:

/var/log/messages contain some KDC/LDAP messages for sssd:

Mar 14 13:18:44 dell-pe2950-01 [sssd[ldap_child[13376]]]: Failed to initialize credentials using keytab [(null)]: Cannot contact any KDC for realm 'TESTRELM.COM'. Unable to create GSSAPI-encrypted LDAP connection.
Mar 14 13:18:44 dell-pe2950-01 [sssd[ldap_child[13376]]]: Cannot contact any KDC for requested realm

Comment 1 Scott Poore 2012-03-14 18:22:31 UTC
Created attachment 570062 [details]
ipaupgade.log file

Comment 2 Scott Poore 2012-03-14 18:23:16 UTC
Created attachment 570063 [details]
dirsrv error log

Comment 3 Dmitri Pal 2012-03-16 21:57:50 UTC
Upstream ticket:
https://fedorahosted.org/freeipa/ticket/2543

Comment 4 Rob Crittenden 2012-03-23 15:38:42 UTC
I'm unable to reproduce this with 2.2.0-5. Can you re-test?

Comment 5 Scott Poore 2012-03-26 18:10:15 UTC
I have been able to reproduce this but, it was with the 1.2.10.2-3 version of 389-ds-base which had the corruption issue from bug 803930.   Which version of 389-ds-base did your test use?  I am wondering if that is causing the issue here.   Some of the errors look pretty close, I think...

Anyway, here's the traceback.  Let me know if you want me to grab the logs too or to try with fixed/newer version of dirsrv.


  Updating   : ipa-server-2.2.0-5.el6.x86_64                                                                     33/70 
Upgraded /etc/httpd/conf.d/ipa.conf to version 4
Configuring ipa_memcached
  [1/2]: starting ipa_memcached 
  [2/2]: configuring ipa_memcached to start on boot
Traceback (most recent call last):
  File "/usr/sbin/ipa-upgradeconfig", line 302, in <module>
    sys.exit(main())
  File "/usr/sbin/ipa-upgradeconfig", line 293, in main
    memcache.create_instance('MEMCACHE', fqdn, None, ipautil.realm_to_suffix(krbctx.default_realm))
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 325, in create_instance
    self.start_creation("Configuring %s" % self.service_name)
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 257, in start_creation
    method()
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 338, in __enable
    self.dm_password, self.suffix)
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 311, in ldap_enable
    self.admin_conn.addEntry(entry)
  File "/usr/lib/python2.6/site-packages/ipaserver/ipaldap.py", line 496, in addEntry
    self.__handle_errors(e, arg_desc=arg_desc)
  File "/usr/lib/python2.6/site-packages/ipaserver/ipaldap.py", line 312, in __handle_errors
    raise errors.NotFound(reason=arg_desc)
ipalib.errors.NotFound: entry=dn: cn=MEMCACHE,cn=dell-pe-sc1435-02.testrelm.com,cn=masters,cn=ipa,cn=etc,dc=testrelm,dc=com
cn: MEMCACHE
ipaconfigstring: enabledService
ipaconfigstring: startOrder 39
objectclass: nsContainer
objectclass: ipaConfigObject


  Updating   : ipa-server-selinux-2.2.0-5.el6.x86_64                                                             34/70

Comment 6 Rob Crittenden 2012-03-26 19:12:56 UTC
I isolated the upgrade to just ipa. I started with a fully updated system and installed ipa-2-1.3-9, installed the server, ipactl stop, yum update ipa-server.

It would seem that the database is up because we didn't fail with a connect error. I'd be curious to know if the other data is readable.

Comment 7 Scott Poore 2012-03-27 01:45:53 UTC
I've tried this a few ways a few times and wasn't able to get it to work doing something like this:

1. Build 6.2 server.
2. point to 6.3 repos, a repo for 2.2.0-5, a repo for 389-ds-base 1.2.10.2-4.
3. yum update
4. yum install bind expect krb5-workstation bind-dyndb-ldap krb5-pkinit-openssl
5. yum install ipa-server
6. yum remove ipa* pki* 
7. yum install krb5-server-ldap
8. remove new repo config files so server just points to 6.2 repos
9. yum install ipa-server
10. ipa-server-install --setup-dns --forwarder=$DNSFORWARD --hostname=$hostname_s.$DOMAIN -r $RELM -n $DOMAIN -p $ADMINPW -P $ADMINPW -a $ADMINPW -U

When I do this though it looks like it can't create the certificate server instance:


2012-03-26 21:11:41,920 DEBUG   [2/16]: configuring certificate server instance
2012-03-26 21:11:42,286 DEBUG args=/usr/bin/perl /usr/bin/pkisilent 'ConfigureCA' '-cs_hostname' 'ibm-wildhorse-01.testrelm.com' '-cs_port' '9445' '-client_certdb_dir' '/tmp/tmp-Il26De' '-client_certdb_pwd' XXXXXXXX '-preop_pin' 'oKMI3TdvP6EiRXCdoPeT' '-domain_name' 'IPA' '-admin_user' 'admin' '-admin_email' 'root@localhost' '-admin_password' XXXXXXXX '-agent_name' 'ipa-ca-agent' '-agent_key_size' '2048' '-agent_key_type' 'rsa' '-agent_cert_subject' 'CN=ipa-ca-agent,O=TESTRELM.COM' '-ldap_host' 'ibm-wildhorse-01.testrelm.com' '-ldap_port' '7389' '-bind_dn' 'cn=Directory Manager' '-bind_password' XXXXXXXX '-base_dn' 'o=ipaca' '-db_name' 'ipaca' '-key_size' '2048' '-key_type' 'rsa' '-key_algorithm' 'SHA256withRSA' '-save_p12' 'true' '-backup_pwd' XXXXXXXX '-subsystem_name' 'pki-cad' '-token_name' 'internal' '-ca_subsystem_cert_subject_name' 'CN=CA Subsystem,O=TESTRELM.COM' '-ca_ocsp_cert_subject_name' 'CN=OCSP Subsystem,O=TESTRELM.COM' '-ca_server_cert_subject_name' 'CN=ibm-wildhorse-01.testrelm.com,O=TESTRELM.COM' '-ca_audit_signing_cert_subject_name' 'CN=CA Audit,O=TESTRELM.COM' '-ca_sign_cert_subject_name' 'CN=Certificate Authority,O=TESTRELM.COM' '-external' 'false' '-clone' 'false'
2012-03-26 21:11:42,287 DEBUG stdout=libpath=/usr/lib64
#######################################################################
CRYPTO INIT WITH CERTDB:/tmp/tmp-Il26De
tokenpwd:XXXXXXXX
#############################################
Attempting to connect to: ibm-wildhorse-01.testrelm.com:9445
Exception in LoginPanel(): java.lang.NullPointerException
ERROR: ConfigureCA: LoginPanel() failure
ERROR: unable to create CA

#######################################################################

2012-03-26 21:11:42,287 DEBUG stderr=Exception: Unable to Send Request:java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
	at java.net.Socket.connect(Socket.java:546)
	at java.net.Socket.connect(Socket.java:495)
	at java.net.Socket.<init>(Socket.java:392)
	at java.net.Socket.<init>(Socket.java:235)
	at HTTPClient.sslConnect(HTTPClient.java:326)
	at ConfigureCA.LoginPanel(ConfigureCA.java:244)
	at ConfigureCA.ConfigureCAInstance(ConfigureCA.java:1157)
	at ConfigureCA.main(ConfigureCA.java:1672)
java.lang.NullPointerException
	at ConfigureCA.LoginPanel(ConfigureCA.java:245)
	at ConfigureCA.ConfigureCAInstance(ConfigureCA.java:1157)
	at ConfigureCA.main(ConfigureCA.java:1672)

2012-03-26 21:11:42,287 CRITICAL failed to configure ca instance Command '/usr/bin/perl /usr/bin/pkisilent 'ConfigureCA' '-cs_hostname' 'ibm-wildhorse-01.testrelm.com' '-cs_port' '9445' '-client_certdb_dir' '/tmp/tmp-Il26De' '-client_certdb_pwd' XXXXXXXX '-preop_pin' 'oKMI3TdvP6EiRXCdoPeT' '-domain_name' 'IPA' '-admin_user' 'admin' '-admin_email' 'root@localhost' '-admin_password' XXXXXXXX '-agent_name' 'ipa-ca-agent' '-agent_key_size' '2048' '-agent_key_type' 'rsa' '-agent_cert_subject' 'CN=ipa-ca-agent,O=TESTRELM.COM' '-ldap_host' 'ibm-wildhorse-01.testrelm.com' '-ldap_port' '7389' '-bind_dn' 'cn=Directory Manager' '-bind_password' XXXXXXXX '-base_dn' 'o=ipaca' '-db_name' 'ipaca' '-key_size' '2048' '-key_type' 'rsa' '-key_algorithm' 'SHA256withRSA' '-save_p12' 'true' '-backup_pwd' XXXXXXXX '-subsystem_name' 'pki-cad' '-token_name' 'internal' '-ca_subsystem_cert_subject_name' 'CN=CA Subsystem,O=TESTRELM.COM' '-ca_ocsp_cert_subject_name' 'CN=OCSP Subsystem,O=TESTRELM.COM' '-ca_server_cert_subject_name' 'CN=ibm-wildhorse-01.testrelm.com,O=TESTRELM.COM' '-ca_audit_signing_cert_subject_name' 'CN=CA Audit,O=TESTRELM.COM' '-ca_sign_cert_subject_name' 'CN=Certificate Authority,O=TESTRELM.COM' '-external' 'false' '-clone' 'false'' returned non-zero exit status 255
2012-03-26 21:11:42,289 DEBUG Configuration of CA failed
  File "/usr/sbin/ipa-server-install", line 1151, in <module>
    sys.exit(main())

  File "/usr/sbin/ipa-server-install", line 954, in main
    subject_base=options.subject)

  File "/usr/lib/python2.6/site-packages/ipaserver/install/cainstance.py", line 537, in configure_instance
    self.start_creation("Configuring certificate server", 210)

  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 248, in start_creation
    method()

  File "/usr/lib/python2.6/site-packages/ipaserver/install/cainstance.py", line 680, in __configure_instance
    raise RuntimeError('Configuration of CA failed')

Comment 8 Scott Poore 2012-03-27 13:29:36 UTC
Ok, I'm testing again with a repo containing ipa 2.2.0-5 and 389-ds-base 1.2.10.2-4.  Much better results this time.  I did notice though that the steps were slightly off.   Apparently I didn't have to delete the socket file since I just realized that was missed from the test I just ran.

Comment 9 Rob Crittenden 2012-03-27 13:40:25 UTC
Ah, yeah, this problem is related to the fact that the latest pki-silent package now does shell escaping. This is not compatible with ipa-2.1.3 which also does the escaping.

If you are up for it you can modify /usr/lib/python2.6/site-packages/ipaserver/install/cainstance.py and delete line 672 which reads:

            args[2:] = [ipautil.shell_quote(i) for i in args[2:]]

It will install.

Comment 10 Scott Poore 2012-03-27 17:26:53 UTC
Should I do that for an upgraded pki-silent before upgrading ipa?  And/or was that an ipa-only upgrade or everything?   

Did you see errors about ldap.SERVER_DOWN?

I installed 6.2 and IPA.  Then I looked at the cainstance.py line and it was there in the version (pki-silent-9.0.3-20.el6.noarch) included in RHEL 6.2.  Then, I pointed to repos with 6.3 with: ipa-server-2.2.0-5 and 389-ds-base-1.2.10.2-3.  Then I ran a full "yum update 'ipa*'" and got a stack trace with and ldap.SERVER_DOWN error.

Comment 11 Rob Crittenden 2012-03-27 17:44:33 UTC
The older pki-silent does not escape shell commands (pre 9.0.3-23). In ipa 2.1.3 we have a call in cainstance.py to escape arguments. pki-silent 9.0.3-23+ escapes its own commands and will fail if ipa pre-escapes them so in 2.2.0-5 we no longer do this.

In order to run an older ipa-server with a newer pki-ca you need to disable the shell escaping in 2.1.3 as outlined in c#9.

The upgrade works flawlessly for me, upgrading just the ipa packages.

ldap.SERVER_DOWN is a different error than originally reported, can you include more details?

Comment 12 Scott Poore 2012-03-27 18:16:40 UTC
Ok, I did my test upgrading everything at once when I saw the ldap.SERVER_DOWN error.   So, I didn't have a newer pki-ca with ipa 2.1.3.  You were suggesting the change from comment 9 to fix the issue I mentioned in comment 7?

If so, I misunderstood.  I will have to go back to reproduce following comment 7 with your workaround for comment 9.

This is what I saw from output (although I think it may be a self inflicted issue from upgrading everything at once):

  Updating   : ipa-server-2.2.0-5.el6.x86_64                                                                     33/70 
Upgraded /etc/httpd/conf.d/ipa.conf to version 4
Configuring ipa_memcached
  [1/2]: starting ipa_memcached 
  [2/2]: configuring ipa_memcached to start on boot
Traceback (most recent call last):
  File "/usr/sbin/ipa-upgradeconfig", line 302, in <module>
    sys.exit(main())
  File "/usr/sbin/ipa-upgradeconfig", line 293, in main
    memcache.create_instance('MEMCACHE', fqdn, None, ipautil.realm_to_suffix(krbctx.default_realm))
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 325, in create_instance
    self.start_creation("Configuring %s" % self.service_name)
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 257, in start_creation
    method()
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 338, in __enable
    self.dm_password, self.suffix)
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 297, in ldap_enable
    self.ldap_connect()
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 79, in ldap_connect
    self.admin_conn = self.__get_conn(None, None, ldapi=True, realm=self.realm)
  File "/usr/lib/python2.6/site-packages/ipaserver/install/service.py", line 290, in __get_conn
    raise e
ldap.SERVER_DOWN: {'desc': "Can't contact LDAP server"}
  Updating   : ipa-server-selinux-2.2.0-5.el6.x86_64                                                             34/70

Comment 13 Scott Poore 2012-03-27 19:41:37 UTC
Ok, things look much better now.

Using your comment 9 fix, I was able to update everything and get ipa 2.1.3 installed/working:

# rpm -q ipa-server 389-ds-base bind bind-dyndb-ldap
ipa-server-2.1.3-9.el6.x86_64
389-ds-base-1.2.10.2-3.el6.x86_64
bind-9.8.2-0.6.rc1.el6.x86_64
bind-dyndb-ldap-1.1.0-0.4.b1.el6.x86_64

Note that's with the 1.2.10.2-3 version of 389-ds-base too.

Then I stopped everything and upgraded ipa only:
# ipactl stop
Stopping CA Service
Stopping pki-ca: [  OK  ]
Stopping HTTP Service
Stopping httpd: [  OK  ]
Stopping DNS Service
Stopping named: .[  OK  ]
Stopping KPASSWD Service
Shutting down ipa_kpasswd: [  OK  ]
Stopping KDC Service
Stopping Kerberos 5 KDC: [  OK  ]
Stopping Directory Service
Shutting down dirsrv: 
    PKI-IPA...[  OK  ]
    TESTRELM-COM...[  OK  ]

# yum update ipa-server

Far different results this time (i.e. it worked):
<snip>
  Updating   : ipa-server-2.2.0-5.el6.x86_64                                                      4/10 
Upgraded /etc/httpd/conf.d/ipa.conf to version 4
Configuring ipa_memcached
  [1/2]: starting ipa_memcached 
  [2/2]: configuring ipa_memcached to start on boot
done configuring ipa_memcached.
  Updating   : ipa-server-selinux-2.2.0-5.el6.x86_64                                              5/10 
</snip>

And to confirm:

# ipactl status
Directory Service: RUNNING
KDC Service: RUNNING
KPASSWD Service: RUNNING
DNS Service: RUNNING
MEMCACHE Service: RUNNING
HTTP Service: RUNNING
CA Service: RUNNING

# ipa user-find
--------------
1 user matched
--------------
  User login: admin
  Last name: Administrator
  Home directory: /home/admin
  Login shell: /bin/bash
  UID: 797800000
  GID: 797800000
  Account disabled: False
  Password: True
  Kerberos keys available: True
----------------------------
Number of entries returned 1
----------------------------

Comment 14 Scott Poore 2012-03-27 19:46:12 UTC
Also, a note I thought I entered earlier today:

This has the same dirsrv errors from the original test in the attached log that we saw from the database corruption from bug 803930:

$ grep get_and_add var.log.dirsrv.slapd-TESTRELM-COM.errors 
[14/Mar/2012:12:43:07 -0400] ldif2dbm - _get_and_add_parent_rdns: Failed to convert DN cn=TESTRELM.COM to RDN
[14/Mar/2012:12:43:07 -0400] ldif2dbm - _get_and_add_parent_rdns: Failed to convert DN cn=TESTRELM.COM to RDN
[14/Mar/2012:12:43:07 -0400] ldif2dbm - _get_and_add_parent_rdns: Failed to convert DN cn=TESTRELM.COM to RDN
...

So, from the dirsrv log, ipa upgrade only, and ipa upgrade with 389-ds-base-1.2.10.2-4, I think we can confirm that this is just a different manifestation of bug 803930.  Only difference in test was that IPA services were off before upgrade.  Database corruption is still believed to have been the actual culprit here.

I'm going to close this as a duplicate of that bug.

Comment 15 Scott Poore 2012-03-27 19:46:57 UTC

*** This bug has been marked as a duplicate of bug 803930 ***