Bug 1404409 - DS returns inconsistent data with an error when using GSSAPI
Summary: DS returns inconsistent data with an error when using GSSAPI
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: 389-ds
Version: 26
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: wibrown@redhat.com
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1403352 1430250 1432149
TreeView+ depends on / blocked
 
Reported: 2016-12-13 18:45 UTC by Tomas Krizek
Modified: 2020-09-13 21:57 UTC (History)
7 users (show)

Fixed In Version: 389-ds-base-1.3.6.2-2.f26
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1432149 (view as bug list)
Environment:
Last Closed: 2017-03-17 01:07:47 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
updated bind template (1.23 KB, text/plain)
2016-12-14 08:51 UTC, Tomas Krizek
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github 389ds 389-ds-base issues 2230 0 None None None 2020-09-13 21:57:20 UTC

Description Tomas Krizek 2016-12-13 18:45:28 UTC
Description of problem:
When trying to retrieve data from Directory Server using GSSAPI, the DS returns inconsistent number of responses and then the query ends with ldap_result: Can't contact LDAP server (-1)

Version-Release number of selected component (if applicable):
389-ds-base-1.3.6.1-1.fc26.x86_64

How reproducible:
always (with named user in IPA setup)

Steps to Reproduce:
1. install IPA server
2. ipa-dns-install --auto-forwarders
3. The above command should fail when attempting to restart named, which is a symptom caused by the issue above.
4. SHELL=/bin/bash sudo -u named -s
5. kdestroy -A
6. klist -kt /etc/named.conf DNS/$(hostname)
7. ldapsearch -h $(hostname) -Y GSSAPI

Repeat step 7 and observe results.

Actual results:
The number of responses is inconsistent and the query ends with ldap_result: Can't contact LDAP server (-1)

Expected results:
DS successfully returns the correct number of results without an error in ldap_result.

Additional info:
This happens both for the UNIX socket as well as the TCP socket.

Comment 4 Noriko Hosoi 2016-12-13 19:56:38 UTC
A small correction in #c0

Step 6 is supposed to be:
kinit -kt /etc/named.keytab DNS/$(hostname)

Comment 5 Tomas Krizek 2016-12-14 08:51:19 UTC
Created attachment 1231518 [details]
updated bind template

Comment 6 Tomas Krizek 2016-12-14 08:58:47 UTC
I also forgot one extra step that is required to reproduce this issue on a clean machine. Before executing Step 2, please place the attached bind template into /usr/share/ipa/bind.named.conf.template to avoid encountering bug 1403352.

Also, make sure to execute steps 5 -7 in the shell running under named user (started in step 4). As Noriko mentioned, please also specify the keytab as the argument in step 6 instead of named.conf.

Sorry for the confusion.

Comment 8 Tomas Krizek 2017-02-10 08:24:53 UTC
The DS version is 389-ds-base-1.3.6.1-1.fc26.x86_64, as stated in the bug report.

Comment 10 Fedora End Of Life 2017-02-28 10:46:44 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle.
Changing version to '26'.

Comment 11 Tomas Krizek 2017-03-01 11:59:17 UTC
Hi William, 

thanks for the test build. Unfortunately, I'm not able to test whether the issue is still present or not while using DS as a part of freeipa. I wasn't able to install freeipa on F25 with this build and installing freeipa first and upgrading just the DS component didn't work either.

Since there are currently some broken dependencies on F26 for freeipa, I can't test there either.

But since the issue wasn't related to IPA, but rather the DS itself when using GSSAPI, I believe your load test should be enough to verify the functionality. When I ran into this issue, almost every request returned different results.

Thanks.

Comment 12 Tomas Krizek 2017-03-08 15:27:05 UTC
William,

would it be possible to provide a testing build for rawhide, so I can verify the issue has been resolved?

I can't use your copr repo, because there are missing/broken dependencies for rawhide:

Error: 
 Problem 1: conflicting requests
  - nothing provides libasan.so.3()(64bit) needed by 389-ds-base-1.3.6.1-20170227.fc25.x86_64
 Problem 2: conflicting requests
  - nothing provides libasan.so.3()(64bit) needed by 389-ds-base-snmp-1.3.6.1-20170227.fc25.x86_64
 Problem 3: conflicting requests
  - nothing provides python2-lib389 needed by python2-389-ds-base-tests-1.3.6.1-20170227.fc25.noarch

Comment 15 Tomas Krizek 2017-03-10 11:27:55 UTC
William,

I'm still unable to install DS with the packages from your copr repo.

The packages do not have any broken dependencies, but when I attempt to install FreeIPA server on Fedora 26, the installation hangs for a few minutes when it attempts to install Directory Server and then it fails with the following error:


Configuring directory server (dirsrv). Estimated time: 1 minute
  [1/47]: creating directory server user
  [2/47]: creating directory server instance
  [error] RuntimeError: failed to create ds instance Command '/usr/sbin/setup-ds.pl --silent --logfile - -f /tmp/tmpcEMwnL' returned non-zero exit status 1
ipa.ipapython.install.cli.install_tool(Server): ERROR    failed to create ds instance Command '/usr/sbin/setup-ds.pl --silent --logfile - -f /tmp/tmpcEMwnL' returned non-zero exit status 1

I've tried these versions and both were affected:
389-ds-base-1.3.6.1-20170309git137606d37.fc26.x86_64
389-ds-base-1.3.6.1-20170308gitecfbfc8ec.fc26.x86_64


This issue is *not* present when using the released version:
389-ds-base-1.3.6.1-1.fc26.1.x86_64

Comment 18 Tomas Krizek 2017-03-13 11:48:03 UTC
(In reply to mreynolds from comment #17)
> New builds are ready:
> 
> F26
> https://koji.fedoraproject.org/koji/buildinfo?buildID=867529

I can't install FreeIPA server with this build either. There is some issue with external bind. When our ipa-server-install attempts to use external bind, it fails.

[13/Mar/2017:10:03:38.021857728 +0100] conn=4 op=0 BIND dn="cn=Directory Manager" method=sasl version=3 mech=EXTERNAL
[13/Mar/2017:10:03:38.021964141 +0100] conn=4 op=0 RESULT err=49 tag=97 nentries=0 etime=0 - No such suffix (cn=Directory Manager)

Comment 19 Adam Williamson 2017-03-13 22:04:12 UTC
Yeah, it happens that a test ran in openQA which attempted deployment with latest packages from updates-testing, and that failed in what sounds like the same way Tomas saw. I see the same error in /var/log/dirsrv/slapd-DOMAIN-LOCAL/access :

[13/Mar/2017:17:23:04.601301050 -0400] conn=4 op=0 RESULT err=49 tag=97 nentries=0 etime=0 - No such suffix (cn=Directory Manager)

and this in ipaserver-install.log:

2017-03-13T21:23:04Z DEBUG Could not connect to the Directory Server on ipa001.domain.local: Insufficient access:  Invalid credentials
2017-03-13T21:23:04Z DEBUG   File "/usr/lib/python2.7/site-packages/ipapython/admintool.py", line 172, in execute
    return_value = self.run()
  File "/usr/lib/python2.7/site-packages/ipapython/install/cli.py", line 318, in run
    cfgr.run()
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 310, in run
    self.execute()
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 334, in execute
    for nothing in self._executor():
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 376, in __runner
    exc_handler(exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 405, in _handle_execute_exception
    self._handle_exception(exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 395, in _handle_exception
    six.reraise(*exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 366, in __runner
    step()
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 363, in <lambda>
    step = lambda: next(self.__gen)
  File "/usr/lib/python2.7/site-packages/ipapython/install/util.py", line 81, in run_generator_with_yield_from
    six.reraise(*exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/util.py", line 59, in run_generator_with_yield_from
    value = gen.send(prev_value)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 597, in _configure
    next(executor)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 376, in __runner
    exc_handler(exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 405, in _handle_execute_exception
    self._handle_exception(exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 460, in _handle_exception
    self.__parent._handle_exception(exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 395, in _handle_exception
    six.reraise(*exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 457, in _handle_exception
    super(ComponentBase, self)._handle_exception(exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 395, in _handle_exception
    six.reraise(*exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 366, in __runner
    step()
  File "/usr/lib/python2.7/site-packages/ipapython/install/core.py", line 363, in <lambda>
    step = lambda: next(self.__gen)
  File "/usr/lib/python2.7/site-packages/ipapython/install/util.py", line 81, in run_generator_with_yield_from
    six.reraise(*exc_info)
  File "/usr/lib/python2.7/site-packages/ipapython/install/util.py", line 59, in run_generator_with_yield_from
    value = gen.send(prev_value)
  File "/usr/lib/python2.7/site-packages/ipapython/install/common.py", line 63, in _install
    for nothing in self._installer(self.parent):
  File "/usr/lib/python2.7/site-packages/ipaserver/install/server/install.py", line 1372, in main
    install(self)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/server/install.py", line 270, in decorated
    func(installer)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/server/install.py", line 759, in install
    ntpinstance.ntp_ldap_enable(host_name, ds.suffix, realm_name)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/ntpinstance.py", line 43, in ntp_ldap_enable
    ntp.ldap_enable('NTP', fqdn, None, base_dn)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 473, in ldap_enable
    self.ldap_connect()
  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 177, in ldap_connect
    conn.do_bind(self.dm_password, autobind=self.autobind)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 1681, in do_bind
    self.do_external_bind(pw_name, timeout=timeout)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 1671, in do_external_bind
    self.__bind_with_wait(self.external_bind, timeout, user_name)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 1650, in __bind_with_wait
    bind_func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 1098, in external_bind
    '', auth_tokens, server_controls, client_controls)
  File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/lib/python2.7/site-packages/ipapython/ipaldap.py", line 973, in error_handler
    raise errors.ACIError(info="%s %s" % (info, desc))

2017-03-13T21:23:04Z DEBUG The ipa-server-install command failed, exception: ACIError: Insufficient access:  Invalid credentials
2017-03-13T21:23:04Z ERROR Insufficient access:  Invalid credentials
2017-03-13T21:23:04Z ERROR The ipa-server-install command failed. See /var/log/ipaserver-install.log for more information

Comment 23 wibrown@redhat.com 2017-03-14 04:21:36 UTC
Continue to configure the system with these values? [no]: yes

The following operations may take some minutes to complete.
Please wait until the prompt is returned.

Configuring NTP daemon (ntpd)
  [1/4]: stopping ntpd
  [2/4]: writing configuration
  [3/4]: configuring ntpd to start on boot
  [4/4]: starting ntpd
Done configuring NTP daemon (ntpd).
Configuring directory server (dirsrv). Estimated time: 1 minute
  [1/47]: creating directory server user
  [2/47]: creating directory server instance
  [3/47]: updating configuration in dse.ldif
  [4/47]: restarting directory server
  [5/47]: adding default schema
  [6/47]: enabling memberof plugin
  [7/47]: enabling winsync plugin
  [8/47]: configuring replication version plugin
  [9/47]: enabling IPA enrollment plugin
  [10/47]: enabling ldapi
  [11/47]: configuring uniqueness plugin
  [12/47]: configuring uuid plugin
  [13/47]: configuring modrdn plugin
  [14/47]: configuring DNS plugin
  [15/47]: enabling entryUSN plugin
  [16/47]: configuring lockout plugin
  [17/47]: configuring topology plugin
  [18/47]: creating indices
  [19/47]: enabling referential integrity plugin
  [20/47]: configuring certmap.conf
  [21/47]: configure autobind for root
  [22/47]: configure new location for managed entries
  [23/47]: configure dirsrv ccache
  [24/47]: enabling SASL mapping fallback
  [25/47]: restarting directory server
  [26/47]: adding sasl mappings to the directory
  [27/47]: adding default layout
  [28/47]: adding delegation layout
  [29/47]: creating container for managed entries
  [30/47]: configuring user private groups
  [31/47]: configuring netgroups from hostgroups
  [32/47]: creating default Sudo bind user
  [33/47]: creating default Auto Member layout
  [34/47]: adding range check plugin
  [35/47]: creating default HBAC rule allow_all
  [36/47]: adding sasl mappings to the directory
  [37/47]: adding entries for topology management
  [38/47]: initializing group membership
  [39/47]: adding master entry
  [40/47]: initializing domain level
  [41/47]: configuring Posix uid/gid generation
  [42/47]: adding replication acis
  [43/47]: enabling compatibility plugin
  [44/47]: activating sidgen plugin
  [45/47]: activating extdom plugin
  [46/47]: tuning directory server
  [47/47]: configuring directory to start on boot
Done configuring directory server (dirsrv).
Configuring certificate server (pki-tomcatd). Estimated time: 3 minutes 30 seconds
  [1/31]: creating certificate server user
  [2/31]: configuring certificate server instance
  [3/31]: stopping certificate server instance to update CS.cfg
  [4/31]: backing up CS.cfg
  [5/31]: disabling nonces
  [6/31]: set up CRL publishing
  [7/31]: enable PKIX certificate path discovery and validation
  [8/31]: starting certificate server instance
  [9/31]: creating RA agent certificate database
  [10/31]: importing CA chain to RA certificate database
  [11/31]: fixing RA database permissions
  [12/31]: setting up signing cert profile
  [13/31]: setting audit signing renewal to 2 years
  [14/31]: restarting certificate server
  [15/31]: requesting RA certificate from CA
  [16/31]: issuing RA agent certificate
  [17/31]: adding RA agent as a trusted user
  [18/31]: authorizing RA to modify profiles
  [19/31]: authorizing RA to manage lightweight CAs
  [20/31]: Ensure lightweight CAs container exists
  [21/31]: configure certmonger for renewals
  [22/31]: configure certificate renewals
  [23/31]: configure RA certificate renewal
  [24/31]: configure Server-Cert certificate renewal
  [25/31]: Configure HTTP to proxy connections
  [26/31]: restarting certificate server
  [27/31]: migrating certificate profiles to LDAP
  [28/31]: importing IPA certificate profiles
  [29/31]: adding default CA ACL
  [30/31]: adding 'ipa' CA entry
  [31/31]: updating IPA configuration
Done configuring certificate server (pki-tomcatd).
Configuring directory server (dirsrv). Estimated time: 10 seconds
  [1/3]: configuring ssl for ds instance
  [2/3]: restarting directory server
  [3/3]: adding CA certificate entry
Done configuring directory server (dirsrv).
Configuring Kerberos KDC (krb5kdc). Estimated time: 30 seconds
  [1/9]: adding kerberos container to the directory
  [2/9]: configuring KDC
  [3/9]: initialize kerberos container
WARNING: Your system is running out of entropy, you may experience long delays
  [4/9]: adding default ACIs
  [5/9]: creating a keytab for the directory
  [6/9]: creating a keytab for the machine
  [7/9]: adding the password extension to the directory
  [8/9]: starting the KDC
  [9/9]: configuring KDC to start on boot
Done configuring Kerberos KDC (krb5kdc).
Configuring kadmin
  [1/2]: starting kadmin 
  [2/2]: configuring kadmin to start on boot
Done configuring kadmin.
Configuring ipa_memcached
  [1/2]: starting ipa_memcached 
  [2/2]: configuring ipa_memcached to start on boot
Done configuring ipa_memcached.
Configuring ipa-otpd
  [1/2]: starting ipa-otpd 
  [2/2]: configuring ipa-otpd to start on boot
Done configuring ipa-otpd.
Configuring ipa-custodia
  [1/5]: Generating ipa-custodia config file
  [2/5]: Making sure custodia container exists
  [3/5]: Generating ipa-custodia keys
  [4/5]: starting ipa-custodia 
  [5/5]: configuring ipa-custodia to start on boot
Done configuring ipa-custodia.
Configuring the web interface (httpd). Estimated time: 1 minute
  [1/21]: setting mod_nss port to 443
  [2/21]: setting mod_nss cipher suite
  [3/21]: setting mod_nss protocol list to TLSv1.0 - TLSv1.2
  [4/21]: setting mod_nss password file
  [5/21]: enabling mod_nss renegotiate
  [6/21]: adding URL rewriting rules
  [7/21]: configuring httpd
  [8/21]: configure certmonger for renewals
  [9/21]: setting up httpd keytab
  [10/21]: setting up ssl
  [11/21]: importing CA certificates from LDAP
  [12/21]: setting up browser autoconfig
  [13/21]: publish CA cert
  [14/21]: clean up any existing httpd ccache
  [15/21]: configuring SELinux for httpd
  [16/21]: create KDC proxy user
  [17/21]: create KDC proxy config
  [18/21]: enable KDC proxy
  [19/21]: restarting httpd
  [20/21]: configuring httpd to start on boot
  [21/21]: enabling oddjobd
Done configuring the web interface (httpd).
Applying LDAP updates
Upgrading IPA:
  [1/9]: stopping directory server
  [2/9]: saving configuration
  [3/9]: disabling listeners
  [4/9]: enabling DS global lock
  [5/9]: starting directory server
  [6/9]: upgrading server
  [7/9]: stopping directory server
  [8/9]: restoring configuration
  [9/9]: starting directory server
Done.
Restarting the directory server
Restarting the KDC
ipa         : ERROR    unable to resolve host name freeipa.example.com. to IP address, ipa-ca DNS record will be incomplete
Please add records in this file to your DNS system: /tmp/ipa.system.records.FRrf6C.db
Restarting the web server
Configuring client side components
Using existing certificate '/etc/ipa/ca.crt'.
Client hostname: freeipa.example.com
Realm: EXAMPLE.COM
DNS Domain: example.com
IPA Server: freeipa.example.com
BaseDN: dc=example,dc=com

Skipping synchronizing time with NTP server.
New SSSD config will be created
Configured sudoers in /etc/nsswitch.conf
Configured /etc/sssd/sssd.conf
trying https://freeipa.example.com/ipa/json
Forwarding 'schema' to json server 'https://freeipa.example.com/ipa/json'
trying https://freeipa.example.com/ipa/session/json
Forwarding 'ping' to json server 'https://freeipa.example.com/ipa/session/json'
Forwarding 'ca_is_enabled' to json server 'https://freeipa.example.com/ipa/session/json'
Systemwide CA database updated.
Adding SSH public key from /etc/ssh/ssh_host_ecdsa_key.pub
Adding SSH public key from /etc/ssh/ssh_host_ed25519_key.pub
Adding SSH public key from /etc/ssh/ssh_host_rsa_key.pub
Forwarding 'host_mod' to json server 'https://freeipa.example.com/ipa/session/json'
Could not update DNS SSHFP records.
SSSD enabled
Configured /etc/openldap/ldap.conf
Configured /etc/ssh/ssh_config
Configured /etc/ssh/sshd_config
Configuring example.com as NIS domain.
Client configuration complete.

==============================================================================
Setup complete

Next steps:
	1. You must make sure these network ports are open:
		TCP Ports:
		  * 80, 443: HTTP/HTTPS
		  * 389, 636: LDAP/LDAPS
		  * 88, 464: kerberos
		UDP Ports:
		  * 88, 464: kerberos
		  * 123: ntp

	2. You can now obtain a kerberos ticket using the command: 'kinit admin'
	   This ticket will allow you to use the IPA tools (e.g., ipa user-add)
	   and the web user interface.

Be sure to back up the CA certificates stored in /root/cacert.p12
These files are required to create replicas. The password for these
files is the Directory Manager password

Comment 24 mreynolds 2017-03-14 13:19:10 UTC
Fixed upstream.


Scratch build:

https://koji.fedoraproject.org/koji/taskinfo?taskID=18377415

Official Koji build (still building...):

https://koji.fedoraproject.org/koji/taskinfo?taskID=18377564

Comment 25 mreynolds 2017-03-14 13:42:11 UTC
Bodhi update:

https://bodhi.fedoraproject.org/updates/FEDORA-2017-b9fbd8b768

Comment 26 Fedora Update System 2017-03-14 17:08:47 UTC
389-ds-base-1.3.6.2-1.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-b9fbd8b768

Comment 28 mreynolds 2017-03-14 18:10:28 UTC
The shutdown issue is caused by the content sync plugin not handling its thread count correctly when connections are aborted.  This bug has been present since 389-ds-base-1.3.3

This is being addressed in issue:

https://pagure.io/389-ds-base/issue/49170

Comment 29 Adam Williamson 2017-03-14 18:14:02 UTC
well, that's clearly a different thing from this bug, so we should probably track it separately, right?

Comment 30 mreynolds 2017-03-14 18:16:31 UTC
(In reply to Adam Williamson from comment #29)
> well, that's clearly a different thing from this bug, so we should probably
> track it separately, right?

Correct, I will be filing a new bug.  I was just adding the tracking info for the reporter.

Comment 31 Adam Williamson 2017-03-14 18:26:05 UTC
RH folks on this bug: can you please check through your private comments and make them public unless there's a real reason for them to be private? This is a *Fedora* bug, and not only that, it's a Fedora *blocker* bug (by inheritance). It's really quite important that non-RH people be able to know what's going on.

There are a few comments that include RH-internal hostnames and stuff and it's reasonable for those to be private (though perhaps they could be sanitized, or a public summary posted, instead), but there are several comments for which I can see no reason to mark them as private at all.

Thanks!

Comment 32 mreynolds 2017-03-14 18:49:28 UTC
Bug to track shutdown problem:

https://bugzilla.redhat.com/show_bug.cgi?id=1432206

Comment 33 Adam Williamson 2017-03-14 20:49:27 UTC
I ran a test with 389-ds 1.3.6.2-1.fc26 and it failed for yet a *different* reason. There seems to have been an error near the end of server deployment which was treated as non-fatal. From ipaserver-install.log:

2017-03-14T19:35:43Z DEBUG Changing admin password
2017-03-14T19:35:43Z DEBUG Starting external process
2017-03-14T19:35:43Z DEBUG args=/usr/bin/ldappasswd -h ipa001.domain.local -ZZ -x -D cn=Directory Manager -y /var/lib/ipa/tmpD4VoiL -T /var/lib/ipa/tmplVltDm uid=admin,cn=users,cn=accounts,dc=domain,dc=local
2017-03-14T19:35:43Z DEBUG Process finished, return code=1
2017-03-14T19:35:43Z DEBUG stdout=
2017-03-14T19:35:43Z DEBUG stderr=ldap_start_tls: Operations error (1)
        additional info: Other operations are still pending on the connection.

2017-03-14T19:35:43Z DEBUG Unable to set admin password Command '/usr/bin/ldappasswd -h ipa001.domain.local -ZZ -x -D cn=Directory Manager -y /var/lib/ipa/tmpD4VoiL -T /var/lib/ipa/tmplVltDm uid=admin,cn=users,cn=accounts,dc=domain,dc=local' returned non-zero exit status 1

then it went ahead and tried to do the client setup, and that failed due to a similar error. From ipaclient-install.log:

017-03-14T19:35:44Z DEBUG Verifying that ipa001.domain.local (realm DOMAIN.LOCAL) is an IPA server
2017-03-14T19:35:44Z DEBUG Init LDAP connection to: ipa001.domain.local
2017-03-14T19:35:44Z DEBUG Unhandled LDAPError: OPERATIONS_ERROR: {'info': 'Other operations are still pending on the connection.', 'desc': 'Operations error'}
2017-03-14T19:35:44Z DEBUG Error checking LDAP: Operations error: Other operations are still pending on the connection.
2017-03-14T19:35:44Z WARNING Skip ipa001.domain.local: cannot verify if this is an IPA server
2017-03-14T19:35:44Z DEBUG Discovery result: UNKNOWN_ERROR; server=None, domain=domain.local, kdc=ipa001.domain.local, basedn=None
2017-03-14T19:35:44Z DEBUG Validated servers: 
2017-03-14T19:35:44Z ERROR Failed to verify that ipa001.domain.local is an IPA Server.
2017-03-14T19:35:44Z ERROR This may mean that the remote server is not up or is not reachable due to network or firewall settings.
2017-03-14T19:35:44Z INFO Please make sure the following ports are opened in the firewall settings:
     TCP: 80, 88, 389
     UDP: 88 (at least one of TCP/UDP ports 88 has to be open)
Also note that following ports are necessary for ipa-client working properly after enrollment:
     TCP: 464
     UDP: 464, 123 (if NTP enabled)
2017-03-14T19:35:44Z WARNING Installation failed. As this is IPA server, changes will not be rolled back.

Comment 34 Adam Williamson 2017-03-14 20:51:46 UTC
Actually that may not have been treated as a fatal error either, as the openQA test actually *timed out* and gave up at that point; it looks like ipa-server-install was still running. But clearly it wasn't deploying properly.

Comment 35 Adam Williamson 2017-03-14 21:30:43 UTC
So I re-ran the test in openQA with updated 389-ds and got the same result: "Other operations are still pending" errors. On closer inspection, the server install process *does* complete. In the openQA test we are using rolekit to deploy the server, so the command run is:

echo '{"admin_password":"monkeys123","dns_forwarders":{"ipv4":["10.5.126.21"],"ipv6":["10.5.126.22"]}}' | rolectl deploy domaincontroller --name=domain.local --settings-stdin

that command exits with code 99, and rolekit seems to log that ipa-server-install exits 256.

The entirety of /var from one of these failures can be found at https://openqa.stg.fedoraproject.org/tests/80024/file/role_deploy_domain_controller-var_log.tar.gz for inspection.

Comment 36 Nathan Kinder 2017-03-14 22:05:09 UTC
The ipa-server-install.log file shows this failure, which is concerning (but doesn't stop the installer from proceeding):

-------------------------------------------------------------------------------
2017-03-14T21:18:50Z DEBUG args=/usr/bin/ldappasswd -h ipa001.domain.local -ZZ -x -D cn=Directory Manager -y /var/lib/ipa/tmpcoeqop -T /var/lib/ipa/tmpDi9E46 uid=admin,cn=users,cn=accounts,dc=domain,dc=local
2017-03-14T21:18:50Z DEBUG Process finished, return code=1
2017-03-14T21:18:50Z DEBUG stdout=
2017-03-14T21:18:50Z DEBUG stderr=ldap_start_tls: Operations error (1)
	additional info: Other operations are still pending on the connection.

2017-03-14T21:18:50Z DEBUG Unable to set admin password Command '/usr/bin/ldappasswd -h ipa001.domain.local -ZZ -x -D cn=Directory Manager -y /var/lib/ipa/tmpcoeqop -T /var/lib/ipa/tmpDi9E46 uid=admin,cn=users,cn=accounts,dc=domain,dc=local' returned non-zero exit status 1
-------------------------------------------------------------------------------

Immediately after this, ipa-server-install attempts to run ipa-client-install, which fails and ultimately triggers the installer to exit.  The ipa-client-install.log shows that there is an LDAP operations error:

-------------------------------------------------------------------------------
2017-03-14T21:18:51Z DEBUG Init LDAP connection to: ipa001.domain.local
2017-03-14T21:18:51Z DEBUG Unhandled LDAPError: OPERATIONS_ERROR: {'info': 'Other operations are still pending on the connection.', 'desc': 'Operations error'}
2017-03-14T21:18:51Z DEBUG Error checking LDAP: Operations error: Other operations are still pending on the connection.
2017-03-14T21:18:51Z WARNING Skip ipa001.domain.local: cannot verify if this is an IPA server
2017-03-14T21:18:51Z DEBUG Discovery result: UNKNOWN_ERROR; server=None, domain=domain.local, kdc=ipa001.domain.local, basedn=None
2017-03-14T21:18:51Z DEBUG Validated servers: 
2017-03-14T21:18:51Z ERROR Failed to verify that ipa001.domain.local is an IPA Server.
2017-03-14T21:18:51Z ERROR This may mean that the remote server is not up or is not reachable due to network or firewall settings.
2017-03-14T21:18:51Z INFO Please make sure the following ports are opened in the firewall settings:
     TCP: 80, 88, 389
     UDP: 88 (at least one of TCP/UDP ports 88 has to be open)
Also note that following ports are necessary for ipa-client working properly after enrollment:
     TCP: 464
     UDP: 464, 123 (if NTP enabled)
2017-03-14T21:18:51Z WARNING Installation failed. As this is IPA server, changes will not be rolled back.
-------------------------------------------------------------------------------

There are a number of CoS plugin related errors in the ns-slapd errors log, as well as some content-sync-plugin messages:

-------------------------------------------------------------------------------
[14/Mar/2017:17:17:18.363748750 -0400] - ERR - cos-plugin - cos_cache_query_attr - cos attribute krbPwdPolicyReference failed schema check on dn: cn=dns,dc=domain,dc=local
[14/Mar/2017:17:17:18.367692831 -0400] - ERR - cos-plugin - cos_cache_query_attr - cos attribute krbPwdPolicyReference failed schema check on dn: cn=dns,dc=domain,dc=local
[14/Mar/2017:17:17:18.385636077 -0400] - ERR - cos-plugin - cos_cache_query_attr - cos attribute krbPwdPolicyReference failed schema check on dn: cn=dns,dc=domain,dc=local
...
[14/Mar/2017:17:19:33.476041525 -0400] - ERR - slapi_connection_acquire - conn=43 fd=76 Attempt to acquire connection in the closing state
[14/Mar/2017:17:19:33.476724598 -0400] - ERR - content-sync-plugin - sync_send_results - conn=43 op=0 Could not acquire the connection - aborted
-------------------------------------------------------------------------------

Mark mentioned in comment#28 that the shutdown issue was related to the content-sync plugin.  Is it possible that this is causing the IPA installation failure as well?  Mark just checked in a fix for the shutdown issue a few hours ago as seen here:

  https://pagure.io/389-ds-base/issue/49170

This fix is not in the most recent build in koji (389-ds-base-1.3.6.2-1.fc26).  It would be great to see if a test build containing Mark's latest fix addresses this issue.

Comment 37 Adam Williamson 2017-03-14 22:43:12 UTC
Nathan: yes, those are exactly the errors I pointed out above.

I personally doubt the shutdown issue is the cause here as the entire server deployment attempt takes less time to finish (8 minutes) than the shutdown attempt should take to time out (10 minutes), and I didn't think the server deployment process actually tried to shut down the directory server at all. But I could be wrong!

Comment 38 Adam Williamson 2017-03-14 22:44:16 UTC
Oh, ISWYM - you're suggesting they could have a common cause. It's possible, for sure. Mark said he is waiting for one other fix before doing an official build with the shutdown fix, I am waiting for that, so I can re-run the test with it.

Comment 39 wibrown@redhat.com 2017-03-14 23:46:26 UTC
So it's likely that the aborted connection is the real problem here, and then the shutdown block is because of that abort.

We need to tackle it as two issues. First is the shutdown block, and the other is the aborted connection. Given that the connection management is my area, I'll look into this. I think there is not a lot of info here, so we'll see how I go.

If this is critical, the new connection framework can be temporarily disabled so we can ship. It's a one line, and the old connection code is all still there.

Comment 40 Adam Williamson 2017-03-15 00:28:53 UTC
https://pagure.io/389-ds-base/issue/49170 appears to have a fix for what you term 'the shutdown block', but not (if I'm reading it correctly) what you term 'the aborted connection'.

Yes, this is critical. To simplify the Fedora release criteria, one of the Alpha release criteria is 'FreeIPA's gotta basically work', as in we have to be able to deploy the server, enrol clients into the domain, and do basic FreeIPA-y things like create users and log in to the clients using those user accounts.

Right now this is not the case, and the Alpha go/no-go meeting - at which we decide whether to ship the Alpha on time, or delay - is on Thursday. In order to ship on time we have to have a release candidate that meets all the criteria built and tested by the time of the meeting.

So obviously, if we're gonna ship Alpha on time, we need package builds and Bodhi updates that make FreeIPA actually work to be in place by, well, ideally later tonight (NA time), but at the latest, early tomorrow (as the compose process takes ~6 hours, then we have to do the actual testing).

Comment 41 wibrown@redhat.com 2017-03-15 00:52:24 UTC
We have a potential fix

https://pagure.io/389-ds-base/issue/49171

Comment 42 wibrown@redhat.com 2017-03-15 00:53:14 UTC
If this fix does not correct the issue, we will discuss rolling back the feature that is causing the issue.

Comment 43 Adam Williamson 2017-03-15 01:05:22 UTC
I'm running a scratch build with the patches from 49170 and 49171 applied. I will run the openQA tests against that build once it's done.

Comment 44 wibrown@redhat.com 2017-03-15 01:05:52 UTC
Thanks, let me know how it goes.

Comment 45 Adam Williamson 2017-03-15 01:49:37 UTC
The server deployment test passed, this time, but I forgot to run the client enrolment tests in parallel. Running with those now.

Comment 46 Adam Williamson 2017-03-15 04:14:51 UTC
So now it seems like deployment and enrolment work, but one of the client-side tests - running 'getent passwd admin', which is expected to produce some info on that account - fails. Will look into it further.

This is exactly the kind of thing that happens when errors early in a complex process are left to fester for months (https://bugzilla.redhat.com/show_bug.cgi?id=1403352 ), BTW, FreeIPA folks: as the later parts of the process don't get tested, multiple bugs in those spring up without our knowledge in the meantime...

Comment 47 wibrown@redhat.com 2017-03-15 04:20:33 UTC
That's really positive that this passes now. I'm going to link this issue with the fix, and move it to POST. Are you okay with this being resolved?

Comment 48 Fedora Update System 2017-03-15 04:23:52 UTC
389-ds-base-1.3.6.2-1.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-b9fbd8b768

Comment 49 Nathan Kinder 2017-03-15 05:20:13 UTC
Resetting this to ASSIGNED and removing the "fixed-in-version", as the build mentioned in comment#48 does not contain the fixes from comment#43 that have been tested to resolve the issue.  A new non-scratch build is needed to pick up these patches.

Comment 50 Adam Williamson 2017-03-15 05:50:05 UTC
wibrown: sure, we can open a new bug for the latest problem. I just don't have the genki to do it tonight, I'll look into it tomorrow.

Comment 51 wibrown@redhat.com 2017-03-15 06:03:01 UTC
(In reply to Adam Williamson from comment #50)
> wibrown: sure, we can open a new bug for the latest problem. I just don't
> have the genki to do it tonight, I'll look into it tomorrow.


No problem: I think it would be best to engage the FreeIPA team on the new issue you have found, as it may or may not be related to DS and it's components. It's best to let them make that analysis.

Thanks again for all your help and diligence on this issue.

Comment 52 Fedora Update System 2017-03-15 14:23:48 UTC
389-ds-base-1.3.6.2-2.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-d0ad136c8c

Comment 53 mreynolds 2017-03-15 14:25:15 UTC
New fedora package has been built (389-ds-base-1.3.6.2-2.f26)

https://koji.fedoraproject.org/koji/taskinfo?taskID=18395610

Comment 54 Fedora Update System 2017-03-16 00:50:39 UTC
389-ds-base-1.3.6.2-2.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-d0ad136c8c

Comment 55 Fedora Update System 2017-03-17 01:07:47 UTC
389-ds-base-1.3.6.2-2.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.