Bug 1432917

Summary: connection handling is looping on timeout event
Product: [Fedora] Fedora Reporter: Tomas Krizek <tkrizek>
Component: 389-ds-baseAssignee: mreynolds
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 26CC: awilliam, edewata, mreynolds, nhosoi, nkinder, rmeggins, tbordaz, wibrown
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 389-ds-base-1.3.6.3-4.f26 389-ds-base-1.3.6.3-4.fc26 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-01 17:29:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tomas Krizek 2017-03-16 12:16:41 UTC
Description of problem:

directory server log is getting flooded with the following warnings:

WARN - ns_handle_pr_read_ready - Received idletime out with c->c_idletimeout as 0. Ignoring.


Version-Release number of selected component (if applicable):
389-ds-base-1.3.6.2-2.fc26.x86_64

How reproducible:
always

Steps to Reproduce:
1. dnf update -y
2. dnf config-manager --enable=updates-testing
3. dnf install -y freeipa-server freeipa-server-dns
4. ipa-server-install -a $password -p $password --domain $domain --realm $realm -U--setup-dns --auto-forwarders

Actual results:
IPA serve with DNS is installed and the mentioned warning message is logged in directory server error log.

Expected results:
No such warning messages in error log.

Comment 1 wibrown@redhat.com 2017-03-16 12:29:21 UTC
Okay, I need to work out why this is looping. I think in an ideal world we would have a default timeout set so this wouldn't be an issue.

I need to see why libevent is throwing a timeout event back on this when it has a 0 timeout. I'll see if I can reproduce.

Comment 3 wibrown@redhat.com 2017-03-16 12:45:00 UTC
Upstream ticket:
https://pagure.io/389-ds-base/issue/49174

Comment 4 wibrown@redhat.com 2017-03-16 13:06:17 UTC
@tbordaz: see.

https://pagure.io/389-ds-base/issue/49174#comment-431838

Comment 5 Tomas Krizek 2017-03-16 13:57:36 UTC
This also affects FreeIPA setup with DNS. Although SyncRepl seems to be working after the server is installed, there are the following errors in the named-pkcs11 log generated by bind-dyndb-ldap.

Mar 16 14:34:35 vm.example.com named-pkcs11[18630]: LDAP error: Can't contact LDAP server: connection error
Mar 16 14:34:35 vm.example.com named-pkcs11[18630]: retrying LDAP operation (modifying(replace)) on entry 'idnsname=dom.example.com.,cn=dns,dc=dom,dc=example,dc=com'
Mar 16 14:34:35 vm.example.com named-pkcs11[18630]: LDAP error: Can't contact LDAP server: while modifying(replace) entry 'idnsname=dom.example.com.,cn=dns,dc=dom,dc=example,dc=com'
Mar 16 14:34:35 vm.example.com named-pkcs11[18630]: unsupported operation: object class in resource record template DN 'idnsname=_ntp._udp,idnsname=idnsname=dom.example.com.,cn=dns,dc=dom,dc=example,dc=com' changed: rndc reload might be necessary

If nunc-stans is turned off in cn=config,

nsslapd-enable-nunc-stans: off

and dirsrv and named-pkcs11 is restarted, the errors disappear and the DNS seems to work properly.

Comment 6 mreynolds 2017-03-16 14:06:46 UTC
I'm going to disable nunc-stans by default and do another build.  Once we get these issues sorted out we will re-enable for the next build.

Comment 7 mreynolds 2017-03-16 14:59:36 UTC
Since this is not a blocker for fedora 26, I'm not doing to do another build with nunc-stans disabled.

Comment 8 Adam Williamson 2017-03-16 17:35:07 UTC
Do we know what the practical consequences of this bug are? Our (Fedora's) automated FreeIPA tests appear to be passing now, but it would be nice to know what this might be causing...

Comment 9 thierry bordaz 2017-03-16 17:41:40 UTC
(In reply to Adam Williamson from comment #8)
> Do we know what the practical consequences of this bug are? Our (Fedora's)
> automated FreeIPA tests appear to be passing now, but it would be nice to
> know what this might be causing...

Hi Adam,

You are right, some tests can pass successfully. In syncRepl tests (DNS), syncRepl was still able to send updates although the server hit that bug.

two symptoms are:

- constant CPU consumption (but quite low <10%)
- DS error logs flooded with warnings : ns_handle_pr_read_ready - Received idletime out with c->c_idletimeout as 0. Ignoring.

Comment 10 Adam Williamson 2017-03-17 21:52:17 UTC
Confirming I see this in openQA testing also.

Comment 11 wibrown@redhat.com 2017-03-22 04:13:14 UTC
I have found the root cause of this issue, and provided a patch upstream. Mark will rebuild this for Fedora shortly for testing. 

https://pagure.io/389-ds-base/issue/49174

Comment 12 Tomas Krizek 2017-03-23 10:48:21 UTC
I've tested 389-ds-base-1.3.6.3-4.fc26 and I can confirm the issue has been fixed.

Thanks!

Comment 13 Fedora Update System 2017-03-23 15:39:39 UTC
389-ds-base-1.3.6.3-4.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-846f41cfeb

Comment 14 Fedora Update System 2017-03-24 14:51:51 UTC
389-ds-base-1.3.6.3-4.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-846f41cfeb

Comment 15 Fedora Update System 2017-04-01 17:29:06 UTC
389-ds-base-1.3.6.3-4.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.