Bug 1142150

Summary: bind hangs after reload/GSSAPI Error: The referenced context has expired (Success)
Product: Red Hat Enterprise Linux 7 Reporter: Arpit Tolani <atolani>
Component: bindAssignee: Tomáš Hozza <thozza>
Status: CLOSED ERRATA QA Contact: qe-baseos-daemons
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: psklenar, pspacek
Target Milestone: rcKeywords: Patch
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: bind-9.9.4-17.el7 Doc Type: Bug Fix
Doc Text:
Cause: BIND incorrectly handles errors returned by dynamic databases (from dyndb API). Consequence: BIND can deadlock on shutdown, e.g. when bug #1142176 is triggered. Fix: The dyndb API was fixed to not cause deadlock on BIND's shutdown if dynamic database previously returned error. Result: BIND now shuts down normally, even if e.g. bug #1142176 is triggered.
Story Points: ---
Clone Of:
: 1142152 1142176 (view as bug list) Environment:
Last Closed: 2015-03-05 08:18:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1142152, 1142176    
Attachments:
Description Flags
SRPM with bind-dyndb-ldap plugin for testing of this bug
none
Patch for the issue. Thanks to Petr Spacek! none

Description Arpit Tolani 2014-09-16 09:11:29 UTC
Description of problem:
bind hangs after reload/GSSAPI Error: The referenced context has expired (Success)

 After a while (about once in a week) the bind daemon is in the state hang/zombie. The bind daemon seems to be present and accept requests from the clients, but is not answering any dns requests. Only killing the process with kill -9 can stop the daemon. After starting bind again, it works fine, until the problem occurs again.

Version-Release number of selected component (if applicable):
bind-9.9.4-14.el7.x86_64

How reproducible:
Everytime Logrotates runs. 

Steps to Reproduce:
1. Configure IPA server with DNS
2. Wait till logrotate starts rotating. 

Additional info:
It is related to https://fedorahosted.org/bind-dyndb-ldap/ticket/131

Comment 1 Tomáš Hozza 2014-09-16 09:54:04 UTC
Thank you for your report.

I already discussed this issue with Petr Spacek and it should be pretty easy to fix it. It is an error in the dyndb patch adding API for bind-dyndb-ldap.

I'll talk to QA guys and try to get it into 7.1.

Comment 2 Petr Spacek 2014-09-16 10:13:23 UTC
This problem is caused by two separate bugs: This one and bind-dyndb-ldap bug #131.

bind-dyndb-ldap was already fixed upstream so the fix will be pulled in as part of bind-dyndb-ldap rebase.

We need to fix both bugs to completely solve the issue.

Comment 4 Tomáš Hozza 2014-09-17 18:22:16 UTC
Created attachment 938587 [details]
SRPM with bind-dyndb-ldap plugin for testing of this bug

Build and install this bind-dyndb-ldap plugin package to trigger this bug in BIND DYNDB API.

scratch build can be found here:
https://brewweb.devel.redhat.com/taskinfo?taskID=7982364

Comment 5 Tomáš Hozza 2014-09-17 18:23:55 UTC
(In reply to Tomas Hozza from comment #4)
> Created attachment 938587 [details]
> SRPM with bind-dyndb-ldap plugin for testing of this bug
> 
> Build and install this bind-dyndb-ldap plugin package to trigger this bug in
> BIND DYNDB API.
> 
> scratch build can be found here:
> https://brewweb.devel.redhat.com/taskinfo?taskID=7982364

To rebuild just run:
$ brew build --scratch <target> <path_to_SRPM>

<target> for 7.1 is "rhel-7.1-candidate"

Comment 6 Tomáš Hozza 2014-09-17 18:24:44 UTC
Created attachment 938588 [details]
Patch for the issue. Thanks to Petr Spacek!

Comment 7 Tomáš Hozza 2014-09-17 18:30:11 UTC
Steps to reproduce for QA:

1. install bind
2. build the attachment 938587 [details] for your architecture
3. install the bind-dyndb-ldap package built from attachment 938587 [details]
4. Add the following section to /etc/named.conf:

dynamic-db "my_db_name" {
	library "ldap.so";
	arg "uri ldap://ldap.example.com";
	arg "base cn=dns, dc=example, dc=com";
	arg "auth_method none";
};

5. run 'named -u named -fg' as root
6. named will start
7. run 'rndc reload' from another console and watch the error on output:

[root@localhost ~]# rndc reload
rndc: 'reload' failed: out of memory

8. press CTRL+C in the terminal you've started named or run 'rndc halt'


Actual result in 8.:
Named will freeze and the only way to stop it is to kill -9 it.

Expected result in 8. (and with attached patch):
Named will exit just normally.

Comment 12 errata-xmlrpc 2015-03-05 08:18:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0357.html