Bug 161661

Summary: named_sdb can depend on local ldap, but server isn't started until later
Product: [Fedora] Fedora Reporter: Dan Cox <dan>
Component: bindAssignee: Jason Vas Dias <jvdias>
Status: CLOSED NOTABUG QA Contact: Ben Levenson <benl>
Severity: medium Docs Contact:
Priority: medium    
Version: 4   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-07-12 23:00:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Cox 2005-06-25 03:50:40 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050524 Fedora/1.0.4-4 Firefox/1.0.4

Description of problem:
Using the named_sdb (LDAP) backend.

A common configuration is to point the bind zone configurations at the LDAP server running on localhost.

The problem is named starts (S11), but the LDAP server isn't available until (S27). What ends up happening is the name server loads with only a warning in the logs. The server then returns ServFail for all future requests.


Version-Release number of selected component (if applicable):
bind-9.3.1-4

How reproducible:
Always

Steps to Reproduce:
Set ENABLE_SDB=yes in /etc/sysconfig/named, install bind-sdb package and use a local LDAP database. eg:

zone "example.net" {
    type            master;
    database        "ldap ldap://localhost/ou=LAN,ou=DNS,dc=example,dc=net 172800";
};


Actual Results:  You will see an error like this in the system logs. Named requests will return ServFail even after the LDAP server is running.

named_sdb[23079]: LDAP sdb zone 'example.net': search failed, filter (&(zoneName=example.net)(relativeDomainName=\$\(smtpserver\)))

Expected Results:  There are 2 options I can think of:

1. Named should fail to start on LDAP query failure. When named tries to load a non-existent or invalid flat file zone configuration, it will fail. The inability to query the LDAP server should also result in named failing to start. This will at least keep the server from starting up and returning ServFail to all clients.

2. OR issue an automatic kill -HUP `cat /var/run/named.pid` AFTER the LDAP server has started. Perhaps a modification of the LDAP init script to detect named_sdb in an LDAP/localhost configuration and then issue the HUP or something else along those lines.

Additional info:

Comment 1 Jason Vas Dias 2005-07-12 23:00:17 UTC
On further consideration, I don't think this is a bug.

It is up to server administrators to configure their service start order 
that suits their site configuration: both the "named" and "ldap" 
initscripts have 'chkconfig - XX XX' lines - ie., they are by default
not started at all and it is up to administrators to issue 
'chkconfig --level XXX on' commands to make them be started at boot, so
it is up to them which order to run them in; if they choose to run named_sdb
and to use the ldap sdb, then they should choose to start ldap before named.

By default, named is started immediately after the 'network' service, so that
it can be used as the only nameserver in resolv.conf, as the host's local 
resolver - that should not be changed. Administrators are free to change the
start up order as they see fit.

All configurable Linux subsystems can be misconfigured in many ways - we cannot
make the code ever more complex to account for every misconfiguration of it.

Your solution (1) above is unacceptable: it is "traditional" named behaviour
that when it encounters a non-syntax zone file error it returns SERVFAIL for
queries of all names in the zone - eg. for zones containing "CNAME and other
data" - making named refuse to start on such errors is unlikely to be accepted 
upstream. When a zone fails to load at startup named returns SERVFAIL for every
query within it , and it would require major redesign of named to change this.

Your solution (2) above also is unacceptable, because users may restart the
LDAP service for many reasons, and may not want it also to restart named -
what if the ldap server is on a remote host, as it often is ?

Really, the only solution is for administrators who want to use sdb ldap to
ensure that the ldap server is running before named is started, either by
configuring ldap to be started on the same host before named, or by inserting
a check in the named initscript such as doing an ldapsearch of their ldap zone
and not running named / taking some action if it fails - this is best left to
per-site configuration.

Comment 2 Dan Cox 2005-07-13 00:47:29 UTC
Actually I had another thought...

Consider this:
1. Start the name server when the LDAP server IS available.
2. Shutdown the LDAP server or otherwise make it unreachable. named will then
issue SERVFAIL as expected.
3. Bring the LDAP server back online. named will automatically start working
again, no HUP required.

So named is basically getting permanently stuck in SERVFAIL state even though
the LDAP server has become available later. I will send this to the dns-ldap
maintainers list and see what they think.