Bug 2218105

Summary: named.service failing to start during boot: could not get query source dispatcher
Product: Red Hat Enterprise Linux 9 Reporter: Amey <abetkike>
Component: bindAssignee: Petr Menšík <pemensik>
Status: CLOSED NOTABUG QA Contact: rhel-cs-infra-services-qe <rhel-cs-infra-services-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.1CC: sbalasub
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
URL: https://systemd.io/NETWORK_ONLINE/
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-09 12:19:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Amey 2023-06-28 07:29:48 UTC
Description of problem:
named.service fails to start during boot, it can be manually started once the system is UP.

Version-Release number of selected component (if applicable):
bind-9.16.23-5.el9_1.x86_64

Actual results:

named.service fails during boot.

Expected results:
named.service should start automatically (provided its enabled) at boot.

Additional info:

The issue can be resolved after adding below to the service unit file. 

------
[Unit]
After=network-online.target
Wants=network-online.target
------

as bind is dependent on the network can we add these dependencies while shipping the bind package?

Comment 2 Petr Menšík 2023-08-08 13:40:09 UTC
Differences for network.target and network-online.target are described on [1].

named has two different kinds of using configured addresses. listen-on and listen-on-v6 clauses use watching dynamic addresses changes and can accept even addresses not yet available on the system when they start.

However, query-source <address> has to be present when named is starting, because named uses them for each query. It cannot make even priming query required on startup. If address specified by any *-source option is not ready on startup, the service has to wait for availability of those addresses by adding After=network-online.target.

On systemd article [1] it describes some issues and why it is not used in default configuration. We will not change default named.service or named-chroot.service to listen on explicit non-localhost addresses. If the customer needs explicit source address, then adding dependency on network-online.target must be configured in addition to configuration change. Default configuration does not need it and will not need that. Configuration of specific outgoing address is not required in common use-cases.

Therefore we will not change named.service and named-chroot.service to depend on network-online.target.

I would close this bug unless something additional is reported.

1. https://systemd.io/NETWORK_ONLINE/

Comment 3 Petr Menšík 2023-08-09 09:37:41 UTC
IPv6 related note:

If the host uses SLAAC assigned addresses and it wants to use such address in configuration option such as query-source-v6, then even depending on network-online.target does not fix it itself. It is possible to make NM wait for IPv6 address explicitly by following command:

# nmcli connection modify $con ipv6.may-fail no

Replace $con with UUID of NM connection modified, listed by "nmcli connection show" command. That would cause unwanted delay if the IPv6 router advertisements stop for any reason, but ensures IPv6 connectivity is ready before continuing to boot. Not issue in this customer case, but I think nice to mention that fact here.

Comment 4 Petr Menšík 2023-08-09 12:19:36 UTC
Created a named-specific support article [1] for this issue. It has been reported multiple times already for different releases, but the primary cause remains the same. If not just listen-on or listen-on-v6 specifies exact address to be used, then appropriate named service needs to be modified to start after connectivity is obtained.

Obviously that can be done only on systems, which do not rely on local named to be up and working to reach network-online.target. In most common cases that is not a problem, but administrator configuring the system is responsible ensuring this does not happen.

1. https://access.redhat.com/solutions/7027975