1739441 – "systemctl reload named" doesn't return error if named.conf has wrong configuration

Bug 1739441 - "systemctl reload named" doesn't return error if named.conf has wrong configuration

Summary: "systemctl reload named" doesn't return error if named.conf has wrong configu...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	bind
Sub Component:
Version:	rawhide
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Petr Menšík
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:	1735787
Blocks:
TreeView+	depends on / blocked

Reported:	2019-08-09 10:21 UTC by Petr Menšík
Modified:	2019-12-13 01:04 UTC (History)
CC List:	8 users (show)
Fixed In Version:	bind-9.11.12-6.fc32 bind-9.11.13-2.fc31 bind-9.11.13-2.fc30
Clone Of:	1488218
Environment:
Last Closed:	2019-11-29 00:54:13 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	systemd systemd pull 13098	0	'None'	closed	core: never propagate reload failure to service result	2020-11-23 07:51:09 UTC

Description Petr Menšík 2019-08-09 10:21:03 UTC

+++ This bug was initially created as a clone of Bug #1488218 +++

Description of problem:
kill -HUP PID_OF_NAMED doesn't return error if named.conf has wrong configuration 

Version-Release number of selected component (if applicable):

bind-9.9.4-51

How reproducible:


Steps to Reproduce:
Reproduce steps:
- add include statement in bind configuration
include "myextraconfig";
- do NOT create the "myextraconfig" file
- confirm broken bind configuration with "named-checkconf"
- run "systemctl reload named"
- or run "kill -HUP <Named PID>"
- check exit code "echo $?" --- It returns 0 instead of 1 

Actual results:
exit code 0

Expected results:
It should return 1

--- Additional comment from Tomáš Hozza 🤓 on 2017-09-05 09:18:59 CEST ---

Hello.

"kill" just sends signal to a process. It does not checks what the process actually does. Therefore the fact that kill returns 0 is expected and not a bug.

--- Additional comment from Ramesh Sahoo on 2017-09-13 18:50:42 CEST ---


If someone executing reload command it must return something i.e. either success or error. If kill -HUL can't fulfill this then we shouldn't use this at all and if we use this my suggestion to replace the below parameter in reload which definitely returns something. 

named-checkconf > /dev/null 2>&1 && { rndc reload > /dev/null 2>&1 || /bin/kill -HUP $MAINPID; } || exit 1

--- Additional comment from Petr Menšík on 2017-09-15 12:19:16 CEST ---

Hi Ramesh.

Current state is a bit unfortunate. rndc reload command returns error when configuration error is found. Rdnc configuration is enabled by default, but kill is used in case rndc cannot connect to the server. It cannot tell difference between reload failed because there is configuration error and failed to connect to the server at all. If you are sure rndc works and if not, it is an error, you can remove " || /bin/kill -HUP $MAINPID" from ExecReload.

Extension of systemd service file can be used for this.
Create file /etc/systemd/system/named.service.d/reload.conf with contents:

; Simply rely on rndc to be properly configured to communicate with local server
[Service]
ExecReload=
ExecReload=/usr/sbin/rndc reload

If used like that, any error in reload would be reported by systemctl. Some users may not want to use rndc at all.
Or use named-checkconf the same way like on start. If named-checkconf fails, no reload command or signal would be sent.

; More compatible solution
[Service]
ExecReload=
ExecReload=/bin/bash -c 'if [ ! "$DISABLE_ZONE_CHECKING" == "yes" ]; then /usr/sbin/named-checkconf -z /etc/named.conf; else echo "Checking of zone files is disabled"; fi'
ExecReload=/bin/sh -c '/usr/sbin/rndc reload > /dev/null 2>&1 || /bin/kill -HUP $MAINPID'

run "systemctl daemon-reload" to load extension into running systemd.

Some runtime errors are not catched by named-checkconf however. It would be nice if rndc reload could return different code for failed operation reported by the server and failed connection to the server. It would be able use kill -HUP only when connection to the server failed, not when server already indicated error on reload. No such feature exist even in lastest development releases however.

--- Additional comment from Petr Menšík on 2019-07-30 11:34:48 CEST ---

rndc null can be used to verify connection to named without doing anything actually. Makes possible to distinguish whether connection to named failed in case of disabled control channel in configuration.

; Compatible solution reporting errors on rndc reload failure
[Service]
ExecReload=
ExecReload=/bin/bash -c 'if [ ! "$DISABLE_ZONE_CHECKING" == "yes" ]; then /usr/sbin/named-checkconf -z /etc/named.conf; else echo "Checking of zone files is disabled"; fi'
ExecReload=/bin/sh -c 'if /usr/sbin/rndc null > /dev/null 2>&1 ; then /usr/sbin/rndc reload; else /bin/kill -HUP $MAINPID; fi'

This way, rndc null errors are ignored and kill is used instead. If null succeed, reload is requested. Failures in that are reported.

--- Additional comment from Petr Menšík on 2019-08-08 11:22:59 CEST ---

Oh, proposed solution with if would not work, because of systemd bug [1]. It makes reload terminate the service, which is not desired at all. Return code has to be ignored until this is fixed. It makes impossible to report failure in reload in different way than just log message. RHEL 8 bug #1735787 fixed it.

Until that is fixed, it may just log error into log file:
# /etc/systemd/system/named.service.d/reload.conf
; vim:set ft=systemd:
[Service]
ExecReload=
ExecReload=-/bin/sh -c 'if /usr/sbin/rndc null > /dev/null 2>&1; then /usr/sbin/rndc reload || true; else /bin/kill -HUP $MAINPID; fi'


1. https://github.com/systemd/systemd/pull/13098

Comment 1 Petr Menšík 2019-08-09 10:42:36 UTC

Partial support for displaying errors from rndc reload is in rawhide. Reporting errors on reload can be done once systemd is fixed. Until that, use just rndc reload to see the errors.

Comment 2 Ben Cotton 2019-10-31 18:58:09 UTC

This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 3 Petr Menšík 2019-11-20 23:38:57 UTC

Already built on rawhide

Comment 4 Fedora Update System 2019-11-27 12:07:12 UTC

FEDORA-2019-c703d2304a has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-c703d2304a

Comment 5 Fedora Update System 2019-11-27 12:12:57 UTC

FEDORA-2019-73a8737068 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2019-73a8737068

Comment 6 Ben Cotton 2019-11-27 14:17:32 UTC

Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 7 Ben Cotton 2019-11-27 15:02:38 UTC

This bug was accidentally closed due to a query error. Reopening.

Comment 8 Fedora Update System 2019-11-28 01:44:29 UTC

bind-9.11.13-2.fc31, bind-dyndb-ldap-11.2-2.fc31, dnsperf-2.3.2-2.fc31 has been pushed to the Fedora 31 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-73a8737068

Comment 9 Fedora Update System 2019-11-28 02:21:37 UTC

bind-9.11.13-2.fc30, bind-dyndb-ldap-11.1-20.fc30, dhcp-4.3.6-38.fc30, dnsperf-2.3.2-2.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-c703d2304a

Comment 10 Fedora Update System 2019-11-29 00:54:13 UTC

bind-9.11.13-2.fc31, bind-dyndb-ldap-11.2-2.fc31, dnsperf-2.3.2-2.fc31 has been pushed to the Fedora 31 stable repository. If problems still persist, please make note of it in this bug report.

Comment 11 Fedora Update System 2019-12-13 01:04:10 UTC

bind-9.11.13-2.fc30, bind-dyndb-ldap-11.1-20.fc30, dhcp-4.3.6-38.fc30, dnsperf-2.3.2-2.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.