Hide Forgot
Description of problem: "systemctl start corosync-qnetd" returns 0 even if corosync-qnetd failed to start, "systemctl status corosync-qnetd" shows status correctly Version-Release number of selected component (if applicable): corosync-qnetd-2.4.0-4.el7 How reproducible: always, easily Steps to Reproduce: [root@rh72-node3:~]# ls -l /etc/corosync/qdevice/net/ total 0 [root@rh72-node3:~]# systemctl start corosync-qnetd.service [root@rh72-node3:~]# echo $? 0 [root@rh72-node3:~]# systemctl status corosync-qnetd.service ● corosync-qnetd.service - Corosync Qdevice Network daemon Loaded: loaded (/usr/lib/systemd/system/corosync-qnetd.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Fri 2016-09-16 15:34:01 CEST; 8s ago Process: 6602 ExecStart=/usr/bin/corosync-qnetd -f $COROSYNC_QNETD_OPTIONS (code=exited, status=1/FAILURE) Main PID: 6602 (code=exited, status=1/FAILURE) Sep 16 15:34:01 rh72-node3 systemd[1]: Started Corosync Qdevice Network daemon. Sep 16 15:34:01 rh72-node3 systemd[1]: Starting Corosync Qdevice Network daemon... Sep 16 15:34:01 rh72-node3 systemd[1]: corosync-qnetd.service: main process exited, code=exited, status=1/FAILURE Sep 16 15:34:01 rh72-node3 systemd[1]: Unit corosync-qnetd.service entered failed state. Sep 16 15:34:01 rh72-node3 systemd[1]: corosync-qnetd.service failed. Actual results: [root@rh72-node3:~]# systemctl start corosync-qnetd.service [root@rh72-node3:~]# echo $? 0 Expected results: [root@rh72-node3:~]# systemctl start corosync-qnetd.service [root@rh72-node3:~]# echo $? 1
Problem is because of behavior of "simple" unit type. Solution seems to be to migrate to notify. It's 7.4 material.
Created attachment 1545733 [details] qnetd: Check existence of NSS DB dir before fork qnetd: Check existence of NSS DB dir before fork Previously, when user tried start corosync-qnetd without initialized NSS database then generic (not very helpful and misleading) NSS error was logged "NSS error (-8015): The certificate/key database is in an old, unsupported format.". Solution is to check if it's possible to open NSS DB directory and display (usually much more informative) result of strerror function. Such check is called before fork, so init system can return error code during start. To make error reporting work with systemd it's also needed to change unit type from simple to forking. Signed-off-by: Jan Friesse <jfriesse> Reviewed-by: Christine Caulfield <ccaulfie>
For QA: Patch solves only described scenario when NSS DB doesn't exists (what is probably most common reason of failure). All other "failures" are handled after fork so behavior is same as before. What I've tested: # ls -la /etc/corosync/qnetd ls: cannot access /etc/corosync/qnetd: No such file or directory # systemctl start corosync-qnetd; echo $? Job for corosync-qnetd.service failed because the control process exited with error code. See "systemctl status corosync-qnetd.service" and "journalctl -xe" for details. 1 # journalctl _COMM=corosync-qnetd * corosync-qnetd[*]: Can't open NSS DB directory (2): No such file or directory After creating of CA: # systemctl start corosync-qnetd; echo $? 0 # systemctl status corosync-qnetd; echo $? ● corosync-qnetd.service - Corosync Qdevice Network daemon Loaded: loaded (/usr/lib/systemd/system/corosync-qnetd.service; disabled; vendor preset: disabled) Active: active (running) since Tue 2019-03-19 16:51:27 CET; 2s ago Docs: man:corosync-qnetd Process: 17632 ExecStart=/usr/bin/corosync-qnetd $COROSYNC_QNETD_OPTIONS (code=exited, status=0/SUCCESS) Main PID: 17633 (corosync-qnetd) ... 0 Please note that error message is now (hopefully) more understandable - "No such file or directory" vs "NSS error (-8015): The certificate/key database is in an old, unsupported format.".
Created attachment 1546365 [details] Use RuntimeDirectory instead of tmpfiles.d Use RuntimeDirectory instead of tmpfiles.d MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This reverts part of commit 32123f6bb2ebc4f9ac7865945cc85a9c9b903dc6. A simple directive is a much lighter solution to the same problem, and automatically follows the specified User. I copied the 0770 modes from the corresponding init scripts; they could use a little documentation. Signed-off-by: Ferenc Wágner <wferi> Reviewed-by: Jan Friesse <jfriesse> (cherry picked from commit c733e9417ef1d2f31268e9b6f99a8fc7712fcea7)
Created attachment 1546367 [details] configure: add --with-initconfigdir option configure: add --with-initconfigdir option MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Default value is /etc/sysconfig and resulting INITCONFIGDIR is used to reduce duplication in init system integration code. Signed-off-by: Ferenc Wágner <wferi> Reviewed-by: Jan Friesse <jfriesse> (cherry picked from commit d7208e88370d2bce40b45224a3971eeb68c22d3c)
BEFORE (corosync-qnetd-2.4.0-4.el7) ====== ## Checks, that NSS DB doesn't exist root@host-027 ~]# ls -l /etc/corosync/qnetd ls: cannot access /etc/corosync/qnetd: No such file or directory ## Start the service without NSS DB, which should fail [root@host-027 ~]# systemctl start corosync-qnetd.service [root@host-027 ~]# echo $? 0 ## systemctl shows status correctly [root@host-027 ~]# systemctl status corosync-qnetd.service ● corosync-qnetd.service - Corosync Qdevice Network daemon Loaded: loaded (/usr/lib/systemd/system/corosync-qnetd.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Fri 2019-06-07 01:58:57 CDT; 2s ago Process: 8941 ExecStart=/usr/bin/corosync-qnetd -f $COROSYNC_QNETD_OPTIONS (code=exited, status=1/FAILURE) Main PID: 8941 (code=exited, status=1/FAILURE) Jun 07 01:58:57 host-027.virt.lab.msp.redhat.com systemd[1]: Started Corosync Qdevice Network daemon. Jun 07 01:58:57 host-027.virt.lab.msp.redhat.com systemd[1]: corosync-qnetd.service: main process ex...RE Jun 07 01:58:57 host-027.virt.lab.msp.redhat.com systemd[1]: Unit corosync-qnetd.service entered fai...e. Jun 07 01:58:57 host-027.virt.lab.msp.redhat.com systemd[1]: corosync-qnetd.service failed. Hint: Some lines were ellipsized, use -l to show in full. AFTER (corosync-qnetd-2.4.3-6.el7) ===== ## NSS DB doesn't exist root@host-027 ~]# ls -l /etc/corosync/qnetd ls: cannot access /etc/corosync/qnetd: No such file or directory ## systemctl start is now failing, as it should [root@host-027 ~]# systemctl start corosync-qnetd.service Job for corosync-qnetd.service failed because the control process exited with error code. See "systemctl status corosync-qnetd.service" and "journalctl -xe" for details. [root@host-027 ~]# echo $? 1 [root@host-027 ~]# journalctl _COMM=corosync-qnetd ... * corosync-qnetd[9809]: Can't open NSS DB directory (2): No such file or directory ... RESULT ====== After the fix, corosync-qnetd service returns error status, when it is started without NSS DB. Note: Both versions work fine when NSS DB is created # corosync-qnetd-certutil -i Verified for version corosync-2.4.3-6.el7
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2245