Bug 1149916
Summary: | 'systemctl start corosync.service' takes 60 second to time out when there is no corosync.conf file | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> | ||||
Component: | corosync | Assignee: | Jan Friesse <jfriesse> | ||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
Severity: | low | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 7.1 | CC: | ccaulfie, cfeist, cluster-maint, cmarthal, jkortus | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | corosync-2.3.4-3.el7 | Doc Type: | Bug Fix | ||||
Doc Text: |
Cause:
User starts corosync systemd unit with invalid (or missing) corosync.conf
Consequence:
Time to fail unit may be very long (with default config 1 minute)
Fix:
Unit file calls init script. Init script now check return code of execution of corosync. If return code is non zero, init scripts fail and doesn't try to wait for IPC connection with (non executed) corosync.
Result:
Time to systemd unit fail is reduced to few seconds.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-03-05 08:27:20 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Corey Marthaler
2014-10-06 22:54:28 UTC
One note, we tried running the following command directly: systemctl start corosync.service And it also took 60s before the service failed. 'pcs cluster start' just runs 'systemctl start corosync.service', checks the output and if it succeeds runs 'systemctl start pacemaker.service'. the unit takes really long time to fail. Just noticed it yesterday when I manually edited the corosync.conf, made an syntax error there and started corosync. Systemctl command call should fail much sooner, no idea what it is waiting for :). Created attachment 944647 [details]
Proposed patch
Init script now checks return code of executing corosync command. If it
fails, ipc_wait section is skipped, resulting in much faster failure of
init script.
Test results (with patched init script): - missing corosync.conf: # time systemctl start corosync Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details. real 0m0.138s user 0m0.005s sys 0m0.009s - Syntax error (set corosync version set to 22) # time systemctl start corosync Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details. real 0m0.172s user 0m0.009s sys 0m0.007s - Correct corosync config file # time systemctl start corosync real 0m0.748s user 0m0.008s sys 0m0.008s Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0365.html |