RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1149916 - 'systemctl start corosync.service' takes 60 second to time out when there is no corosync.conf file
Summary: 'systemctl start corosync.service' takes 60 second to time out when there is...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: corosync
Version: 7.1
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: rc
: ---
Assignee: Jan Friesse
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-10-06 22:54 UTC by Corey Marthaler
Modified: 2015-03-05 08:27 UTC (History)
5 users (show)

Fixed In Version: corosync-2.3.4-3.el7
Doc Type: Bug Fix
Doc Text:
Cause: User starts corosync systemd unit with invalid (or missing) corosync.conf Consequence: Time to fail unit may be very long (with default config 1 minute) Fix: Unit file calls init script. Init script now check return code of execution of corosync. If return code is non zero, init scripts fail and doesn't try to wait for IPC connection with (non executed) corosync. Result: Time to systemd unit fail is reduced to few seconds.
Clone Of:
Environment:
Last Closed: 2015-03-05 08:27:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Proposed patch (916 bytes, patch)
2014-10-07 15:53 UTC, Jan Friesse
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0365 0 normal SHIPPED_LIVE corosync bug fix and enhancement update 2015-03-05 12:51:37 UTC

Description Corey Marthaler 2014-10-06 22:54:28 UTC
Description of problem:
I forgot to set up a cluster on these nodes and instead just ran 'pcs cluster start'.
[root@host-111 ~]# time  pcs cluster start
Starting Cluster...
Redirecting to /bin/systemctl start  corosync.service
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details.

Error: unable to start corosync

real    1m0.696s
user    0m0.106s
sys     0m0.039s


Oct  6 17:43:26 host-111 systemd: Starting Corosync Cluster Engine...
Oct  6 17:43:26 host-111 corosync[2340]: [MAIN  ] Can't read file /etc/corosync/corosync.conf reason = (No such file or directory)
Oct  6 17:43:26 host-111 corosync[2340]: [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1273.
Oct  6 17:44:26 host-111 corosync: Starting Corosync Cluster Engine (corosync): [FAILED]^M[  OK  ]
Oct  6 17:44:26 host-111 systemd: corosync.service: control process exited, code=exited status=1
Oct  6 17:44:26 host-111 systemd: Failed to start Corosync Cluster Engine.
Oct  6 17:44:26 host-111 systemd: Unit corosync.service entered failed state.




Version-Release number of selected component (if applicable):
[root@host-111 ~]# rpm -qi corosync
Name        : corosync
Version     : 2.3.4
Release     : 2.el7
Architecture: x86_64
Install Date: Mon 06 Oct 2014 11:45:38 AM CDT
Group       : System Environment/Base
Size        : 472938
License     : BSD
Signature   : (none)
Source RPM  : corosync-2.3.4-2.el7.src.rpm
Build Date  : Fri 12 Sep 2014 10:59:11 AM CDT
Build Host  : x86-030.build.eng.bos.redhat.com
Relocations : (not relocatable)
Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
Vendor      : Red Hat, Inc.
URL         : http://www.corosync.org/
Summary     : The Corosync Cluster Engine and Application Programming Interfaces

Comment 1 Chris Feist 2014-10-07 00:35:34 UTC
One note, we tried running the following command directly:

systemctl start corosync.service

And it also took 60s before the service failed.

'pcs cluster start' just runs 'systemctl start corosync.service', checks the output and if it succeeds runs 'systemctl start pacemaker.service'.

Comment 2 Jaroslav Kortus 2014-10-07 14:20:31 UTC
the unit takes really long time to fail. Just noticed it yesterday when I manually edited the corosync.conf, made an syntax error there and started corosync. Systemctl command call should fail much sooner, no idea what it is waiting for :).

Comment 3 Jan Friesse 2014-10-07 15:53:18 UTC
Created attachment 944647 [details]
Proposed patch

Init script now checks return code of executing corosync command. If it
fails, ipc_wait section is skipped, resulting in much faster failure of
init script.

Comment 5 Jan Friesse 2014-10-08 11:33:45 UTC
Test results (with patched init script):
- missing corosync.conf:

# time systemctl start corosync 
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details.

real	0m0.138s
user	0m0.005s
sys	0m0.009s

- Syntax error (set corosync version set to 22)
# time systemctl start corosync 
Job for corosync.service failed. See 'systemctl status corosync.service' and 'journalctl -xn' for details.

real	0m0.172s
user	0m0.009s
sys	0m0.007s

- Correct corosync config file
# time systemctl start corosync 

real	0m0.748s
user	0m0.008s
sys	0m0.008s

Comment 9 errata-xmlrpc 2015-03-05 08:27:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0365.html


Note You need to log in before you can comment on or make changes to this bug.