Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1165423

Summary: Pacemaker doesn't wait for systemd resource stop
Product: Red Hat Enterprise Linux 7 Reporter: Mitchell <mitchellbosecke>
Component: pacemakerAssignee: Andrew Beekhof <abeekhof>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 7.0CC: abeekhof, cfeist, cluster-maint, cluster-qe, djansa, dvossel, fdinitto, jeckersb, jherrman, mnovacek, mspqa-list, ovasik, rohara, tlavigne, ykawada
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pacemaker-1.1.12-13.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1111747 Environment:
Last Closed: 2015-03-05 10:00:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mitchell 2014-11-18 23:13:44 UTC
Exactly like the original bug that this one is a clone of except pacemaker does not wait for the systemd resource to fully stop before returning.

+++ This bug was initially created as a clone of Bug #1111747 +++

Description of problem:
It looks like pacemaker isn't waiting for systemd resource to start and always thinks the start has succeeded.

Version-Release number of selected component (if applicable):
pacemaker-1.1.10-31.el7_0.x86_64

How reproducible:
Always

Steps to Reproduce:
1.  Setup service file (or edit an already existing one).
I'm currently using a mysqld systemd file, but anything would work, it looks like this: (I've just used a /usr/bin/sleep 20) for the command.

[Unit]
Description=MariaDB database server
After=syslog.target
After=network.target

[Service]
Type=forking
User=mysql
Group=mysql

ExecStart=/usr/bin/sleep 20

# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=300

# Place temp files in a secure directory, not /tmp
PrivateTmp=true

[Install]
WantedBy=multi-user.target


2. systemctl start <servicename>'
3. crm_resource -r mysqld --force-start -V

Actual results:
systemctl start <servicename> takes about 20 seconds to run
crm_resource -r mysqld --force-start -V immediately returns

Expected results:
crm_resource -r mysqld --force-start should also take 20 seconds to run

Additional info:
This was first discovered because the user was attempting to start the mysqld systemd file through pacemaker and pacemaker would start it, and immediately think it succeeded (even though it took about 40 seconds to start up), then ran a monitor operation which noticed the service wasn't up and then tried to start it again, which also immediately succeeded, and then it tried to monitor, etc., etc.

--- Additional comment from David Vossel on 2014-06-20 18:48:42 EDT ---

I've experienced weird behavior like this as well. We are fighting two things systemd related in pacemaker.

1. Most of the time people aren't taking into account these systemd unit files are being used in a HA environment. With HA, we execute start and then usually immediately execute the first monitor operation "systemctl status" right afterwards.  If the unit file allows any gap of time between starting and being able to detect the status, that unit file is going to look like it is failing to pacemaker.

2. Something about how we are using the systemd dbus API isn't waiting for the 'ExecStartPost' command in a unit file to complete before considering the start operation completed. Apparently (according to what Chris is reporting) there may be some other issues going on here as well were we don't wait completely even for the 'ExecStart' to complete.


Moving forward we need to fix whatever is causing us to return too early during the start operation (item 2), and as part of this fix we should figure out a way to make everyone else's broken systemd scripts work (item 1.).  Perhaps we should internally execute start on the service, and not consider the service started until we can execute status successfully on the service.  That way we are guaranteed that we can monitor the service successfully before we say the start operation has completed.

-- Vossel

--- Additional comment from John Eckersberg on 2014-06-23 09:42:16 EDT ---

This is also happening with services of type=notify.  Here's a minimal service that sleeps for 5 seconds then notifies startup to demonstrate:

# cat /etc/systemd/system/notify-test.service
[Unit]
Description=notify test

[Service]
Type=notify
ExecStart=/usr/bin/python -c 'import time; import systemd.daemon; time.sleep(5); systemd.daemon.notify("READY=1"); time.sleep(3600)'

As expected, it takes 5 seconds to start:

# time systemctl start notify-test.service

real    0m5.404s
user    0m0.059s
sys     0m0.010s

But is considered started immediately by pacemaker as described in the original report:

# time crm_resource -r notify-test --force-start -V
Operation start for notify-test (systemd::notify-test) returned 0

real    0m0.336s
user    0m0.006s
sys     0m0.005s

The result is pacemaker repeatedly and rapidly trying to restart the service when it is a managed resource.  Note the frequency of restarts when I follow the journal and then enable the service in pacemaker:

# journalctl -f -u notify-test &
[2] 3166
# pcs resource enable notify-test
Jun 23 09:38:24 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:25 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:25 control1.example.org systemd[1]: Stopped notify test.
Jun 23 09:38:25 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:26 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:26 control1.example.org systemd[1]: Stopped notify test.
Jun 23 09:38:26 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:26 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:26 control1.example.org systemd[1]: Stopped notify test.
Jun 23 09:38:26 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:27 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:27 control1.example.org systemd[1]: Stopped notify test.
Jun 23 09:38:27 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:27 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:27 control1.example.org systemd[1]: Stopped notify test.
Jun 23 09:38:28 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:28 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:28 control1.example.org systemd[1]: Stopped notify test.
Jun 23 09:38:28 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:29 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:29 control1.example.org systemd[1]: Stopped notify test.
Jun 23 09:38:29 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:30 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:30 control1.example.org systemd[1]: Stopped notify test.
Jun 23 09:38:30 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:30 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:30 control1.example.org systemd[1]: Stopped notify test.
Jun 23 09:38:31 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:31 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:31 control1.example.org systemd[1]: Stopped notify test.
Jun 23 09:38:32 control1.example.org systemd[1]: Starting notify test...
Jun 23 09:38:32 control1.example.org systemd[1]: Stopping notify test...
Jun 23 09:38:32 control1.example.org systemd[1]: Stopped notify test.
...etc...

--- Additional comment from Andrew Beekhof on 2014-06-25 08:52:32 EDT ---

We're returning early because the systemd API call returns early.
The rules for every other kind of resource are "don't return until you're done".
I guess systemd is special, so we're going to have to hack our way this.

--- Additional comment from Andrew Beekhof on 2014-06-26 06:03:39 EDT ---

IMHO, we can just use start-delay for this in 7.0 and fix properly for 7.1

--- Additional comment from Andrew Beekhof on 2014-06-27 00:07:03 EDT ---

Fixed upstream:

https://github.com/beekhof/pacemaker/commit/3bd6c30

`crm_resource --force-start` will still return immediately though, as it bypasses the cluster completely and goes direct to the underlying resource.

--- Additional comment from David Vossel on 2014-06-30 09:54:04 EDT ---

(In reply to Andrew Beekhof from comment #5)
> IMHO, we can just use start-delay for this in 7.0 and fix properly for 7.1

There are limited scenarios where start-delay is safe.  Typically start-delay is used on the recurring monitor operations to keep them from attempting to monitor a resource which may not completely be up yet.  It is less understood that you'd also have to use start-delay on the 'start' operation of any resources ordered to start after the resource that has issues starting correctly.

Example:
A resource takes 40 seconds to start, but the actual start op returns immediately.  If there is an ordering constraint, start delayed_rsc then start whatever_rsc, the 'whatever_rsc' resource would need the start-delay option set on its 'start' operation to account for issues that the 'delayed_rsc' has.

Outside of a few simple use-cases, start-delay can produce a complex and fragile config that may prove to be unreliable.

-- Vossel

--- Additional comment from David Vossel on 2014-06-30 10:02:22 EDT ---

(In reply to Andrew Beekhof from comment #6)
> Fixed upstream:
> 
> https://github.com/beekhof/pacemaker/commit/3bd6c30

patch looks good.  We could follow up with adding a 'sleep' to the lrmd regression test's sample systemd unit file so we can be certain this new logic is exercised.

--- Additional comment from Andrew Beekhof on 2014-06-30 21:31:47 EDT ---

(In reply to David Vossel from comment #8)
> (In reply to Andrew Beekhof from comment #6)
> > Fixed upstream:
> > 
> > https://github.com/beekhof/pacemaker/commit/3bd6c30
> 
> patch looks good.  We could follow up with adding a 'sleep' to the lrmd
> regression test's sample systemd unit file so we can be certain this new
> logic is exercised.

I tested with the service chris supplied in the description and saw the expected retries in the logs.  So I'm certain it works :)

Comment 2 Andrew Beekhof 2014-11-20 22:38:27 UTC
Sigh. I should have thought of that.

Comment 3 Andrew Beekhof 2014-11-21 00:28:03 UTC
Upstream patch:
   https://github.com/beekhof/pacemaker/commit/a9ec918

qe-ack+ please?

Comment 5 Andrew Beekhof 2015-01-06 06:35:15 UTC
Forgot to update BZ

Comment 7 michal novacek 2015-01-07 16:54:33 UTC
I have verified that with pacemaker-1.1.15.el7 the resource will be stopped
only after the systemd unit is stopped (not immediately)

----
# cat /etc/systemd/system/notify-test.service 
[Unit]
Description=notify test (5)

[Service]
Type=notify
ExecStart=/usr/bin/python -c 'import systemd.daemon; from time import sleep; systemd.daemon.notify("READY=1"); sleep(3600)'
ExecStop=/usr/bin/python -c 'import systemd.daemon; from time import sleep; sleep(10); systemd.daemon.notify("READY=1")'

# systemctl start notify-test; systemctl status notify-test
notify-test.service - notify test (5)
   Loaded: loaded (/etc/systemd/system/notify-test.service; static)
   Active: active (running) since Wed 2015-01-07 10:49:20 CST; 9ms ago
 Main PID: 3971 (python)
   CGroup: /system.slice/notify-test.service
           └─3971 /usr/bin/python -c import systemd.daemon; from time import sleep; systemd.daemon.notify("READY=1"); sleep(3600)

Jan 07 10:49:20 node-01 systemd[1]: Starting notify test (5)...
Jan 07 10:49:20 node-01 systemd[1]: Started notify test (5).

# time systemctl stop notify-test
real    0m10.059s
user    0m0.001s
sys     0m0.006s

# pcs resource show notify-test
 Resource: notify-test (class=systemd type=notify-test)
  Meta Attrs: target-role=Stopped 
  Operations: monitor interval=60s (notify-test-monitor-interval-60s)

# pcs resource
 notify-test    (systemd:notify-test):  Started 

# pcs resource disable notify-test

# tail /var/log/messages
...
Jan  7 10:51:59 node-01 systemd: Stopping notify test (5)
Jan  7 10:52:09 node-01 systemd: Stopped notify test (5).
Jan  7 10:52:11 node-01 crmd[1116]: notice: process_lrm_event: Operation notify-test_stop_0: ok (node=BOHOe7z-node01, call=9, rc=0, cib-update=16, confirmed=true)
...

Comment 9 errata-xmlrpc 2015-03-05 10:00:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0440.html