1436696 – pacemaker can catch systemd unit reload as not running

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1436696 - pacemaker can catch systemd unit reload as not running

Summary: pacemaker can catch systemd unit reload as not running

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	pacemaker
Sub Component:
Version:	7.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	7.4
Assignee:	Ken Gaillot
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-03-28 13:00 UTC by Markus Frosch
Modified:	2017-08-01 17:54 UTC (History)
CC List:	3 users (show)
Fixed In Version:	pacemaker-1.1.16-3.el7
Doc Type:	Bug Fix
Doc Text:	Cause: If a systemd-based cluster resource were in the "reloading" state when monitored by Pacemaker, Pacemaker would consider that to be a monitor failure. Consequence: A systemd-based resource could be unnecessarily recovered by the cluster. Fix: Pacemaker now considers "reloading" to be a successful monitor state for a systemd-based resource. Result: Reloads of a systemd-based resource do not trigger recovery.
Clone Of:
Environment:
Last Closed:	2017-08-01 17:54:39 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Cluster Labs	5307	0	None	None	None	2017-03-28 13:00:46 UTC
Red Hat Product Errata	RHEA-2017:1862	0	normal	SHIPPED_LIVE	pacemaker bug fix and enhancement update	2017-08-01 18:04:15 UTC

Description Markus Frosch 2017-03-28 13:00:47 UTC

Description of problem:

When a reload via systemd and the monitor action of pacemaker run at the same time, pacemaker will detect the resource "not running".

Resulting in an increased fail-count and pacemaker restarting the service.

This has been addressed at upstream, but not yet released: https://github.com/ClusterLabs/pacemaker/pull/1191

Is there a chance to include the patch in an upcoming pacemaker update?

Comment 2 Ken Gaillot 2017-03-28 14:15:04 UTC

Yes, actually the fix is already in the build planned for 7.4.

QA: To test, create a dummy systemd service on all nodes with:

yum install systemd-python
cat > /usr/lib/systemd/system/bz1436696.service <<EOF
[Unit]
Description=BZ#1436696 Test Unit

[Service]
Type=notify
ExecStart=/usr/bin/python -c 'import time, systemd.daemon; systemd.daemon.notify("READY=1"); time.sleep(86400)'
ExecStop=/bin/sh -c '[ -n "$MAINPID" ] && kill -s KILL $MAINPID'
ExecReload=/bin/sh -c 'sleep 10'
EOF
systemctl daemon-reload

Then configure a resource:

# pcs resource create bz1436696test systemd:bz1436696 op monitor interval=9s

Then on the node running the service:

# systemctl reload bz1436696

Before the fix, it should result in a monitor error; after the fix, there should be no error.

Comment 5 michal novacek 2017-05-24 15:30:39 UTC

I have verified that reloading systemd unit at the time of monitor will not
cause monitoring action to fail in pacemaker-1.1.16-9.

-----

1/ Create new dummy systemctl resource which has reload action (sleep) taking
    longer than the monitor interval of the resource [1]

2/ Create cluster with fencing and newly created systemd resource with shorter
    monitor interval than the created dummy resource have [2]

3/ Start the resource 
[root@virt-135 ~]# pcs resource
...
 bz1436696test  (systemd:bz1436696):    Started virt-135

[root@virt-135 ~]# systemctl is-active bz1436696
active


before the patch (pacemaker-1.1.15-11.el7)
==========================================


[root@virt-135 ~]# systemctl reload bz1436696
Job for bz1436696.service canceled.

[root@virt-135 ~]# date
Wed May 24 17:19:15 CEST 2017

[root@virt-135 ~]# grep bz1436696test_monitor_9000 /var/log/cluster/corosync.log
May 24 17:18:35 [8480] virt-135  cib:     info: cib_perform_op: +  /cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='bz1436696test']/lrm_rsc_op[@id='bz1436696test_monitor_9000']:  @transition-key=5:18:0:415724a7-08e1-4dea-b809-56372ffe1866, @transition-magic=0:0;5:18:0:415724a7-08e1-4dea-b809-56372ffe1866, @call-id=93, @last-rc-change=1495639115, @exec-time=3, @queue-time=1
> May 24 17:19:20 [8485] virt-135 crmd:     info: process_lrm_event:      Result of monitor operation for bz1436696test on virt-135: 7 (not running) | call=93 key=bz1436696test_monitor_9000 confirmed=false cib-update=93
May 24 17:19:20 [8485] virt-135 crmd:     info: process_lrm_event:      Result of monitor operation for bz1436696test on virt-135: Cancelled | call=93 key=bz1436696test_monitor_9000 confirmed=true


after the patch (pacemaker-1.1.16-9.el7)
========================================

[root@virt-135 ~]# date
Wed May 24 17:26:47 CEST 2017
[root@virt-135 ~]# systemctl reload bz1436696
[root@virt-135 ~]# echo $?
0

[root@virt-135 ~]# grep bz1436696test_monitor_9000 /var/log/cluster/corosync.log
May 24 17:25:30 [31737] virt-135 crmd:   notice: te_rsc_command:        Initiating monitor operation bz1436696test_monitor_0 on virt-134 | action 15
May 24 17:25:30 [31732] virt-135  cib:     info: cib_perform_op:        ++                                                                <lrm_rsc_op id="bz1436696test_last_0" operation_key="bz1436696test_monitor_0" operation="monitor" crm-debug-origin="do_update_resource" crm_feature_set="3.0.12" transition-key="15:1:7:ffa92c87-7fe1-4557-bdcb-3b3610c343e7" transition-magic="0:7;15:1:7:ffa92c87-7fe1-4557-bdcb-3b3610c343e7" on_node="virt-134" call-id="30" rc-code="7" op-stat
May 24 17:25:30 [31737] virt-135 crmd:     info: match_graph_event:     Action bz1436696test_monitor_0 (15) confirmed on virt-134 (rc=7)

(no monitor failures after systemctl reload have been run)

---

(1) systemctl resource
[root@virt-135 ~]# yum -y install systemd-python

[root@virt-135 ~]# systemctl cat bz1436696
# /usr/lib/systemd/system/bz1436696.service
[Unit]
Description=BZ#1436696 Test Unit

[Service]
Type=notify
ExecStart=/usr/bin/python -c 'import time, systemd.daemon; systemd.daemon.notify("READY=1"); time.sleep(86400)'
ExecStop=/bin/sh -c '[ -n "" ] && kill -s KILL '
ExecReload=/bin/sh -c 'sleep 10'

# /run/systemd/system/bz1436696.service.d/50-pacemaker.conf
[Unit]
Description=Cluster Controlled bz1436696
Before=pacemaker.service

[Service]
Restart=no


(2) resource setup
[root@virt-135 ~]# pcs resource
 Clone Set: dlm-clone [dlm]
     Started: [ virt-134 virt-135 virt-136 ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ virt-134 virt-135 virt-136 ]
 bz1436696test  (systemd:bz1436696):    Started virt-135

[root@virt-135 ~]# pcs resource --full
 Clone: dlm-clone
  Meta Attrs: interleave=true ordered=true 
  Resource: dlm (class=ocf provider=pacemaker type=controld)
   Operations: monitor interval=30s on-fail=fence (dlm-monitor-interval-30s)
               start interval=0s timeout=90 (dlm-start-interval-0s)
               stop interval=0s timeout=100 (dlm-stop-interval-0s)
 Clone: clvmd-clone
  Meta Attrs: interleave=true ordered=true 
  Resource: clvmd (class=ocf provider=heartbeat type=clvm)
   Attributes: with_cmirrord=1
   Operations: monitor interval=30s on-fail=fence (clvmd-monitor-interval-30s)
               start interval=0s timeout=90 (clvmd-start-interval-0s)
               stop interval=0s timeout=90 (clvmd-stop-interval-0s)
 Resource: bz1436696test (class=systemd type=bz1436696)
  Operations: monitor interval=9s (bz1436696test-monitor-interval-9s)
              start interval=0s timeout=100 (bz1436696test-start-interval-0s)
              stop interval=0s timeout=100 (bz1436696test-stop-interval-0s)

Comment 6 errata-xmlrpc 2017-08-01 17:54:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1862

Note You need to log in before you can comment on or make changes to this bug.