Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1300729

Summary:	Pacemaker failed to start a systemd resource: 'not running'
Product:	Red Hat Enterprise Linux 7	Reporter:	Matti Linnanvuori <mattilinnanvuori>
Component:	pacemaker	Assignee:	Ken Gaillot <kgaillot>
Status:	CLOSED NOTABUG	QA Contact:	cluster-qe <cluster-qe>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	7.2	CC:	abeekhof, cluster-maint, jruemker
Target Milestone:	rc
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-01-21 15:47:32 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Matti Linnanvuori 2016-01-21 14:56:38 UTC

Description of problem:

Pacemaker failed to start a systemd resource: 'not running'

Version-Release number of selected component (if applicable):

pacemaker 1.1.13-10.el7

How reproducible:

Not reproducible.

Steps to Reproduce:
1. Create a three-node cluster with a standby node.
2. Add a systemd resource.
3. Watch the resource.

Actual results:

The resource failed to start on an online node.

Expected results:

The resource should start on an online node.

Additional info:

sudo pcs status
Cluster name: MDCS
Last updated: Thu Jan 21 14:00:03 2016		Last change: Thu Jan 21 13:31:31 2016 by hacluster via crm_attribute on tauti
Stack: corosync
Current DC: tauti (version 1.1.13-10.el7-44eb2dd) - partition with quorum
3 nodes and 13 resources configured

Node tauti: standby
Node teema: standby
Online: [ tauko ]

Full list of resources:

 DMS-IP	(ocf::heartbeat:IPaddr2):	Started tauko
 Resource Group: DMS
     apache2	(systemd:httpd):	Started tauko
     DMS-GW	(lsb:dms):	Started tauko
 Resource Group: PMC
     pmc-routing	(systemd:pmc-routing):	Stopped
     pmc-email-amqp-dispatcher	(systemd:pmc-email-amqp-dispatcher):	Stopped
     pmc-email-main	(systemd:pmc-email-main):	Stopped
     pmc-smpp-receive-json	(systemd:pmc-smpp-receive-json):	Stopped
     pmc-smpp-receive-dlr	(systemd:pmc-smpp-receive-dlr):	Stopped
     pmc-smpp-receive-msg	(systemd:pmc-smpp-receive-msg):	Stopped
     postfix	(systemd:postfix):	Stopped
 Resource Group: kannel
     kannel-bearerbox	(systemd:kannel-bearerbox):	Started tauko
     kannel-smsbox	(systemd:kannel-smsbox):	Started tauko
     kannel-wapbox	(systemd:kannel-wapbox):	Started tauko

Failed Actions:
* pmc-routing_start_0 on tauko 'not running' (7): call=60, status=complete, exitreason='none',
    last-rc-change='Wed Jan 20 16:07:59 2016', queued=0ms, exec=2008ms


PCSD Status:
  tauko: Online
  tauti: Online
  teema: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

tauko /var/log/messages:
Jan 20 16:08:01 localhost crmd[1491]:  notice: Operation pmc-routing_start_0: not running (node=tauko, call=60, rc=7, cib-update=32, confirmed=true)

tauti /var/log/messages:
Jan 20 16:07:59 localhost crmd[2145]:  notice: Transition 6 (Complete=20, Pending=0, Fired=0, Skipped=3, Incomplete=25, Source=/var/lib/pacemaker/pengine/pe-error-162.bz2): Stopped
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   DMS-GW#011(tauko)
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   pmc-routing#011(tauko)
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   pmc-email-amqp-dispatcher#011(tauko)
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   pmc-email-main#011(tauko)
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   pmc-smpp-receive-json#011(tauko)
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   pmc-smpp-receive-dlr#011(tauko)
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   pmc-smpp-receive-msg#011(tauko)
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   postfix#011(tauko)
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   kannel-bearerbox#011(tauko)
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   kannel-smsbox#011(tauko)
Jan 20 16:07:59 localhost pengine[2144]:  notice: Start   kannel-wapbox#011(tauko)
Jan 20 16:07:59 localhost crmd[2145]:  notice: Initiating action 12: start DMS-GW_start_0 on tauko
Jan 20 16:07:59 localhost pengine[2144]:  notice: Calculated Transition 7: /var/lib/pacemaker/pengine/pe-input-268.bz2
Jan 20 16:07:59 localhost crmd[2145]:  notice: Initiating action 13: monitor DMS-GW_monitor_60000 on tauko
Jan 20 16:07:59 localhost crmd[2145]:  notice: Initiating action 18: start pmc-routing_start_0 on tauko
Jan 20 16:08:01 localhost crmd[2145]: warning: Action 18 (pmc-routing_start_0) on tauko failed (target: 0 vs. rc: 7): Error
Jan 20 16:08:01 localhost crmd[2145]:  notice: Transition aborted by pmc-routing_start_0 'modify' on tauko: Event failed (magic=0:7;18:7:0:da931aba-558d-4290-a05b-6f5971f308e0, cib=0.290.59, source=match_graph_event:381, 0)
Jan 20 16:08:01 localhost crmd[2145]: warning: Action 18 (pmc-routing_start_0) on tauko failed (target: 0 vs. rc: 7): Error
Jan 20 16:08:01 localhost crmd[2145]:  notice: Transition 7 (Complete=7, Pending=0, Fired=0, Skipped=1, Incomplete=21, Source=/var/lib/pacemaker/pengine/pe-input-268.bz2): Stopped
Jan 20 16:08:01 localhost pengine[2144]: warning: Processing failed op start for pmc-routing on tauko: not running (7)
Jan 20 16:08:01 localhost pengine[2144]: warning: Processing failed op start for pmc-routing on tauko: not running (7)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Recover pmc-routing#011(Started tauko)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Start   pmc-email-amqp-dispatcher#011(tauko)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Start   pmc-email-main#011(tauko)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Start   pmc-smpp-receive-json#011(tauko)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Start   pmc-smpp-receive-dlr#011(tauko)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Start   pmc-smpp-receive-msg#011(tauko)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Start   postfix#011(tauko)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Start   kannel-bearerbox#011(tauko)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Start   kannel-smsbox#011(tauko)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Start   kannel-wapbox#011(tauko)
Jan 20 16:08:01 localhost pengine[2144]:  notice: Calculated Transition 8: /var/lib/pacemaker/pengine/pe-input-269.bz2
Jan 20 16:08:01 localhost pengine[2144]: warning: Processing failed op start for pmc-routing on tauko: not running (7)
Jan 20 16:08:01 localhost pengine[2144]: warning: Processing failed op start for pmc-routing on tauko: not running (7)
Jan 20 16:08:01 localhost pengine[2144]: warning: Forcing pmc-routing away from tauko after 1000000 failures (max=1000000)

Comment 2 John Ruemker 2016-01-21 15:47:32 UTC

(In reply to Matti Linnanvuori from comment #0)
> Description of problem:
> 
> Pacemaker failed to start a systemd resource: 'not running'
> 

Hello,
The problem you described would require further investigation in order to identify the true cause and any available means to resolve it, which is best done in a support case rather than here in bugzilla, which is intended more for reporting bugs and undesired behaviors in the product itself.  

I would like to request that you please engage Red Hat Global Support Services through one of the methods described at:

  https://access.redhat.com/start/how-to-engage-red-hat-support

From there, we'll collect some additional information from you and take a closer look at the specifics of this incident to help you resolve the underlying problem.

In an attempt to give you some guidance in the meantime: the problem you described doesn't look to be anything unexpected from pacemaker itself.  It attempted to start the pmc-routing systemd service and encountered a failure; by default, the cluster property start-failure-is-fatal is set, which causes a start failure to automatically be treated as a reason to ban the local node from further attempts to start that resource and to try on another node.  Since both of the other nodes are in standby, there is nowhere else to start, so it gives up.  

The real problem here is that a systemd start of pmc-routing failed.  We'll need to look more closely at why that is and address it.  This is what we can help you with in a support case.

Since there doesn't appear to be anything in this data suggesting that there is a problem in need of a fix in pacemaker, I'm going to close this out.  If we discover in the course of our investigation through a support case that there is some unexpected behavior in one of our products, then Red Hat Global Support Services will coordinate with the developers of that component to look into it and pursue any necessary fixes. 

Regards,
John Ruemker, RHCA
Principal Software Maintenance Engineer
Red Hat Global Support Services