1472928 – On pacemaker remote node stonith is set unconditionally

Bug 1472928 - On pacemaker remote node stonith is set unconditionally

Summary: On pacemaker remote node stonith is set unconditionally

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	puppet-tripleo
Sub Component:
Version:	11.0 (Ocata)
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	z2
Target Release:	11.0 (Ocata)
Assignee:	Michele Baldessari
QA Contact:	Udi Shkalim
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-07-19 15:43 UTC by Marian Krcmarik
Modified:	2017-09-13 21:43 UTC (History)
CC List:	7 users (show)
Fixed In Version:	puppet-tripleo-6.5.0-6.el7ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-09-13 21:43:17 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1696336	None	None	None	2017-07-19 15:43:39 UTC
OpenStack gerrit	485251	None	None	None	2017-07-20 06:08:27 UTC
Red Hat Product Errata	RHBA-2017:2721	normal	SHIPPED_LIVE	Red Hat OpenStack Platform 11.0 director Bug Fix Advisory	2017-09-14 01:39:22 UTC

Description Marian Krcmarik 2017-07-19 15:43:40 UTC

Description of problem:
We have the following code currently in the tripleo pacemaker_remote manifest:
class tripleo::profile::base::pacemaker_remote (
  $remote_authkey,
  $pcs_tries = hiera('pcs_tries', 20),
  $enable_fencing = hiera('enable_fencing', false),
  $step = hiera('step'),
) {
  class { '::pacemaker::remote':
    remote_authkey => $remote_authkey,
  }
  $enable_fencing_real = str2bool($enable_fencing) and $step >= 5

  class { '::pacemaker::stonith':
    disable => !$enable_fencing_real,
    tries => $pcs_tries,
  }
....

It makes no sense to enforce the stonith on the remote nodes and we should probably just enforce
it on $pacemaker_master anyway. While this works in general it creates extra CIB changes for nothing and we did see an issue when working with container HA (due to the remotes not being up already)

Version-Release number of selected component (if applicable):


How reproducible:
"Always" in recent days

Steps to Reproduce:
1. Deploy OSP11 on pacemaker remote nodes with composable roles of rabbitmq and galera.

Actual results:
Error: Could not find dependency Exec[wait-for-settle] for Pcmk_property[property--stonith-enabled] at /etc/puppet/modules/pacemaker/manifests/property.pp:78

Expected results:
Successful deployment

Additional info:

Comment 6 Udi Shkalim 2017-08-31 09:59:06 UTC

verified on: puppet-tripleo-6.5.0-8.el7ost.noarch

Deployment passed:
[root@controller-2 ~]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-2 (version 1.1.16-12.el7-94ff4df) - partition with quorum
Last updated: Wed Aug 30 16:30:07 2017
Last change: Wed Aug 30 16:24:51 2017 by root via cibadmin on controller-0

6 nodes configured
34 resources configured

Online: [ controller-0 controller-1 controller-2 ]
RemoteOnline: [ messaging-0 messaging-1 messaging-2 ]

Full list of resources:

 messaging-0    (ocf::pacemaker:remote):        Started controller-0
 messaging-1    (ocf::pacemaker:remote):        Started controller-1
 messaging-2    (ocf::pacemaker:remote):        Started controller-2
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ messaging-0 messaging-1 messaging-2 ]
     Stopped: [ controller-0 controller-1 controller-2 ]
 Master/Slave Set: galera-master [galera]
     Masters: [ controller-0 controller-1 controller-2 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 Master/Slave Set: redis-master [redis]
     Masters: [ controller-2 ]
     Slaves: [ controller-0 controller-1 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 ip-192.168.24.12       (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-10.0.0.107  (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.1.19 (ocf::heartbeat:IPaddr2):       Started controller-2
 ip-172.17.1.14 (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-172.17.3.18 (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.4.18 (ocf::heartbeat:IPaddr2):       Started controller-2
 Clone Set: haproxy-clone [haproxy]
     Started: [ controller-0 controller-1 controller-2 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started controller-0

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 8 errata-xmlrpc 2017-09-13 21:43:17 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2721

Note You need to log in before you can comment on or make changes to this bug.