Bug 1472928 - On pacemaker remote node stonith is set unconditionally
Summary: On pacemaker remote node stonith is set unconditionally
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z2
: 11.0 (Ocata)
Assignee: Michele Baldessari
QA Contact: Udi Shkalim
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-19 15:43 UTC by Marian Krcmarik
Modified: 2017-09-13 21:43 UTC (History)
7 users (show)

Fixed In Version: puppet-tripleo-6.5.0-6.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-09-13 21:43:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1696336 0 None None None 2017-07-19 15:43:39 UTC
OpenStack gerrit 485251 0 None None None 2017-07-20 06:08:27 UTC
Red Hat Product Errata RHBA-2017:2721 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 director Bug Fix Advisory 2017-09-14 01:39:22 UTC

Description Marian Krcmarik 2017-07-19 15:43:40 UTC
Description of problem:
We have the following code currently in the tripleo pacemaker_remote manifest:
class tripleo::profile::base::pacemaker_remote (
  $remote_authkey,
  $pcs_tries = hiera('pcs_tries', 20),
  $enable_fencing = hiera('enable_fencing', false),
  $step = hiera('step'),
) {
  class { '::pacemaker::remote':
    remote_authkey => $remote_authkey,
  }
  $enable_fencing_real = str2bool($enable_fencing) and $step >= 5

  class { '::pacemaker::stonith':
    disable => !$enable_fencing_real,
    tries => $pcs_tries,
  }
....

It makes no sense to enforce the stonith on the remote nodes and we should probably just enforce
it on $pacemaker_master anyway. While this works in general it creates extra CIB changes for nothing and we did see an issue when working with container HA (due to the remotes not being up already)

Version-Release number of selected component (if applicable):


How reproducible:
"Always" in recent days

Steps to Reproduce:
1. Deploy OSP11 on pacemaker remote nodes with composable roles of rabbitmq and galera.

Actual results:
Error: Could not find dependency Exec[wait-for-settle] for Pcmk_property[property--stonith-enabled] at /etc/puppet/modules/pacemaker/manifests/property.pp:78

Expected results:
Successful deployment

Additional info:

Comment 6 Udi Shkalim 2017-08-31 09:59:06 UTC
verified on: puppet-tripleo-6.5.0-8.el7ost.noarch

Deployment passed:
[root@controller-2 ~]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-2 (version 1.1.16-12.el7-94ff4df) - partition with quorum
Last updated: Wed Aug 30 16:30:07 2017
Last change: Wed Aug 30 16:24:51 2017 by root via cibadmin on controller-0

6 nodes configured
34 resources configured

Online: [ controller-0 controller-1 controller-2 ]
RemoteOnline: [ messaging-0 messaging-1 messaging-2 ]

Full list of resources:

 messaging-0    (ocf::pacemaker:remote):        Started controller-0
 messaging-1    (ocf::pacemaker:remote):        Started controller-1
 messaging-2    (ocf::pacemaker:remote):        Started controller-2
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ messaging-0 messaging-1 messaging-2 ]
     Stopped: [ controller-0 controller-1 controller-2 ]
 Master/Slave Set: galera-master [galera]
     Masters: [ controller-0 controller-1 controller-2 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 Master/Slave Set: redis-master [redis]
     Masters: [ controller-2 ]
     Slaves: [ controller-0 controller-1 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 ip-192.168.24.12       (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-10.0.0.107  (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.1.19 (ocf::heartbeat:IPaddr2):       Started controller-2
 ip-172.17.1.14 (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-172.17.3.18 (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.4.18 (ocf::heartbeat:IPaddr2):       Started controller-2
 Clone Set: haproxy-clone [haproxy]
     Started: [ controller-0 controller-1 controller-2 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started controller-0

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Comment 8 errata-xmlrpc 2017-09-13 21:43:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2721


Note You need to log in before you can comment on or make changes to this bug.