Bug 1472928 - On pacemaker remote node stonith is set unconditionally
On pacemaker remote node stonith is set unconditionally
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo (Show other bugs)
11.0 (Ocata)
Unspecified Unspecified
high Severity high
: z2
: 11.0 (Ocata)
Assigned To: Michele Baldessari
Udi Shkalim
: Triaged, ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-19 11:43 EDT by Marian Krcmarik
Modified: 2017-09-13 17:43 EDT (History)
7 users (show)

See Also:
Fixed In Version: puppet-tripleo-6.5.0-6.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-09-13 17:43:17 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1696336 None None None 2017-07-19 11:43 EDT
OpenStack gerrit 485251 None None None 2017-07-20 02:08 EDT

  None (edit)
Description Marian Krcmarik 2017-07-19 11:43:40 EDT
Description of problem:
We have the following code currently in the tripleo pacemaker_remote manifest:
class tripleo::profile::base::pacemaker_remote (
  $remote_authkey,
  $pcs_tries = hiera('pcs_tries', 20),
  $enable_fencing = hiera('enable_fencing', false),
  $step = hiera('step'),
) {
  class { '::pacemaker::remote':
    remote_authkey => $remote_authkey,
  }
  $enable_fencing_real = str2bool($enable_fencing) and $step >= 5

  class { '::pacemaker::stonith':
    disable => !$enable_fencing_real,
    tries => $pcs_tries,
  }
....

It makes no sense to enforce the stonith on the remote nodes and we should probably just enforce
it on $pacemaker_master anyway. While this works in general it creates extra CIB changes for nothing and we did see an issue when working with container HA (due to the remotes not being up already)

Version-Release number of selected component (if applicable):


How reproducible:
"Always" in recent days

Steps to Reproduce:
1. Deploy OSP11 on pacemaker remote nodes with composable roles of rabbitmq and galera.

Actual results:
Error: Could not find dependency Exec[wait-for-settle] for Pcmk_property[property--stonith-enabled] at /etc/puppet/modules/pacemaker/manifests/property.pp:78

Expected results:
Successful deployment

Additional info:
Comment 6 Udi Shkalim 2017-08-31 05:59:06 EDT
verified on: puppet-tripleo-6.5.0-8.el7ost.noarch

Deployment passed:
[root@controller-2 ~]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-2 (version 1.1.16-12.el7-94ff4df) - partition with quorum
Last updated: Wed Aug 30 16:30:07 2017
Last change: Wed Aug 30 16:24:51 2017 by root via cibadmin on controller-0

6 nodes configured
34 resources configured

Online: [ controller-0 controller-1 controller-2 ]
RemoteOnline: [ messaging-0 messaging-1 messaging-2 ]

Full list of resources:

 messaging-0    (ocf::pacemaker:remote):        Started controller-0
 messaging-1    (ocf::pacemaker:remote):        Started controller-1
 messaging-2    (ocf::pacemaker:remote):        Started controller-2
 Clone Set: rabbitmq-clone [rabbitmq]
     Started: [ messaging-0 messaging-1 messaging-2 ]
     Stopped: [ controller-0 controller-1 controller-2 ]
 Master/Slave Set: galera-master [galera]
     Masters: [ controller-0 controller-1 controller-2 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 Master/Slave Set: redis-master [redis]
     Masters: [ controller-2 ]
     Slaves: [ controller-0 controller-1 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 ip-192.168.24.12       (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-10.0.0.107  (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.1.19 (ocf::heartbeat:IPaddr2):       Started controller-2
 ip-172.17.1.14 (ocf::heartbeat:IPaddr2):       Started controller-0
 ip-172.17.3.18 (ocf::heartbeat:IPaddr2):       Started controller-1
 ip-172.17.4.18 (ocf::heartbeat:IPaddr2):       Started controller-2
 Clone Set: haproxy-clone [haproxy]
     Started: [ controller-0 controller-1 controller-2 ]
     Stopped: [ messaging-0 messaging-1 messaging-2 ]
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started controller-0

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
Comment 8 errata-xmlrpc 2017-09-13 17:43:17 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2721

Note You need to log in before you can comment on or make changes to this bug.