Description of problem: puppet-pacemaker instanceha does does not work correctly with more than 10 nodes due to regex issues ------------------------------------------------------------------------ Deploying instance-ha with 12 compute nodes. compute-1 always has issues for a deployment with 12 nodes. With 4 computes, it's o.k. We checked with 10 compute nodes as well, and 10 compute nodes are fine, too ~~~ pcs status (...) Clone Set: compute-unfence-trigger-clone [compute-unfence-trigger] Started: [ compute-0 compute-10 compute-11 compute-2 compute-3 compute-4 compute-5 compute-6 compute-7 compute-8 compute-9 ] Stopped: [ compute-1 controller-0 controller-1 controller-2 ] (...) ~~~ ~~~ [root@compute-1 ~]# hiera -c /etc/puppet/hiera.yaml tripleo::instanceha true [root@compute-1 ~]# [root@compute-1 ~]# journalctl | grep instanceha-role [root@compute-1 ~]# ~~~ Vs. ~~~ [root@compute-8 ~]# journalctl | grep instanceha-role | head -1 Nov 27 15:10:50 compute-8 puppet-user[33673]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-compute-8-compute-instanceha-role]/ensure) created [root@compute-8 ~]# ~~~ ~~~ [root@compute-1 ~]# cat /etc/puppet/modules/tripleo/manifests/profile/pacemaker/compute_instanceha.pp # == Class: tripleo::profile::pacemaker::compute_instanceha # # Configures Compute nodes for Instance HA # # === Parameters: # # [*step*] # (Optional) The current step in deployment. See tripleo-heat-templates # for more details. # Defaults to hiera('step') # # [*pcs_tries*] # (Optional) The number of times pcs commands should be retried. # Defaults to hiera('pcs_tries', 20) # # [*enable_instanceha*] # (Optional) Boolean driving the Instance HA controlplane configuration # Defaults to false # class tripleo::profile::pacemaker::compute_instanceha ( $step = Integer(hiera('step')), $pcs_tries = hiera('pcs_tries', 20), $enable_instanceha = hiera('tripleo::instanceha', false), ) { if $step >= 2 and $enable_instanceha { pacemaker::property { 'compute-instanceha-role-node-property': property => 'compute-instanceha-role', value => true, tries => $pcs_tries, node => $::hostname, } } } ~~~ ~~~ Nov 27 15:10:50 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-8-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:10:50 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-3-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:10:56 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-6-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:10:57 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-4-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:11:03 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-2-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:11:08 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-10-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:11:08 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-7-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:11:09 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-5-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:11:10 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-9-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:11:17 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-11-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:11:20 [39123] controller-0 cib: info: cib_perform_op: ++ <nvpair id="nodes-compute-0-compute-instanceha-role" name="compute-instanceha-role" value="true"/> Nov 27 15:23:55 [39123] controller-0 cib: info: cib_perform_op: ++ <expression attribute="compute-instanceha-role" id="location-compute-unfence-trigger-clone-rule-expr" operation="ne" value="true"/> Nov 27 15:24:01 [39123] controller-0 cib: info: cib_perform_op: ++ <expression attribute="compute-instanceha-role" id="location-nova-evacuate-rule-expr" operation="eq" value="true"/> (overcloud-Queens) [root@controller-0 ~]# cibadmin -Q | grep compute-instanceha-role <nvpair id="nodes-compute-3-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <nvpair id="nodes-compute-8-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <nvpair id="nodes-compute-6-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <nvpair id="nodes-compute-4-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <nvpair id="nodes-compute-2-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <nvpair id="nodes-compute-5-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <nvpair id="nodes-compute-7-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <nvpair id="nodes-compute-10-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <nvpair id="nodes-compute-9-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <nvpair id="nodes-compute-11-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <nvpair id="nodes-compute-0-compute-instanceha-role" name="compute-instanceha-role" value="true"/> <expression attribute="compute-instanceha-role" id="location-compute-unfence-trigger-clone-rule-expr" operation="ne" value="true"/> <expression attribute="compute-instanceha-role" id="location-nova-evacuate-rule-expr" operation="eq" value="true"/> (overcloud-Queens) [root@controller-0 ~]# ~~~ And I can "fix" this manually by running: ~~~ root@compute-1 ~]# pcs property set --node compute-1 compute-instanceha-role=true [root@compute-1 ~]# pcs property show (...) compute-0: compute-instanceha-role=true compute-1: compute-instanceha-role=true compute-10: compute-instanceha-role=true compute-11: compute-instanceha-role=true compute-2: compute-instanceha-role=true compute-3: compute-instanceha-role=true compute-4: compute-instanceha-role=true compute-5: compute-instanceha-role=true compute-6: compute-instanceha-role=true compute-7: compute-instanceha-role=true compute-8: compute-instanceha-role=true compute-9: compute-instanceha-role=true (...) ~~~ ~~~ pcs status (...)\ Clone Set: compute-unfence-trigger-clone [compute-unfence-trigger] Started: [ compute-0 compute-1 compute-10 compute-11 compute-2 compute-3 compute-4 compute-5 compute-6 compute-7 compute-8 compute-9 ] Stopped: [ controller-0 controller-1 controller-2 ] (...) ~~~ Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: Dimitri Savineau thinks it's due to the following: ~~~ I never tried instance-ha but looking at the puppet-pacemaker module this could come from an issue on how the property is checked in pcs based on the hostname. Because compute-1 is a subset of compute-10 and compute-11 and the code uses a "| grep hostname" [1] if compute-10 or compute-11 is configured before compute-1 then the property is not set. [1] https://github.com/openstack/puppet-pacemaker/blob/master/lib/puppet/provider/pcmk_property/default.rb#L50 ~~~
Verified , On an OSP13 11compute IHA ,with compute-1 and compute-10 (same fix subset) as the test subjects for successfull deployment. (undercloud) [stack@undercloud-0 ~]$ cat core_puddle_version 2018-12-07.1(undercloud) [stack@undercloud-0 ~]$ #pcs status: Started: [ overcloud-novacomputeiha-0 overcloud-novacomputeiha-1 overcloud-novacomputeiha-10 overcloud-novacomputeiha-2 overcloud-novacomputeiha-3 overcloud-novacomputeiha-4 overcloud-novacomputeiha-5 overcloud-novacomputeiha-6 overcloud-novacomputeiha-7 overcloud-novacomputeiha-8 overcloud-novacomputeiha-9 ] Stopped: [ controller-0 ] (undercloud) [stack@undercloud-0 ~]$ ansible compute -b -mshell -a'journalctl | grep instanceha-role 2>/dev/null|head -1' [WARNING]: Found both group and host with same name: undercloud compute-1 | SUCCESS | rc=0 >> Dec 11 22:09:03 overcloud-novacomputeiha-1 puppet-user[24188]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-1-compute-instanceha-role]/ensure) created compute-3 | SUCCESS | rc=0 >> Dec 11 22:09:12 overcloud-novacomputeiha-3 puppet-user[24202]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-3-compute-instanceha-role]/ensure) created compute-4 | SUCCESS | rc=0 >> Dec 11 22:09:02 overcloud-novacomputeiha-4 puppet-user[24128]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-4-compute-instanceha-role]/ensure) created compute-2 | SUCCESS | rc=0 >> Dec 11 22:08:52 overcloud-novacomputeiha-2 puppet-user[23984]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-2-compute-instanceha-role]/ensure) created compute-0 | SUCCESS | rc=0 >> Dec 11 22:09:12 overcloud-novacomputeiha-0 puppet-user[24000]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-0-compute-instanceha-role]/ensure) created compute-6 | SUCCESS | rc=0 >> Dec 11 22:09:07 overcloud-novacomputeiha-6 puppet-user[24242]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-6-compute-instanceha-role]/ensure) created compute-5 | SUCCESS | rc=0 >> Dec 11 22:08:58 overcloud-novacomputeiha-5 puppet-user[24166]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-5-compute-instanceha-role]/ensure) created compute-7 | SUCCESS | rc=0 >> Dec 11 22:09:12 overcloud-novacomputeiha-7 puppet-user[41842]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-7-compute-instanceha-role]/ensure) created compute-9 | SUCCESS | rc=0 >> Dec 11 22:09:05 overcloud-novacomputeiha-9 puppet-user[24135]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-9-compute-instanceha-role]/ensure) created compute-8 | SUCCESS | rc=0 >> Dec 11 22:09:12 overcloud-novacomputeiha-8 puppet-user[41965]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-8-compute-instanceha-role]/ensure) created compute-10 | SUCCESS | rc=0 >> Dec 11 22:08:37 overcloud-novacomputeiha-10 puppet-user[23967]: (/Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha/Pacemaker::Property[compute-instanceha-role-node-property]/Pcmk_property[property-overcloud-novacomputeiha-10-compute-instanceha-role]/ensure) created
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0068