Description of problem: pcs create constraint failed due to wrong naming of the vip resources. The os-collect-config log shows several messages such as: Error: Resource 'ip-fd00:fd00:fd00:2000:f816:3eff:fe11:920' does not exist while the resource name is (after applying the patch in BZ#1298391) ip-fd00.fd00.fd00.2000.f816.3eff.fe11.920 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 Version-Release number of selected component (if applicable): I'm doing the test following the instructions in: https://etherpad.openstack.org/p/tripleo-ipv6-support and enabling pacemaker by passing an additional $THT/environments/puppet-pacemaker.yaml environment file openstack-puppet-modules-7.0.1-1.el7.noarch How reproducible: 100% Steps to Reproduce: Deploy ipv6 enabled overcloud Actual results: The deployment fails Expected results: The deployment succeeds. Additional info: Attaching the os-collect-config journal where the errors show up. [root@overcloud-controller-0 ~]# pcs status | grep ip Cluster name: tripleo_cluster ip-192.0.2.23 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-2001.db8.fd00.1000.f816.3eff.fee9.4a21 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-fd00.fd00.fd00.3000.f816.3eff.fe1b.5223 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-fd00.fd00.fd00.2000.f816.3eff.fed8.4b9b (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-fd00.fd00.fd00.4000.f816.3eff.fe55.5f05 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 ip-fd00.fd00.fd00.2000.f816.3eff.fe11.920 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0
Created attachment 1114941 [details] os-collect-config
From the pacemaker puppet module resource::ip: # pcs dislikes colons from IPv6 addresses. Replacing them with dots. $resource_name = regsubst($ip_address, '(:)', '.', 'G') When tht creates the constraint, it does not do the same, simply passes in: first_resource => "ip-${control_vip}", which causes the mismatch here. Seems to me that either we munge the ip to make a pcs-compliant name, or explicitly set a name when creating the resource::ip via the name param for that class (and then use the same name in the constraint ref). Not sure which is better, but I think either would solve the issue.
(In reply to Jason Guiditta from comment #2) > From the pacemaker puppet module resource::ip: > > # pcs dislikes colons from IPv6 addresses. Replacing them with dots. > $resource_name = regsubst($ip_address, '(:)', '.', 'G') I'm not sure that we can do the required munging inside of TripleO Heat Templates, it might be required to do a text replacement inside of Puppet. I think Puppet actually takes the VIP and creates this name, I don't think we output it from THT.
I'm going to munge it inside puppet.
It's been already fixed upstream: https://github.com/redhat-openstack/puppet-pacemaker/commit/01c6000db5040055372021ad5a3231840ccb8bba
Tested the patch and it function properly: pacemaker::resource::ip {'ip-2001::59': ip_address => '2001::59', nic => 'eth1', cidr_netmask => '', post_success_sleep => 0, tries => 1, try_sleep => 1, require => 'Class[pacemaker::corosync]', } pcs resource show ip-2001..59 Resource: ip-2001..59 (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=2001::59 nic=eth1 Operations: start interval=0s timeout=20s (ip-2001..59-start-interval-0s) stop interval=0s timeout=20s (ip-2001..59-stop-interval-0s) monitor interval=10s timeout=20s (ip-2001..59-monitor-interval-10s)
(In reply to Sofer Athlan-Guyot from comment #6) > Tested the patch and it function properly: > > pacemaker::resource::ip {'ip-2001::59': > ip_address => '2001::59', > nic => 'eth1', > cidr_netmask => '', > post_success_sleep => 0, > tries => 1, > try_sleep => 1, > require => 'Class[pacemaker::corosync]', > } > > pcs resource show ip-2001..59 > > Resource: ip-2001..59 (class=ocf provider=heartbeat type=IPaddr2) > Attributes: ip=2001::59 nic=eth1 > Operations: start interval=0s timeout=20s (ip-2001..59-start-interval-0s) > stop interval=0s timeout=20s (ip-2001..59-stop-interval-0s) > monitor interval=10s timeout=20s > (ip-2001..59-monitor-interval-10s) I already had the patch when testing(see the resources names in the initial description - ip-2001.db8.fd00.1000.f816.3eff.fee9.4a21). The problem is that when running pcs constraint command it uses the the name containing ':' - e.g /Exec[Creating order constraint public_vip-then-haproxy]/returns: change from notrun to 0 failed: /usr/sbin/pcs constraint order start ip-2001:db8:fd00:1000:f816:3eff:fee9:4a21 then start haproxy-clone kind=Optional returned 1 instead of one of [0][0m
Yeap, I'm working on it right now, thanks for the clarification.
Waiting for review in https://github.com/redhat-openstack/puppet-pacemaker/pull/70 This has been tested with the following manifest: class {'::pacemaker::corosync': cluster_name => 'basic_cluster', cluster_members => 'node1 node2', } class {'::pacemaker::stonith': disable => true, } pacemaker::resource::ip {'ip-2001::59': ip_address => '2001::59', nic => 'eth1', cidr_netmask => '', post_success_sleep => 0, tries => 1, try_sleep => 1, require => 'Class[pacemaker::corosync]', } pacemaker::resource::ip {'ip-2001::60': ip_address => '2001::60', nic => 'eth1', cidr_netmask => '', post_success_sleep => 0, tries => 1, try_sleep => 1, require => 'Class[pacemaker::corosync]', } # testing location pacemaker::constraint::location { 'ipv6-on-node1': resource => 'ip-2001::59', location => 'node1', score => '100', } # testing colocation pacemaker::constraint::colocation { 'ipv6-on-same-node': source => 'ip-2001::60', target => 'ip-2001::59', score => 'INFINITY', require => ['Pacemaker::resource::ip[ip-2001::60]', 'Pacemaker::resource::ip[ip-2001::60]'], } # testing order pacemaker::constraint::base { 'ipv6-59-before-ipv6-60': constraint_type => 'order', first_resource => 'ip-2001::59', second_resource => 'ip-2001::60', first_action => 'start', second_action => 'start', } and with this one for the deletion: class {'::pacemaker::corosync': cluster_name => 'basic_cluster', cluster_members => 'node1 node2', } class {'::pacemaker::stonith': disable => true, } # testing location pacemaker::constraint::location { 'ipv6-on-node1': ensure => absent, resource => 'ip-2001::59', location => 'node1', score => '100', } # testing colocation pacemaker::constraint::colocation { 'ipv6-on-same-node': ensure => absent, source => 'ip-2001::60', target => 'ip-2001::59', score => 'INFINITY', } # testing order pacemaker::constraint::base { 'ipv6-59-before-ipv6-60': ensure => absent, constraint_type => 'order', first_resource => 'ip-2001::59', second_resource => 'ip-2001::60', first_action => 'start', second_action => 'start', } Can somebody validate this on a OSP7 deployment ?
[root@overcloud-controller-0 ~]# rpm -qa | grep puppet-modules openstack-puppet-modules-2015.1.8-41.el7ost.noarch [root@overcloud-controller-0 ~]# pcs constraint list --full | grep ip- start ip-fd00.fd00.fd00.2000.f816.3eff.fe4a.871 then start haproxy-clone (kind:Optional) (id:order-ip-fd00.fd00.fd00.2000.f816.3eff.fe4a.871-haproxy-clone-Optional) start ip-192.0.2.6 then start haproxy-clone (kind:Optional) (id:order-ip-192.0.2.6-haproxy-clone-Optional) start ip-fd00.fd00.fd00.4000.f816.3eff.fe88.dbfc then start haproxy-clone (kind:Optional) (id:order-ip-fd00.fd00.fd00.4000.f816.3eff.fe88.dbfc-haproxy-clone-Optional) start ip-2001.db8.fd00.1000.f816.3eff.feb3.25fb then start haproxy-clone (kind:Optional) (id:order-ip-2001.db8.fd00.1000.f816.3eff.feb3.25fb-haproxy-clone-Optional) start ip-fd00.fd00.fd00.2000.f816.3eff.feeb.38af then start haproxy-clone (kind:Optional) (id:order-ip-fd00.fd00.fd00.2000.f816.3eff.feeb.38af-haproxy-clone-Optional) start ip-fd00.fd00.fd00.3000.f816.3eff.fe2e.5873 then start haproxy-clone (kind:Optional) (id:order-ip-fd00.fd00.fd00.3000.f816.3eff.fe2e.5873-haproxy-clone-Optional) ip-192.0.2.6 with haproxy-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master) (id:colocation-ip-192.0.2.6-haproxy-clone-INFINITY) ip-fd00.fd00.fd00.3000.f816.3eff.fe2e.5873 with haproxy-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master) (id:colocation-ip-fd00.fd00.fd00.3000.f816.3eff.fe2e.5873-haproxy-clone-INFINITY) ip-fd00.fd00.fd00.4000.f816.3eff.fe88.dbfc with haproxy-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master) (id:colocation-ip-fd00.fd00.fd00.4000.f816.3eff.fe88.dbfc-haproxy-clone-INFINITY) ip-2001.db8.fd00.1000.f816.3eff.feb3.25fb with haproxy-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master) (id:colocation-ip-2001.db8.fd00.1000.f816.3eff.feb3.25fb-haproxy-clone-INFINITY) ip-fd00.fd00.fd00.2000.f816.3eff.fe4a.871 with haproxy-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master) (id:colocation-ip-fd00.fd00.fd00.2000.f816.3eff.fe4a.871-haproxy-clone-INFINITY) ip-fd00.fd00.fd00.2000.f816.3eff.feeb.38af with haproxy-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master) (id:colocation-ip-fd00.fd00.fd00.2000.f816.3eff.feeb.38af-haproxy-clone-INFINITY)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0265.html