Bug 1298506

Summary: Pacemaker IPv6 vips fail to start
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-heat-templatesAssignee: Emilien Macchi <emacchi>
Status: CLOSED ERRATA QA Contact: yeylon <yeylon>
Severity: high Docs Contact:
Priority: urgent    
Version: 7.0 (Kilo)CC: dmacpher, emacchi, gfidente, jslagle, mburns, michele, rhel-osp-director-maint, srevivo, yeylon
Target Milestone: y3   
Target Release: 7.0 (Kilo)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-0.8.6-107.el7ost Doc Type: Bug Fix
Doc Text:
Pacemaker failed to start in an IPv6-based Overcloud deployment due to using IPv4-based settings (/32) for the VIP netmask. This fix determines if the Overcloud uses IPv6 and sets the VIP netmasks to the appropriate values (/64 in most cases). Pacemaker now starts successfully in the Overcloud.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-18 16:49:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
/var/log/messages none

Description Marius Cornea 2016-01-14 09:57:31 UTC
Created attachment 1114744 [details]
/var/log/messages

Description of problem:
With an IPv6 deployment the pacemaker vips fail to start 

Version-Release number of selected component (if applicable):
I'm doing the test following the instructions in:
https://etherpad.openstack.org/p/tripleo-ipv6-support
and enabling pacemaker by passing an additional $THT/environments/puppet-pacemaker.yaml environment file

How reproducible:
100%

Steps to Reproduce:
1. Apply workarounds for BZ#1295986, BZ#1297850 and BZ#1298391 
2. Deploy ipv6 enabled overcloud

Actual results:
The vip resources fail to start with errors like the following:
IPaddr2(ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071)[5442]: ERROR: Unable to find nic or netmask.
IPaddr2(ip-fd00.fd00.fd00.4000.f816.3eff.fef8.ad5a)[5444]: ERROR: Unable to find nic or netmask.
IPaddr2(ip-fd00.fd00.fd00.2000.f816.3eff.fe3e.a57)[5443]: ERROR: Unable to find nic or netmask.
IPaddr2(ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071)[5442]: INFO: [findif] failed
IPaddr2(ip-fd00.fd00.fd00.4000.f816.3eff.fef8.ad5a)[5444]: INFO: [findif] failed
IPaddr2(ip-fd00.fd00.fd00.2000.f816.3eff.fe3e.a57)[5443]: INFO: [findif] failed
lrmd[536]:  notice: ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071_monitor_0:5442:stderr [ ocf-exit-reason:Unable to find nic or netmask. ]
lrmd[536]:  notice: ip-fd00.fd00.fd00.2000.f816.3eff.fe3e.a57_monitor_0:5443:stderr [ ocf-exit-reason:Unable to find nic or netmask. ]
lrmd[536]:  notice: ip-fd00.fd00.fd00.4000.f816.3eff.fef8.ad5a_monitor_0:5444:stderr [ ocf-exit-reason:Unable to find nic or netmask. ]
crmd[540]:  notice: Operation ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071_monitor_0: not running (node=overcloud-controller-0, call=24, rc=7, cib-update=52, confirmed=true)
crmd[540]:  notice: overcloud-controller-0-ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071_monitor_0:24 [ ocf-exit-reason:Unable to find nic or netmask.\n ]
crmd[540]:  notice: Operation ip-fd00.fd00.fd00.2000.f816.3eff.fe3e.a57_monitor_0: not running (node=overcloud-controller-0, call=28, rc=7, cib-update=53, confirmed=true)
crmd[540]:  notice: overcloud-controller-0-ip-fd00.fd00.fd00.2000.f816.3eff.fe3e.a57_monitor_0:28 [ ocf-exit-reason:Unable to find nic or netmask.\n ]
crmd[540]:  notice: Operation ip-fd00.fd00.fd00.4000.f816.3eff.fef8.ad5a_monitor_0: not running (node=overcloud-controller-0, call=32, rc=7, cib-update=54, confirmed=true)
crmd[540]:  notice: overcloud-controller-0-ip-fd00.fd00.fd00.4000.f816.3eff.fef8.ad5a_monitor_0:32 [ ocf-exit-reason:Unable to find nic or netmask.\n ]

Expected results:
The vip resources get started

Additional info:
Attaching /var/log/messages.

Comment 1 Marius Cornea 2016-01-14 10:15:10 UTC
It looks like the resources are created in an ipv4 manner(cidr_netmask=32 ):

[root@overcloud-controller-0 ~]# pcs resource show ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071
 Resource: ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=fd00:fd00:fd00:2000:f816:3eff:fe35:5071 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071-start-interval-0s)
              stop interval=0s timeout=20s (ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071-stop-interval-0s)
              monitor interval=10s timeout=20s (ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071-monitor-interval-10s)

Comment 2 Michele Baldessari 2016-01-14 13:30:07 UTC
So I did a quick test and if we add an ipv6 resource without netmask:
pcs resource create ip-fe80-294-5aff-abcd-1234  ocf:heartbeat:IPaddr2  nic=br-ex ip=fe80::294:5aff:abcd:1234   op monitor interval=30s

and it did the right thing. So I believe (can't find the exact piece of puppet/heat line right now) we are forcing /32 in both ipv4 and ipv6 cases and
we should either distinguish them or we can leave it out and let the resource agent find the matching one.

I'll be around today (not tomorrow) for any tests or questions about this.

Comment 3 Emilien Macchi 2016-01-14 15:09:02 UTC
I confirm we need to fix that in THT so we can configure cidr_netmask parameter when creating the vip resource.

Comment 4 Emilien Macchi 2016-01-14 15:11:04 UTC
I confirm we need to fix that in THT so we can configure cidr_netmask parameter when creating the vip resource.

Comment 5 Hugh Brock 2016-01-15 09:13:56 UTC
Jirka, can you do the t-h-t part of this this morning?

Comment 6 Dan Sneddon 2016-01-15 09:16:46 UTC
(In reply to Emilien Macchi from comment #4)
> I confirm we need to fix that in THT so we can configure cidr_netmask
> parameter when creating the vip resource.

Can you be more specific about the actual outputs that are being appended by a /32 netmask? I can track it back to THT from the output, but which hieradata values are ending up with the bad netmask?

Comment 8 Marius Cornea 2016-01-19 11:34:57 UTC
[root@overcloud-controller-0 ~]# pcs status | grep ip-
 ip-192.0.2.6	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-2001.db8.fd00.1000.f816.3eff.feb3.25fb	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-fd00.fd00.fd00.4000.f816.3eff.fe88.dbfc	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-fd00.fd00.fd00.2000.f816.3eff.feeb.38af	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-fd00.fd00.fd00.3000.f816.3eff.fe2e.5873	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-fd00.fd00.fd00.2000.f816.3eff.fe4a.871	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0

openstack-tripleo-heat-templates-0.8.6-106.el7ost.noarch

Comment 10 Marius Cornea 2016-01-21 11:41:33 UTC
openstack-tripleo-heat-templates-0.8.6-110.el7ost.noarch

[root@overcloud-controller-0 ~]# pcs resource show ip-2001.db8.fd00.1000.f816.3eff.fe08.99ca
 Resource: ip-2001.db8.fd00.1000.f816.3eff.fe08.99ca (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=2001:db8:fd00:1000:f816:3eff:fe08:99ca cidr_netmask=64 
  Operations: start interval=0s timeout=20s (ip-2001.db8.fd00.1000.f816.3eff.fe08.99ca-start-interval-0s)
              stop interval=0s timeout=20s (ip-2001.db8.fd00.1000.f816.3eff.fe08.99ca-stop-interval-0s)
              monitor interval=10s timeout=20s (ip-2001.db8.fd00.1000.f816.3eff.fe08.99ca-monitor-interval-10s)

Comment 12 errata-xmlrpc 2016-02-18 16:49:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0264.html