Bug 1298506 - Pacemaker IPv6 vips fail to start
Summary: Pacemaker IPv6 vips fail to start
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: y3
: 7.0 (Kilo)
Assignee: Emilien Macchi
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-14 09:57 UTC by Marius Cornea
Modified: 2016-04-18 07:14 UTC (History)
9 users (show)

Fixed In Version: openstack-tripleo-heat-templates-0.8.6-107.el7ost
Doc Type: Bug Fix
Doc Text:
Pacemaker failed to start in an IPv6-based Overcloud deployment due to using IPv4-based settings (/32) for the VIP netmask. This fix determines if the Overcloud uses IPv6 and sets the VIP netmasks to the appropriate values (/64 in most cases). Pacemaker now starts successfully in the Overcloud.
Clone Of:
Environment:
Last Closed: 2016-02-18 16:49:48 UTC
Target Upstream Version:


Attachments (Terms of Use)
/var/log/messages (556.71 KB, text/plain)
2016-01-14 09:57 UTC, Marius Cornea
no flags Details


Links
System ID Priority Status Summary Last Updated
OpenStack gerrit 267647 None MERGED Set /64 cidr_netmask for pcmk VIPs when IPv6 2020-05-07 10:46:19 UTC
Red Hat Product Errata RHBA-2016:0264 normal SHIPPED_LIVE Red Hat Enterprise Linux OSP 7 director Bug Fix Advisory 2016-02-18 21:41:29 UTC

Description Marius Cornea 2016-01-14 09:57:31 UTC
Created attachment 1114744 [details]
/var/log/messages

Description of problem:
With an IPv6 deployment the pacemaker vips fail to start 

Version-Release number of selected component (if applicable):
I'm doing the test following the instructions in:
https://etherpad.openstack.org/p/tripleo-ipv6-support
and enabling pacemaker by passing an additional $THT/environments/puppet-pacemaker.yaml environment file

How reproducible:
100%

Steps to Reproduce:
1. Apply workarounds for BZ#1295986, BZ#1297850 and BZ#1298391 
2. Deploy ipv6 enabled overcloud

Actual results:
The vip resources fail to start with errors like the following:
IPaddr2(ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071)[5442]: ERROR: Unable to find nic or netmask.
IPaddr2(ip-fd00.fd00.fd00.4000.f816.3eff.fef8.ad5a)[5444]: ERROR: Unable to find nic or netmask.
IPaddr2(ip-fd00.fd00.fd00.2000.f816.3eff.fe3e.a57)[5443]: ERROR: Unable to find nic or netmask.
IPaddr2(ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071)[5442]: INFO: [findif] failed
IPaddr2(ip-fd00.fd00.fd00.4000.f816.3eff.fef8.ad5a)[5444]: INFO: [findif] failed
IPaddr2(ip-fd00.fd00.fd00.2000.f816.3eff.fe3e.a57)[5443]: INFO: [findif] failed
lrmd[536]:  notice: ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071_monitor_0:5442:stderr [ ocf-exit-reason:Unable to find nic or netmask. ]
lrmd[536]:  notice: ip-fd00.fd00.fd00.2000.f816.3eff.fe3e.a57_monitor_0:5443:stderr [ ocf-exit-reason:Unable to find nic or netmask. ]
lrmd[536]:  notice: ip-fd00.fd00.fd00.4000.f816.3eff.fef8.ad5a_monitor_0:5444:stderr [ ocf-exit-reason:Unable to find nic or netmask. ]
crmd[540]:  notice: Operation ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071_monitor_0: not running (node=overcloud-controller-0, call=24, rc=7, cib-update=52, confirmed=true)
crmd[540]:  notice: overcloud-controller-0-ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071_monitor_0:24 [ ocf-exit-reason:Unable to find nic or netmask.\n ]
crmd[540]:  notice: Operation ip-fd00.fd00.fd00.2000.f816.3eff.fe3e.a57_monitor_0: not running (node=overcloud-controller-0, call=28, rc=7, cib-update=53, confirmed=true)
crmd[540]:  notice: overcloud-controller-0-ip-fd00.fd00.fd00.2000.f816.3eff.fe3e.a57_monitor_0:28 [ ocf-exit-reason:Unable to find nic or netmask.\n ]
crmd[540]:  notice: Operation ip-fd00.fd00.fd00.4000.f816.3eff.fef8.ad5a_monitor_0: not running (node=overcloud-controller-0, call=32, rc=7, cib-update=54, confirmed=true)
crmd[540]:  notice: overcloud-controller-0-ip-fd00.fd00.fd00.4000.f816.3eff.fef8.ad5a_monitor_0:32 [ ocf-exit-reason:Unable to find nic or netmask.\n ]

Expected results:
The vip resources get started

Additional info:
Attaching /var/log/messages.

Comment 1 Marius Cornea 2016-01-14 10:15:10 UTC
It looks like the resources are created in an ipv4 manner(cidr_netmask=32 ):

[root@overcloud-controller-0 ~]# pcs resource show ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071
 Resource: ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=fd00:fd00:fd00:2000:f816:3eff:fe35:5071 cidr_netmask=32 
  Operations: start interval=0s timeout=20s (ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071-start-interval-0s)
              stop interval=0s timeout=20s (ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071-stop-interval-0s)
              monitor interval=10s timeout=20s (ip-fd00.fd00.fd00.2000.f816.3eff.fe35.5071-monitor-interval-10s)

Comment 2 Michele Baldessari 2016-01-14 13:30:07 UTC
So I did a quick test and if we add an ipv6 resource without netmask:
pcs resource create ip-fe80-294-5aff-abcd-1234  ocf:heartbeat:IPaddr2  nic=br-ex ip=fe80::294:5aff:abcd:1234   op monitor interval=30s

and it did the right thing. So I believe (can't find the exact piece of puppet/heat line right now) we are forcing /32 in both ipv4 and ipv6 cases and
we should either distinguish them or we can leave it out and let the resource agent find the matching one.

I'll be around today (not tomorrow) for any tests or questions about this.

Comment 3 Emilien Macchi 2016-01-14 15:09:02 UTC
I confirm we need to fix that in THT so we can configure cidr_netmask parameter when creating the vip resource.

Comment 4 Emilien Macchi 2016-01-14 15:11:04 UTC
I confirm we need to fix that in THT so we can configure cidr_netmask parameter when creating the vip resource.

Comment 5 Hugh Brock 2016-01-15 09:13:56 UTC
Jirka, can you do the t-h-t part of this this morning?

Comment 6 Dan Sneddon 2016-01-15 09:16:46 UTC
(In reply to Emilien Macchi from comment #4)
> I confirm we need to fix that in THT so we can configure cidr_netmask
> parameter when creating the vip resource.

Can you be more specific about the actual outputs that are being appended by a /32 netmask? I can track it back to THT from the output, but which hieradata values are ending up with the bad netmask?

Comment 8 Marius Cornea 2016-01-19 11:34:57 UTC
[root@overcloud-controller-0 ~]# pcs status | grep ip-
 ip-192.0.2.6	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-2001.db8.fd00.1000.f816.3eff.feb3.25fb	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-fd00.fd00.fd00.4000.f816.3eff.fe88.dbfc	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-fd00.fd00.fd00.2000.f816.3eff.feeb.38af	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-fd00.fd00.fd00.3000.f816.3eff.fe2e.5873	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0
 ip-fd00.fd00.fd00.2000.f816.3eff.fe4a.871	(ocf::heartbeat:IPaddr2):	Started overcloud-controller-0

openstack-tripleo-heat-templates-0.8.6-106.el7ost.noarch

Comment 10 Marius Cornea 2016-01-21 11:41:33 UTC
openstack-tripleo-heat-templates-0.8.6-110.el7ost.noarch

[root@overcloud-controller-0 ~]# pcs resource show ip-2001.db8.fd00.1000.f816.3eff.fe08.99ca
 Resource: ip-2001.db8.fd00.1000.f816.3eff.fe08.99ca (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=2001:db8:fd00:1000:f816:3eff:fe08:99ca cidr_netmask=64 
  Operations: start interval=0s timeout=20s (ip-2001.db8.fd00.1000.f816.3eff.fe08.99ca-start-interval-0s)
              stop interval=0s timeout=20s (ip-2001.db8.fd00.1000.f816.3eff.fe08.99ca-stop-interval-0s)
              monitor interval=10s timeout=20s (ip-2001.db8.fd00.1000.f816.3eff.fe08.99ca-monitor-interval-10s)

Comment 12 errata-xmlrpc 2016-02-18 16:49:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0264.html


Note You need to log in before you can comment on or make changes to this bug.