Bug 1298198

Summary: Connectivity checks are broken for IPv6 deployments
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-heat-templatesAssignee: Jiri Stransky <jstransk>
Status: CLOSED ERRATA QA Contact: yeylon <yeylon>
Severity: high Docs Contact:
Priority: urgent    
Version: 7.0 (Kilo)CC: adarazs, dmacpher, dprince, jslagle, mandreou, mburns, rhel-osp-director-maint, shardy, srevivo
Target Milestone: y3Keywords: Regression
Target Release: 7.0 (Kilo)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-0.8.6-99.el7ost Doc Type: Bug Fix
Doc Text:
The validation script for deployment testing node availability only supported IPv4. This caused connectivity checks for an IPv6-based Overcloud to fail. This mix modifies the validation script to detect whether the IP address is v4 or v6 and run the respective connectivity check commands. Now connectivity checks succeed for IPv6-based Overclouds.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-18 16:49:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marius Cornea 2016-01-13 13:11:28 UTC
Description of problem:
The connectivity checks ran during deployment don't work for IPv6 deployments. The os-collect-config log shows the following:
[2016-01-12 14:33:06,714] (heat-config) [DEBUG] ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6 

Version-Release number of selected component (if applicable):
I'm doing the test following the instructions in:
https://etherpad.openstack.org/p/tripleo-ipv6-support
and enabling pacemaker by passing an additional $THT/environments/puppet-pacemaker.yaml environment file

How reproducible:
100%

Steps to Reproduce:
1. Deploy ipv6 enabled overcloud
2. Watch journalctl -l -u os-collect-config

Actual results:
The checks are not run. For ipv4 addressing we can something like:
 [2016-01-13 08:00:48,755] (heat-config) [INFO] Trying to ping 172.16.19.11 for local network 172.16.19.0/24...SUCCESS
 Trying to ping 172.16.20.13 for local network 172.16.20.0/24...SUCCESS
 Trying to ping 172.16.21.12 for local network 172.16.21.0/24...SUCCESS
 Trying to ping 172.16.22.11 for local network 172.16.22.0/24...SUCCESS
 Trying to ping 172.16.23.11 for local network 172.16.23.0/24...SUCCESS

With the ipv6 deployment we get:
[DEBUG] ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6
ipcalc: unable to show setting for IPv6 

Expected results:
The connectivity checks are run for IPv6 as well.

Comment 1 Dan Prince 2016-01-13 15:23:22 UTC
Yeah. I'm pretty sure the ./validation-scripts/all-nodes.sh script only supports IPv4 at the moment. We will need to look into adding that support upstream.

In the meantime you could simply disable the validation check like this:

diff --git a/all-nodes-validation.yaml b/all-nodes-validation.yaml
index a738337..a69042d 100644
--- a/all-nodes-validation.yaml
+++ b/all-nodes-validation.yaml
@@ -19,7 +19,9 @@ resources:
       inputs:
         - name: ping_test_ips
           default: {get_param: PingTestIps}
-      config: {get_file: ./validation-scripts/all-nodes.sh}
+      config: |
+        #!/bin/sh
+        exit 0


----

Alternately, you could modify the resource registry so that the validation nested stack is set to a Noop resource:

OS::TripleO::AllNodes::Validation: OS::Heat::None

I'm not sure if the OS::Heat::None resource is in the product yet but if it is available it is arguably cleaner then patching out the all-nodes-validation.yaml above.

Comment 2 Steven Hardy 2016-01-13 15:52:03 UTC
Note that OS::Heat::None is new for liberty/OSP8, so if this is OSP7 (kilo), a different resource_registry mapping will be needed, e.g:

cp /usr/share/openstack-tripleo-heat-templates/all-nodes-validation.yaml disable_validation.yaml

Edit the template to disable the script, or point to another script which does nothing, e.g:

--- a/all-nodes-validation.yaml
+++ b/all-nodes-validation.yaml
@@ -19,7 +19,7 @@ resources:
       inputs:
         - name: ping_test_ips
           default: {get_param: PingTestIps}
-      config: {get_file: ./validation-scripts/all-nodes.sh}
+      config: |
+         #!/bin/sh
+         echo "validation disabled"


Then have an environment file like:

resource_registry:
    OS::TripleO::AllNodes::Validation: disable_validation.yaml

Comment 3 Marios Andreou 2016-01-15 12:52:58 UTC
proposed fixup:

upstream: https://review.openstack.org/268103 Adds basic v6 capabilitiy to the deploy validation test (pings)        
https://bugs.launchpad.net/tripleo/+bug/1534578  The deploytime connectivity (ping) tests are failing in an ipv6 based deploy 

downstream: https://code.engineering.redhat.com/gerrit/65649 Adds v6 capability to the deploy validation test (pings)

Comment 4 Marios Andreou 2016-01-15 12:54:13 UTC
proposed fixup:

upstream:
https://review.openstack.org/268103 Adds basic v6 capabilitiy to the deploy validation test (pings)        
https://bugs.launchpad.net/tripleo/+bug/1534578  The deploytime connectivity (ping) tests are failing in an ipv6 based deploy 

downstream:
https://code.engineering.redhat.com/gerrit/65649 Adds v6 capability to the deploy validation test (pings)

Comment 5 Marios Andreou 2016-01-15 15:04:26 UTC
apologies, downstream review updated, is now at https://code.engineering.redhat.com/gerrit/#/c/65658/

Comment 8 Marius Cornea 2016-01-19 10:58:43 UTC
*** Bug 1299242 has been marked as a duplicate of this bug. ***

Comment 9 Marius Cornea 2016-01-19 11:01:12 UTC
openstack-tripleo-heat-templates-0.8.6-106.el7ost.noarch

[root@overcloud-compute-0 heat-admin]# journalctl -l -u os-collect-config | grep 'Trying to ping'
Jan 19 05:26:31 overcloud-compute-0.localdomain os-collect-config[8383]: + echo -n 'Trying to ping default gateway 192.0.2.1...'
Jan 19 05:26:31 overcloud-compute-0.localdomain os-collect-config[8383]: Trying to ping default gateway 192.0.2.1...+ local COUNT=0
Jan 19 05:28:45 overcloud-compute-0.localdomain os-collect-config[8383]: [2016-01-19 05:28:45,175] (heat-config) [INFO] {"deploy_stdout": "Trying to ping fd00:fd00:fd00:2000:f816:3eff:feeb:3100 for local network fd00:fd00:fd00:2000::/64...SUCCESS\nTrying to ping fd00:fd00:fd00:3000:f816:3eff:feca:b10a for local network fd00:fd00:fd00:3000::/64...SUCCESS\nTrying to ping fd00:fd00:fd00:5000:f816:3eff:fe71:8edd for local network fd00:fd00:fd00:5000::/64...SUCCESS\n", "deploy_stderr": "", "deploy_status_code": 0}
Jan 19 05:28:45 overcloud-compute-0.localdomain os-collect-config[8383]: [2016-01-19 05:28:45,171] (heat-config) [INFO] Trying to ping fd00:fd00:fd00:2000:f816:3eff:feeb:3100 for local network fd00:fd00:fd00:2000::/64...SUCCESS
Jan 19 05:28:45 overcloud-compute-0.localdomain os-collect-config[8383]: Trying to ping fd00:fd00:fd00:3000:f816:3eff:feca:b10a for local network fd00:fd00:fd00:3000::/64...SUCCESS
Jan 19 05:28:45 overcloud-compute-0.localdomain os-collect-config[8383]: Trying to ping fd00:fd00:fd00:5000:f816:3eff:fe71:8edd for local network fd00:fd00:fd00:5000::/64...SUCCESS

Comment 11 errata-xmlrpc 2016-02-18 16:49:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0264.html