Bug 1246596

Summary: Add support for network validation tests
Product: Red Hat OpenStack Reporter: Angus Thomas <athomas>
Component: openstack-tripleo-heat-templatesAssignee: Dan Prince <dprince>
Status: CLOSED ERRATA QA Contact: Marius Cornea <mcornea>
Severity: high Docs Contact:
Priority: urgent    
Version: DirectorCC: calfonso, djuran, dsneddon, mandreou, mburns, mcornea, michele, nbarcet, ohochman, rhel-osp-director-maint, whayutin
Target Milestone: y1Keywords: Triaged, ZStream
Target Release: 7.0 (Kilo)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-0.8.6-59.el7ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-08 12:15:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Angus Thomas 2015-07-24 16:31:24 UTC
In order to ensure that network-related problems can be caught quickly at overcloud deployment time, we need to include the network validation tests which have recently been posted upstream:

https://review.openstack.org/#/c/204781/
https://review.openstack.org/#/c/204806/

These will verify that the interfaces which os-net-config has created can be used to ping the undercloud controller, before the configuration of openstack services begins. If that fails, the deployment will terminate with detailed error reporting.

Comment 4 Marios Andreou 2015-07-28 12:01:59 UTC
FYI I tested the two upstream tripleo-heat-templates patches above. I cloned current downstream and cherry-picked those onto it

My env was otherwise current poodle. I hit a problem with tuskar since the new 'validation-scripts' directory isn't included in the paths to grab role-extra data from. I have a review out for that @ https://review.openstack.org/#/c/204781/. With this applied and the roles recreated like [1] I got the expected output in compute/controller

/var/log/messages:2118:Jul 28 07:27:35 localhost os-collect-config: [2015-07-28 07:27:35,828] (heat-config) [INFO] {"deploy_stdout": "Trying to ping 172.16.0.7 for local network 172.16.0.0/24...SUCCESS\nTrying to ping 172.16.1.9 for local network 172.16.1.0/24...SUCCESS\nTrying to ping 172.16.2.10 for local network 172.16.2.0/24...SUCCESS\nTrying to ping default gateway 192.0.2.1...SUCCESS\n", "deploy_stderr": "", "deploy_status_code": 0}



[1] https://github.com/rdo-management/instack-undercloud/blob/c072ac1e16f3f75dc229c7cffae8acab0a52c1c9/doc/source/advanced_deployment/reload_roles_and_plan.rst

Comment 5 Marios Andreou 2015-07-28 12:05:34 UTC
woops sorry the link above is wrong my review is at https://review.openstack.org/#/c/206502/ (I linked to dan's patch instead)

Comment 6 chris alfonso 2015-08-26 16:41:17 UTC
*** Bug 1255453 has been marked as a duplicate of this bug. ***

Comment 9 Michele Baldessari 2015-09-14 10:24:21 UTC
Currently on openstack-tripleo-heat-templates-0.8.6-58.el7ost.noarch this
is failing due to:
openstack overcloud deploy --templates --control-scale 3 --control-flavor vm --compute-scale 3 --compute-flavor baremetal  --ntp-server 10.16.255.1
Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates
ERROR: openstack Could not fetch contents for file:///usr/share/openstack-tripleo-heat-templates/validation-scripts/all-nodes.sh

We're missing the all-nodes.sh file in the rpm:
$ rpm -ql openstack-tripleo-heat-templates | grep validation 
/usr/share/openstack-tripleo-heat-templates/all-nodes-validation.yaml

Comment 11 Mike Burns 2015-09-14 12:42:32 UTC
Spec file issue fixed and rpm rebuilt

Comment 17 wes hayutin 2015-09-15 12:55:53 UTC
Thanks Dan, Marios,
Should network-validation be turned off for virt deployments?

Comment 19 Dan Sneddon 2015-09-15 18:28:10 UTC
(In reply to wes hayutin from comment #17)
> Thanks Dan, Marios,
> Should network-validation be turned off for virt deployments?

I think so, since it was designed to validate a production network, and we make compromises with the network topology in virt that confuses the validation script. This script is still very useful for bare metal deployments with network isolation.

Comment 20 Dan Prince 2015-09-17 13:53:19 UTC
To be clear the new validations were designed to work fine both with and without network isolation. Also, they should work in both virt and non-virt environments.

This:  https://review.openstack.org/#/c/204781/

----

We spoke about this on IRC today and we think the issue with testing this downstream is actually caused by this missing revert (which already landed upstream):

We need to backport this: https://review.openstack.org/#/c/205206/

Comment 21 Omri Hochman 2015-09-18 20:00:45 UTC
Verified with : openstack-tripleo-heat-templates-0.8.6-62.el7ost.noarch

I specified wrong VLAN on the network-environment.yaml (switch StorageMgmtNetworkVlanID: 203 --> StorageMgmtNetworkVlanID: 233 ) 

started deployment and got the following Error after ~25 minutes : 
-------------------------------------------------------------------
Stack failed with status: Resource CREATE failed: Error: resources.CephStorageAllNodesValidationDeployment.resources[0]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 1

ERROR: openstack Heat Stack create failed


[stack@undercloud ~]$ heat resource-list overcloud -n 5 | grep -v COMPLETE
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+
| resource_name                               | physical_resource_id                          | resource_type                                     | resource_status    | updated_time         | parent_resource                             |
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+
| CephStorageAllNodesValidationDeployment     | 9e66cf81-c708-43ec-8bba-8ca70d03e859          | OS::Heat::StructuredDeployments                   | CREATE_FAILED      | 2015-09-18T23:42:50Z |                                             |
| ComputeNodesPostDeployment                  | 611f66ad-7e57-47bf-9466-9479f1fd9524          | OS::TripleO::ComputePostDeployment                | CREATE_FAILED      | 2015-09-18T23:42:50Z |                                             |
| ControllerAllNodesValidationDeployment      | 7a3f2705-2448-4226-8cd1-6c3ae8bf56f1          | OS::Heat::StructuredDeployments                   | CREATE_FAILED      | 2015-09-18T23:42:50Z |                                             |
| ControllerNodesPostDeployment               | ae3e5483-e395-4632-9bcd-92d7f291f016          | OS::TripleO::ControllerPostDeployment             | CREATE_FAILED      | 2015-09-18T23:42:50Z |                                             |
| 0                                           | 8118e256-0913-4470-82f8-3d0eba7630d6          | OS::Heat::StructuredDeployment                    | CREATE_FAILED      | 2015-09-18T23:54:45Z | CephStorageAllNodesValidationDeployment     |
| 1                                           | e50c006c-5b2c-4385-a0ba-159e007c992c          | OS::Heat::StructuredDeployment                    | CREATE_FAILED      | 2015-09-18T23:55:07Z | ControllerAllNodesValidationDeployment      |
| 2                                           | 1db8dccb-7015-4529-bf96-100a6005700e          | OS::Heat::StructuredDeployment                    | CREATE_FAILED      | 2015-09-18T23:55:07Z | ControllerAllNodesValidationDeployment      |
| ControllerOvercloudServicesDeployment_Step6 |                                               | OS::Heat::StructuredDeployments                   | CREATE_IN_PROGRESS | 2015-09-18T23:55:14Z | ControllerNodesPostDeployment               |
+---------------------------------------------+-----------------------------------------------+---------------------------------------------------+--------------------+----------------------+---------------------------------------------+

Comment 23 errata-xmlrpc 2015-10-08 12:15:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:1862