Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1239130 - [RFE] Heat environment sanity check
[RFE] Heat environment sanity check
Status: CLOSED CURRENTRELEASE
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director (Show other bugs)
7.0 (Kilo)
Unspecified Unspecified
unspecified Severity medium
: ---
: 10.0 (Newton)
Assigned To: Hugh Brock
Shai Revivo
: FutureFeature, Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-07-03 12:47 EDT by Marius Cornea
Modified: 2016-09-30 03:58 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
The director does not provide network validation before or during a deployment. This means a deployment with a bad network configuration can run for two hours with no output and can result in failure. A network validation script is currently in development and will be released in the future.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-09-30 03:58:45 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Marius Cornea 2015-07-03 12:47:37 EDT
Description of problem:

Today we hit an issue when the overcloud deployment timed out after 2 hours, caused by a broken network template for the compute role(containing port for the storage management network). It would be nice to get a sanity check before deployment that validates the heat environment so you don't get to wait 2 hours to fix a problem that shows up in the early stages of the deployment.
Comment 3 chris alfonso 2015-07-06 12:09:23 EDT
Please provide the exact steps and failure to see if we can add checks.
Comment 5 Marius Cornea 2015-07-07 05:44:03 EDT
Deploy overcloud by passing -e ~/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml.

/usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml is the default

network-environment.yaml includes this compute.yaml nic template[1]. The template contains a vlan interface with IP address from StorageMgmtIpSubnet but in network-isolation.yaml compute role doesn't have a port for StorageMgmt. 

As a result this error[2] shows up on the deployed compute nodes.

[1] http://pastebin.test.redhat.com/295175
[2] http://pastebin.test.redhat.com/295181
Comment 6 Marius Cornea 2015-07-13 14:38:38 EDT
Another check we should cover:

When creating ovs bridges only one interface should be part of the ovs bridge if bonds are not used. 

Here's an example of bad template which might lead to loops:

resources:
  OsNetConfigImpl:
    type: OS::Heat::StructuredConfig
    properties:
      group: os-apply-config
      config:
        os_net_config:
          network_config:
            -
              type: ovs_bridge
              name: br-storage
              use_dhcp: true
              members:
                -
                  type: interface
                  name: eth0
                  use_dhcp: false
                -
                  type: interface
                  name: eth1
                  use_dhcp: false
                -
                  type: interface
                  name: eth2
                  # force the MAC address of the bridge to this interface
                  primary: true
                  addresses:
                  -
                    ip_netmask: {get_param: StorageIpSubnet}
                  -
                    ip_netmask: {get_param: StorageMgmtIpSubnet}
Comment 10 Jaromir Coufal 2016-09-30 03:58:45 EDT
This should be fixed with validations.

Note You need to log in before you can comment on or make changes to this bug.