Bug 1268415

Summary: rhel-osp-director: unable to configure overcloud after creation using the OS::TripleO::NodeExtraConfigPost resource.
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: python-rdomanager-oscpluginAssignee: Steve Baker <sbaker>
Status: CLOSED ERRATA QA Contact: Alexander Chuzhoy <sasha>
Severity: unspecified Docs Contact:
Priority: high    
Version: unspecifiedCC: calfonso, dmacpher, dnavale, jcall, jcoufal, jraju, jslagle, mburns, mcornea, ohochman, rhel-osp-director-maint, sbaker, zbitter
Target Milestone: y2Keywords: Triaged
Target Release: 7.0 (Kilo)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-rdomanager-oscplugin-0.0.10-16.el7ost Doc Type: Bug Fix
Doc Text:
Previously the base resource registry environment was included for all overcloud stack updates, which meant customizations may be lost unless all environment files are repeated in order when calling "openstack overcloud deploy". With this update, it is possible to call "openstack overcloud deploy" with no environments without losing customizations. If any environment files are specified, then all environment files must be specified again in the desired order.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-21 16:49:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
heat-engine.log from the undercloud. none

Description Alexander Chuzhoy 2015-10-02 19:03:04 UTC
rhel-osp-director: unable to configure overcloud after creation using the OS::TripleO::NodeExtraConfigPost resource.
The procedure to configure the overcloud after creation is outlined here:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html-single/Director_Installation_and_Usage/index.html#sect-Configuring_after_Overcloud_Creation



Environment:
python-heatclient-0.6.0-1.el7ost.noarch
instack-undercloud-2.1.2-29.el7ost.noarch
openstack-heat-engine-2015.1.1-5.el7ost.noarch
openstack-heat-api-cfn-2015.1.1-5.el7ost.noarch
openstack-heat-api-cloudwatch-2015.1.1-5.el7ost.noarch
openstack-puppet-modules-2015.1.8-21.el7ost.noarch
openstack-heat-api-2015.1.1-5.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.1-5.el7ost.noarch
openstack-heat-common-2015.1.1-5.el7ost.noarch
openstack-tripleo-heat-templates-0.8.6-71.el7ost.noarch
openstack-heat-templates-0-0.6.20150605git.el7ost.noarch



Steps to reproduce:
1. Deploy the overcloud
2. Attempt to change using the steps outlined here: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html-single/Director_Installation_and_Usage/index.html#sect-Configuring_after_Overcloud_Creation


Result:
The script doesn't run on the overcloud nodes.

Comment 2 Zane Bitter 2015-10-02 19:14:19 UTC
This is likely because the client is not uploading the new version of the script in the "files" section of the environment, so Heat is retaining the existing version.

If environment files are not specified explicitly then there's no real way for the client to know which scripts need to be uploaded. However Sasha reports that it's not working even though the environment file is explicitly specified. So we might not be uploading any modified files at all.

Comment 3 John Call 2015-10-05 15:43:24 UTC
Bug 1267855 may be related

Comment 7 Alexander Chuzhoy 2015-12-02 17:08:37 UTC
FailedQA:
Environment:
python-rdomanager-oscplugin-0.0.10-19.el7ost.noarch

Deployed successfully HA overcloud with: openstack overcloud deploy --templates --control-scale 3 --compute-scale 1  --neutron-network-type gre --neutron-tunnel-types gre  --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml


Attempted to run the script in a couple of ways:
openstack overcloud deploy --templates -e ~/post_config.yaml                                         
Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates                                        
Stack failed with status: resources.Networks: resources.TenantNetwork: Conflict: resources.TenantSubnet: Unable to complete operation on subnet 638903d8-f61c-4d2e-8aad-46dfb62e8dcc. One or more ports have an IP allocation from this subnet.                                                                                                    
ERROR: openstack Heat Stack update failed. 



openstack overcloud deploy --templates --control-scale 3 --compute-scale 1  --neutron-network-type gre --neutron-tunnel-types gre  --ntp-server 10.5.26.10 --timeout 90 -e post_config.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml
Deploying templates in the directory /usr/share/openstack-tripleo-heat-templates
Stack failed with status: resources.Networks: resources.StorageMgmtNetwork: Conflict: resources.StorageMgmtSubnet: Unable to complete operation on subnet 37d91214-f32f-464c-b0a5-e9ad75d4a52a. One or more ports have an IP allocation from this subnet.
ERROR: openstack Heat Stack update failed.




[stack@instack ~]$ cat post_config.yaml
resource_registry:
  OS::TripleO::NodeExtraConfigPost: /home/stack/post.yaml
parameter_defaults:
  message1: hello


[stack@instack ~]$ cat post.yaml
heat_template_version: 2014-10-16

parameters:
  servers:
    type: json

  nameserver_ip:
    type: string

resources:

  ExtraConfig:
    type: OS::Heat::SoftwareConfig
    properties:
      group: script
      config:
        str_replace:
          template: |
            #!/bin/sh
            echo "message _MESSAGE_" >> /root/file_sasha
          params:
            _MESSAGE_: {get_param: message1}

  ExtraDeployments:
    type: OS::Heat::SoftwareDeployments
    properties:
      servers:  {get_param: servers}
      config: {get_resource: ExtraConfig}
      actions: ['CREATE','UPDATE']

Comment 8 Steve Baker 2015-12-02 21:57:28 UTC
Alexander, if you specify an extra environment then you must specify all of them again as well. If our docs don't say this then they need to be corrected. So if you deployed with:

  openstack overcloud deploy --templates --control-scale 3 --compute-scale 1  --neutron-network-type gre --neutron-tunnel-types gre  --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml

Then you should add the extra config with:

openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml -e ~/post_config.yaml

Until the heat api supports multiple environments this will be necessary.

Looking at the docs there are many places where the following warning has been added:

  IMPORTANT
  If you passed any extra environment files when you created the Overcloud,
  pass them again here using the -e or --environment-file option to avoid
  making undesired changes to the Overcloud.

However this warning is missing from https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html-single/Director_Installation_and_Usage/index.html#sect-Configuring_after_Overcloud_Creation so I think a docs fix is needed here.

Comment 9 Alexander Chuzhoy 2015-12-03 17:05:22 UTC
FailedQA:

Re-attempted:
###################################################################
Deployment command:
openstack overcloud deploy --templates --control-scale 3 --compute-scale 1  --neutron-network-type gre --neutron-tunnel-types gre  --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml
###################################################################
Update command:
openstack overcloud deploy --templates  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml -e post.yaml
###################################################################
cat post.yaml:
resource_registry:
  OS::TripleO::NodeExtraConfigPost: message.yaml
parameter_defaults:
  message1: hello_world
###################################################################
cat message.yaml:
heat_template_version: 2014-10-16

parameters:
  servers:
    type: json
  message1:
    type: string

resources:
  ExtraConfig:
    type: OS::Heat::SoftwareConfig
    properties:
      group: script
      config:
        str_replace:
          template: |
            #!/bin/sh
            echo "message: _MESSAGE_" >> /root/sasha
          params:
            _MESSAGE_: {get_param: message1}

  ExtraDeployments:
    type: OS::Heat::SoftwareDeployments
    properties:
      servers:  {get_param: servers}
      config: {get_resource: ExtraConfig}
      actions: ['CREATE','UPDATE']

###################################################################

Result:
Stack failed with status: resources.ControllerNodesPostDeployment: resources.ControllerOvercloudServicesDeployment_Step4: Error: resources[2]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6                                                                               
ERROR: openstack Heat Stack update failed.     
###################################################################


Checking the nodes - I see that on only one node the file was created as expected, the rest don't have it.

Comment 10 Alexander Chuzhoy 2015-12-03 17:08:31 UTC
Created attachment 1101864 [details]
heat-engine.log from the undercloud.

Comment 11 Zane Bitter 2015-12-04 00:18:38 UTC
Status code 6 is from puppet. You should look at the puppet output. You can get is by showing the software deployment that failed in the Heat API.

Comment 12 Alexander Chuzhoy 2015-12-04 21:49:26 UTC
Verified:
Environment:
python-rdomanager-oscplugin-0.0.10-19.el7ost.noarch


Was able to complete the task successfully.


 Deployment command: openstack overcloud deploy --templates --control-scale 3 --compute-scale 1  --neutron-network-type gre --neutron-tunnel-types gre  --ntp-server x.x.x.x --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml


Update command:
openstack overcloud deploy --templates  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml -e post.yaml


cat post.yaml
resource_registry:
  OS::TripleO::NodeExtraConfigPost: test.yaml
parameter_defaults:
  message1: hello_world



cat test.yaml
heat_template_version: 2014-10-16

parameters:
  servers:
    type: json
  message1:
    type: string

resources:
  ExtraConfig:
    type: OS::Heat::SoftwareConfig
    properties:
      group: script
      config:
        str_replace:
          template: |
            #!/bin/sh
            echo "message _MESSAGE_" >> /root/test
          params:
            _MESSAGE_: {get_param: message1}

  ExtraDeployments:
    type: OS::Heat::SoftwareDeployments
    properties:
      servers:  {get_param: servers}
      config: {get_resource: ExtraConfig}
      actions: ['CREATE','UPDATE']

Comment 15 errata-xmlrpc 2015-12-21 16:49:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2650