Bug 1400250 - Overcloud with *PostConfig hooks fails
Summary: Overcloud with *PostConfig hooks fails
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Dan Macpherson
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
: 1427077 (view as bug list)
Depends On:
Blocks: 1354654 1364298 1454747 1454748
TreeView+ depends on / blocked
 
Reported: 2016-11-30 18:02 UTC by Dan Macpherson
Modified: 2022-08-09 14:16 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-31 16:52:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-8530 0 None None None 2022-08-09 14:16:13 UTC

Description Dan Macpherson 2016-11-30 18:02:00 UTC
Talked to shardy about this issue. I'm testing out deploying an Overcloud with 1 Controller, 1 Compute, and 3 Ceph nodes, and then integrating into an Ceph Storage Console. This requires installing the Ceph Storage Console Agent on the Controller and Ceph nodes.

However, the ControllerPostConfig and CephStoragePostConfig hooks seem to fail with reference to the server param in the ObjectStorage resource (which isn't being used).

2016-11-30 17:32:22Z [overcloud.AllNodesDeploySteps.ControllerPostConfig.RHSCSetupControllerDeployment]: CREATE_FAILED  resources.RHSCSetupControllerDeployment: Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-11-30 17:32:22Z [overcloud.AllNodesDeploySteps.CephStoragePostConfig.RHSCSetupCeph]: CREATE_COMPLETE  state changed
2016-11-30 17:32:22Z [overcloud.AllNodesDeploySteps.ControllerPostConfig]: CREATE_FAILED  Resource CREATE failed: resources.RHSCSetupControllerDeployment: Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-11-30 17:32:22Z [overcloud.AllNodesDeploySteps.CephStoragePostConfig.RHSCSetupCephDeployments]: CREATE_IN_PROGRESS  state changed
2016-11-30 17:32:23Z [overcloud.AllNodesDeploySteps.CephStoragePostConfig.RHSCSetupCephDeployments]: CREATE_FAILED  resources.RHSCSetupCephDeployments: Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-11-30 17:32:23Z [overcloud.AllNodesDeploySteps.CephStoragePostConfig]: CREATE_FAILED  Resource CREATE failed: resources.RHSCSetupCephDeployments: Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-11-30 17:32:23Z [overcloud.AllNodesDeploySteps.BlockStoragePostConfig]: CREATE_COMPLETE  state changed
2016-11-30 17:32:23Z [overcloud.AllNodesDeploySteps.ComputePostConfig]: CREATE_COMPLETE  state changed
2016-11-30 17:32:23Z [overcloud.AllNodesDeploySteps.ObjectStoragePostConfig]: CREATE_COMPLETE  state changed
2016-11-30 17:32:23Z [overcloud.AllNodesDeploySteps.ControllerPostConfig]: CREATE_FAILED  resources.RHSCSetupControllerDeployment: resources.ControllerPostConfig.Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-11-30 17:32:23Z [overcloud.AllNodesDeploySteps.CephStoragePostConfig]: CREATE_FAILED  CREATE aborted
2016-11-30 17:32:23Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED  Resource CREATE failed: resources.RHSCSetupControllerDeployment: resources.ControllerPostConfig.Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-11-30 17:32:24Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED  resources.RHSCSetupControllerDeployment: resources.AllNodesDeploySteps.resources.ControllerPostConfig.Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-11-30 17:32:24Z [overcloud]: CREATE_FAILED  Resource CREATE failed: resources.RHSCSetupControllerDeployment: resources.AllNodesDeploySteps.resources.ControllerPostConfig.Property error: resources.ObjectStorage.properties.server: Value must be a string

So a couple of things worth noting:

1. I'm not using the ObjectStorage role, so it's weird that it's making reference to an unrelated role. I even tried using a custom roles_data file without ObjectStorage, but it threw the same error for the Compute role.

2. Not sure why the server param is brought up. The *PostConfig hooks require the servers param, which is a json map. So I'm not sure why it's expecting a single server.

Will attach the full overcloud plan to this BZ for debugging.

Comment 2 Andreas Karis 2016-12-30 22:00:48 UTC
I'm hitting this as well in a lab:

2016-12-30 20:57:31Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi-ControllerDeployment_Step5-bq3ihegv64sn.1]: UPDATE_COMPLETE  state changed
2016-12-30 20:57:34Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi-ControllerDeployment_Step5-bq3ihegv64sn]: UPDATE_COMPLETE  Stack UPDATE completed successfully
2016-12-30 20:57:36Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi.ControllerDeployment_Step5]: UPDATE_COMPLETE  state changed
2016-12-30 20:57:39Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi.ComputePostConfig]: CREATE_IN_PROGRESS  state changed
2016-12-30 20:57:40Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi.ComputePostConfig]: CREATE_IN_PROGRESS  Stack CREATE started
2016-12-30 20:57:40Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi.ComputePostConfig.ExtraConfig]: CREATE_IN_PROGRESS  state changed
2016-12-30 20:57:40Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi.ComputePostConfig.ExtraConfig]: CREATE_COMPLETE  state changed
2016-12-30 20:57:40Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi.ComputePostConfig.ExtraDeployments]: CREATE_IN_PROGRESS  state changed
2016-12-30 20:57:40Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi.ComputePostConfig.ExtraDeployments]: CREATE_FAILED  resources.ExtraDeployments: Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-12-30 20:57:40Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi.ComputePostConfig]: CREATE_FAILED  Resource CREATE failed: resources.ExtraDeployments: Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-12-30 20:57:41Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi.ComputePostConfig]: CREATE_FAILED  resources.ExtraDeployments: resources.ComputePostConfig.Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-12-30 20:57:42Z [overcloud-AllNodesDeploySteps-mqdlu47sqwfi]: UPDATE_FAILED  resources.ExtraDeployments: resources.ComputePostConfig.Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-12-30 20:57:43Z [AllNodesDeploySteps]: UPDATE_FAILED  resources.AllNodesDeploySteps: resources.ExtraDeployments: resources.ComputePostConfig.Property error: resources.ObjectStorage.properties.server: Value must be a string
2016-12-30 20:57:43Z [overcloud]: UPDATE_FAILED  resources.AllNodesDeploySteps: resources.ExtraDeployments: resources.ComputePostConfig.Property error: resources.ObjectStorage.properties.server: Value must be a string

 Stack overcloud UPDATE_FAILED 


===============

my configuration is outlined in:
https://access.redhat.com/solutions/2838361

Comment 3 Andreas Karis 2016-12-30 22:47:15 UTC
To have the info here as well, these are the failing templates:

/home/stack/templates/enable-cpu-pinning.yaml
Raw

resource_registry:
  OS::TripleO::Tasks::ComputePostConfig: update-grub.yaml

parameter_defaults:
  # Replace device with the name of the device that contains the boot record, usually sda. 
  compute_root_disk: '/dev/sda'
  # Use the list of CPU cores reserved for guest processes as a parameter of this argument.
  compute_isol_cpu: '2,3'

  controllerExtraConfig:
    nova::scheduler::filter::scheduler_default_filters: ['RetryFilter' , 'AvailabilityZoneFilter' , 'RamFilter' ,
'ComputeFilter' , 'ComputeCapabilitiesFilter' , 'ImagePropertiesFilter' , 'CoreFilter' ,
'NUMATopologyFilter' , 'AggregateInstanceExtraSpecsFilter' ]

  NovaComputeExtraConfig:
    nova::compute::vcpu_pin_set: [ '2', '3' ]
    nova::compute::reserved_host_memory: '1024'

/home/stack/templates/update-grub.yaml
Raw

heat_template_version: 2014-10-16

description: >
  Extra hostname configuration

parameters:
  servers:
    type: json
  compute_root_disk:
    type: string
  compute_isol_cpu:
    type: string
  input_values:
    type: json
    description: input values for the software deployments

resources:
  ExtraConfig:
    type: OS::Heat::SoftwareConfig
    properties:
      group: script
      config:
        str_replace:
          template: |
            #!/bin/bash
            if ! `cat /proc/cmdline  | grep -q isolcpus`;then
            grubby --update-kernel=ALL --args="isolcpus=_ISOL_CPU_"
            grub2-install _ROOT_DISK_
            fi
          params:
            _ROOT_DISK_: {get_param: compute_root_disk}
            _ISOL_CPU_: {get_param: compute_isol_cpu}

  ExtraDeployments:
    type: OS::Heat::SoftwareDeployments
    properties:
      servers:  {get_param: servers}
      config: {get_resource: ExtraConfig}
      actions: ['CREATE','UPDATE']
      input_values: {get_param: input_values}

Modify openstack overcloud deploy (...)
Raw

openstack overcloud deploy --templates \
(...)
-e ${template_base_dir}/enable-cpu-pinning.yaml \
(...)

Comment 4 Andreas Karis 2017-01-26 13:15:45 UTC
I raised the severity of this. The postconfig hooks are a pretty basic, documented feature and another customer just ran into this same issue.

Is there any ETA for this?

Regards,

Andreas

Comment 5 Jiri Stransky 2017-01-26 17:29:27 UTC
I'll move this to DF DFG for further triage.

From a brief scan of the templates i couldn't see what this could be caused by. Just a few Qs that might perhaps help: Did you also not deploy any object storage nodes in your deployment? Did you remove the object storage role from roles_data.yaml as well?

Comment 13 Chris Paquin 2017-02-17 22:30:21 UTC
Hello, wanted to provide an update to my previous comment. We now have figured out how to get our templates working. It seems that any time you use a type that is specific to a role (new in osp 10, see section 4.2/4.3 of the advanced deployment guide) you will need to add the following to your templates. 


See examples below.

This template is called directly via -e in our deploy script.

[stack@tpavcpclouduc1 templates]$ cat node_extra_config_post_compute.yaml
resource_registry:
  OS::TripleO::Tasks::ComputePostConfig: /home/stack/templates/NodeExtraConfigPost-Compute.yaml

it sets the value of  OS::TripleO::Tasks::ComputePostConfig

Which then calls this template. Note input_values in params, and properties. Also note the role type is included in the get_param for servers.

eat_template_version: 2014-10-16
description: 'Extra Post-Deployment Config, Compute Only'
parameters:
  servers:
    type: json
  input_values:
    type: json

# Note depends_on may be used for serialization if ordering is important
resources:
  MultiPathConfig:
    type: /home/stack/templates/extraconfig/post_deploy/apply-compute-multipath.yaml
    properties:
        servers: {get_param: [servers,Compute]}
        input_values: {get_param: input_values}

  ADCompConfig:
    type: /home/stack/templates/extraconfig/post_deploy/apply-compute-activedirectory.yaml
    properties:
        servers: {get_param: [servers,Compute]}
        input_values: {get_param: input_values}



outputs:
  deploy_stdout:
    value:
      list_join:
      - ''
      - - {get_attr: [MultiPathConfig, deploy_stdout]}
        - {get_attr: [ADCompConfig, deploy_stdout]}

In the template above we are calling two additional templates. We will look at one of them below.

heat_template_version: 2014-10-16

description: >
  Apply multipath config to overcloud systems

parameters:
  servers:
    type: json
  input_values:
    type: json

resources:
  MultiPathConfig:
    type: OS::Heat::SoftwareConfig
    properties:
      group: script
      config: {get_file: /home/stack/templates/extraconfig/post_deploy/post_deploy_compute-multipath.sh}

  MultiPathDeployment:
    type: OS::Heat::SoftwareDeployments
    properties:
      name: MultiPathDeployment
      #servers: {get_param: [servers,Compute]}
      servers: {get_param: [servers]}
      config:  {get_resource: MultiPathConfig}
      actions: ['CREATE','UPDATE'] # do this on CREATE and UPDATE
      input_values: {get_param: input_values}

Comment 14 Chris Paquin 2017-02-17 22:41:44 UTC
Additional Note:

It appears that you only need to add input_values and the specific role type when using a resource type where the role is declared.

For example:

When using this type - OS::TripleO::Tasks::ControllerPreConfig - you will need to add input values and the specific role type to the get_param:[servers]

But when using a pre-composable roles type, such as OS::TripleO::NodeExtraConfig, neither need to be included.

This means you *should* be able to port your osp 8/9 templates to 10 without issues, unless you want to take advantage of the new role types defined in the Advanced Overcloud Customization Guide [1] for OSP10.

Note that the template examples in the documentation [1] do not show an example using input_values or get_param:[servers]. The documentation should probably be updated to indicate that these values are required for certain role types. Examples should be included as well.



[1] https://access.redhat.com/documentation/en/red-hat-openstack-platform/10/single/advanced-overcloud-customization

Comment 15 Dan Macpherson 2017-02-28 05:12:02 UTC
I'm going to move this to my queue because this looks like a documentation issue now more than an engineering issue.

The differences between NodeExtraConfig vs the other PreConfig hooks is what messed me up. Will try and re-document them to accommodate these differences.

FYI, the main differences seem to be:

- NodeExtraConfig uses 'server' param (string) while the other *PreConfig hooks use 'servers' param (json)
- NodeExtraConfig doesn't require 'input_values' param while the other *PreConfig hooks do.
- NodeExtraConfig uses SoftwareDeployment resource while *PreConfig hooks uses SoftwareDeployments resource. Each require properties based on params provided.

Chris, any further differences I should know about?

Comment 16 Dan Macpherson 2017-02-28 05:49:19 UTC
Also one other thing I noticed with the postconfig hooks. The differences are similar to the preconfig hooks except that both NodeExtraConfigPost and *ConfigPost use the 'servers' param (json).

The odd one out seems to be NodeExtraConfig, which uses the 'server' param (string).

Comment 20 Alex Schultz 2017-03-03 20:47:26 UTC
*** Bug 1427077 has been marked as a duplicate of this bug. ***

Comment 21 Dan Macpherson 2017-03-31 16:31:43 UTC
Hi Chris,

Sorry for the delay on your feedback (had to refocus on OSP11 for the last few weeks). Have updated the *PreConfig and *PostConfig sections and both should now use the right syntax for the server param.

PreConfig:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/advanced_overcloud_customization/configuration_hooks#sect-Customizing_Overcloud_PreConfiguration

PostConfig:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/advanced_overcloud_customization/configuration_hooks#sect-Customizing_Overcloud_PostConfiguration

How does it look? Anything further required?

Comment 22 Chris Paquin 2017-03-31 16:36:24 UTC
Dan, looks good. Thanks.

Comment 23 Dan Macpherson 2017-03-31 16:52:23 UTC
Thanks, Chris!

Closing this BZ. If any further changes are required, please feel free to reopen and let met know.


Note You need to log in before you can comment on or make changes to this bug.