Hide Forgot
* Description of problem: When using the environments/hyperconverged-ceph.yaml [1] to deploy an HCI (nova/ceph on one node) overcloud, compute nodes are deployed with working OSDs, but nova compute services are not configured. For example, nova.conf is empty, nova processes are not running on the compute nodes, and nova's systemd unit files show nova services to be disabled. The deploy completes successfully but nova doesn't work. Other services, e.g. glance and cinder (backed by Ceph) do work. It's as if the parameter_merge_strategies merge strategy [1] is behaving like overwrite instead [2]. [1] https://github.com/openstack/tripleo-heat-templates/blob/stable/newton/environments/hyperconverged-ceph.yaml#L10-L11 [2] http://docs.openstack.org/developer/heat/template_guide/environment.html#environment-merging * Version-Release number of selected component (if applicable): python-tripleoclient-5.2.0-2.el7ost.noarch python-heatclient-1.5.0-1.el7ost.noarch * How reproducible: Seems deterministic. Reproduced 3 times using puddles from 18th and 19th of Oct * Steps to Reproduce: Do an HCI deployment with environments/hyperconverged-ceph.yaml * Actual results: Overcloud without Nova compute services * Expected results: Overcloud with Nova compute services * Additional info: Using "Scenario 2 - hyperconverged ceph deployment" as per http://hardysteven.blogspot.com/2016/08/tripleo-composable-services-101.html produces the needed result.
[stack@hci-director ~]$ swift download overcloud user-environment.yaml user-environment.yaml [auth 0.369s, headers 0.666s, total 0.666s, 0.014 MB/s] [stack@hci-director ~]$ more user-environment.yaml parameter_merge_strategies: {ComputeServices: merge} resource_registry: {'OS::TripleO::BlockStorage::Ports::ExternalPort': network/ports/noop.yaml 'OS::TripleO::BlockStorage::Ports::InternalApiPort': network/ports/internal_api.yaml, 'OS::TripleO::BlockStorage::Ports::StorageMgmtPort': network/ports/storage_mgmt.yaml, 'OS::TripleO::BlockStorage::Ports::StoragePort': network/ports/storage.yaml, 'OS::TripleO:: 'OS::TripleO::CephStorage::Ports::ExternalPort': network/ports/noop.yaml, 'OS::TripleO::Cep 'OS::TripleO::CephStorage::Ports::StorageMgmtPort': network/ports/storage_mgmt.yaml, 'OS::TripleO::CephStorage::Ports::StoragePort': network/ports/storage.yaml, 'OS::TripleO::C 'OS::TripleO::Compute::Net::SoftwareConfig': user-files/2ad0c78891e7891f89696d44ff44b9c5-co 'OS::TripleO::Compute::Ports::ExternalPort': network/ports/noop.yaml, 'OS::TripleO::Compute 'OS::TripleO::Compute::Ports::StorageMgmtPort': network/ports/storage_mgmt.yaml, 'OS::TripleO::Compute::Ports::StoragePort': network/ports/storage.yaml, 'OS::TripleO::Compu 'OS::TripleO::Controller::Net::SoftwareConfig': user-files/c538381d5c820ef0f4caf2385fd0548d 'OS::TripleO::Controller::Ports::ExternalPort': network/ports/external.yaml, 'OS::TripleO:: 'OS::TripleO::Controller::Ports::RedisVipPort': network/ports/vip.yaml, 'OS::TripleO::Contr 'OS::TripleO::Controller::Ports::StoragePort': network/ports/storage.yaml, 'OS::TripleO::Co 'OS::TripleO::ControllerConfig': puppet/controller-config-pacemaker.yaml, 'OS::TripleO::Net 'OS::TripleO::Network::InternalApi': network/internal_api.yaml, 'OS::TripleO::Network::Port 'OS::TripleO::Network::Ports::InternalApiVipPort': network/ports/internal_api.yaml, 'OS::TripleO::Network::Ports::RedisVipPort': network/ports/vip.yaml, 'OS::TripleO::Network: 'OS::TripleO::Network::Ports::StorageVipPort': network/ports/storage.yaml, 'OS::TripleO::Ne 'OS::TripleO::Network::StorageMgmt': network/storage_mgmt.yaml, 'OS::TripleO::Network::Tena 'OS::TripleO::NodeExtraConfigPost': user-files/e541a433b6110f5b9ef4218a1de4443a-post-deploy 'OS::TripleO::NodeUserData': user-files/03e8fd271d16960f8955efecb6447694-first-boot-templat 'OS::TripleO::Services::CephClient': puppet/services/ceph-client.yaml, 'OS::TripleO::Servic 'OS::TripleO::Services::CephOSD': puppet/services/ceph-osd.yaml, 'OS::TripleO::Services::Ci 'OS::TripleO::Services::HAproxy': puppet/services/pacemaker/haproxy.yaml, 'OS::TripleO::Ser 'OS::TripleO::Services::Pacemaker': puppet/services/pacemaker.yaml, 'OS::TripleO::Services: 'OS::TripleO::Services::Redis': puppet/services/pacemaker/database/redis.yaml, 'OS::TripleO 'OS::TripleO::SwiftStorage::Ports::InternalApiPort': network/ports/internal_api.yaml, 'OS::TripleO::SwiftStorage::Ports::StorageMgmtPort': network/ports/storage_mgmt.yaml, 'OS::TripleO::SwiftStorage::Ports::StoragePort': network/ports/storage.yaml, 'OS::TripleO:: 'OS::TripleO::Tasks::ControllerPostPuppet': extraconfig/tasks/post_puppet_pacemaker.yaml, 'OS::TripleO::Tasks::ControllerPostPuppetRestart': extraconfig/tasks/post_puppet_pacemaker_ 'OS::TripleO::Tasks::ControllerPrePuppet': extraconfig/tasks/pre_puppet_pacemaker.yaml} [stack@hci-director ~]$
Hi, parameter_merge_strategies: ComputeServices: deep_merge Solves the issue as if something changed in template depth level : deep_merge Json values are deep merged. Not useful for other types like comma delimited lists and strings. If specified for them, it falls back to merge. openstack hypervisor list +----+---------------------------------+ | ID | Hypervisor Hostname | +----+---------------------------------+ | 1 | overcloud-compute-2.localdomain | | 2 | overcloud-compute-1.localdomain | | 3 | overcloud-compute-0.localdomain | +----+---------------------------------+ Cheers, Greg
(In reply to Gregory Charot from comment #2) > Hi, > > parameter_merge_strategies: > ComputeServices: deep_merge > > Solves the issue as if something changed in template depth level : it does not solve the issue for me; also the params we are merging are still lists so 'merge' should work as he used to do before I am still investigating what the root cause could be
Dougal, can you help us with this? As per comment #1 user-environment.yaml seems to preserve parameter_merge_strategies Is there any chance we are not passing it at deployment time?
Discussed with Giulio, he is tracking down a potential issue in tripleo-common.
I am debugging the POST request and it looks like tripleoclient is passing via heatclient parameter_merge_strategies as intended: 2016-10-21 08:43:27.954 10409 DEBUG heatclient.common.http [-] curl -g -i -X POST -H 'X-Auth-User: admin' -H 'X-Auth-Token: {SHA1}5b4a7dc7215ce4ce0f124f0f454a3d1672c9fc9b' -H 'X-Region-Name: regionOne' -H 'Accept: application/json' -H 'User-Agent: python-heatclient' -H 'Content-Type: application/json' -d '{"stack_name": "overcloud", "environment": {"parameter_defaults": {"MysqlMaxConnections": 8192, "ControllerCount": 3, ... }, "parameter_merge_strategies": {"ComputeServices": "merge"}, "resource_registry": {"OS::TripleO::Services::Timezone": "http://192.168.1.1:8080/v1/AUTH_b0ffd5e578ee44ebb9bfcc9a5425426a/overcloud/puppet/services/time/timezone.yaml", ...}} the list of parameter_defaults and resource_registry is longer but parameter_merge_strategies seems to be there as wanted.
Given we can already merge _defaults from different environment files, an alternative approach to the Heat change is to pass the services list via tripleo registry which is used as env file at deployment.
We agreed to go with the THT fix[1] i.e. move the the template defaults to the base env file, rather than making any changes in heat. There is an related upstream fix in heat[2] to avoid duplicates if merge_strategies are specificed in the base env file. However, we probably would not need to backport it to newton. [1] https://review.openstack.org/#/c/391064/ [2] https://review.openstack.org/#/c/390064/
This fix is blocked by broken upstream CI job gate-tripleo-ci-centos-7-ovb-ha.
Status: THT fix Merged upstream in master (ocata). Awaiting for same fix to be +2'd into backport to Newton [2]. [1] https://review.openstack.org/#/c/391064/ [2] https://review.openstack.org/#/c/394442/
I am still having the reported issue with my first test. The nova.conf on my computes is empty [0] after the deploy Ceph OSD services are running however. I have checked that I have the versions that should contain the fix [1]. I am double checking that I lined up my deploy environment overrides correctly [2]. Also, sharing my user-environment.yaml from swift [3]. John [0] [stack@hci-director ~]$ ansible osds -b -m shell -a "wc -l /etc/nova/nova.conf" 192.168.1.29 | SUCCESS | rc=0 >> 0 /etc/nova/nova.conf 192.168.1.34 | SUCCESS | rc=0 >> 0 /etc/nova/nova.conf 192.168.1.36 | SUCCESS | rc=0 >> 0 /etc/nova/nova.conf 192.168.1.32 | SUCCESS | rc=0 >> 0 /etc/nova/nova.conf [1] Desired fixes are in: [stack@hci-director ~]$ sudo rpm -qa | grep openstack-tripleo-heat openstack-tripleo-heat-templates-5.0.0-1.5.el7ost.noarch [stack@hci-director ~]$ [root@hci-director ~]# rpm -q python-heatclient python-heatclient-1.5.0-1.el7ost.noarch [root@hci-director ~]# [stack@hci-director ~]$ sudo rpm -qa | grep openstack-heat openstack-heat-templates-0-0.6.1e6015dgit.el7ost.noarch openstack-heat-engine-7.0.0-5.el7ost.noarch openstack-heat-api-cfn-7.0.0-5.el7ost.noarch openstack-heat-api-7.0.0-5.el7ost.noarch openstack-heat-common-7.0.0-5.el7ost.noarch [stack@hci-director ~]$ openstack-heat-7.0.0-5.el7ost is the NVR and openstack-heat-{engine,api-cfn,api,common}-7.0.0-5.el7ost.noarch are the RPMs. [2] My deploy command: time openstack overcloud deploy --templates \ -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/hyperconverged-ceph.yaml \ -e ~/custom-templates/network.yaml \ -e ~/custom-templates/ceph.yaml \ --control-flavor control \ --control-scale 3 \ --compute-flavor compute \ --compute-scale 4 \ Where network.yaml and ceph.yaml are at https://github.com/RHsyseng/hci/tree/master/custom-templates. [3] swift download overcloud user-environment.yaml parameter_merge_strategies: {ComputeServices: merge} resource_registry: {'OS::TripleO::BlockStorage::Ports::ExternalPort': network/ports/noop.yaml, 'OS::TripleO::BlockStorage::Ports::InternalApiPort': network/ports/internal_api.yaml, 'OS::TripleO::BlockStorage::Ports::StorageMgmtPort': network/ports/storage_mgmt.yaml, 'OS::TripleO::BlockStorage::Ports::StoragePort': network/ports/storage.yaml, 'OS::TripleO::BlockStorage::Ports::TenantPort': network/ports/noop.yaml, 'OS::TripleO::CephStorage::Ports::ExternalPort': network/ports/noop.yaml, 'OS::TripleO::CephStorage::Ports::InternalApiPort': network/ports/noop.yaml, 'OS::TripleO::CephStorage::Ports::StorageMgmtPort': network/ports/storage_mgmt.yaml, 'OS::TripleO::CephStorage::Ports::StoragePort': network/ports/storage.yaml, 'OS::TripleO::CephStorage::Ports::TenantPort': network/ports/noop.yaml, 'OS::TripleO::Compute::Net::SoftwareConfig': user-files/home/stack/custom-templates/nic-configs/compute-nics.yaml, 'OS::TripleO::Compute::Ports::ExternalPort': network/ports/noop.yaml, 'OS::TripleO::Compute::Ports::InternalApiPort': network/ports/internal_api.yaml, 'OS::TripleO::Compute::Ports::StorageMgmtPort': network/ports/storage_mgmt.yaml, 'OS::TripleO::Compute::Ports::StoragePort': network/ports/storage.yaml, 'OS::TripleO::Compute::Ports::TenantPort': network/ports/tenant.yaml, 'OS::TripleO::Controller::Net::SoftwareConfig': user-files/home/stack/custom-templates/nic-configs/controller-nics.yaml, 'OS::TripleO::Controller::Ports::ExternalPort': network/ports/external.yaml, 'OS::TripleO::Controller::Ports::InternalApiPort': network/ports/internal_api.yaml, 'OS::TripleO::Controller::Ports::RedisVipPort': network/ports/vip.yaml, 'OS::TripleO::Controller::Ports::StorageMgmtPort': network/ports/storage_mgmt.yaml, 'OS::TripleO::Controller::Ports::StoragePort': network/ports/storage.yaml, 'OS::TripleO::Controller::Ports::TenantPort': network/ports/tenant.yaml, 'OS::TripleO::ControllerConfig': puppet/controller-config-pacemaker.yaml, 'OS::TripleO::Network::External': network/external.yaml, 'OS::TripleO::Network::InternalApi': network/internal_api.yaml, 'OS::TripleO::Network::Ports::ExternalVipPort': network/ports/external.yaml, 'OS::TripleO::Network::Ports::InternalApiVipPort': network/ports/internal_api.yaml, 'OS::TripleO::Network::Ports::RedisVipPort': network/ports/vip.yaml, 'OS::TripleO::Network::Ports::StorageMgmtVipPort': network/ports/storage_mgmt.yaml, 'OS::TripleO::Network::Ports::StorageVipPort': network/ports/storage.yaml, 'OS::TripleO::Network::Storage': network/storage.yaml, 'OS::TripleO::Network::StorageMgmt': network/storage_mgmt.yaml, 'OS::TripleO::Network::Tenant': network/tenant.yaml, 'OS::TripleO::NodeExtraConfigPost': user-files/home/stack/custom-templates/post-deploy-template.yaml, 'OS::TripleO::NodeUserData': user-files/home/stack/custom-templates/first-boot-template.yaml, 'OS::TripleO::Services::CephClient': puppet/services/ceph-client.yaml, 'OS::TripleO::Services::CephMon': puppet/services/ceph-mon.yaml, 'OS::TripleO::Services::CephOSD': puppet/services/ceph-osd.yaml, 'OS::TripleO::Services::CinderVolume': puppet/services/pacemaker/cinder-volume.yaml, 'OS::TripleO::Services::HAproxy': puppet/services/pacemaker/haproxy.yaml, 'OS::TripleO::Services::MySQL': puppet/services/pacemaker/database/mysql.yaml, 'OS::TripleO::Services::Pacemaker': puppet/services/pacemaker.yaml, 'OS::TripleO::Services::RabbitMQ': puppet/services/pacemaker/rabbitmq.yaml, 'OS::TripleO::Services::Redis': puppet/services/pacemaker/database/redis.yaml, 'OS::TripleO::SwiftStorage::Ports::ExternalPort': network/ports/noop.yaml, 'OS::TripleO::SwiftStorage::Ports::InternalApiPort': network/ports/internal_api.yaml, 'OS::TripleO::SwiftStorage::Ports::StorageMgmtPort': network/ports/storage_mgmt.yaml, 'OS::TripleO::SwiftStorage::Ports::StoragePort': network/ports/storage.yaml, 'OS::TripleO::SwiftStorage::Ports::TenantPort': network/ports/noop.yaml, 'OS::TripleO::Tasks::ControllerPostPuppet': extraconfig/tasks/post_puppet_pacemaker.yaml, 'OS::TripleO::Tasks::ControllerPostPuppetRestart': extraconfig/tasks/post_puppet_pacemaker_restart.yaml, 'OS::TripleO::Tasks::ControllerPrePuppet': extraconfig/tasks/pre_puppet_pacemaker.yaml}
The actual bug in tripleoclient/tripleo-common is tracked upstream via LP bug 1635409. Until that is fixed, we can put the entire list of services needed on the Compute role in the environment file, so that the user experience does not change (people just need to deploy passing the additional environment file) and when the upstream fix is finished we'll go back to using merge_strategies.
According to our records, this should be resolved by openstack-tripleo-heat-templates-5.1.0-7.el7ost. This build is available now.
According to our records, this should be resolved by openstack-heat-7.0.0-7.el7ost. This build is available now.
verified on openstack-tripleo-heat-templates-5.2.0-9.el7ost.noarch