Created attachment 1052240 [details] heat deployment-show output Description of problem: I'm starting with a 1 controller, 1 compute deployment, virt environment without network isolation. Updating the stack to 3 controllers, 1 compute node and 1 ceph node fails. Version-Release number of selected component (if applicable): instack-undercloud-2.1.2-19.el7ost.noarch openstack-tripleo-heat-templates-0.8.6-35.el7ost.noarch openstack-heat-templates-0-0.6.20150605git.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy 1 controller, 1 compute openstack overcloud deploy --plan-uuid 3b7779f5-8206-4913-909e-eb6e1e9d9f63 --control-scale 1 --compute-scale 1 --ceph-storage-scale 0 --block-storage-scale 0 --swift-storage-scale 0 2. Update the stack to 3 controllers, 1 compute and 1 ceph node openstack overcloud deploy --plan-uuid 3b7779f5-8206-4913-909e-eb6e1e9d9f63 --control-scale 3 --ceph-storage-scale 1 Actual results: Stack update fails. Expected results: Stack updates succeeds with using the new configuration. Additional info: Attaching the output of heat deployment-show.
I tested this on version 8 and the issue is still present. I started with a 1 ctrl, 1 compute deployment with network isolation: openstack overcloud deploy --templates $THT \ -e $THT/environments/network-isolation.yaml \ -e ~/templates/network-environment.yaml \ -e ~/templates/firstboot-environment.yaml \ --control-scale 1 \ --compute-scale 1 \ --ntp-server clock.redhat.com \ --libvirt-type qemu and tried to scale out to 3 controllers openstack overcloud deploy --templates $THT \ -e $THT/environments/network-isolation.yaml \ -e ~/templates/network-environment.yaml \ -e ~/templates/firstboot-environment.yaml \ --control-scale 3 \ --compute-scale 1 \ --ntp-server clock.redhat.com \ --libvirt-type qemu The deployment appears to be stuck at step: overcloud-ControllerNodesPostDeployment-y237iugnyxw3-ControllerLoadBalancerDeployment_Step1-6epiqilgdz2m From what I can see on overcloud-controller-1 it gets stuck at: [DEBUG] Running /var/lib/heat-config/hooks/puppet < /var/run/heat-config/deployed/c4e44ac7-8fe5-4b4b-9b0d-e6a3a969846d.json Mar 01 10:47:43 overcloud-controller-1.localdomain passwd[26694]: pam_unix(passwd:chauthtok): password changed for hacluster Running puppet manually shows that it's looping around: Debug: Exec[wait-for-settle](provider=posix): Executing '/usr/sbin/pcs status | grep -q 'partition with quorum' > /dev/null 2>&1' Debug: Executing '/usr/sbin/pcs status | grep -q 'partition with quorum' > /dev/null 2>&1' Debug: /Stage[main]/Pacemaker::Corosync/Exec[wait-for-settle]/returns: Sleeping for 10.0 seconds between tries At this stage there is no corosync.conf file existing in /etc/corosync/ so I guess we're missing a configuration step before trying to get the nodes into the cluster.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
*** Bug 1336588 has been marked as a duplicate of this bug. ***
In OPS10 with composable service and custom roles this is possible.