Red Hat Bugzilla – 1356777 – rhel-osp-director: scale down of computes fails after upgrade 8.0->9.0, some resources are unmanaged/stopped on controllers.
Was this deployment in a sane state before the upgrade/scale attempt? I'm seeing issues starting galera right away in the logs, which makes me think the initial deployment failed. At that point I wouldn't expect anything else to work.
rhel-osp-director: scale down of computes fails after upgrade 8.0->9.0 Environment: openstack-tripleo-heat-templates-2.0.0-15.el7ost.noarch openstack-tripleo-heat-templates-liberty-2.0.0-15.el7ost.noarch openstack-tripleo-heat-templates-kilo-2.0.0-15.el7ost.noarch instack-undercloud-4.0.0-7.el7ost.noarch openstack-puppet-modules-8.1.2-1.el7ost.noarch Steps to reproduce: 1. Deploy 8.0 with: openstack overcloud deploy --templates --control-scale 3 --compute-scale 2 --ceph-storage-scale 3 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml 2. Upgrade to 9.0 (including updating the images for OC nodes). 3. Try to scale down the computes with: openstack overcloud deploy --templates --control-scale 3 --compute-scale 1 --ceph-storage-scale 3 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml Result: 2016-06-30 02:22:13 [overcloud-ComputeAllNodesValidationDeployment-rxtdysawxz72]: UPDATE_COMPLETE Stack UPDATE completed successfully 2016-06-30 02:22:14 [ComputeAllNodesValidationDeployment]: UPDATE_COMPLETE state changed 2016-06-30 02:53:15 [2]: SIGNAL_COMPLETE Unknown 2016-06-30 02:53:22 [1]: SIGNAL_COMPLETE Unknown Stack overcloud UPDATE_FAILED Deployment failed: Heat Stack update failed. pcs status outputs the following: [root@overcloud-controller-0 ~]# pcs status Cluster name: tripleo_cluster Last updated: Fri Jul 15 03:18:01 2016 Last change: Fri Jul 15 01:50:34 2016 by root via cibadmin on overcloud-controller-0 Stack: corosync Current DC: overcloud-controller-2 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 3 nodes and 127 resources configured Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Full list of resources: ip-192.168.200.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 (unmanaged) ip-10.19.94.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 (unmanaged) ip-10.19.95.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 (unmanaged) Clone Set: haproxy-clone [haproxy] (unmanaged) haproxy (systemd:haproxy): Started overcloud-controller-0 (unmanaged) haproxy (systemd:haproxy): Started overcloud-controller-2 (unmanaged) haproxy (systemd:haproxy): Started overcloud-controller-1 (unmanaged) ip-192.168.0.6 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 (unmanaged) Master/Slave Set: galera-master [galera] (unmanaged) galera (ocf::heartbeat:galera): FAILED Master overcloud-controller-0 (unmanaged) galera (ocf::heartbeat:galera): Started overcloud-controller-2 (unmanaged) galera (ocf::heartbeat:galera): Started overcloud-controller-1 (unmanaged) Clone Set: memcached-clone [memcached] (unmanaged) memcached (systemd:memcached): Started overcloud-controller-0 (unmanaged) memcached (systemd:memcached): Started overcloud-controller-2 (unmanaged) memcached (systemd:memcached): Started overcloud-controller-1 (unmanaged) ip-10.19.94.11 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 (unmanaged) ip-10.19.184.180 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 (unmanaged) Clone Set: rabbitmq-clone [rabbitmq] (unmanaged) rabbitmq (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-0 (unmanaged) rabbitmq (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-2 (unmanaged) rabbitmq (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-1 (unmanaged) Clone Set: openstack-core-clone [openstack-core] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Master/Slave Set: redis-master [redis] (unmanaged) redis (ocf::heartbeat:redis): Master overcloud-controller-0 (unmanaged) redis (ocf::heartbeat:redis): Started overcloud-controller-2 (unmanaged) redis (ocf::heartbeat:redis): Started overcloud-controller-1 (unmanaged) Clone Set: mongod-clone [mongod] (unmanaged) mongod (systemd:mongod): Started overcloud-controller-0 (unmanaged) mongod (systemd:mongod): Started overcloud-controller-2 (unmanaged) mongod (systemd:mongod): Started overcloud-controller-1 (unmanaged) Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: neutron-l3-agent-clone [neutron-l3-agent] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup] (unmanaged) neutron-netns-cleanup (ocf::neutron:NetnsCleanup): Started overcloud-controller-0 (unmanaged) neutron-netns-cleanup (ocf::neutron:NetnsCleanup): Started overcloud-controller-2 (unmanaged) neutron-netns-cleanup (ocf::neutron:NetnsCleanup): Started overcloud-controller-1 (unmanaged) Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup] (unmanaged) neutron-ovs-cleanup (ocf::neutron:OVSCleanup): Started overcloud-controller-0 (unmanaged) neutron-ovs-cleanup (ocf::neutron:OVSCleanup): Started overcloud-controller-2 (unmanaged) neutron-ovs-cleanup (ocf::neutron:OVSCleanup): Started overcloud-controller-1 (unmanaged) openstack-cinder-volume (systemd:openstack-cinder-volume): Stopped (unmanaged) Clone Set: openstack-heat-engine-clone [openstack-heat-engine] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-gnocchi-metricd-clone [openstack-gnocchi-metricd] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-heat-api-clone [openstack-heat-api] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-glance-api-clone [openstack-glance-api] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-nova-api-clone [openstack-nova-api] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-sahara-api-clone [openstack-sahara-api] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-sahara-engine-clone [openstack-sahara-engine] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-glance-registry-clone [openstack-glance-registry] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-gnocchi-statsd-clone [openstack-gnocchi-statsd] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-cinder-api-clone [openstack-cinder-api] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: delay-clone [delay] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: httpd-clone [httpd] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Clone Set: neutron-server-clone [neutron-server] (unmanaged) Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ] Failed Actions: * galera_promote_0 on overcloud-controller-0 'unknown error' (1): call=35, status=complete, exitreason='Failed initial monitor action', last-rc-change='Thu Jul 14 23:09:32 2016', queued=0ms, exec=8811ms * openstack-nova-scheduler_start_0 on overcloud-controller-0 'OCF_TIMEOUT' (198): call=102, status=Timed Out, exitreason='none', last-rc-change='Fri Jul 15 00:00:41 2016', queued=0ms, exec=199981ms * openstack-nova-scheduler_start_0 on overcloud-controller-2 'OCF_TIMEOUT' (198): call=101, status=Timed Out, exitreason='none', last-rc-change='Fri Jul 15 00:00:41 2016', queued=0ms, exec=199993ms * openstack-nova-scheduler_start_0 on overcloud-controller-1 'OCF_TIMEOUT' (198): call=102, status=Timed Out, exitreason='none', last-rc-change='Fri Jul 15 00:00:41 2016', queued=0ms, exec=199992ms PCSD Status: overcloud-controller-0: Online overcloud-controller-1: Online overcloud-controller-2: Online Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled