Description of problem: OSP8 -> OSP9 upgrade: pacemaker resources are stopped and unmanaged post upgrade: [root@controller-0 heat-admin]# pcs status Cluster name: tripleo_cluster Stack: corosync Current DC: controller-1 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum Last updated: Wed Jun 21 16:55:27 2017 Last change: Wed Jun 21 16:34:43 2017 by root via cibadmin on controller-0 *** Resource management is DISABLED *** The cluster will not attempt to start, stop or recover services 3 nodes and 115 resources configured: 15 resources DISABLED and 0 BLOCKED from being started due to failures Online: [ controller-0 controller-1 controller-2 ] Full list of resources: ip-172.17.4.10 (ocf::heartbeat:IPaddr2): Started controller-0 (unmanaged) ip-192.168.24.6 (ocf::heartbeat:IPaddr2): Started controller-1 (unmanaged) Clone Set: haproxy-clone [haproxy] (unmanaged) haproxy (systemd:haproxy): Started controller-1 (unmanaged) haproxy (systemd:haproxy): Started controller-0 (unmanaged) haproxy (systemd:haproxy): Started controller-2 (unmanaged) ip-172.17.3.10 (ocf::heartbeat:IPaddr2): Started controller-2 (unmanaged) ip-172.17.1.10 (ocf::heartbeat:IPaddr2): Started controller-0 (unmanaged) ip-10.0.0.101 (ocf::heartbeat:IPaddr2): Started controller-1 (unmanaged) ip-172.17.1.11 (ocf::heartbeat:IPaddr2): Started controller-2 (unmanaged) Master/Slave Set: redis-master [redis] (unmanaged) Stopped (disabled): [ controller-0 controller-1 controller-2 ] Master/Slave Set: galera-master [galera] (unmanaged) galera (ocf::heartbeat:galera): Master controller-1 (unmanaged) galera (ocf::heartbeat:galera): Master controller-0 (unmanaged) galera (ocf::heartbeat:galera): Master controller-2 (unmanaged) Clone Set: mongod-clone [mongod] (unmanaged) mongod (systemd:mongod): Started controller-1 (unmanaged) mongod (systemd:mongod): Started controller-0 (unmanaged) mongod (systemd:mongod): Started controller-2 (unmanaged) Clone Set: rabbitmq-clone [rabbitmq] (unmanaged) Stopped (disabled): [ controller-0 controller-1 controller-2 ] Clone Set: memcached-clone [memcached] (unmanaged) Stopped (disabled): [ controller-0 controller-1 controller-2 ] Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: neutron-l3-agent-clone [neutron-l3-agent] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-heat-engine-clone [openstack-heat-engine] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup] (unmanaged) neutron-ovs-cleanup (ocf::neutron:OVSCleanup): Started controller-1 (unmanaged) neutron-ovs-cleanup (ocf::neutron:OVSCleanup): Started controller-0 (unmanaged) neutron-ovs-cleanup (ocf::neutron:OVSCleanup): Started controller-2 (unmanaged) Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup] (unmanaged) neutron-netns-cleanup (ocf::neutron:NetnsCleanup): Started controller-1 (unmanaged) neutron-netns-cleanup (ocf::neutron:NetnsCleanup): Started controller-0 (unmanaged) neutron-netns-cleanup (ocf::neutron:NetnsCleanup): Started controller-2 (unmanaged) Clone Set: openstack-heat-api-clone [openstack-heat-api] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-nova-api-clone [openstack-nova-api] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-glance-api-clone [openstack-glance-api] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: delay-clone [delay] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: httpd-clone [httpd] (unmanaged) Stopped (disabled): [ controller-0 controller-1 controller-2 ] Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-glance-registry-clone [openstack-glance-registry] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-cinder-api-clone [openstack-cinder-api] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: neutron-server-clone [neutron-server] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] openstack-cinder-volume (systemd:openstack-cinder-volume): Stopped (unmanaged) Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] Clone Set: openstack-core-clone [openstack-core] (unmanaged) Stopped (disabled): [ controller-0 controller-1 controller-2 ] Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-2.0.0-57.el7ost.noarch openstack-tripleo-heat-templates-kilo-0.8.14-29.el7ost.noarch openstack-tripleo-heat-templates-liberty-2.0.0-57.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy OSP8 2. Upgrade to OSP9 Actual results: Upgrade completes successfuly but the overcloud is not functional Expected results: Overcloud is working ok. Additional info:
Hi, there is the infamous bns table error: Jun 23 13:50:38 controller-0.localdomain "Table 'ovs_neutron.bsn_routerrules' doesn't exist\") [SQL: u'ALTER TABLE bsn_routerrules ADD COLUMN tenant_id VARCHAR(255)']\n", "deploy_status_code": 1} during step2, but somehow it achieves to get to Step4 which fails: /Stage[main]/Keystone/Exec[keystone-manage bootstrap]: Failed to call refresh: Command exceeded timeout\u001b[0m\n\u001b[1;31mError: /Stage[main]/Keystone/Exec[keystone-manage bootstrap]: Command exceeded timeout\nWrapped exception:\nexecution expired\u001b[0m\n", "deploy_status_code": 6 and make the upgrade fails. As we didn't get to Step~6 where the cluster is brought back to managed, we have this overall unmanaged stuff. Moving it to Mathieu as he has deal with this issue before.
I created the fix for the 9 branch
When is the fix scheduled to land?
Reviewer is needed on the fix, I will ping folks this morning.
The fix is landed, it should be available in the next Z release.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1736