Bug 1463759 - OSP8 -> OSP9 upgrade: pacemaker resources are stopped and unmanaged post upgrade
OSP8 -> OSP9 upgrade: pacemaker resources are stopped and unmanaged post upgrade
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-bigswitch (Show other bugs)
9.0 (Mitaka)
Unspecified Unspecified
unspecified Severity urgent
: zstream
: 9.0 (Mitaka)
Assigned To: mathieu bultel
Amit Ugol
: Regression, Triaged, ZStream
Depends On:
Blocks: 1472738
  Show dependency treegraph
 
Reported: 2017-06-21 12:58 EDT by Marius Cornea
Modified: 2017-08-07 18:08 EDT (History)
15 users (show)

See Also:
Fixed In Version: python-networking-bigswitch-8.40.7-2.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1472738 (view as bug list)
Environment:
Last Closed: 2017-07-12 09:18:57 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Marius Cornea 2017-06-21 12:58:48 EDT
Description of problem:
OSP8 -> OSP9 upgrade: pacemaker resources are stopped and unmanaged post upgrade:

[root@controller-0 heat-admin]# pcs status
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-1 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
Last updated: Wed Jun 21 16:55:27 2017		Last change: Wed Jun 21 16:34:43 2017 by root via cibadmin on controller-0

              *** Resource management is DISABLED ***
  The cluster will not attempt to start, stop or recover services

3 nodes and 115 resources configured: 15 resources DISABLED and 0 BLOCKED from being started due to failures

Online: [ controller-0 controller-1 controller-2 ]

Full list of resources:

 ip-172.17.4.10	(ocf::heartbeat:IPaddr2):	Started controller-0 (unmanaged)
 ip-192.168.24.6	(ocf::heartbeat:IPaddr2):	Started controller-1 (unmanaged)
 Clone Set: haproxy-clone [haproxy] (unmanaged)
     haproxy	(systemd:haproxy):	Started controller-1 (unmanaged)
     haproxy	(systemd:haproxy):	Started controller-0 (unmanaged)
     haproxy	(systemd:haproxy):	Started controller-2 (unmanaged)
 ip-172.17.3.10	(ocf::heartbeat:IPaddr2):	Started controller-2 (unmanaged)
 ip-172.17.1.10	(ocf::heartbeat:IPaddr2):	Started controller-0 (unmanaged)
 ip-10.0.0.101	(ocf::heartbeat:IPaddr2):	Started controller-1 (unmanaged)
 ip-172.17.1.11	(ocf::heartbeat:IPaddr2):	Started controller-2 (unmanaged)
 Master/Slave Set: redis-master [redis] (unmanaged)
     Stopped (disabled): [ controller-0 controller-1 controller-2 ]
 Master/Slave Set: galera-master [galera] (unmanaged)
     galera	(ocf::heartbeat:galera):	Master controller-1 (unmanaged)
     galera	(ocf::heartbeat:galera):	Master controller-0 (unmanaged)
     galera	(ocf::heartbeat:galera):	Master controller-2 (unmanaged)
 Clone Set: mongod-clone [mongod] (unmanaged)
     mongod	(systemd:mongod):	Started controller-1 (unmanaged)
     mongod	(systemd:mongod):	Started controller-0 (unmanaged)
     mongod	(systemd:mongod):	Started controller-2 (unmanaged)
 Clone Set: rabbitmq-clone [rabbitmq] (unmanaged)
     Stopped (disabled): [ controller-0 controller-1 controller-2 ]
 Clone Set: memcached-clone [memcached] (unmanaged)
     Stopped (disabled): [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-l3-agent-clone [neutron-l3-agent] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-heat-engine-clone [openstack-heat-engine] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup] (unmanaged)
     neutron-ovs-cleanup	(ocf::neutron:OVSCleanup):	Started controller-1 (unmanaged)
     neutron-ovs-cleanup	(ocf::neutron:OVSCleanup):	Started controller-0 (unmanaged)
     neutron-ovs-cleanup	(ocf::neutron:OVSCleanup):	Started controller-2 (unmanaged)
 Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup] (unmanaged)
     neutron-netns-cleanup	(ocf::neutron:NetnsCleanup):	Started controller-1 (unmanaged)
     neutron-netns-cleanup	(ocf::neutron:NetnsCleanup):	Started controller-0 (unmanaged)
     neutron-netns-cleanup	(ocf::neutron:NetnsCleanup):	Started controller-2 (unmanaged)
 Clone Set: openstack-heat-api-clone [openstack-heat-api] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-api-clone [openstack-nova-api] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-glance-api-clone [openstack-glance-api] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: delay-clone [delay] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: httpd-clone [httpd] (unmanaged)
     Stopped (disabled): [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-glance-registry-clone [openstack-glance-registry] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-cinder-api-clone [openstack-cinder-api] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: neutron-server-clone [neutron-server] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 openstack-cinder-volume	(systemd:openstack-cinder-volume):	Stopped (unmanaged)
 Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 Clone Set: openstack-core-clone [openstack-core] (unmanaged)
     Stopped (disabled): [ controller-0 controller-1 controller-2 ]

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled


Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-2.0.0-57.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.14-29.el7ost.noarch
openstack-tripleo-heat-templates-liberty-2.0.0-57.el7ost.noarch


How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP8
2. Upgrade to OSP9 

Actual results:
Upgrade completes successfuly but the overcloud is not functional

Expected results:
Overcloud is working ok.

Additional info:
Comment 3 Sofer Athlan-Guyot 2017-06-26 08:00:24 EDT
Hi,

there is the infamous bns table error:

Jun 23 13:50:38 controller-0.localdomain "Table 'ovs_neutron.bsn_routerrules' doesn't exist\") [SQL: u'ALTER TABLE bsn_routerrules ADD COLUMN tenant_id VARCHAR(255)']\n", "deploy_status_code": 1} 

during step2, but somehow it achieves to get to Step4 which fails:

/Stage[main]/Keystone/Exec[keystone-manage bootstrap]: Failed to call refresh: Command exceeded timeout\u001b[0m\n\u001b[1;31mError: /Stage[main]/Keystone/Exec[keystone-manage bootstrap]: Command exceeded timeout\nWrapped exception:\nexecution expired\u001b[0m\n", "deploy_status_code": 6

and make the upgrade fails.  As we didn't get to Step~6 where the cluster is brought back to managed, we have this overall unmanaged stuff.

Moving it to Mathieu as he has deal with this issue before.
Comment 5 mathieu bultel 2017-06-26 09:54:58 EDT
I created the fix for the 9 branch
Comment 7 Yolanda Robla 2017-06-29 03:58:29 EDT
When is the fix scheduled to land?
Comment 9 mathieu bultel 2017-06-29 04:04:32 EDT
Reviewer is needed on the fix, I will ping folks this morning.
Comment 16 mathieu bultel 2017-06-30 08:00:04 EDT
The fix is landed, it should be available in the next Z release.
Comment 21 errata-xmlrpc 2017-07-12 09:18:57 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1736

Note You need to log in before you can comment on or make changes to this bug.