Bug 1353079
| Summary: | osp-director-9: Attempt to upgrade OSP 8.0-> 9.0 with SSL fails. | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Omri Hochman <ohochman> | ||||||
| Component: | documentation | Assignee: | Dan Macpherson <dmacpher> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | RHOS Documentation Team <rhos-docs> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | urgent | ||||||||
| Version: | 9.0 (Mitaka) | CC: | akaris, dbecker, dmacpher, ebarrera, goneri, jason.dobies, jcoufal, jjoyce, josorior, lbopf, markmc, mburns, mcornea, morazi, nkinder, ohochman, panbalag, racedoro, rhel-osp-director-maint, srevivo, tvignaud | ||||||
| Target Milestone: | ga | Keywords: | Documentation | ||||||
| Target Release: | 10.0 (Newton) | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2017-02-23 08:03:33 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Omri Hochman
2016-07-06 03:16:58 UTC
[root@overcloud-controller-0 ~]# pcs status
Cluster name: tripleo_cluster
Last updated: Wed Jul 6 03:15:35 2016 Last change: Wed Jul 6 00:52:01 2016 by root via cibadmin on overcloud-controller-2
Stack: corosync
Current DC: overcloud-controller-0 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum
3 nodes and 112 resources configured
Online: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
Full list of resources:
ip-10.19.184.210 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 (unmanaged)
ip-192.168.200.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 (unmanaged)
Clone Set: haproxy-clone [haproxy] (unmanaged)
haproxy (systemd:haproxy): Started overcloud-controller-2 (unmanaged)
haproxy (systemd:haproxy): Started overcloud-controller-0 (unmanaged)
haproxy (systemd:haproxy): Started overcloud-controller-1 (unmanaged)
ip-192.168.0.6 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 (unmanaged)
ip-10.19.104.11 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0 (unmanaged)
ip-10.19.105.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1 (unmanaged)
ip-10.19.104.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2 (unmanaged)
Master/Slave Set: redis-master [redis] (unmanaged)
redis (ocf::heartbeat:redis): Master overcloud-controller-2 (unmanaged)
redis (ocf::heartbeat:redis): Slave overcloud-controller-0 (unmanaged)
redis (ocf::heartbeat:redis): Slave overcloud-controller-1 (unmanaged)
Master/Slave Set: galera-master [galera] (unmanaged)
galera (ocf::heartbeat:galera): Master overcloud-controller-2 (unmanaged)
galera (ocf::heartbeat:galera): Master overcloud-controller-0 (unmanaged)
galera (ocf::heartbeat:galera): Master overcloud-controller-1 (unmanaged)
Clone Set: mongod-clone [mongod] (unmanaged)
mongod (systemd:mongod): Started overcloud-controller-2 (unmanaged)
mongod (systemd:mongod): Started overcloud-controller-0 (unmanaged)
mongod (systemd:mongod): Started overcloud-controller-1 (unmanaged)
Clone Set: rabbitmq-clone [rabbitmq] (unmanaged)
rabbitmq (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-2 (unmanaged)
rabbitmq (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-0 (unmanaged)
rabbitmq (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-1 (unmanaged)
Clone Set: memcached-clone [memcached] (unmanaged)
memcached (systemd:memcached): Started overcloud-controller-2 (unmanaged)
memcached (systemd:memcached): Started overcloud-controller-0 (unmanaged)
memcached (systemd:memcached): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler] (unmanaged)
openstack-nova-scheduler (systemd:openstack-nova-scheduler): Started overcloud-controller-2 (unmanaged)
openstack-nova-scheduler (systemd:openstack-nova-scheduler): Started overcloud-controller-0 (unmanaged)
openstack-nova-scheduler (systemd:openstack-nova-scheduler): Started overcloud-controller-1 (unmanaged)
Clone Set: neutron-l3-agent-clone [neutron-l3-agent] (unmanaged)
neutron-l3-agent (systemd:neutron-l3-agent): Started overcloud-controller-0 (unmanaged)
neutron-l3-agent (systemd:neutron-l3-agent): Started overcloud-controller-1 (unmanaged)
Stopped: [ overcloud-controller-2 ]
Clone Set: openstack-ceilometer-alarm-notifier-clone [openstack-ceilometer-alarm-notifier] (unmanaged)
Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
Clone Set: openstack-heat-engine-clone [openstack-heat-engine] (unmanaged)
Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
Clone Set: openstack-ceilometer-api-clone [openstack-ceilometer-api] (unmanaged)
openstack-ceilometer-api (systemd:openstack-ceilometer-api): Started overcloud-controller-2 (unmanaged)
openstack-ceilometer-api (systemd:openstack-ceilometer-api): Started overcloud-controller-0 (unmanaged)
openstack-ceilometer-api (systemd:openstack-ceilometer-api): Started overcloud-controller-1 (unmanaged)
Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent] (unmanaged)
neutron-metadata-agent (systemd:neutron-metadata-agent): Started overcloud-controller-0 (unmanaged)
neutron-metadata-agent (systemd:neutron-metadata-agent): Started overcloud-controller-1 (unmanaged)
Stopped: [ overcloud-controller-2 ]
Clone Set: neutron-ovs-cleanup-clone [neutron-ovs-cleanup] (unmanaged)
neutron-ovs-cleanup (ocf::neutron:OVSCleanup): Started overcloud-controller-2 (unmanaged)
neutron-ovs-cleanup (ocf::neutron:OVSCleanup): Started overcloud-controller-0 (unmanaged)
neutron-ovs-cleanup (ocf::neutron:OVSCleanup): Started overcloud-controller-1 (unmanaged)
Clone Set: neutron-netns-cleanup-clone [neutron-netns-cleanup] (unmanaged)
neutron-netns-cleanup (ocf::neutron:NetnsCleanup): Started overcloud-controller-2 (unmanaged)
neutron-netns-cleanup (ocf::neutron:NetnsCleanup): Started overcloud-controller-0 (unmanaged)
neutron-netns-cleanup (ocf::neutron:NetnsCleanup): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-heat-api-clone [openstack-heat-api] (unmanaged)
Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
Clone Set: openstack-cinder-scheduler-clone [openstack-cinder-scheduler] (unmanaged)
openstack-cinder-scheduler (systemd:openstack-cinder-scheduler): Started overcloud-controller-2 (unmanaged)
openstack-cinder-scheduler (systemd:openstack-cinder-scheduler): Started overcloud-controller-0 (unmanaged)
openstack-cinder-scheduler (systemd:openstack-cinder-scheduler): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-nova-api-clone [openstack-nova-api] (unmanaged)
openstack-nova-api (systemd:openstack-nova-api): Started overcloud-controller-2 (unmanaged)
openstack-nova-api (systemd:openstack-nova-api): Started overcloud-controller-0 (unmanaged)
openstack-nova-api (systemd:openstack-nova-api): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-heat-api-cloudwatch-clone [openstack-heat-api-cloudwatch] (unmanaged)
Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
Clone Set: openstack-ceilometer-collector-clone [openstack-ceilometer-collector] (unmanaged)
openstack-ceilometer-collector (systemd:openstack-ceilometer-collector): Started overcloud-controller-2 (unmanaged)
openstack-ceilometer-collector (systemd:openstack-ceilometer-collector): Started overcloud-controller-0 (unmanaged)
openstack-ceilometer-collector (systemd:openstack-ceilometer-collector): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-keystone-clone [openstack-keystone] (unmanaged)
openstack-keystone (systemd:openstack-keystone): Started overcloud-controller-2 (unmanaged)
openstack-keystone (systemd:openstack-keystone): Started overcloud-controller-0 (unmanaged)
openstack-keystone (systemd:openstack-keystone): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-nova-consoleauth-clone [openstack-nova-consoleauth] (unmanaged)
openstack-nova-consoleauth (systemd:openstack-nova-consoleauth): Started overcloud-controller-2 (unmanaged)
openstack-nova-consoleauth (systemd:openstack-nova-consoleauth): Started overcloud-controller-0 (unmanaged)
openstack-nova-consoleauth (systemd:openstack-nova-consoleauth): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-glance-registry-clone [openstack-glance-registry] (unmanaged)
openstack-glance-registry (systemd:openstack-glance-registry): Started overcloud-controller-2 (unmanaged)
openstack-glance-registry (systemd:openstack-glance-registry): Started overcloud-controller-0 (unmanaged)
openstack-glance-registry (systemd:openstack-glance-registry): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-ceilometer-notification-clone [openstack-ceilometer-notification] (unmanaged)
Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
Clone Set: openstack-cinder-api-clone [openstack-cinder-api] (unmanaged)
openstack-cinder-api (systemd:openstack-cinder-api): Started overcloud-controller-2 (unmanaged)
openstack-cinder-api (systemd:openstack-cinder-api): Started overcloud-controller-0 (unmanaged)
openstack-cinder-api (systemd:openstack-cinder-api): Started overcloud-controller-1 (unmanaged)
Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent] (unmanaged)
neutron-dhcp-agent (systemd:neutron-dhcp-agent): Started overcloud-controller-2 (unmanaged)
neutron-dhcp-agent (systemd:neutron-dhcp-agent): Started overcloud-controller-0 (unmanaged)
neutron-dhcp-agent (systemd:neutron-dhcp-agent): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-glance-api-clone [openstack-glance-api] (unmanaged)
openstack-glance-api (systemd:openstack-glance-api): Started overcloud-controller-2 (unmanaged)
openstack-glance-api (systemd:openstack-glance-api): Started overcloud-controller-0 (unmanaged)
openstack-glance-api (systemd:openstack-glance-api): Started overcloud-controller-1 (unmanaged)
Clone Set: neutron-openvswitch-agent-clone [neutron-openvswitch-agent] (unmanaged)
neutron-openvswitch-agent (systemd:neutron-openvswitch-agent): Started overcloud-controller-2 (unmanaged)
neutron-openvswitch-agent (systemd:neutron-openvswitch-agent): Started overcloud-controller-0 (unmanaged)
neutron-openvswitch-agent (systemd:neutron-openvswitch-agent): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-nova-novncproxy-clone [openstack-nova-novncproxy] (unmanaged)
openstack-nova-novncproxy (systemd:openstack-nova-novncproxy): Started overcloud-controller-2 (unmanaged)
openstack-nova-novncproxy (systemd:openstack-nova-novncproxy): Started overcloud-controller-0 (unmanaged)
openstack-nova-novncproxy (systemd:openstack-nova-novncproxy): Started overcloud-controller-1 (unmanaged)
Clone Set: delay-clone [delay] (unmanaged)
delay (ocf::heartbeat:Delay): Started overcloud-controller-2 (unmanaged)
delay (ocf::heartbeat:Delay): Started overcloud-controller-0 (unmanaged)
delay (ocf::heartbeat:Delay): Started overcloud-controller-1 (unmanaged)
Clone Set: neutron-server-clone [neutron-server] (unmanaged)
neutron-server (systemd:neutron-server): Started overcloud-controller-2 (unmanaged)
neutron-server (systemd:neutron-server): Started overcloud-controller-0 (unmanaged)
neutron-server (systemd:neutron-server): Started overcloud-controller-1 (unmanaged)
Clone Set: httpd-clone [httpd] (unmanaged)
httpd (systemd:httpd): Started overcloud-controller-2 (unmanaged)
httpd (systemd:httpd): Started overcloud-controller-0 (unmanaged)
httpd (systemd:httpd): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-ceilometer-central-clone [openstack-ceilometer-central] (unmanaged)
openstack-ceilometer-central (systemd:openstack-ceilometer-central): Started overcloud-controller-2 (unmanaged)
openstack-ceilometer-central (systemd:openstack-ceilometer-central): Started overcloud-controller-0 (unmanaged)
openstack-ceilometer-central (systemd:openstack-ceilometer-central): Started overcloud-controller-1 (unmanaged)
Clone Set: openstack-ceilometer-alarm-evaluator-clone [openstack-ceilometer-alarm-evaluator] (unmanaged)
Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn] (unmanaged)
Stopped: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
openstack-cinder-volume (systemd:openstack-cinder-volume): Started overcloud-controller-0 (unmanaged)
Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor] (unmanaged)
openstack-nova-conductor (systemd:openstack-nova-conductor): Started overcloud-controller-2 (unmanaged)
openstack-nova-conductor (systemd:openstack-nova-conductor): Started overcloud-controller-0 (unmanaged)
openstack-nova-conductor (systemd:openstack-nova-conductor): Started overcloud-controller-1 (unmanaged)
Failed Actions:
* openstack-ceilometer-alarm-evaluator_start_0 on overcloud-controller-2 'not installed' (5): call=617, status=Not installed, exitreason='none',
last-rc-change='Wed Jul 6 00:20:13 2016', queued=0ms, exec=87ms
* openstack-ceilometer-alarm-evaluator_start_0 on overcloud-controller-0 'not installed' (5): call=616, status=Not installed, exitreason='none',
last-rc-change='Wed Jul 6 00:20:13 2016', queued=0ms, exec=216ms
* openstack-ceilometer-alarm-evaluator_start_0 on overcloud-controller-1 'not installed' (5): call=639, status=Not installed, exitreason='none',
last-rc-change='Wed Jul 6 00:20:13 2016', queued=0ms, exec=99ms
PCSD Status:
overcloud-controller-0: Online
overcloud-controller-1: Online
overcloud-controller-2: Online
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
[root@overcloud-controller-0 ~]#
Created attachment 1176713 [details]
heat-engine.log from undercloud
Created attachment 1176714 [details]
Adding the /var/log/messages from the controller
reproduced again with: openstack-tripleo-heat-templates-liberty-2.0.0-13.el7ost.noarch happens during the AODH migration step: openstack overcloud deploy --templates --control-scale 3 --compute-scale 1 --ceph-storage-scale 2 --neutron-network-type vxlan --neutron-tunnel-types vxlan --ntp-server 10.5.26.10 --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml -e /home/stack/ssl-heat-templates/environments/enable-tls.yaml -e /home/stack/ssl-heat-templates/environments/inject-trust-anchor.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-aodh.yaml results : -----------+---------------------+ | d288684b-b4d7-4606-bac5-c63611fb262b | overcloud | UPDATE_FAILED | 2016-02-10T18:26:12 | 2016-02-11T17:17:10 | +--------------------------------------+------------+---------------+---------------------+---------------------+ This appears to be caused by the following hierdata in controller.yaml: [root@overcloud-controller-0 hieradata]# sed -n 318p /etc/puppet/hieradata/controller.yaml ceilometer::dispatcher::gnocchi::url: ://: Most probably this is caused by the enable-tls.yaml not containing the new services in the EndpointMap. I adjusted it like the following and the hieradata got populated with the url containing the internal vip: [root@overcloud-controller-0 heat-admin]# sed -n 318p /etc/puppet/hieradata/controller.yaml ceilometer::dispatcher::gnocchi::url: http://10.0.0.10:8041 diff templates/enable-tls.yaml templates/enable-tls.yaml.pre-upgrade 55,63d54 < AodhAdmin: {protocol: 'http', port: '8042', host: 'IP_ADDRESS'} < AodhInternal: {protocol: 'http', port: '8042', host: 'IP_ADDRESS'} < AodhPublic: {protocol: 'https', port: '13042', host: 'IP_ADDRESS'} < GnocchiAdmin: {protocol: 'http', port: '8041', host: 'IP_ADDRESS'} < GnocchiInternal: {protocol: 'http', port: '8041', host: 'IP_ADDRESS'} < GnocchiPublic: {protocol: 'https', port: '13041', host: 'IP_ADDRESS'} < SaharaAdmin: {protocol: 'http', port: '8386', host: 'IP_ADDRESS'} < SaharaInternal: {protocol: 'http', port: '8386', host: 'IP_ADDRESS'} < SaharaPublic: {protocol: 'https', port: '13386', host: 'IP_ADDRESS'} According to Marius the issue is that the upgrade command is changing the original enable-tls.yaml file that locates under:
/usr/share/openstack-tripleo-heat-templates/environments/enable-tls.yaml
then when we run the upgrade command (with -e) we were calling the local copy of the enable-tls.yaml file that is located under:
/home/stack/ssl-heat-templates/environments/enable-tls.yaml
the file that under the /home/stack/ - didn't change during the upgrade,.
Workaround:
------------
before running the upgrade command, run sasha's sed cmd
to fix the enable-tls.yaml file that under /home/stack/ssl-heat-templates/environments/enable-tls.yaml:
sed -i '/EndpointMap.*/a \ \ \ \ AodhAdmin: {protocol: 'http', port: '8042', host: 'IP_ADDRESS'}\n\ \ \ \ AodhInternal: {protocol: 'http', port: '8042', ho
'https', port: '13042', host: 'IP_ADDRESS'}\n\ \ \ \ GnocchiAdmin: {protocol: 'http', port: '8041', host: 'IP_ADDRESS'}\n\ \ \ \ GnocchiInternal: {protocol: 'http', port: '8041', host: 'IP_A
', port: '13041', host: 'IP_ADDRESS'\n\ \ \ \ SaharaAdmin: {protocol: 'http', port: '8386', host: 'IP_ADDRESS'}\n\ \ \ \ SaharaInternal: {protocol: 'http', port: '8386', host: 'IP_ADDRESS'\n
13386', host: 'IP_ADDRESS'}' /home/stack/ssl-heat-templates/environments/enable-tls.yaml
fixing the sed command :
sed -i "/EndpointMap.*/a \ \ \ \ AodhAdmin: {protocol: 'http', port: '8042', host: 'IP_ADDRESS'}\n\ \ \ \ AodhInternal: {protocol: 'http', port: '8042', host: 'IP_ADDRESS'}\n\ \ \ \ AodhPublic: {protocol: 'https', port: '13042', host: 'IP_ADDRESS'}\n\ \ \ \ GnocchiAdmin: {protocol: 'http', port: '8041', host: 'IP_ADDRESS'}\n\ \ \ \ GnocchiInternal: {protocol: 'http', port: '8041', host: 'IP_ADDRESS'}\n\ \ \ \ GnocchiPublic: {protocol: 'https', port: '13041', host: 'IP_ADDRESS'}\n\ \ \ \ SaharaAdmin: {protocol: 'http', port: '8386', host: 'IP_ADDRESS'}\n\ \ \ \ SaharaInternal: {protocol: 'http', port: '8386', host: 'IP_ADDRESS'}\n\ \ \ \ SaharaPublic: {protocol: 'https', port: '13386', host: 'IP_ADDRESS'}" /home/stack/ssl-heat-templates/environments/enable-tls.yaml
My attempt as fixing these problems was to use map_merge for that https://review.openstack.org/#/c/308182/ but it wasn't well received. I can try to re-take that effort though. Else I can write a tool to update that map and keep the certs, which should be called when an update is needed. *** Bug 1356077 has been marked as a duplicate of this bug. *** I think this BZ might have slipped under the radar. Grabbing this BZ and checking for the fix. Dan Macpherson, do we have any progress with the documentation of the workaround in our upgrade doc? The workaround is known for 6 months. Given the number of customers upgrading from 8 to 9, this should be documented. Thanks. -ak Thanks for highlighting this BZ, Andreas. I had some priority work I had to focus on for OSP10, but I've got that out of the way. I'll make this BZ a priority and work on it today. Thanks a lot :-) Have added the following note to the OSP9 Upgrade procedure:
"If using a custom endpoint map for enabling TLS/SSL in the overcloud, make sure to update the map with endpoints for the following new services:
OpenStack Telemetry Metrics (gnocchi)
OpenStack Telemetry Alarming (aodh)
OpenStack Clustering (sahara)
Check the latest TLS/SSL mappings from the core Heat template collection (see EndpointMap in /usr/share/openstack-tripleo-heat-templates/environments/enable-tls.yaml) and add the missing endpoints to the EndpointMap in your custom enable-tls.yaml file. For more information, see "Enabling SSL/TLS on the Overcloud" in the Red Hat OpenStack Platform Director Installation and Usage guide."
Link: https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/single/upgrading-red-hat-openstack-platform/#sect-Pre-Upgrade_Notes_for_Overcloud
Omri and Andreas, anything further required for this BZ?
No response in over two weeks. If nothing further to add, I'll close this BZ. If further changes are required, please feel free to reopen it. Hi Eduard, Is there something in the documentation unresolved in relation to the case you posted? Just want to make sure we have everything covered. - Dan |