Description of problem: OSP10 -> OSP11 upgrade: upgrade fails during 'Setup gnocchi db during upgrade' task because httpd is stopped and Keystone is unreacheable: [stack@undercloud-0 ~]$ openstack stack failures list overcloud overcloud.AllNodesDeploySteps.ControllerUpgrade_Step5.0: resource_type: OS::Heat::SoftwareDeployment physical_resource_id: 6f01d508-664a-4a44-9970-490c9ab2ea34 status: CREATE_FAILED status_reason: | Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 deploy_stdout: | ... TASK [set is_bootstrap_node fact] ********************************************** ok: [localhost] TASK [Setup gnocchi db during upgrade] ***************************************** fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["gnocchi-upgrade"], "delta": "0:00:01.794639", "end": "2017-12-11 13:08:55.898215", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2017-12-11 13:08:54.103576", "stderr": "Option \"metric_processing_delay\" from group \"storage\" is deprecated. Use option \"metric_processing_delay\" from group \"metricd\".", "stderr_lines": ["Option \"metric_processing_delay\" from group \"storage\" is deprecated. Use option \"metric_processing_delay\" from group \"metricd\"."], "stdout": "", "stdout_lines": []} to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/a0bdd5f0-6d02-4a17-878b-c869d968427a_playbook.retry PLAY RECAP ********************************************************************* localhost : ok=30 changed=27 unreachable=0 failed=1 (truncated, view all with --long) deploy_stderr: | Checking gnocchi-upgrade log on the first controller we can spot: 2017-12-11 13:08:55.788 459449 CRITICAL gnocchi [-] ClientException: Authorization Failure. Authorization Failed: Service Unavailable (HTTP 503) 2017-12-11 13:08:55.788 459449 ERROR gnocchi Traceback (most recent call last): 2017-12-11 13:08:55.788 459449 ERROR gnocchi File "/usr/bin/gnocchi-upgrade", line 10, in <module> 2017-12-11 13:08:55.788 459449 ERROR gnocchi sys.exit(upgrade()) 2017-12-11 13:08:55.788 459449 ERROR gnocchi File "/usr/lib/python2.7/site-packages/gnocchi/cli.py", line 70, in upgrade 2017-12-11 13:08:55.788 459449 ERROR gnocchi s = storage.get_driver(conf) 2017-12-11 13:08:55.788 459449 ERROR gnocchi File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 144, in get_driver 2017-12-11 13:08:55.788 459449 ERROR gnocchi conf.incoming) 2017-12-11 13:08:55.788 459449 ERROR gnocchi File "/usr/lib/python2.7/site-packages/gnocchi/storage/incoming/swift.py", line 36, in __init__ 2017-12-11 13:08:55.788 459449 ERROR gnocchi self.swift.put_container(self.MEASURE_PREFIX) 2017-12-11 13:08:55.788 459449 ERROR gnocchi File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1755, in put_container 2017-12-11 13:08:55.788 459449 ERROR gnocchi query_string=query_string) 2017-12-11 13:08:55.788 459449 ERROR gnocchi File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1661, in _retry 2017-12-11 13:08:55.788 459449 ERROR gnocchi self.url, self.token = self.get_auth() 2017-12-11 13:08:55.788 459449 ERROR gnocchi File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1613, in get_auth 2017-12-11 13:08:55.788 459449 ERROR gnocchi timeout=self.timeout) 2017-12-11 13:08:55.788 459449 ERROR gnocchi File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 669, in get_auth 2017-12-11 13:08:55.788 459449 ERROR gnocchi auth_version=auth_version) 2017-12-11 13:08:55.788 459449 ERROR gnocchi File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 581, in get_auth_keystone 2017-12-11 13:08:55.788 459449 ERROR gnocchi raise ClientException('Authorization Failure. %s' % err) 2017-12-11 13:08:55.788 459449 ERROR gnocchi ClientException: Authorization Failure. Authorization Failed: Service Unavailable (HTTP 503) At this point httpd is stopped on the first controller so Keystone is unreacheable: [root@controller-0 heat-admin]# systemctl status httpd ● httpd.service - The Apache HTTP Server Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/httpd.service.d └─openstack-dashboard.conf Active: inactive (dead) since Mon 2017-12-11 12:53:14 UTC; 30min ago Docs: man:httpd(8) man:apachectl(8) Main PID: 102032 (code=exited, status=0/SUCCESS) Status: "Total requests: 1733; Current requests/sec: 0; Current traffic: 0 B/sec" Dec 11 11:39:52 controller-0 python[101287]: Copying '/usr/share/javascript/jquery_ui/themes/dot-luv/images/ui-icons_98d2fb_256x240.png' Dec 11 11:39:52 controller-0 python[101287]: Copying '/usr/share/javascript/jquery_ui/themes/dot-luv/images/ui-bg_diagonals-thick_15_0b3e6f_40x40.png' Dec 11 11:39:52 controller-0 python[101287]: Copying '/usr/share/javascript/jquery_ui/themes/dot-luv/images/ui-icons_9ccdfc_256x240.png' Dec 11 11:39:52 controller-0 python[101287]: Copying '/usr/share/javascript/jquery_ui/themes/dot-luv/images/ui-bg_flat_40_292929_40x100.png' Dec 11 11:39:52 controller-0 python[101287]: Copying '/usr/share/javascript/jquery_ui/themes/cupertino/theme.css' Dec 11 11:39:52 controller-0 python[101287]: Copying '/usr/share/javascript/jquery_ui/themes/cupertino/jquery-ui.css' Dec 11 11:39:52 controller-0 python[101287]: Copying '/usr/share/javascript/jquery_ui/themes/cupertino/jquery-ui.min.css' Dec 11 11:40:01 controller-0 systemd[1]: Started The Apache HTTP Server. Dec 11 12:53:12 controller-0 systemd[1]: Stopping The Apache HTTP Server... Dec 11 12:53:14 controller-0 systemd[1]: Stopped The Apache HTTP Server. Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-6.2.4-3.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy OSP10 with 3 controllers + 2 computes + 2 networker nodes 2. Upgrade to OSP11 Actual results: major-upgrade-composable-steps.yaml fails when gnocchi-upgrade runs because keystone is unreachable(httpd is stopped on controllers) Expected results: Upgrade doesn't fail. Additional info: This issue cannot be reproduced when the deployment contains Ceph nodes which leads me to believe that this issue is particular to environments where Gnocchi uses Swift as backend.
I think the issue here is that we stop httpd in step1: https://github.com/openstack/tripleo-heat-templates/blob/stable/ocata/puppet/services/gnocchi-api.yaml#L137-L139 but when running gnocchi-upgrade in step5 it cannot authenticate against Keystone(running under httpd) as httpd is stopped: https://github.com/openstack/tripleo-heat-templates/blob/stable/ocata/puppet/services/gnocchi-api.yaml#L147-L150
Created attachment 1366017 [details] sosreport controller-0
I have already done all backports and built the fixed package. That's why I put it to MODIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1627