1392613 – OSP8 Upgrade to OSP9 failed during the keystone upgrade step.

Bug 1392613 - OSP8 Upgrade to OSP9 failed during the keystone upgrade step.

Summary: OSP8 Upgrade to OSP9 failed during the keystone upgrade step.

Keywords:
Status:	CLOSED DUPLICATE of bug 1414784
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-heat-templates
Sub Component:
Version:	9.0 (Mitaka)
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Sofer Athlan-Guyot
QA Contact:	Amit Ugol
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-11-07 21:47 UTC by Jeremy
Modified:	2022-08-02 17:15 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-03-08 11:20:19 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	OSP-7947	None	None	None	2022-08-02 17:15:15 UTC
Red Hat Issue Tracker	UPG-981	None	None	None	2021-08-30 13:48:01 UTC
Red Hat Knowledge Base (Solution)	2750961	None	None	None	2016-11-15 04:02:42 UTC

Description Jeremy 2016-11-07 21:47:51 UTC

Description of problem:
During the overcloud upgrade using director we encountered some issues in the keystone upgrade step, command:

openstack overcloud deploy \
--stack lab \
--templates \
--ntp-server time.ord1.rackspace.com \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e ~/templates/ips-from-pool-all.yaml \
-e ~/templates/environments/network-environment.yaml \
-e ~/templates/environments/storage-environment.yaml \
-e ~/templates/wipe_disk_resource.yaml \
-e ~/templates/rhel-registration/environment-rhel-registration.yaml \
-e ~/templates/rhel-registration/rhel-registration-resource-registry.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-keystone-liberty-mitaka.yaml \
--control-flavor control \
--compute-flavor compute \
--ceph-storage-flavor ceph-storage \
--neutron-network-type vxlan \
--neutron-tunnel-types vxlan \
--control-scale 3 \
--compute-scale 5 \
--ceph-storage-scale 4

The problems were:

1. keystone-manage bootstrap[1] ran but that is not an option for OSP8 keystone package, we ran it again and that was fixed, it's odd how this worked because at this point keystone was still not updated.

2. keystone was added as wsgi module in apache, but the openstack-keystone-clone service was still running in pacemaker, when resource httpd-clone was started it failed because the port was already used[2].

We solved this by stopping openstack-keystone-clone and start httpd-clone manually, then re-ran the upgrade command, at this point the upgrade was able to continue and finalize the keystone upgrade step

It looks like heat and puppet[3] take care of stopping keystone and starting apache, but our suspicion is that it needs to be a considerable sleep time between these tasks because all openstack services depend on keystone in OSP8 and takes about 2 minutes to stop all services:
I pulled the log entries generated during the upgrade, because the full sosreport is 255MB and it has more stuff of other tests we did after the upgrade. See attached file upgrade.log

# journalctl -u os-collect-config --since="2016-10-26 14:35" --until "2016-10-26 21:07" > upgrade.log

[1]
Oct 26 14:38:42 444729-controller00.localdomain os-collect-config[4033]: [2016-10-26 14:38:42,315] (heat-config) [INFO] Warning: Scope(Class[Keystone]): Execution of db_sync does not depend on $enabled anymore. Please use sync_db instead.
Oct 26 14:38:42 444729-controller00.localdomain os-collect-config[4033]: Error: /Stage[main]/Keystone/Exec[keystone-manage bootstrap]: Failed to call refresh: keystone-manage bootstrap --bootstrap-password p4g6PcMxyrNu9xgCCuRjEE9hX returned 2 instead of one of [0]
Oct 26 14:38:42 444729-controller00.localdomain os-collect-config[4033]: Error: /Stage[main]/Keystone/Exec[keystone-manage bootstrap]: keystone-manage bootstrap --bootstrap-password p4g6PcMxyrNu9xgCCuRjEE9hX returned 2 instead of one of [0]
Oct 26 14:38:42 444729-controller00.localdomain os-collect-config[4033]: [2016-10-26 14:38:42,315] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-puppet/56509a24-5608-4899-8c79-b9fd936d2366.pp. [6]

[2]
Oct 26 15:29:46 444729-controller00.localdomain os-collect-config[4033]: Error: Could not start Service[httpd]: Execution of '/usr/bin/systemctl start httpd' returned 1: Job for httpd.service failed because the control process exited with error code. See "systemctl status httpd.service" and "journalctl -xe" for details.
Oct 26 15:29:46 444729-controller00.localdomain os-collect-config[4033]: Wrapped exception:
Oct 26 15:29:46 444729-controller00.localdomain os-collect-config[4033]: Execution of '/usr/bin/systemctl start httpd' returned 1: Job for httpd.service failed because the control process exited with error code. See "systemctl status httpd.service" and "journalctl -xe" for details.
Oct 26 15:29:46 444729-controller00.localdomain os-collect-config[4033]: Error: /Stage[main]/Apache::Service/Service[httpd]/ensure: change from stopped to running failed: Could not start Service[httpd]: Execution of '/usr/bin/systemctl start httpd' returned 1: Job for httpd.service failed because the control process exited with error code. See "systemctl status httpd.service" and "journalctl -xe" for details.
Oct 26 15:29:46 444729-controller00.localdomain os-collect-config[4033]: Warning: /Stage[main]/Keystone::Deps/Anchor[keystone::service::end]: Skipping because of failed dependencies
Oct 26 15:29:46 444729-controller00.localdomain os-collect-config[4033]: [2016-10-26 15:29:46,831] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-puppet/0baac45a-7e7d-4faa-ba84-326b44769b9b.pp. [6]

[3]
/usr/share/openstack-tripleo-heat-templates/extraconfig/tasks/major_upgrade_keystone_liberty_mitaka.yaml
/usr/share/openstack-tripleo-heat-templates/extraconfig/tasks/liberty_to_mitaka_keystone_upgrade.pp


Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-2.0.0-34.el7ost.noarch

Comment 1 Jeremy 2016-11-07 21:48:44 UTC

seems similar to https://bugzilla.redhat.com/show_bug.cgi?id=1354046 , however they are already on the fixed version:

Comment 3 Jiri Stransky 2016-11-11 10:00:04 UTC

Indeed this looks like bug 1354046 (CCing Michele). We are already checking that Keystone disappears from pacemaker before re-managing httpd:

https://github.com/openstack/tripleo-heat-templates/blob/stable/mitaka/extraconfig/tasks/major_upgrade_pacemaker_migrations.sh#L67-L87

so it's not immediately obvious how the problem could have happened. Perhaps more logs could be useful in order to see the exact time progression of events. Ideally relevant apache logs showing the failure, /var/log/cluster/..., and the portion of the os-collect-config log (the upgrade.log mentioned earlier).

Also just to double check re workaround -- you managed to work around the issue by removing the openstack-keystone-clone resource manually, and running the major-upgrade-keystone-liberty-mitaka.yaml step again, right?

Comment 4 Jeremy 2016-11-11 14:39:49 UTC

Jiri,

yes that's the workaround , or stopping keystone in pcs. I will collect those logs and update. Thanks!

Comment 7 Jaromir Coufal 2017-01-25 19:47:02 UTC

Moving to upgrades group, workaround available, not blocking any important cases. We will investigate proper fix asap.

Comment 8 Sofer Athlan-Guyot 2017-01-26 18:17:06 UTC

Hi,

I think I found the issue.  I could reproduce the exact same error message:

    Error: /Stage[main]/Keystone/Exec[keystone-manage bootstrap]: Failed to call refresh: keystone-manage bootstrap --bootstrap-password 39EnVE8U7QaxGXYzpKhH47kXh returned 2 instead of one of [0]
    Error: /Stage[main]/Keystone/Exec[keystone-manage bootstrap]: keystone-manage bootstrap --bootstrap-password 39EnVE8U7QaxGXYzpKhH47kXh returned 2 instead of one of [0]

by upgrading the puppet-module to osp9-director before running the keystone migration.

The fix is to completely ignore in the documentation:

     https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/paged/upgrading-red-hat-openstack-platform/chapter-3-director-based-environments-performing-upgrades-to-major-versions 

in this section:  

    3.4.3. Upgrading Keystone

this command:

    "Before running the upgrade, update the openstack-puppet-modules package on each node with the following command on the Undercloud: "

    for i in $(nova list|grep ctlplane|awk -F' ' '{ print $12 }'|awk -F'=' '{ print $2 }'); do ssh -o StrictHostKeyChecking=no heat-admin@$i "sudo yum -y update openstack-puppet-modules" ; done

This is a documentation issue and has been raised:

 - here: https://bugzilla.redhat.com/show_bug.cgi?id=1414917
 - there: https://bugzilla.redhat.com/show_bug.cgi?id=1414784

Could you confirm that if you don't upgrade the openstack-puppet-modules before doing the migration then the issue disapear and you don't have to run the workaround any more ?

Regards,

Comment 9 Mike Orazi 2017-02-21 12:50:01 UTC

Can you confirm with the documentation mentioned above in place that we should close this BZ out?

Comment 10 Red Hat Bugzilla Rules Engine 2017-02-21 12:50:10 UTC

This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.

Comment 12 Sofer Athlan-Guyot 2017-03-08 09:49:31 UTC

Yes it's only a documentation bug and it has been fixed there https://bugzilla.redhat.com/show_bug.cgi?id=1414784#c5

Comment 13 Sofer Athlan-Guyot 2017-03-08 11:20:19 UTC

Closing it as it's fixed in the documentation.

*** This bug has been marked as a duplicate of bug 1414784 ***

Note You need to log in before you can comment on or make changes to this bug.