Bug 1323987 - [docs] [director] Rerunning the initial overcloud deploy command from an upgraded undercloud fails
Summary: [docs] [director] Rerunning the initial overcloud deploy command from an upgr...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 8.0 (Liberty)
Assignee: Dan Macpherson
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-05 09:35 UTC by Marius Cornea
Modified: 2016-05-03 02:11 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-05-03 02:11:56 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Marius Cornea 2016-04-05 09:35:48 UTC
Description of problem:
Rerunning overcloud deploy command from an upgraded undercloud fails and leaves the overcloud in a not functional state. 

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-0.8.14-5.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.14-5.el7ost.noarch

How reproducible:


Steps to Reproduce:
1. Deploy 7.3 overcloud, ~/templates/my-overcloud-7.3 is a copy of the templates in /usr/share/openstack-tripleo-heat-templates/

export THT=~/templates/my-overcloud-7.3
openstack overcloud deploy --templates $THT \
-e $THT/environments/network-isolation-v6.yaml \
-e ~/templates/network-environment-7.3-v6.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/enable-tls.yaml \
-e ~/templates/inject-trust-anchor.yaml \
--control-scale 3 \
--compute-scale 1 \
--ceph-storage-scale 2 \
--ntp-server clock.redhat.com \
--libvirt-type qemu

2. Upgrade undercloud

3. Rerun the deploy command:
export THT=~/templates/my-overcloud-7.3
openstack overcloud deploy --templates $THT \
-e $THT/environments/network-isolation-v6.yaml \
-e ~/templates/network-environment-7.3-v6.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/enable-tls.yaml \
-e ~/templates/inject-trust-anchor.yaml \
--control-scale 3 \
--compute-scale 1 \
--ceph-storage-scale 2 \
--ntp-server clock.redhat.com \
--libvirt-type qemu

Actual results:
The deploy command fails and the overcloud is left in a non functional state.

Expected results:
The deploy command succeeds and the overcloud is accessible.

Comment 2 Marius Cornea 2016-04-05 10:44:07 UTC
The problem appears to have the same cause as BZ#1321132 but since I was using a copy of the 7.3 templates the patch for it wasn't present.

Comment 3 Marios Andreou 2016-04-05 11:51:54 UTC
To be clear, this is a mixed version issue: use an upgraded undercloud to manage an existing overcloud. To do this, the deployer uses backed up versions of the (pre-undercloud upgrade) templates and not the newly upgraded openstack-tripleo-heat-templates package installed in /usr/share.

However, after the undercloud upgrade, the tripleo client is also updated (i.e. not just the templates) and it is that which causes the issue being seen here. Mcornea is right, it is the same behaviour/root as in BZ#1321132 - the tripleoclient now sets random passwords for rabbit and since the fix from https://review.openstack.org/#/c/298834/ isn't applied puppet tries and fails to restart the neutron-server service.

I think the workaround is to override the tripleo-passwords-file and use it to set the values to whatever your existing overcloud has. Given we are using backed up templates since we don't want to update/upgrade/change our overcloud just yet, this makes sense to me at least, that we want to maintain the existing passwords.

Alternatively we try and backport the fix from https://review.openstack.org/#/c/298834 - though we have the added complication of also having to work out how to make the post-puppet services restart happen. We do it for upgrades by setting the update_identifier like https://review.openstack.org/#/c/297175/ - but it wouldn't happen during the 'normal' stack update attempted here.

Comment 4 Brad P. Crochet 2016-04-05 15:00:59 UTC
Recommended workaround:

If the undercloud is being upgraded to OSPd 8, and the admin wishes to continue to manage via deploy a 7.3 overcloud, the following will be necessary.

1. echo "OVERCLOUD_RABBITMQ_PASSWORD=guest" >> $HOME/tripleo-overcloud-passwords
2. Point all templates and environments to /usr/share/openstack-tripleo-heat-templates/kilo when running additional deployments.

Comment 5 Dan Macpherson 2016-04-26 02:02:35 UTC
I included the workaround from comment #4 in 10.1. Important Pre-Upgrade Notes:

https://access.redhat.com/documentation/en/red-hat-openstack-platform/8/director-installation-and-usage/chapter-10-upgrading-the-environment

Marius and Brad -- Is there anything else we need to document for this issue? Is the workaround the only part needing documentation?

Comment 6 Marius Cornea 2016-04-26 07:26:17 UTC
Thanks, Dan. It looks good to me, the workaround should be enough to cover the initial report.

Comment 7 Dan Macpherson 2016-05-03 02:11:56 UTC
Cool, I'll close this BZ down, but please feel free to reopen it if further changes are required.


Note You need to log in before you can comment on or make changes to this bug.