1267318 – [Docs] [Director] Overcloud Updates failed due new resource type

Bug 1267318 - [Docs] [Director] Overcloud Updates failed due new resource type

Summary: [Docs] [Director] Overcloud Updates failed due new resource type

Keywords:
Status:	CLOSED DUPLICATE of bug 1286798
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	documentation
Sub Component:
Version:	7.0 (Kilo)
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	async
Target Release:	7.0 (Kilo)
Assignee:	Dan Macpherson
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-09-29 15:45 UTC by mathieu bultel
Modified:	2015-12-18 01:28 UTC (History)
CC List:	17 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-12-18 01:28:58 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
OpenStack gerrit	240792	0	None	None	None	Never

Description mathieu bultel 2015-09-29 15:45:33 UTC

Description of problem:

100% reproducible, when trying to upgrade GA to the latest puddle, after the upgrade of the undercloud, the update of the overcloud failed with :

openstack overcloud update stack -i overcloud --template /usr/share/openstack-tripleo-heat-templates
starting package update on stack overcloud
ERROR: openstack ERROR: Unknown resource Type : OS::TripleO::Network::Ports::StorageVipPort

Comment 2 Zane Bitter 2015-09-29 15:58:49 UTC

The problem is that by upgrading the undercloud, we have also updated the tripleo-heat-templates. This requires changes to the environment files (e.g. to support new resource types, as we see here). The "openstack overcloud update" command, however, is designed to not pass any environment files but just retain the existing environment in Heat. This does not work if the templates have been modified.

A solution may be to simply require the user to do an update explicitly passing all of the environment files again (including explicitly specifying the default environment files) after an undercloud update. We may be able to improve on this by adding a command-line option to include the default env files so that the user doesn't have to figure out the correct paths to them. (Even better would be to have a confirmation step when this option is specified, to make sure that users remember to include all of their extra environment files too.)

Assigning to the rhel-osp-director component for now. This may end up as a docs-only bug, or it may require changes to python-rdomanager-oscplugin.

Comment 3 Steven Hardy 2015-09-30 07:57:35 UTC

So I chatted to Jan about this, wrt how we might validate this before attempting the stack update.

Unfortunately, despite us having a preview_update_stack interface which could probably validate the PATCH update, I missed updating that interface recently when I implemented PATCH updates for update_stack proper.

I raised this upstream bug to track it, they should behave consistently, and we may be able to backport the fix to preview_update_stack as AFAICT it's only a refactoring change inside service.py.

https://bugs.launchpad.net/heat/+bug/1501207

Comment 4 Jan Provaznik 2015-09-30 11:46:39 UTC

Because the same situation (old environment with new template) may happen for all CLI commands (scaling down, pkg updates, re-deploy) it seems to me that it's best to instruct users to update the existing overcloud with new environment files right after undercloud machine upgrade.

A doc patch with this instruction is here:
https://review.openstack.org/229373

An alternative to adding an extra parameter for adding default env file might be pre-update validation mentioned by Steven. If backporting of patch for preview_update_stack might happen anytime soon, I would lean to add a check to CLI commands which calls preview_update_stack before running stack-update and if this preview would fail, we could warn user.

Upstream bug for this issue:
https://bugs.launchpad.net/tripleo/+bug/1501296

Comment 5 Zane Bitter 2015-09-30 15:44:42 UTC

(In reply to Jan Provaznik from comment #4)
> Because the same situation (old environment with new template) may happen
> for all CLI commands (scaling down, pkg updates, re-deploy) it seems to me
> that it's best to instruct users to update the existing overcloud with new
> environment files right after undercloud machine upgrade.

I'm inclined to agree. My only reservation is that this could result in the user ending up with new templates but old packages until they then go and run a package update. Off the top of my head I can't think of any circumstances where that would cause a problem though.

> A doc patch with this instruction is here:
> https://review.openstack.org/229373

LGTM

Comment 6 chris alfonso 2015-09-30 16:11:18 UTC

Dan, please check the docs patch and let us know if you need more info for the product doc change.

Comment 7 Andrew Dahms 2015-10-08 01:00:15 UTC

Assigning to Dan for review.

Comment 9 Dan Macpherson 2015-10-15 02:00:50 UTC

Added Overcloud stack upgrade procedure:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Updating_Overcloud_Stack.html

Zane, Steve, Jan - How does this look? Any further changes required?

Comment 10 Jan Provaznik 2015-10-15 11:25:05 UTC

Thanks Dan. Unfortunately it seems that this fix is not sufficient because Zane's concern from comment 5 is probably already happening. Step https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Updating_Overcloud_Stack.html

failed for Gael Lambert because node resources failed with error:
Error: Could not find class tripleo::packages for strg00-prv.localdomain on node strg00-prv.localdomain

tripleo::packages puppet class is defined in openstack-puppet-modules-2015.1.8-19.el7ost.noarch but I guess it was not yet in 7.0 rpms.


So a potential fix would run directly package update and pass it explicitly all the env files again (comment 2).

Comment 11 Mike Burns 2015-10-16 00:14:39 UTC

I think the solution here is pretty simple, actually.  This error (tripleo::packages not defined) is because of an old openstack-puppet-modules on the deployed machine.  This can be solved in 2 ways:

1.  user manually runs yum update openstack-puppet-modules on every machine
2.  we change update stack to do update of openstack-puppet-modules first, then proceed with the rest.  

#1 can be a good short term solution

#2 is likely the right long term solution.  OPM is a safe update in general, as far as openstack services go.  It's also going to hit us that THT requires something that is not in the old OPM at some point in the future, especially on major version upgrades.

Comment 12 Jan Provaznik 2015-10-16 07:05:23 UTC

We are testing if running:
openstack overcloud update stack overcloud -i --templates -e /usr/share/openstack-tripleo-heat-templates/overcloud-resource-registry-puppet.yaml

Basically skipping https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Updating_Overcloud_Stack.html and doing directly https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Updating_Overcloud_Packages.html with explicitly setting default environment file.

If this works it should be just matter of updating doc.

Comment 13 Cyril Lopez 2015-10-16 07:51:14 UTC

(In reply to Mike Burns from comment #11)
> I think the solution here is pretty simple, actually.  This error
> (tripleo::packages not defined) is because of an old
> openstack-puppet-modules on the deployed machine.  This can be solved in 2
> ways:
> 
> 1.  user manually runs yum update openstack-puppet-modules on every machine
> 2.  we change update stack to do update of openstack-puppet-modules first,
> then proceed with the rest.  
> 
> #1 can be a good short term solution
> 
> #2 is likely the right long term solution.  OPM is a safe update in general,
> as far as openstack services go.  It's also going to hit us that THT
> requires something that is not in the old OPM at some point in the future,
> especially on major version upgrades.

That's what we did and it's solve to do a openstack deploy but we have a other issue cf https://bugzilla.redhat.com/show_bug.cgi?id=1272347

Comment 14 Jan Provaznik 2015-10-19 13:28:40 UTC

Unfortunately a fix for this can't be tested properly until https://bugzilla.redhat.com/show_bug.cgi?id=1272357 is fixed (this one makes update fail always).

Comment 15 Jan Provaznik 2015-10-21 07:31:54 UTC

So far it seems that running directly "openstack overcloud update stack" (comment 12) solves the issue with missing tripleo::packages class on OC nodes - openstack-puppet-modules is updated by yum update before puppet runs (at least for 7.0->7.1 upgrade). So no pre-patching of OC nodes should be required for this BZ (only doc update).

Update process is still failing but in much later phase (probably related to 1272347), I'll send a doc patch for this BZ once update finishes successfully.

Comment 16 Jan Provaznik 2015-11-02 09:03:05 UTC

I can confirm that running package update directly solves this particular issue for 7.0->7.1 upgrades.

An upstream doc patch:
https://review.openstack.org/240792

Comment 17 Dan Macpherson 2015-12-16 04:50:42 UTC

I think this BZ is obsolete due to this BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1286798

Can anyone confirm this?

Comment 18 Giulio Fidente 2015-12-17 10:37:17 UTC

hi Dan, yes the update process tracked by #1286798 covers this particular issue too.

Comment 19 Dan Macpherson 2015-12-18 01:28:58 UTC

In that case, I'll close this BZ since the other bug is more relevant. However, if we need this BZ open for whatever reason, please feel free to reopen.

*** This bug has been marked as a duplicate of bug 1286798 ***

Note You need to log in before you can comment on or make changes to this bug.