Bug 1303094

Summary: Can't ignore updates to OS::Nova::Server
Product: Red Hat OpenStack Reporter: Jiri Stransky <jstransk>
Component: openstack-tripleo-commonAssignee: Dougal Matthews <dmatthew>
Status: CLOSED CURRENTRELEASE QA Contact: Alexander Chuzhoy <sasha>
Severity: unspecified Docs Contact:
Priority: high    
Version: 8.0 (Liberty)CC: jcoufal, jschluet, jslagle, mandreou, mburns, rhel-osp-director-maint, sbaker, shardy, slinaber, srevivo, zbitter
Target Milestone: gaKeywords: TestOnly
Target Release: 8.0 (Liberty)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-common-0.3.0-1.el7ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1314429 (view as bug list) Environment:
Last Closed: 2016-04-18 16:37:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1314429    

Description Jiri Stransky 2016-01-29 14:27:58 UTC
Description of problem:

When attempting and overcloud upgrade from 7 poodle to 8 poodle with OSP-d, and after tripleo-heat-templates reverts/edits to work around bug 1303084, heat tries to update the overcloud by reprovisioning the nodes. This is most probably due to changes in the OS::Nova::Server resource properties. We probably need a way to tell Heat to ignore updates to OS::Nova::Server resources in this case.

Comment 1 Zane Bitter 2016-02-01 18:45:40 UTC
This is almost certainly not getting fixed in stable/liberty upstream, so we should investigate ways to not change stuff that we don't want to change.

Comment 2 Steven Hardy 2016-02-01 23:03:01 UTC
(In reply to Zane Bitter from comment #1)
> This is almost certainly not getting fixed in stable/liberty upstream, so we
> should investigate ways to not change stuff that we don't want to change.

It's not exactly that simple - the reason we changed the user-data was because heat removed the previous user-data payload which created the heat-admin user.

I'm not sure how we can not change things given that heat internally changed the user-data payload, thus the only possible solution was to change the templates to reinstate the now-removed heat-admin user.

Context is https://bugs.launchpad.net/heat/+bug/1229849

We "fixed" that in heat, but at the expense of breaking all existing deployments - there's no way to maintain that behaviour without modifying the template, and thus the user-data.

Comment 3 Zane Bitter 2016-02-01 23:49:10 UTC
Maybe Heat should never replace a server due to a change in the userdata that it made itself; only a change in the property value specified by the user? Would that solve the problem? Because that seems like what users would be expecting anyway.

Comment 4 Steve Baker 2016-02-02 20:48:44 UTC
As far as the heat-admin user setup is concerned, a diskimage-builder element needs to create a heat-admin user and set that up for standard cloud-init public key setup.

If this is for some reason impossible, heat-admin could be set up with a software deployment - no need for deliberately changing the template user_data. (This means https://review.openstack.org/#/c/220057/ should be reverted)

To mitigate server replacement on updates for stable/liberty, other than doing comment #3 can we modify tripleoclient to do a stack-preview and prompt the user for an "Are you *really* sure you want to replace these servers" if the output states that any servers will be replaced.

I'm OK with user_data_update_policy property in Mitaka, but once that is in the user control I don't see the need for a heat.conf option - neither in good conscience could be backported to liberty.

Comment 5 Steven Hardy 2016-02-03 06:10:49 UTC
(In reply to Steve Baker from comment #4)
> As far as the heat-admin user setup is concerned, a diskimage-builder
> element needs to create a heat-admin user and set that up for standard
> cloud-init public key setup.

Yes we could do this, but there's still the need for runtime logic to setup the SSH key from the nova metadata, or the operator can't log in via the dib created user (e.g the templates still have to change whether it's userdata or a SoftwareDeployment).

One possibility would be to create some local dummy user-data via dib which e.g gets picked up via the nocloud collector for cloud-init.

> If this is for some reason impossible, heat-admin could be set up with a
> software deployment - no need for deliberately changing the template
> user_data. (This means https://review.openstack.org/#/c/220057/ should be
> reverted)

Again, this is possible, but we'd need to do this very early (before the NetworkDeployment runs as often you need to log to debug if e.g the NetworkDeployment can't signal back to heat).

> To mitigate server replacement on updates for stable/liberty, other than
> doing comment #3 can we modify tripleoclient to do a stack-preview and
> prompt the user for an "Are you *really* sure you want to replace these
> servers" if the output states that any servers will be replaced.

This won't work until update preview handles nested stacks, which I've been fixing via https://review.openstack.org/#/c/268997/ - not yet landed.

> I'm OK with user_data_update_policy property in Mitaka, but once that is in
> the user control I don't see the need for a heat.conf option - neither in
> good conscience could be backported to liberty.

The config file option could reasonably be backported IMHO, which is why I proposed it as two patches.  If you guys are -2 on that, I'll remove the config option and squash the patches.

I personally do think that user_data_update_policy combined with ResourceGroup will be useful - there are several other use-cases other than the heat-admin case where it's needed to provide data for new nodes very early, os-net-config mapping files is one example where user-data is a very convenient and simple method which is impossible to use unless we have this level of control (and in the TripleO case, having the option to set this globally in the single-purpose heat undercloud install makes sense, to me at least)

Cheers.

Comment 6 Steve Baker 2016-02-03 20:30:43 UTC
(In reply to Steven Hardy from comment #5)
> (In reply to Steve Baker from comment #4)
> > As far as the heat-admin user setup is concerned, a diskimage-builder
> > element needs to create a heat-admin user and set that up for standard
> > cloud-init public key setup.
> 
> Yes we could do this, but there's still the need for runtime logic to setup
> the SSH key from the nova metadata, or the operator can't log in via the dib
> created user (e.g the templates still have to change whether it's userdata
> or a SoftwareDeployment).
> 
> One possibility would be to create some local dummy user-data via dib which
> e.g gets picked up via the nocloud collector for cloud-init.

All pristine cloud images come with a centos/cloud-user/fedora-user user set up for cloud-init key injection. I'm suggesting that we just replicate that for the heat-admin user in the image building. 

> > If this is for some reason impossible, heat-admin could be set up with a
> > software deployment - no need for deliberately changing the template
> > user_data. (This means https://review.openstack.org/#/c/220057/ should be
> > reverted)
> 
> Again, this is possible, but we'd need to do this very early (before the
> NetworkDeployment runs as often you need to log to debug if e.g the
> NetworkDeployment can't signal back to heat).

Early sounds appropriate

> > To mitigate server replacement on updates for stable/liberty, other than
> > doing comment #3 can we modify tripleoclient to do a stack-preview and
> > prompt the user for an "Are you *really* sure you want to replace these
> > servers" if the output states that any servers will be replaced.
> 
> This won't work until update preview handles nested stacks, which I've been
> fixing via https://review.openstack.org/#/c/268997/ - not yet landed.

OK

> > I'm OK with user_data_update_policy property in Mitaka, but once that is in
> > the user control I don't see the need for a heat.conf option - neither in
> > good conscience could be backported to liberty.
> 
> The config file option could reasonably be backported IMHO, which is why I
> proposed it as two patches.  If you guys are -2 on that, I'll remove the
> config option and squash the patches.

I thought the config change depended on the properties change

> I personally do think that user_data_update_policy combined with
> ResourceGroup will be useful - there are several other use-cases other than
> the heat-admin case where it's needed to provide data for new nodes very
> early, os-net-config mapping files is one example where user-data is a very
> convenient and simple method which is impossible to use unless we have this
> level of control (and in the TripleO case, having the option to set this
> globally in the single-purpose heat undercloud install makes sense, to me at
> least)

I'm not -2 on it, but we might get away with not needing it if we do the above.

Comment 7 Steven Hardy 2016-02-04 16:23:29 UTC
> I thought the config change depended on the properties change

No, we can land the config change, backport it, and then land the properties change only to the master branch.

Comment 8 Steve Baker 2016-02-23 04:22:35 UTC
Please take a look at my upstream proposed change to instack-undercloud.

Comment 9 Marios Andreou 2016-02-24 14:42:27 UTC
(In reply to Steve Baker from comment #8)
> Please take a look at my upstream proposed change to instack-undercloud.

for clarity/to avoid confusion that was abandoned and the current approach is in tripleo-common as linked above in the External Trackers at openstack gerrit https://review.openstack.org/#/c/283832

Comment 10 Dougal Matthews 2016-03-07 13:43:32 UTC
It looks like the patch Marios linked is merged. Is there anything else that needs to be done with this?

Comment 11 Steve Baker 2016-03-07 20:00:18 UTC
There is a follow up change which prevents replacement on any property change. This also needs to land and be backported.

Comment 18 Alexander Chuzhoy 2016-04-14 14:52:09 UTC
Verified:
Environment:
openstack-tripleo-common-0.3.1-1.el7ost.noarch


To verify, checked that I see a single appearance of nova instances in "nova list" output against the undercloud.