Bug 1459854

Summary: request backport of "Only recreate CHECK FAILED resources in ResourceGroup"
Product: Red Hat OpenStack Reporter: Luca Miccini <lmiccini>
Component: openstack-heatAssignee: Zane Bitter <zbitter>
Status: CLOSED ERRATA QA Contact: Amit Ugol <augol>
Severity: high Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: dbecker, dcadzow, mburns, nchandek, pablo.iranzo, rhel-osp-director-maint, rrasouli, sbaker, shardy, srevivo, zbitter
Target Milestone: z4Keywords: FeatureBackport, Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-heat-7.0.3-2.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1459984 (view as bug list) Environment:
Last Closed: 2017-09-06 17:13:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1459984    

Description Luca Miccini 2017-06-08 11:15:37 UTC
Description of problem:

removing one server and updating the stack causes re-creation of all servers.

This has been fixed upstream:

https://review.openstack.org/#/c/445662/9

and backported to newton:

https://review.openstack.org/#/c/452947/


Version-Release number of selected component (if applicable):

openstack-heat-api-7.0.2-1.el7ost.noarch 
openstack-heat-engine-7.0.2-1.el7ost.noarch 
openstack-heat-common-7.0.2-1.el7ost.noarch

How reproducible:

always

Steps to Reproduce:

1. create a stack

$ nova list
host-0.example.com  192.168.0.5
host-1.example.com  192.168.0.6

hosts are 

  - in a OS::Heat::ResourceGroup 
  - in a ServerGroup with an anti-affinity policy


2. Delete one node
$ nova delete host-0.example.com
$ nova list
host-1.example.com

Stack check correctly detects the missing node
$ heat action-check mystack  

3. Run stack-update to recover the missing node
$ heat stack-update -x mystack


Actual results:

Two more nodes are created, one with a conflicting name (the previously existing one).

$ nova list
host-1.example.com 192.168.0.6
host-0.example.com 192.168.0.11
host-1.example.com 192.168.0.12


logs:
2017-06-08 07:59:10Z [shift.openshift_infra_nodes.1]: CHECK_COMPLETE  Stack CHECK completed successfully. 'CHECK' not fully supported (see resources)  # SoftwareConfig & co
2017-06-08 07:59:10Z [shift.openshift_infra_nodes.1]: CHECK_COMPLETE  state changed
2017-06-08 07:59:10Z [shift.openshift_infra_nodes.0]: CHECK_FAILED  ['NotFound: resources[0].resources.root_volume: Volume 96187160-3c71-47de-98c9-2e3693e90003 could not be found. (HTTP 404) (Request-ID: req-696f3b71-1a4e-4984-a7cb-9d29b31157ad)']. 'CHECK' not fully supported (see resources)
2017-06-08 07:59:10Z [shift.openshift_infra_nodes]: CHECK_FAILED  Resource CHECK failed: ["['NotFound: resources[0].resources.root_volume: Volume 96187160-3c71-47de-98c9-2e3693e90003 could not be found. (HTTP 404) (Request-ID: req-696f3b71-1a4e-4984-a7cb-9d29b31157ad)']. 'CHECK' not fully supported (see resources)"]. 'C
2017-06-08 07:59:11Z [shift.openshift_infra_nodes]: CHECK_FAILED  ["['NotFound: resources.openshift_infra_nodes.resources[0].resources.root_volume: Volume 96187160-3c71-47de-98c9-2e3693e90003 could not be found. (HTTP 404) (Request-ID: req-696f3b71-1a4e-4984-a7cb-9d29b31157ad)']. 'CHECK' not fully supported (see resourc
2017-06-08 07:59:19Z [shift]: CHECK_FAILED  Resource CHECK failed: ['["[\'NotFound: resources.openshift_infra_nodes.resources[0].resources.root_volume: Volume 96187160-3c71-47de-98c9-2e3693e90003 could not be found. (HTTP 404) (Request-ID: req-696f3b71-1a4e-4984-a7cb-9d29b31157ad)\']. \'CHECK\' not

Expected results:

The stack to be recovered to the original state (only the missing server to be recreated).

Additional info:

Comment 10 Ronnie Rasouli 2017-09-03 14:15:29 UTC
UPDATE_FAILED after stack update and openstack stack check no more nodes created with that stack

Comment 12 errata-xmlrpc 2017-09-06 17:13:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2655