Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1718265

Summary: AWS provider removes stopped instances when reconciling machines
Product: OpenShift Container Platform Reporter: Michael Gugino <mgugino>
Component: Cloud ComputeAssignee: Michael Gugino <mgugino>
Status: CLOSED ERRATA QA Contact: Jianwei Hou <jhou>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.1.zCC: agarcial, jhou, ocs-bugs, ratamir, sponnaga, tkimura, xtian, zhsun
Target Milestone: ---Keywords: OSE41z_next
Target Release: 4.1.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: 4.1.4
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1713010 Environment:
Last Closed: 2019-07-04 09:01:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1713010    
Bug Blocks:    

Description Michael Gugino 2019-06-07 11:43:18 UTC
+++ This bug was initially created as a clone of Bug #1713010 +++

Description of problem:

If a cloud instance backing a machine has stopped, and the machine is reconciled again later for some reason, the stopped instance will be deleted and a new instance will be created in its place.  This behavior is undocumented, likely unexpected, and probably something we should remove.

--- Additional comment from Michael Gugino on 2019-06-07 11:42:25 UTC ---

Merged in master.

Comment 2 Michael Gugino 2019-06-17 13:53:00 UTC
How to verify QE:

Prior to this patch:
1) Stop a worker instance in AWS console.
2) Wait for node to go unready.
3) After node is unready, in a minute or two you should see a new instance provisioned in AWS console with same tag.Name as instance you stopped.
4) Old instance will be terminated.

1) Stop a worker instance in AWS console.
2) Wait for node to go unready.
3) After node is unready, after a few minutes, verify there are no new instances with same tag.Name in AWS console as the instnace you stopped.
4) Instance will not be terminated and can be successfully restarted.

Comment 5 sunzhaohua 2019-06-28 03:33:28 UTC
Verified.

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-06-27-204847   True        False         39m     Cluster version is 4.1.0-0.nightly-2019-06-27-204847

Stop a worker instance in AWS console. Node status becomes NotReady. After a few minutes, no new instances were provisioned with same tag.Name in AWS console as stoped .
If restarted the stoped instance, the node will become ready.

$ oc get node
NAME                                         STATUS     ROLES    AGE   VERSION
ip-10-0-131-22.us-east-2.compute.internal    NotReady   worker   55m   v1.13.4+c9e4f28ff
ip-10-0-136-24.us-east-2.compute.internal    Ready      master   60m   v1.13.4+c9e4f28ff
ip-10-0-157-50.us-east-2.compute.internal    Ready      worker   55m   v1.13.4+c9e4f28ff
ip-10-0-158-200.us-east-2.compute.internal   Ready      master   60m   v1.13.4+c9e4f28ff
ip-10-0-168-191.us-east-2.compute.internal   Ready      master   60m   v1.13.4+c9e4f28ff

$ oc get node
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-131-22.us-east-2.compute.internal    Ready    worker   58m   v1.13.4+c9e4f28ff
ip-10-0-136-24.us-east-2.compute.internal    Ready    master   64m   v1.13.4+c9e4f28ff
ip-10-0-157-50.us-east-2.compute.internal    Ready    worker   58m   v1.13.4+c9e4f28ff
ip-10-0-158-200.us-east-2.compute.internal   Ready    master   63m   v1.13.4+c9e4f28ff
ip-10-0-168-191.us-east-2.compute.internal   Ready    master   63m   v1.13.4+c9e4f28ff

Comment 7 Raz Tamir 2019-06-28 23:05:18 UTC
Hi Michael,
Any idea when this fix will land in 4.1.1 or 4.1.2?

Comment 8 Michael Gugino 2019-07-01 12:10:54 UTC
(In reply to Raz Tamir from comment #7)
> Hi Michael,
> Any idea when this fix will land in 4.1.1 or 4.1.2?

I believe it's now targeted for 4.1.4.

Comment 11 errata-xmlrpc 2019-07-04 09:01:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1635

Comment 12 Alberto 2019-07-26 14:19:05 UTC
*** Bug 1724968 has been marked as a duplicate of this bug. ***