Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1884195

Summary:	Possible to delete 2 masters simultaneously if kubelet unreachable
Product:	OpenShift Container Platform	Reporter:	Sam Batschelet <sbatsche>
Component:	Cloud Compute	Assignee:	Michael Gugino <mgugino>
Cloud Compute sub component:	Other Providers	QA Contact:	sunzhaohua <zhsun>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	low
Priority:	medium	CC:	agarcial, mgugino, mimccune, wking, zhsun
Version:	4.6
Target Milestone:	---
Target Release:	4.5.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1840358	Environment:
Last Closed:	2021-03-03 04:40:29 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1840358
Bug Blocks:

Comment 1 sunzhaohua 2020-11-09 05:37:02 UTC

This bug's PR is dev-approved and not yet merged, so I'm following DPTP-660 to do pre-merge verification by using cluster-bot to launch a cluster with the open PR.

clusterversion: 4.5.0-0.ci.test-2020-11-09-040048-ci-ln-t9lmsrb
1.  Stop the kubelet on 2/3 master nodes
2.  Delete first stopped master via machine-api
3.  Delete second stopped master via machine-api,nither could be deleted.
$ oc get node
NAME                                         STATUS     ROLES    AGE   VERSION
ip-10-0-154-169.us-west-2.compute.internal   NotReady   master   75m   v1.18.3+10e5708
ip-10-0-170-160.us-west-2.compute.internal   Ready      worker   64m   v1.18.3+10e5708
ip-10-0-177-138.us-west-2.compute.internal   NotReady   master   75m   v1.18.3+10e5708
ip-10-0-184-104.us-west-2.compute.internal   Ready      worker   64m   v1.18.3+10e5708
ip-10-0-204-210.us-west-2.compute.internal   Ready      master   75m   v1.18.3+10e5708
ip-10-0-209-124.us-west-2.compute.internal   Ready      worker   64m   v1.18.3+10e5708

$ oc get machine
NAME                                                PHASE     TYPE        REGION      ZONE         AGE
ci-ln-t9lmsrb-d5d6b-2hj86-master-0                  Running   m5.xlarge   us-west-2   us-west-2a   77m
ci-ln-t9lmsrb-d5d6b-2hj86-master-1                  Running   m5.xlarge   us-west-2   us-west-2b   77m
ci-ln-t9lmsrb-d5d6b-2hj86-master-2                  Running   m5.xlarge   us-west-2   us-west-2a   77m
ci-ln-t9lmsrb-d5d6b-2hj86-worker-us-west-2a-9ds9f   Running   m4.xlarge   us-west-2   us-west-2a   68m
ci-ln-t9lmsrb-d5d6b-2hj86-worker-us-west-2a-lhbdd   Running   m4.xlarge   us-west-2   us-west-2a   68m
ci-ln-t9lmsrb-d5d6b-2hj86-worker-us-west-2b-krfnd   Running   m4.xlarge   us-west-2   us-west-2b   68m

$ oc delete machine ci-ln-t9lmsrb-d5d6b-2hj86-master-0
machine.machine.openshift.io "ci-ln-t9lmsrb-d5d6b-2hj86-master-0" deleted
^C
$ oc delete machine ci-ln-t9lmsrb-d5d6b-2hj86-master-2
machine.machine.openshift.io "ci-ln-t9lmsrb-d5d6b-2hj86-master-2" deleted
^C

So this bug is pre-merge-verified. After the PR gets merged, the bug will be moved to VERIFIED by the bot automatically, if not working, I will move to VERIFIED manually.

Comment 2 Michael McCune 2020-12-04 21:34:56 UTC

the 2133 PR associated with this issue has the necessary labels to merge but is waiting on CI and a discussion that is happening in the comments.

Comment 7 errata-xmlrpc 2021-03-03 04:40:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.5.33 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0428