1779801 – All nodes go NotReady after 24 hours, several pending CSRs

Bug 1779801 - All nodes go NotReady after 24 hours, several pending CSRs

Summary: All nodes go NotReady after 24 hours, several pending CSRs

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-controller-manager
Sub Component:
Version:	4.2.0
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	4.2.z
Assignee:	Maciej Szulik
QA Contact:	Walid A.
Docs Contact:
URL:
Whiteboard:
Depends On:	1755469
Blocks:
TreeView+	depends on / blocked

Reported:	2019-12-04 18:32 UTC by Maciej Szulik
Modified:	2023-03-24 16:18 UTC (History)
CC List:	19 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1755469
Environment:
Last Closed:	2019-12-20 00:46:48 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-kube-controller-manager-operator pull 305	0	'None'	closed	[release-4.2] bug 1779801: watch for changes to input secrets from the operator namespace	2020-11-04 22:16:30 UTC
Red Hat Product Errata	RHBA-2019:4181	0	None	None	None	2019-12-20 00:46:58 UTC

Comment 2 Xingxing Xia 2019-12-09 02:02:55 UTC

Hi, Walid, this is a clone of original bug 1755469, could you verify it? Thanks in advance.
One interesting thing is 4.2.2 was shown running well in bug 1755469#c42 while above PR is recent.

Comment 5 Walid A. 2019-12-13 05:33:43 UTC

Verified that all nodes are still Ready after 25+ hours on both AWS and Azure IPI installed clusters:

Azure: 

# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.10    True        False         25h     Cluster version is 4.2.10

# oc get nodes
NAME                                        STATUS   ROLES    AGE   VERSION
walid4210zb-dxzqw-master-0                  Ready    master   26h   v1.14.6+888f9c630
walid4210zb-dxzqw-master-1                  Ready    master   26h   v1.14.6+888f9c630
walid4210zb-dxzqw-master-2                  Ready    master   26h   v1.14.6+888f9c630
walid4210zb-dxzqw-worker-centralus1-kfxzw   Ready    worker   26h   v1.14.6+888f9c630
walid4210zb-dxzqw-worker-centralus2-wrml6   Ready    worker   26h   v1.14.6+888f9c630
walid4210zb-dxzqw-worker-centralus3-csvpq   Ready    worker   26h   v1.14.6+888f9c630


AWS:
# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.10    True        False         25h     Cluster version is 4.2.10

# oc get nodes
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-132-101.us-west-2.compute.internal   Ready    master   26h   v1.14.6+888f9c630
ip-10-0-142-51.us-west-2.compute.internal    Ready    worker   26h   v1.14.6+888f9c630
ip-10-0-157-108.us-west-2.compute.internal   Ready    master   26h   v1.14.6+888f9c630
ip-10-0-157-125.us-west-2.compute.internal   Ready    worker   26h   v1.14.6+888f9c630
ip-10-0-163-205.us-west-2.compute.internal   Ready    master   26h   v1.14.6+888f9c630
ip-10-0-164-109.us-west-2.compute.internal   Ready    worker   26h   v1.14.6+888f9c630

Comment 7 errata-xmlrpc 2019-12-20 00:46:48 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:4181

Note You need to log in before you can comment on or make changes to this bug.