Bug 1779801 - All nodes go NotReady after 24 hours, several pending CSRs
Summary: All nodes go NotReady after 24 hours, several pending CSRs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-controller-manager
Version: 4.2.0
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
: 4.2.z
Assignee: Maciej Szulik
QA Contact: Walid A.
URL:
Whiteboard:
Depends On: 1755469
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-04 18:32 UTC by Maciej Szulik
Modified: 2020-01-07 16:03 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1755469
Environment:
Last Closed: 2019-12-20 00:46:48 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift cluster-kube-controller-manager-operator pull 305 'None' closed [release-4.2] bug 1779801: watch for changes to input secrets from the operator namespace 2020-05-13 22:48:17 UTC
Red Hat Product Errata RHBA-2019:4181 None None None 2019-12-20 00:46:58 UTC

Comment 2 Xingxing Xia 2019-12-09 02:02:55 UTC
Hi, Walid, this is a clone of original bug 1755469, could you verify it? Thanks in advance.
One interesting thing is 4.2.2 was shown running well in bug 1755469#c42 while above PR is recent.

Comment 5 Walid A. 2019-12-13 05:33:43 UTC
Verified that all nodes are still Ready after 25+ hours on both AWS and Azure IPI installed clusters:

Azure: 

# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.10    True        False         25h     Cluster version is 4.2.10

# oc get nodes
NAME                                        STATUS   ROLES    AGE   VERSION
walid4210zb-dxzqw-master-0                  Ready    master   26h   v1.14.6+888f9c630
walid4210zb-dxzqw-master-1                  Ready    master   26h   v1.14.6+888f9c630
walid4210zb-dxzqw-master-2                  Ready    master   26h   v1.14.6+888f9c630
walid4210zb-dxzqw-worker-centralus1-kfxzw   Ready    worker   26h   v1.14.6+888f9c630
walid4210zb-dxzqw-worker-centralus2-wrml6   Ready    worker   26h   v1.14.6+888f9c630
walid4210zb-dxzqw-worker-centralus3-csvpq   Ready    worker   26h   v1.14.6+888f9c630


AWS:
# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.10    True        False         25h     Cluster version is 4.2.10

# oc get nodes
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-132-101.us-west-2.compute.internal   Ready    master   26h   v1.14.6+888f9c630
ip-10-0-142-51.us-west-2.compute.internal    Ready    worker   26h   v1.14.6+888f9c630
ip-10-0-157-108.us-west-2.compute.internal   Ready    master   26h   v1.14.6+888f9c630
ip-10-0-157-125.us-west-2.compute.internal   Ready    worker   26h   v1.14.6+888f9c630
ip-10-0-163-205.us-west-2.compute.internal   Ready    master   26h   v1.14.6+888f9c630
ip-10-0-164-109.us-west-2.compute.internal   Ready    worker   26h   v1.14.6+888f9c630

Comment 7 errata-xmlrpc 2019-12-20 00:46:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:4181


Note You need to log in before you can comment on or make changes to this bug.