Created attachment 1812272 [details] must gather Description of problem: Fresh ocp4.9 installation shows error for kube-apiserver. Version-Release number of selected component (if applicable): Server Version: 4.9.0-0.nightly-2021-08-07-175228 Platform : AWS How reproducible: Occurred Once Steps to Reproduce: 1. Install ocp 4.9 environment. 2. 3. Actual results: Showing error as below: message: "NodeInstallerDegraded: 1 nodes are failing on revision 4:\nNodeInstallerDegraded: installer: 30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/user-serving-cert-000\": dial tcp 172.30.0.1:443: connect: connection refused\nNodeInstallerDegraded: I0809 04:46:58.847903 1 copy.go:24] Failed to get secret openshift-kube-apiserver/user-serving-cert-000: Get \"https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/user-serving-cert-000\": dial tcp 172.30.0.1:443: connect: connection refused\nNodeInstallerDegraded: I0809 04:46:59.116854 1 copy.go:24] Failed to get secret openshift-kube-apiserver/user-serving-cert-000: Get \"https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/user-serving-cert-000\": dial tcp 172.30.0.1:443: connect: connection refused\nNodeInstallerDegraded: W0809 04:46:59.117906 1 recorder.go:198] Error creating event &Event{ObjectMeta:{installer-4-ip-10-0-167-134.us-east-2.compute.internal.169989f37bb86e32 \ openshift-kube-apiserver 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []},InvolvedObject:ObjectReference{Kind:Pod,Namespace:openshift-kube-apiserver,Name:installer-4-ip-10-0-167-134.us-east-2.compute.internal,UID:7c85fc09-408d-4e73-b175-6177507e47da,APIVersion:v1,ResourceVersion:,FieldPath:,},Reason:StaticPodInstallerFailed,Message:Installing revision 4: Get \"https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/user-serving-cert-000\": dial tcp 172.30.0.1:443: connect: connection refused,Source:EventSource{Component:static-pod-installer,Host:,},FirstTimestamp:2021-08-09 04:46:59.116887602 +0000 UTC m=+19.308927563,LastTimestamp:2021-08-09 04:46:59.116887602 +0000 UTC m=+19.308927563,Count:1,Type:Warning,EventTime:0001-01-01 00:00:00 +0000 UTC,Series:nil,Action:,Related:nil,ReportingController:,ReportingInstance:,}: Post \"https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/events\": dial tcp 172.30.0.1:443: connect: connection refused\nNodeInstallerDegraded: F0809 04:46:59.118050 1 cmd.go:96] failed to copy: Get \"https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/secrets/user-serving-cert-000\": dial tcp 172.30.0.1:443: connect: connection refused\nNodeInstallerDegraded: " reason: NodeInstaller_InstallerPodFailed status: "True" type: Degraded Expected results: cluster operator kube-apiserver should not show any error Additional info:
https://github.com/openshift/library-go/pull/1179
Two PRs of this bug, the last PR was merged in 8 days ago, in the past 14 days, there are still many such failures can be found, $ w3m -dump -cols 200 'https://search.ci.openshift.org/?search=NodeInstallerDegraded.*1+nodes+are+failing+on+revision&maxAge=336h&context=1&type=junit&name=4%5C.9&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job' | grep 'kube-apiserver.*1 nodes are failing on revision' | wc -l 30 In the past 7 days, after both PR was merged, we cannot find any such failure, $ w3m -dump -cols 200 'https://search.ci.openshift.org/?search=NodeInstallerDegraded.*1+nodes+are+failing+on+revision&maxAge=168h&context=1&type=junit&name=4%5C.9&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job' | grep 'kube-apiserver.*1 nodes are failing on revision' No results found. Based on the above, the PRs work fine, so move the bug VERIFIED.
In addition,no such issues have been found in recent installations.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759