Bug 1856316
Summary: | Installer fails because openshift-authentication never gets Available | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | David Sanz <dsanzmor> |
Component: | apiserver-auth | Assignee: | Standa Laznicka <slaznick> |
Status: | CLOSED ERRATA | QA Contact: | pmali |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.6 | CC: | anusaxen, aos-bugs, mfojtik, pasik, pmali, slaznick, wjiang, wsun, xxia, yunjiang |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | 4.6.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-10-27 16:13:56 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
David Sanz
2020-07-13 11:53:52 UTC
*** Bug 1856425 has been marked as a duplicate of this bug. *** Change the bug to QE status, it should be verified by QE before closing. How did this pass CI if it prevents installation? *** Bug 1856475 has been marked as a duplicate of this bug. *** > How did this pass CI if it prevents installation?
it was a race condition, and I think we got very lucky in the CI when testing the PR
reopen this bug, since met this problem again. (not sure if it is the same problem, let me know and I could open a new bug if this is a new issue.) the frequency of reproducing the problem is not `always`, I reproduced problem 3 times out of 4. LAST SEEN TYPE REASON OBJECT MESSAGE <unknown> Warning FailedScheduling pod/authentication-operator-7566665ccc-jv5c9 no nodes available to schedule pods <unknown> Warning FailedScheduling pod/authentication-operator-7566665ccc-jv5c9 no nodes available to schedule pods <unknown> Warning FailedScheduling pod/authentication-operator-7566665ccc-jv5c9 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. <unknown> Warning FailedScheduling pod/authentication-operator-7566665ccc-jv5c9 0/3 nodes are available: 3 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate. <unknown> Normal Scheduled pod/authentication-operator-7566665ccc-jv5c9 Successfully assigned openshift-authentication-operator/authentication-operator-7566665ccc-jv5c9 to ip-10-0-55-8.us-east-2.compute.internal 3h17m Warning FailedMount pod/authentication-operator-7566665ccc-jv5c9 MountVolume.SetUp failed for volume "service-ca-bundle" : failed to sync configmap cache: timed out waiting for the condition 3h17m Warning FailedMount pod/authentication-operator-7566665ccc-jv5c9 MountVolume.SetUp failed for volume "serving-cert" : failed to sync secret cache: timed out waiting for the condition 3h17m Warning FailedMount pod/authentication-operator-7566665ccc-jv5c9 MountVolume.SetUp failed for volume "trusted-ca-bundle" : failed to sync configmap cache: timed out waiting for the condition installation output: level=info msg="Waiting up to 30m0s for bootstrapping to complete..." level=error msg="Cluster operator authentication Degraded is True with ConfigObservation_Error::IngressStateEndpoints_MissingSubsets::RouterCerts_NoRouterCertSecret: RouterCertsDegraded: secret/v4-0-config-system-router-certs -n openshift-authentication: could not be retrieved: secret \"v4-0-config-system-router-certs\" not found\nIngressStateEndpointsDegraded: No subsets found for the endpoints of oauth-server\nConfigObservationDegraded: secret \"v4-0-config-system-router-certs\" not found" level=info msg="Cluster operator authentication Progressing is Unknown with NoData: " level=info msg="Cluster operator authentication Available is Unknown with NoData: " level=error msg="Cluster operator cloud-credential Degraded is True with CredentialsFailing: 3 of 3 credentials requests are failing to sync." level=info msg="Cluster operator cloud-credential Progressing is True with Reconciling: 0 of 3 credentials requests provisioned, 3 reporting errors." level=error msg="Cluster operator kube-apiserver Degraded is True with NodeInstaller_InstallerPodFailed::StaticPods_Error: StaticPodsDegraded: pod/kube-apiserver-ip-10-0-76-192.us-east-2.compute.internal container \"kube-apiserver-check-endpoints\" is not ready: CrashLoopBackOff: back-off 5m0s restarting failed container=kube-apiserver-check-endpoints pod=kube-apiserver-ip-10-0-76-192.us-east-2.compute.internal_openshift-kube-apiserver(4524d59004962035e5c196e1396bc8f1)\nStaticPodsDegraded: pod/kube-apiserver-ip-10-0-76-192.us-east-2.compute.internal container \"kube-apiserver-check-endpoints\" is waiting: CrashLoopBackOff: back-off 5m0s restarting failed container=kube-apiserver-check-endpoints pod=kube-apiserver-ip-10-0-76-192.us-east-2.compute.internal_openshift-kube-apiserver(4524d59004962035e5c196e1396bc8f1)\nStaticPodsDegraded: pod/kube-apiserver-ip-10-0-58-64.us-east-2.compute.internal container \"kube-apiserver-check-endpoints\" is not ready: CrashLoopBackOff: back-off 5m0s restarting failed container=kube-apiserver-check-endpoints pod=kube-apiserver-ip-10-0-58-64.us-east-2.compute.internal_openshift-kube-apiserver(6cbbceab8ec9e144f33a7b1c41343b1c)\nStaticPodsDegraded: pod/kube-apiserver-ip-10-0-58-64.us-east-2.compute.internal container \"kube-apiserver-check-endpoints\" is waiting: CrashLoopBackOff: back-off 5m0s restarting failed container=kube-apiserver-check-endpoints pod=kube-apiserver-ip-10-0-58-64.us-east-2.compute.internal_openshift-kube-apiserver(6cbbceab8ec9e144f33a7b1c41343b1c)\nStaticPodsDegraded: pods \"kube-apiserver-ip-10-0-55-8.us-east-2.compute.internal\" not found\nNodeInstallerDegraded: 1 nodes are failing on revision 2:\nNodeInstallerDegraded: static pod of revision 2 has been installed, but is not ready while new revision 3 is pending" level=info msg="Cluster operator kube-apiserver Progressing is True with NodeInstaller: NodeInstallerProgressing: 3 nodes are at revision 0; 0 nodes have achieved new revision 3" level=info msg="Cluster operator kube-apiserver Available is False with StaticPods_ZeroNodesActive: StaticPodsAvailable: 0 nodes are active; 3 nodes are at revision 0; 0 nodes have achieved new revision 3" level=error msg="Cluster operator kube-controller-manager Degraded is True with NodeInstaller_InstallerPodFailed: NodeInstallerDegraded: 1 nodes are failing on revision 3:\nNodeInstallerDegraded: static pod of revision 3 has been installed, but is not ready while new revision 4 is pending; 1 nodes are failing on revision 4:\nNodeInstallerDegraded: " level=info msg="Cluster operator kube-controller-manager Progressing is True with NodeInstaller: NodeInstallerProgressing: 2 nodes are at revision 0; 1 nodes are at revision 4; 0 nodes have achieved new revision 5" level=info msg="Cluster operator kube-storage-version-migrator Available is False with _NoMigratorPod: Available: deployment/migrator.openshift-kube-storage-version-migrator: no replicas are available" level=info msg="Cluster operator network Progressing is True with Deploying: DaemonSet \"openshift-multus/network-metrics-daemon\" is waiting for other operators to become ready" level=info msg="Cluster operator openshift-apiserver Available is False with APIServices_PreconditionNotReady: APIServicesAvailable: PreconditionNotReady" level=info msg="Cluster operator openshift-controller-manager Progressing is True with _DesiredStateNotYetAchieved: Progressing: daemonset/controller-manager: observed generation is 0, desired generation is 10.\nProgressing: daemonset/controller-manager: number available is 0, desired number available > 1" level=info msg="Cluster operator openshift-controller-manager Available is False with _NoPodsAvailable: Available: no daemon pods available on any node." level=info msg="Use the following commands to gather logs from the cluster" level=info msg="openshift-install gather bootstrap --help" level=fatal msg="failed to wait for bootstrapping to complete: timed out waiting for the condition" that looks like a different problem, more related to ingress than authentication Change back to VERIFIED status since https://bugzilla.redhat.com/show_bug.cgi?id=1856316#c10 is a different issue. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 *** Bug 1892187 has been marked as a duplicate of this bug. *** |