Moving to 4.5 since it hasn't reproduced in a week. If you hit it, please provide the details and move it back to 4.4
Today I hit failure and after analysis it appears same issue: tried 4.5.0-0.nightly-2020-03-17-011909 ipi on gcp installation https://openshift-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/Launch%20Environment%20Flexy/85156/ (where kubeconfig link is shown) , the installation failed with https://openshift-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/Launch%20Environment%20Flexy/85156/console : ... E0317 03:09:27.807942 561 reflector.go:153] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to list *v1.ConfigMap: Get https://api.xxia0317-3.qe.gcp.devcluster.openshift.com:6443/api/v1/namespaces/kube-system/configmaps?fieldSelector=metadata.name%3Dbootstrap&limit=500&resourceVersion=0: dial tcp 146.148.47.225:6443: i/o timeout level=error msg="Cluster operator authentication Degraded is True with IngressStateEndpoints_MissingSubsets::RouterCerts_NoRouterCertSecret: RouterCertsDegraded: secret/v4-0-config-system-router-certs -n openshift-authentication: could not be retrieved: secret \"v4-0-config-system-router-certs\" not found\nIngressStateEndpointsDegraded: No subsets found for the endpoints of oauth-server" level=info msg="Cluster operator authentication Progressing is Unknown with NoData: " level=info msg="Cluster operator authentication Available is Unknown with NoData: " level=info msg="Cluster operator dns Progressing is True with Reconciling: Not all DNS DaemonSets available." level=error msg="Cluster operator etcd Degraded is True with InstallerPodContainerWaiting_ContainerCreating::InstallerPodNetworking_FailedCreatePodSandBox::StaticPods_Error: InstallerPodContainerWaitingDegraded: Pod \"installer-2-xxia03-cqhk8-m-0.c.openshift-qe.internal\" on node \"xxia03-cqhk8-m-0.c.openshift-qe.internal\" container \"installer\" is waiting for 38m56.672639188s because \"\"\nInstallerPodNetworkingDegraded: Pod \"installer-2-xxia03-cqhk8-m-0.c.openshift-qe.internal\" on node \"xxia03-cqhk8-m-0.c.openshift-qe.internal\" observed degraded networking: (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_installer-2-xxia03-cqhk8-m-0.c.openshift-qe.internal_openshift-etcd_1c23413d-8db3-4344-935d-8c7e5d9e91fc_0(03c858bd0e2c4d0299ecd9e2118f71b5a5f498139033f85debef6f77bf747645): netplugin failed with no error message\nStaticPodsDegraded: pods \"etcd-xxia03-cqhk8-m-0.c.openshift-qe.internal\" not found\nStaticPodsDegraded: pods \"etcd-xxia03-cqhk8-m-2.c.openshift-qe.internal\" not found\nStaticPodsDegraded: pods \"etcd-xxia03-cqhk8-m-1.c.openshift-qe.internal\" not found" level=info msg="Cluster operator etcd Progressing is True with NodeInstaller: NodeInstallerProgressing: 3 nodes are at revision 0; 0 nodes have achieved new revision 2" level=info msg="Cluster operator etcd Available is False with StaticPods_ZeroNodesActive: StaticPodsAvailable: 0 nodes are active; 3 nodes are at revision 0; 0 nodes have achieved new revision 2" level=error msg="Cluster operator kube-apiserver Degraded is True with InstallerPodContainerWaiting_ContainerCreating::InstallerPodNetworking_FailedCreatePodSandBox::StaticPods_Error: InstallerPodContainerWaitingDegraded: Pod \"installer-2-xxia03-cqhk8-m-0.c.openshift-qe.internal\" on node \"xxia03-cqhk8-m-0.c.openshift-qe.internal\" container \"installer\" is waiting for 36m45.196622777s because \"\"\nInstallerPodNetworkingDegraded: Pod \"installer-2-xxia03-cqhk8-m-0.c.openshift-qe.internal\" on node \"xxia03-cqhk8-m-0.c.openshift-qe.internal\" observed degraded networking: Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_installer-2-xxia03-cqhk8-m-0.c.openshift-qe.internal_openshift-kube-apiserver_4214c57b-231d-409f-8edc-3b11c909e8f5_0(37893ee187d0bfcab4944d58caebafa140990b06bb7ae6520d80120b517308a9): netplugin failed with no error message\nStaticPodsDegraded: pods \"kube-apiserver-xxia03-cqhk8-m-0.c.openshift-qe.internal\" not found\nStaticPodsDegraded: pods \"kube-apiserver-xxia03-cqhk8-m-2.c.openshift-qe.internal\" not found\nStaticPodsDegraded: pods \"kube-apiserver-xxia03-cqhk8-m-1.c.openshift-qe.internal\" not found" ... Above it says "installer-2-xxia03-cqhk8-m-0.c.openshift-qe.internal ... failed to create pod network sandbox ... netplugin failed with no error message". # oc get no # no workers are up yet NAME STATUS ROLES AGE VERSION xxia03-cqhk8-m-0.c.openshift-qe.internal Ready master 7h4m v1.17.1 xxia03-cqhk8-m-1.c.openshift-qe.internal Ready master 7h3m v1.17.1 xxia03-cqhk8-m-2.c.openshift-qe.internal Ready master 7h3m v1.17.1 # oc get po -n openshift-kube-apiserver # only shows one pod as below NAME READY STATUS RESTARTS AGE installer-2-xxia03-cqhk8-m-0.c.openshift-qe.internal 0/1 ContainerCreating 0 6h59m # oc describe po installer-2-xxia03-cqhk8-m-0.c.openshift-qe.internal -n openshift-kube-apiserver Warning FailedCreatePodSandBox 102s (x1461 over 6h44m) kubelet, xxia03-cqhk8-m-0.c.openshift-qe.internal (combined from similar ev ents): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_installer-2-xxia03-cqhk8- m-0.c.openshift-qe.internal_openshift-kube-apiserver_4214c57b-231d-409f-8edc-3b11c909e8f5_0(35073fac92a453debb270e8fdafe7c67324ab1e38e94 c3f183971a5248c7c6a6): netplugin failed with no error message # oc get po -n openshift-sdn -o wide ovs-cfhzt 1/1 Running 0 6h54m 10.0.0.4 xxia03-cqhk8-m-2.c.openshift-qe.internal <none> <none> ovs-sxkkr 1/1 Running 0 6h54m 10.0.0.3 xxia03-cqhk8-m-0.c.openshift-qe.internal <none> <none> ovs-tlmkl 1/1 Running 0 6h54m 10.0.0.5 xxia03-cqhk8-m-1.c.openshift-qe.internal <none> <none> # oc logs -n openshift-sdn ovs-sxkkr > ovs-sxkkr.log # vi ovs-sxkkr.log # many below errors ... 2020-03-17T09:24:30.395Z|04141|jsonrpc|WARN|unix#27609: receive error: Connection reset by peer 2020-03-17T09:24:30.395Z|04142|reconnect|WARN|unix#27609: connection dropped (Connection reset by peer) ...
Try many times and this issue did not be reproduced in 4.5 4.5.0-0.nightly-2020-04-21-103613 Verified this bug
Created attachment 1696486 [details] ovs-pods-logs
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409