+++ This bug was initially created as a clone of Bug #1876091 +++ Description of problem: Error: Failed to evict container: "": Failed to find container "etcd-signer" in state: no container with name or ID etcd-signer found: no such container Version-Release number of the following components: 4.5.7 vSphere 6.7U3 How reproducible: Unsure Steps to Reproduce: 1. Following disconnected installation instructions on vSphere [1] 2. Bootstrap node fails with the error message in the results below 3. [1] https://docs.openshift.com/container-platform/4.5/installing/installing_vsphere/installing-restricted-networks-vsphere.html#installation-initializing-manual_installing-restricted-networks-vsphere Actual results: [core@bootstrap ~]$ journalctl -b -f -u release-image.service -u bootkube.service -- Logs begin at Fri 2020-09-04 19:24:26 UTC. -- [...] Sep 04 20:29:10 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: Skipped "secret-kube-apiserver-to-kubelet-signer.yaml" secrets.v1./kube-apiserver-to-kubelet-signer -n openshift-kube-apiserver-operator as it already exists Sep 04 20:29:11 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: Skipped "secret-loadbalancer-serving-signer.yaml" secrets.v1./loadbalancer-serving-signer -n openshift-kube-apiserver-operator as it already exists Sep 04 20:29:11 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: Skipped "secret-localhost-serving-signer.yaml" secrets.v1./localhost-serving-signer -n openshift-kube-apiserver-operator as it already exists Sep 04 20:29:11 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: Skipped "secret-service-network-serving-signer.yaml" secrets.v1./service-network-serving-signer -n openshift-kube-apiserver-operator as it already exists Sep 04 20:29:21 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: E0904 20:29:21.735288 1 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=3, ErrCode=NO_ERROR, debug="" Sep 04 20:29:21 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: E0904 20:29:21.759732 1 reflector.go:251] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to watch *v1.Pod: Get https://localhost:6443/api/v1/pods?watch=true: dial tcp [::1]:6443: connect: connection refused Sep 04 20:29:22 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: E0904 20:29:22.761579 1 reflector.go:134] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to list *v1.Pod: Get https://localhost:6443/api/v1/pods: dial tcp [::1]:6443: connect: connection refused Sep 04 20:29:23 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: E0904 20:29:23.763460 1 reflector.go:134] github.com/openshift/cluster-bootstrap/pkg/start/status.go:66: Failed to list *v1.Pod: Get https://localhost:6443/api/v1/pods: dial tcp [::1]:6443: connect: connection refused Sep 04 20:48:37 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: Error: error while checking pod status: timed out waiting for the condition Sep 04 20:48:37 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: Tearing down temporary bootstrap control plane... Sep 04 20:48:37 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: Error: error while checking pod status: timed out waiting for the condition Sep 04 20:48:38 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[26740]: Error: Failed to evict container: "": Failed to find container "etcd-signer" in state: no container with name or ID etcd-signer found: no such container Sep 04 20:48:38 bootstrap.discocp4.lab.msp.redhat.com systemd[1]: bootkube.service: Main process exited, code=exited, status=1/FAILURE Sep 04 20:48:38 bootstrap.discocp4.lab.msp.redhat.com systemd[1]: bootkube.service: Failed with result 'exit-code'. Sep 04 20:48:43 bootstrap.discocp4.lab.msp.redhat.com systemd[1]: bootkube.service: Service RestartSec=5s expired, scheduling restart. Sep 04 20:48:43 bootstrap.discocp4.lab.msp.redhat.com systemd[1]: bootkube.service: Scheduled restart job, restart counter is at 4. Sep 04 20:48:43 bootstrap.discocp4.lab.msp.redhat.com systemd[1]: Stopped Bootstrap a Kubernetes cluster. Sep 04 20:48:43 bootstrap.discocp4.lab.msp.redhat.com systemd[1]: Started Bootstrap a Kubernetes cluster. Sep 04 20:49:00 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[33924]: Starting etcd certificate signer... Sep 04 20:49:01 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[33924]: 6a59a3e4c6a4a93d53756df48b49fea9e64149c059f105101d3b6262aabd9ac2 Sep 04 20:49:02 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[33924]: https://localhost:2379 is healthy: successfully committed proposal: took = 15.98495ms Sep 04 20:49:02 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[33924]: etcd cluster up. Killing etcd certificate signer... Sep 04 20:49:02 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[33924]: 6a59a3e4c6a4a93d53756df48b49fea9e64149c059f105101d3b6262aabd9ac2 Sep 04 20:49:02 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[33924]: Starting cluster-bootstrap... Sep 04 20:49:03 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[33924]: Starting temporary bootstrap control plane... Sep 04 20:49:03 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[33924]: Skipped "0000_00_cluster-version-operator_00_namespace.yaml" namespaces.v1./openshift-cluster-version -n as it already exists Sep 04 20:49:03 bootstrap.discocp4.lab.msp.redhat.com bootkube.sh[33924]: Skipped "0000_00_cluster-version-operator_01_clusteroperator.crd.yaml" customresourcedefinitions.v1beta1.apiextensions.k8s.io/clusteroperators.config.openshift.io -n as it already exists [...] [root@bootstrap ~]# crictl pods POD ID CREATED STATE NAME NAMESPACE ATTEMPT 96913e0cd7de9 11 minutes ago Ready bootstrap-kube-apiserver-bootstrap.discocp4.lab.msp.redhat.com kube-system 1 71ce0c944231d 11 minutes ago Ready bootstrap-kube-scheduler-bootstrap.discocp4.lab.msp.redhat.com kube-system 1 2518d5fbaf566 11 minutes ago Ready bootstrap-kube-controller-manager-bootstrap.discocp4.lab.msp.redhat.com kube-system 1 560e173e96a84 11 minutes ago Ready bootstrap-cluster-version-operator-bootstrap.discocp4.lab.msp.redhat.com openshift-cluster-version 1 1f59d9b34619f 11 minutes ago Ready cloud-credential-operator-bootstrap.discocp4.lab.msp.redhat.com openshift-cloud-credential-operator 1 bf512c1a3af90 32 minutes ago NotReady bootstrap-kube-scheduler-bootstrap.discocp4.lab.msp.redhat.com kube-system 0 6887e35e74929 32 minutes ago NotReady bootstrap-kube-controller-manager-bootstrap.discocp4.lab.msp.redhat.com kube-system 0 dcc9121a53df1 32 minutes ago NotReady bootstrap-kube-apiserver-bootstrap.discocp4.lab.msp.redhat.com kube-system 0 c278592bc9851 32 minutes ago NotReady bootstrap-cluster-version-operator-bootstrap.discocp4.lab.msp.redhat.com openshift-cluster-version 0 705aca2b677d9 32 minutes ago Ready bootstrap-machine-config-operator-bootstrap.discocp4.lab.msp.redhat.com default 0 f849fb2f7034b 33 minutes ago Ready etcd-bootstrap-member-bootstrap.discocp4.lab.msp.redhat.com openshift-etcd 0 [root@bootstrap ~]# crictl images IMAGE TAG IMAGE ID SIZE quay.io/openshift-release-dev/ocp-release@sha256 <none> e7c443017e821 306MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> 790b38ec6f81b 307MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> f67097361498f 283MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> 3fcd563edad3b 255MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> b0d508e56910d 305MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> 7e44a17a2951a 282MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> c5072ae56904b 308MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> 0c893df5a716e 308MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> 793d4a1e7161c 305MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> eaff45a171adb 307MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> d1bb18c7027ae 432MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> 5afa4eae3d651 311MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> d1eec47fd97e5 326MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> 32b54e50bc4bc 288MB quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256 <none> d8375a61d36e3 674MB Expected results: Installation should run through and/or give us some details on why it's failing Additional info: Install config is as follows: [root@tatooine ocp45]# cat install-config.yaml apiVersion: v1 baseDomain: lab.msp.redhat.com compute: - hyperthreading: Disabled name: worker replicas: 2 controlPlane: hyperthreading: Disabled name: master replicas: 3 metadata: name: discocp4 platform: vsphere: vcenter: vcenter01.lab.msp.redhat.com username: ocp4 password: OpenShift2020! datacenter: msp-lab defaultDatastore: storage03-iscsi-lun0 networking: clusterNetworks: - cidr: 10.128.0.0/14 hostPrefix: 23 networkType: OpenShiftSDN serviceNetwork: - 172.30.0.0/16 platform: none: {} pullSecret: '{"auths": ...}' sshKey: 'ssh-ed25519 AAAA...' imageContentSources: - mirrors: - registry.lab.msp.redhat.com:5000/ocp4/openshift4 source: quay.io/openshift-release-dev/ocp-release - mirrors: - registry.lab.msp.redhat.com:5000/ocp4/openshift4 source: quay.io/openshift-release-dev/ocp-v4.0-art-dev --- Additional comment from Scott Dodson on 2020-09-05 16:46:28 UTC --- Please attach the log bundle generated from `openshift-install gather bootstrap` see --help if you're not familiar with the command. When the installer failed it should've attempted to gather the bundle or emitted instructions to do so. That log bundle should be attached to any bug involving bootstrap failure. --- Additional comment from Sam Yangsao on 2020-09-06 00:01:22 UTC --- Log bundle attached. --- Additional comment from Sam Yangsao on 2020-09-08 15:47:55 UTC --- I was able to reproduce the issue again this morning, log bundle 2 attached from the bootstrap node. --- Additional comment from Abhinav Dahiya on 2020-09-08 16:54:34 UTC --- The etcd team creates the etcd signer, so i think they can help the best here.
Installed 4.5.0-0.nightly-2020-10-31-200727 with 19_Disconnected UPI on vSphere 7.0 with RHCOS, have not hit this issue.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5.18 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4425