Description of problem: Openshift Install fails 1. Machine config operator not running 2. Masters/control plane retrying to get config from machine config operator/never completing boot 3. Bootstrap waiting on etcd, and failing 4. Overall install fails Jun 15 06:19:49 bootstrap-0 hyperkube[1375]: E0615 06:19:49.270441 1375 file.go:108] Unable to process watch event: can't process config file "/etc/kubernetes/manifests/machineconfigoperator-bootstrap-pod.yaml": /etc/kubernetes/manifests/machineconfigoperator-bootstrap-pod.yaml: couldn't parse as pod(Object 'Kind' is missing in 'null'), please check config file. Jun 15 06:19:49 bootstrap-0 hyperkube[1375]: I0615 06:19:49.364262 1375 kubelet.go:1915] SyncLoop (ADD, "file"): "bootstrap-machine-config-operator-bootstrap-0_default(eb26c1cc7cd8e6521dfc95d1a59cd87f)" Jun 15 06:19:49 bootstrap-0 hyperkube[1375]: W0615 06:19:49.364359 1375 eviction_manager.go:160] Failed to admit pod bootstrap-machine-config-operator-bootstrap-0_default(eb26c1cc7cd8e6521dfc95d1a59cd87f) - node has conditions: [DiskPressure] Crio Issues (Same Install): Jun 15 06:19:11 bootstrap-0 openshift.sh[1584]: kubectl create --filename ./99_binding-discovery.yaml failed. Retrying in 5 seconds... Jun 15 06:19:11 bootstrap-0 systemd[1]: Stopping Open Container Initiative Daemon... Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: I0615 06:19:11.507439 1375 controlbuf.go:382] transport: loopyWriter.run returning. connection error: desc = "transport is closing" Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: W0615 06:19:11.507585 1375 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {/var/run/crio/crio.sock 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/run/crio/crio.sock: connect: no such file or directory". Reconnecting... Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: I0615 06:19:11.507607 1375 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc0001cdc80, TRANSIENT_FAILURE Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: I0615 06:19:11.507616 1375 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc0001cdc80, CONNECTING Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: I0615 06:19:11.507624 1375 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc0001cdc80, TRANSIENT_FAILURE Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: I0615 06:19:11.507650 1375 controlbuf.go:382] transport: loopyWriter.run returning. connection error: desc = "transport is closing" Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: W0615 06:19:11.507709 1375 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {/var/run/crio/crio.sock 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/run/crio/crio.sock: connect: no such file or directory". Reconnecting... Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: I0615 06:19:11.507723 1375 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00014a6e0, TRANSIENT_FAILURE Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: I0615 06:19:11.507733 1375 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00014a6e0, CONNECTING Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: I0615 06:19:11.507740 1375 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00014a6e0, TRANSIENT_FAILURE Jun 15 06:19:11 bootstrap-0 systemd[1]: Stopped Open Container Initiative Daemon. Jun 15 06:19:11 bootstrap-0 systemd[1]: Starting Open Container Initiative Daemon... Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: E0615 06:19:11.606854 1375 remote_runtime.go:173] ListPodSandbox with filter &PodSandboxFilter{Id:,State:&PodSandboxStateValue{State:SANDBOX_READY,},LabelSelector:map[string]string{},} from runtime service failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial unix /var/run/crio/crio.sock: connect: no such file or directory" Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: E0615 06:19:11.607251 1375 kuberuntime_sandbox.go:210] ListPodSandbox failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial unix /var/run/crio/crio.sock: connect: no such file or directory" Jun 15 06:19:11 bootstrap-0 hyperkube[1375]: E0615 06:19:11.607315 1375 kubelet_pods.go:1022] Error listing containers: &status.statusError{Code:14, Message:"all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial unix /var/run/crio/crio.sock: connect: no such file or directory\"", Details:[]*any.Any(nil)} Jun 15 06 Version-Release number of selected component (if applicable): Installer: openshift-install v4.1.1-201906040019-dirty built from commit fb776038a1d90b2b83839ab5deb8579287972e11 release image quay.io/openshift-release-dev/ocp-release@sha256:e9415dbf80988553adc6c34740243805a21d92e3cdedeb2fd8d743ca56522a61 Environment: 1. Bare Metal Install - ESXI, pxe boot 2. dnsmasq providing IP 3. Private ESXI Network with RHEL Ctl node providing nat internet access 4. All nodes pxe boot, and pull ignition files from web server Full log from bootstrap: https://gist.github.com/glennswest/7828c2572feafd80b4d1541a2245a4ef How reproducible: Scripted, every time the same. Hard Failure. (1 week of countless attempts) Steps to Reproduce: 1. Create install-config.yml per bare metal doc 2. Use latest RHCOS image and install components from pxe 3. Validate time is set correctly 4. Generate ignition scripts each time in empty directory Actual results: Failed Instal Expected results: Working Cluster Additional info:
Problem duplicated on 4.1.2 Jun 15 18:50:15 bootstrap-0 bootkube.sh[1575]: Writing asset: /assets/kube-controller-manager-bootstrap/manifests/00_openshift-kube-controller-manager-operator-ns.yaml Jun 15 18:50:32 bootstrap-0 hyperkube[1348]: E0615 18:50:32.388397 1348 file.go:108] Unable to process watch event: can't process config file "/etc/kubernetes/manifests/machineconfigoperator-bootstrap-pod.yaml": /etc/kubernetes/manifests/machineconfigoperator-bootstrap-pod.yaml: couldn't parse as pod(Object 'Kind' is missing in 'null'), please check config file. Jun 15 18:50:32 bootstrap-0 hyperkube[1348]: I0615 18:50:32.477829 1348 kubelet.go:1915] SyncLoop (ADD, "file"): "bootstrap-machine-config-operator-bootstrap-0_default(eb26c1cc7cd8e6521dfc95d1a59cd87f)" Jun 15 18:50:32 bootstrap-0 hyperkube[1348]: W0615 18:50:32.477923 1348 eviction_manager.go:160] Failed to admit pod bootstrap-machine-config-operator-bootstrap-0_default(eb26c1cc7cd8e6521dfc95d1a59cd87f) - node has conditions: [DiskPressure]
Openshift install binery for 4.1.2 is not updated to 4.1.2 hash.
Jun 15 18:50:32 bootstrap-0 hyperkube[1348]: W0615 18:50:32.477923 1348 eviction_manager.go:160] Failed to admit pod bootstrap-machine-config-operator-bootstrap-0_default(eb26c1cc7cd8e6521dfc95d1a59cd87f) - node has conditions: [DiskPressure] The above says that the bootstrap node is under disk pressure - this seems just expected for the pod not to start to me.
The node has lots of space [core@bootstrap-0 ~]$ df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 7.8G 0 7.8G 0% /dev tmpfs 7.9G 0 7.9G 0% /dev/shm tmpfs 7.9G 1.2M 7.9G 1% /run tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup /dev/sda3 319G 4.8G 315G 2% /sysroot /dev/sda2 976M 72M 838M 8% /boot tmpfs 1.6G 0 1.6G 0% /run/user/1000 [core@bootstrap-0 ~]$ Dont see any disk preasure. I started at 200G (Doc says 160 is enough) I then upgrade all nodes to 320 Gig disks. It didnt change. They all have 16Gig ram as well.
There may be a race condition: While this does seem to happen: Jun 16 16:14:40 bootstrap-0 hyperkube[1388]: I0616 16:14:40.999242 1388 kubelet.go:1915] SyncLoop (ADD, "file"): "bootstrap-machine-config-operator-bootstrap-0_default(7b20d6bcdff316762adc6abcea4bda05)" Jun 16 16:14:40 bootstrap-0 hyperkube[1388]: W0616 16:14:40.999310 1388 eviction_manager.go:160] Failed to admit pod bootstrap-machine-config-operator-bootstrap-0_default(7b20d6bcdff316762adc6abcea4bda05) - node has conditions: [DiskPressure] The partition is expanded at: Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: meta-data=/dev/sda3 isize=512 agcount=4, agsize=136768 blks Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: = sectsz=512 attr=2, projid32bit=1 Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: = crc=1 finobt=1, sparse=1, rmapbt=0 Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: = reflink=1 Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: data = bsize=4096 blocks=547072, imaxpct=25 Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: = sunit=0 swidth=0 blks Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: naming =version 2 bsize=4096 ascii-ci=0, ftype=1 Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: log =internal log bsize=4096 blocks=2560, version=2 Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: = sectsz=512 sunit=0 blks, lazy-count=1 Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: realtime =none extsz=4096 blocks=0, rtextents=0 Jun 16 16:13:00 bootstrap-0 coreos-growpart[1302]: data blocks changed from 547072 to 41680379 The node shows disk pressure 16:12 thru 16:18 only Looking at it a bit more detailed: Jun 16 16:12:50 bootstrap-0 hyperkube[1388]: I0616 16:12:50.705833 1388 kubelet_node_status.go:446] Recording NodeHasNoDiskPressure event message for node bootstrap-0 Jun 16 16:12:50 bootstrap-0 hyperkube[1388]: I0616 16:12:50.799577 1388 kubelet_node_status.go:446] Recording NodeHasNoDiskPressure event message for node bootstrap-0 Jun 16 16:12:50 bootstrap-0 hyperkube[1388]: I0616 16:12:50.809266 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:13:00 bootstrap-0 hyperkube[1388]: I0616 16:13:00.814215 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:13:10 bootstrap-0 hyperkube[1388]: I0616 16:13:10.821837 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:13:20 bootstrap-0 hyperkube[1388]: I0616 16:13:20.873645 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:13:40 bootstrap-0 hyperkube[1388]: I0616 16:13:40.729792 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:13:50 bootstrap-0 hyperkube[1388]: I0616 16:13:50.741057 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:14:00 bootstrap-0 hyperkube[1388]: I0616 16:14:00.751720 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:14:10 bootstrap-0 hyperkube[1388]: I0616 16:14:10.761797 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:14:23 bootstrap-0 hyperkube[1388]: I0616 16:14:23.026863 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:14:34 bootstrap-0 hyperkube[1388]: I0616 16:14:34.234750 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:14:40 bootstrap-0 hyperkube[1388]: W0616 16:14:40.999310 1388 eviction_manager.go:160] Failed to admit pod bootstrap-machine-config-operator-bootstrap-0_default(7b20d6bcdff316762adc6abcea4bda05) - node has conditions: [DiskPressure] Jun 16 16:14:44 bootstrap-0 hyperkube[1388]: I0616 16:14:44.297265 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:14:55 bootstrap-0 hyperkube[1388]: I0616 16:14:55.687715 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:15:05 bootstrap-0 hyperkube[1388]: I0616 16:15:05.698484 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:15:15 bootstrap-0 hyperkube[1388]: I0616 16:15:15.708075 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:15:25 bootstrap-0 hyperkube[1388]: I0616 16:15:25.717638 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:15:35 bootstrap-0 hyperkube[1388]: I0616 16:15:35.727809 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:15:45 bootstrap-0 hyperkube[1388]: I0616 16:15:45.737063 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:15:55 bootstrap-0 hyperkube[1388]: I0616 16:15:55.746290 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:16:05 bootstrap-0 hyperkube[1388]: I0616 16:16:05.755426 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:16:15 bootstrap-0 hyperkube[1388]: I0616 16:16:15.765174 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:16:25 bootstrap-0 hyperkube[1388]: I0616 16:16:25.775507 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:16:35 bootstrap-0 hyperkube[1388]: I0616 16:16:35.784761 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:16:45 bootstrap-0 hyperkube[1388]: I0616 16:16:45.794190 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:16:55 bootstrap-0 hyperkube[1388]: I0616 16:16:55.803434 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:17:05 bootstrap-0 hyperkube[1388]: I0616 16:17:05.812588 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:17:15 bootstrap-0 hyperkube[1388]: I0616 16:17:15.827270 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:17:25 bootstrap-0 hyperkube[1388]: I0616 16:17:25.835891 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:17:35 bootstrap-0 hyperkube[1388]: I0616 16:17:35.845064 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:17:45 bootstrap-0 hyperkube[1388]: I0616 16:17:45.854104 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:17:55 bootstrap-0 hyperkube[1388]: I0616 16:17:55.863340 1388 kubelet_node_status.go:446] Recording NodeHasDiskPressure event message for node bootstrap-0 Jun 16 16:18:05 bootstrap-0 hyperkube[1388]: I0616 16:18:05.879401 1388 kubelet_node_status.go:446] Recording NodeHasNoDiskPressure event message for node bootstrap-0 Jun 16 16:18:15 bootstrap-0 hyperkube[1388]: I0616 16:18:15.889038 1388 kubelet_node_status.go:446] Recording NodeHasNoDiskPressure event message for node bootstrap-0
I believe there some good evidence that this is a race condition. I've tried reducing the vcpu to 1, and I find that all the masters come up, and the bootstrap machine config operator does seem to start with this case. I still dont have a working config, but its later in the process. Note that 1 vcpu is not a recommended config. The bootstrap log https://gist.github.com/9d794b59d2b3f46696614ece0239ff4c I tried 4.1.0 and 4.1.1 and had similar results.
I tried to increase vcpu to 8 for all vm's, and hit the same problem Jun 17 06:07:51 bootstrap-0 bootkube.sh[1689]: Writing asset: /assets/kube-apiserver-bootstrap/manifests/00_openshift-kube-apiserver-operator-ns.yaml Jun 17 06:07:58 bootstrap-0 bootkube.sh[1689]: Writing asset: /assets/kube-controller-manager-bootstrap/manifests/00_openshift-kube-controller-manager-operator-ns.yaml Jun 17 06:08:15 bootstrap-0 hyperkube[1465]: I0617 06:08:15.391171 1465 kubelet.go:1915] SyncLoop (ADD, "file"): "bootstrap-machine-config-operator-bootstrap-0_default(7b20d6bcdff316762adc6abcea4bda05)" Jun 17 06:08:15 bootstrap-0 hyperkube[1465]: W0617 06:08:15.391239 1465 eviction_manager.go:160] Failed to admit pod bootstrap-machine-config-operator-bootstrap-0_default(7b20d6bcdff316762adc6abcea4bda05) - node has conditions: [DiskPressure] So only 1 vcpu can I get any movement on the install, and 1 seems to cause problems later.
This doesn't seem an MCO issue as described though. This is the kubelet/kubernetes not admitting the MCO pod. I'll move this to Node, reassign it back if it's something related to the MCO specifically.
There is a race with the kubelet having started while coreos-growpart has started or in the process of running. The coreos-growpart service needs a Type=oneshot to allow the service to be marked started after the main process has exited (completing the resize event).
I added the growpart service to the kubelet unit file here. After [1] merges and the tweak is made to coreos-growpart, then the kubelet will wait for the resize event to complete before starting. 1. https://github.com/openshift/machine-config-operator/pull/861
I closed my PR which is not needed with Colin's fix.
> when will a image be available for test? After https://bugzilla.redhat.com/show_bug.cgi?id=1720872#c13 merges. Hmm, this is about the bootstrap machine, so it will require an update to the installer too and new published bootimages on the mirrors.
The changes are in 4.1.0-0.nightly-2019-06-19-033215 ``` $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.1.0-0.nightly-2019-06-19-033215 True False 3h6m Cluster version is 4.1.0-0.nightly-2019-06-19-033215 $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-131-47.us-west-2.compute.internal Ready worker 3h12m v1.13.4+9252851b0 ip-10-0-141-41.us-west-2.compute.internal Ready master 3h17m v1.13.4+9252851b0 ip-10-0-146-21.us-west-2.compute.internal Ready master 3h17m v1.13.4+9252851b0 ip-10-0-154-37.us-west-2.compute.internal Ready worker 3h12m v1.13.4+9252851b0 ip-10-0-168-93.us-west-2.compute.internal Ready worker 3h11m v1.13.4+9252851b0 ip-10-0-174-123.us-west-2.compute.internal Ready master 3h17m v1.13.4+9252851b0 $ oc debug node/ip-10-0-131-47.us-west-2.compute.internal Starting pod/ip-10-0-131-47us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# systemctl cat coreos-growpart.service # /usr/lib/systemd/system/coreos-growpart.service [Unit] ConditionPathExists=!/var/lib/coreos-growpart.stamp Before=sshd.service kubelet.service [Service] Type=oneshot ExecStart=/usr/libexec/coreos-growpart / RemainAfterExit=yes [Install] WantedBy=multi-user.target ``` Glenn, could you confirm that the changes fix the issue you encountered? (I don't have access to the necessary bare metal resources to confirm) After you confirm, I'll move this to VERIFIED.
Short answer - It does appear on first try to resolve the problem. Also accidently included workers, and that worked as well? Is that supposed to work now. (Big win for me) Authentication seems to be not ok, but I have not done any of the steps after the wait for compete, and I believe there is some additional steps. Will rerun this several times to make sure it rock solid. Also I didnt change the installer, so its: openshift-install unreleased-master-995-gc6517384e71e5f09931c4da5e772fdec225d02ec-dirty built from commit c6517384e71e5f09931c4da5e772fdec225d02ec release image quay.io/openshift-release-dev/ocp-release@sha256:9c5f0df8b192a0d7b46cd5f6a4da2289c155fd5302dec7954f8f06c878160b8b "Create Project: tiny"DEBUG OpenShift Installer unreleased-master-995-gc6517384e71e5f09931c4da5e772fdec225d02ec-dirty DEBUG Built from commit c6517384e71e5f09931c4da5e772fdec225d02ec INFO Waiting up to 30m0s for the Kubernetes API at https://api.tiny.k.lo:6443... DEBUG Still waiting for the Kubernetes API: Get https://api.tiny.k.lo:6443/version?timeout=32s: dial tcp 10.100.1.30:6443: connect: connection refused DEBUG Still waiting for the Kubernetes API: Get https://api.tiny.k.lo:6443/version?timeout=32s: dial tcp 10.100.1.30:6443: connect: connection refused DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: the server could not find the requested resource DEBUG Still waiting for the Kubernetes API: Get https://api.tiny.k.lo:6443/version?timeout=32s: dial tcp 10.100.1.31:6443: connect: connection refused INFO API v1.13.4+d4417a7 up INFO Waiting up to 30m0s for bootstrapping to complete... DEBUG Bootstrap status: complete INFO It is now safe to remove the bootstrap resources [root@ctl ocp4wlab]# ./clusteroperator.sh NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication Unknown Unknown False 33s cloud-credential 4.1.2 True False False 4m42s cluster-autoscaler 4.1.2 True False False 4m44s dns 4.1.2 True False False 3m48s kube-apiserver 4.1.2 True True True 115s kube-controller-manager 4.1.2 True True False 111s kube-scheduler 4.1.2 True True False 2m5s machine-api 4.1.2 True False False 4m37s machine-config 4.1.2 True False False 3m4s network 4.1.2 True False False 4m26s node-tuning 4.1.2 True False False 24s openshift-apiserver 4.1.2 True False False 38s openshift-controller-manager 4.1.2 True False False 3m48s operator-lifecycle-manager 4.1.2 True True False 3m8s operator-lifecycle-manager-catalog 4.1.2 True True False 3m9s service-ca 4.1.2 True False False 4m29s service-catalog-apiserver 4.1.2 True False False 32s service-catalog-controller-manager 4.1.2 True False False 33s [root@ctl ocp4wlab]# ./clusteroperatorstatus.sh apiVersion: v1 items: - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:59:15Z" generation: 1 name: authentication resourceVersion: "8786" selfLink: /apis/config.openshift.io/v1/clusteroperators/authentication uid: 2c430e6f-93c8-11e9-b6fb-0050561f3131 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:59:15Z" message: 'Degraded: failed handling the route: route has no host: &v1.Route{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"oauth-openshift", GenerateName:"", Namespace:"openshift-authentication", SelfLink:"/apis/route.openshift.io/v1/namespaces/openshift-authentication/routes/oauth-openshift", UID:"2c787efd-93c8-11e9-a434-0a580a80000e", ResourceVersion:"6572", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63696679156, loc:(*time.Location)(0x2b32340)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"app":"oauth-openshift"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.RouteSpec{Host:"", Subdomain:"", Path:"", To:v1.RouteTargetReference{Kind:"Service", Name:"oauth-openshift", Weight:(*int32)(0xc42077555c)}, AlternateBackends:[]v1.RouteTargetReference(nil), Port:(*v1.RoutePort)(0xc420833920), TLS:(*v1.TLSConfig)(0xc421278c60), WildcardPolicy:"None"}, Status:v1.RouteStatus{Ingress:[]v1.RouteIngress(nil)}}' reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:59:15Z" reason: NoData status: Unknown type: Progressing - lastTransitionTime: "2019-06-21T01:59:15Z" reason: NoData status: Unknown type: Available - lastTransitionTime: "2019-06-21T01:59:15Z" reason: NoData status: Unknown type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: authentications - group: config.openshift.io name: cluster resource: authentications - group: config.openshift.io name: cluster resource: infrastructures - group: config.openshift.io name: cluster resource: oauths - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-authentication resource: namespaces - group: "" name: authentication-operator resource: namespaces - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:06Z" generation: 1 name: cloud-credential resourceVersion: "2283" selfLink: /apis/config.openshift.io/v1/clusteroperators/cloud-credential uid: 97dce012-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:55:06Z" message: No credentials requests reporting errors. reason: NoCredentialsFailing status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:55:11Z" message: 4 of 4 credentials requests provisioned and reconciled. reason: ReconcilingComplete status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:55:06Z" status: "True" type: Available extension: null versions: - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:04Z" generation: 1 name: cluster-autoscaler resourceVersion: "2938" selfLink: /apis/config.openshift.io/v1/clusteroperators/cluster-autoscaler uid: 9649c8ad-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:55:04Z" message: at version 4.1.2 status: "True" type: Available - lastTransitionTime: "2019-06-21T01:55:20Z" status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:55:20Z" status: "False" type: Degraded extension: null relatedObjects: - group: "" name: openshift-machine-api resource: namespaces versions: - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:59:49Z" generation: 1 name: console resourceVersion: "8216" selfLink: /apis/config.openshift.io/v1/clusteroperators/console uid: 40650cb4-93c8-11e9-b6fb-0050561f3131 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:59:51Z" reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:59:51Z" message: "Progressing: route: waiting on route host\nProgressing: " reason: ProgressingSyncLoopProgressing status: "True" type: Progressing - lastTransitionTime: "2019-06-21T01:59:50Z" reason: NoData status: Unknown type: Available - lastTransitionTime: "2019-06-21T01:59:50Z" reason: AsExpected status: "True" type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: consoles - group: config.openshift.io name: cluster resource: consoles - group: config.openshift.io name: cluster resource: infrastructures - group: oauth.openshift.io name: console resource: oauthclients - group: "" name: openshift-console-operator resource: namespaces - group: "" name: openshift-console resource: namespaces - group: "" name: console-public namespace: openshift-config-managed resource: configmaps versions: - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:38Z" generation: 1 name: dns resourceVersion: "4107" selfLink: /apis/config.openshift.io/v1/clusteroperators/dns uid: aa8eaf79-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:56:00Z" message: All desired DNS DaemonSets available and operand Namespace exists reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:56:02Z" message: Desired and available number of DNS DaemonSets are equal reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:56:00Z" message: At least 1 DNS DaemonSet available reason: AsExpected status: "True" type: Available extension: null relatedObjects: - group: "" name: openshift-dns-operator resource: namespaces - group: "" name: openshift-dns resource: namespaces versions: - name: operator version: 4.1.2 - name: coredns version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7e547e33db54dbcc33f72a556e739c4f9f0961098099dec6180398b4f0de03f5 - name: openshift-cli version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:403bc3725949c3d507065ababc37cf35c44b680441930dc8dfc48263aa3a9a61 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:12Z" generation: 1 name: kube-apiserver resourceVersion: "8534" selfLink: /apis/config.openshift.io/v1/clusteroperators/kube-apiserver uid: 9b4a6baf-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:56:35Z" message: 'StaticPodsDegraded: pods "kube-apiserver-control-plane-0" not found' reason: StaticPodsDegradedError status: "True" type: Degraded - lastTransitionTime: "2019-06-21T01:55:17Z" message: 'Progressing: 1 nodes are at revision 0; 1 nodes are at revision 2; 1 nodes are at revision 3' reason: Progressing status: "True" type: Progressing - lastTransitionTime: "2019-06-21T01:57:53Z" message: 'Available: 2 nodes are active; 1 nodes are at revision 0; 1 nodes are at revision 2; 1 nodes are at revision 3' reason: AsExpected status: "True" type: Available - lastTransitionTime: "2019-06-21T01:55:12Z" reason: AsExpected status: "True" type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: kubeapiservers - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-kube-apiserver-operator resource: namespaces - group: "" name: openshift-kube-apiserver resource: namespaces versions: - name: raw-internal version: 4.1.2 - name: kube-apiserver version: 1.13.4 - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:12Z" generation: 1 name: kube-controller-manager resourceVersion: "8485" selfLink: /apis/config.openshift.io/v1/clusteroperators/kube-controller-manager uid: 9b4c9fc6-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:58:48Z" reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T02:00:00Z" message: 'Progressing: 3 nodes are at revision 3' reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:57:57Z" message: 'Available: 3 nodes are active; 3 nodes are at revision 3' reason: AsExpected status: "True" type: Available - lastTransitionTime: "2019-06-21T01:55:13Z" reason: AsExpected status: "True" type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: kubecontrollermanagers - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-kube-controller-manager resource: namespaces - group: "" name: openshift-kube-controller-manager-operator resource: namespaces versions: - name: raw-internal version: 4.1.2 - name: kube-controller-manager version: 1.13.4 - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:12Z" generation: 1 name: kube-scheduler resourceVersion: "7764" selfLink: /apis/config.openshift.io/v1/clusteroperators/kube-scheduler uid: 9b571971-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:59:34Z" message: 'StaticPodsDegraded: nodes/control-plane-0 pods/openshift-kube-scheduler-control-plane-0 container="scheduler" is not ready' reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:55:15Z" message: 'Progressing: 1 nodes are at revision 0; 2 nodes are at revision 4' reason: Progressing status: "True" type: Progressing - lastTransitionTime: "2019-06-21T01:57:43Z" message: 'Available: 2 nodes are active; 1 nodes are at revision 0; 2 nodes are at revision 4' reason: AsExpected status: "True" type: Available - lastTransitionTime: "2019-06-21T01:55:13Z" reason: AsExpected status: "True" type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: kubeschedulers - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-kube-scheduler resource: namespaces - group: "" name: openshift-kube-scheduler-operator resource: namespaces versions: - name: raw-internal version: 4.1.2 - name: kube-scheduler version: 1.13.4 - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:10Z" generation: 1 name: machine-api resourceVersion: "6057" selfLink: /apis/config.openshift.io/v1/clusteroperators/machine-api uid: 99dab2fe-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:55:12Z" status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:55:11Z" message: 'Cluster Machine API Operator is available at operator: 4.1.2' status: "True" type: Available - lastTransitionTime: "2019-06-21T01:55:11Z" status: "False" type: Degraded extension: null versions: - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:17Z" generation: 1 name: machine-config resourceVersion: "4545" selfLink: /apis/config.openshift.io/v1/clusteroperators/machine-config uid: 9e7af8d1-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:56:44Z" message: Cluster has deployed 4.1.2 status: "True" type: Available - lastTransitionTime: "2019-06-21T01:56:44Z" message: Cluster version is 4.1.2 status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:55:17Z" status: "False" type: Degraded extension: master: all 3 nodes are at latest configuration rendered-master-359b2349a98d06ac7ed7d336fa6870fb worker: all 2 nodes are at latest configuration rendered-worker-fde4eb7c38bfdb2e6e1144b58b25d4b1 relatedObjects: - group: "" name: openshift-machine-config-operator resource: namespaces versions: - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:53:25Z" generation: 1 name: network resourceVersion: "2998" selfLink: /apis/config.openshift.io/v1/clusteroperators/network uid: 5b4b9977-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:53:26Z" status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:55:22Z" status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:55:22Z" status: "True" type: Available extension: null versions: - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:59:14Z" generation: 1 name: node-tuning resourceVersion: "7146" selfLink: /apis/config.openshift.io/v1/clusteroperators/node-tuning uid: 2b9aeb0f-93c8-11e9-b6fb-0050561f3131 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:59:24Z" message: Cluster has deployed "4.1.2" status: "True" type: Available - lastTransitionTime: "2019-06-21T01:59:16Z" message: Cluster version is "4.1.2" status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:59:14Z" status: "False" type: Degraded extension: null relatedObjects: - group: "" name: openshift-cluster-node-tuning-operator resource: namespaces versions: - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:11Z" generation: 1 name: openshift-apiserver resourceVersion: "6476" selfLink: /apis/config.openshift.io/v1/clusteroperators/openshift-apiserver uid: 9af44ad4-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:57:55Z" reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:58:40Z" reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:59:10Z" reason: AsExpected status: "True" type: Available - lastTransitionTime: "2019-06-21T01:55:12Z" reason: AsExpected status: "True" type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: openshiftapiservers - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-apiserver-operator resource: namespaces - group: "" name: openshift-apiserver resource: namespaces - group: apiregistration.k8s.io name: v1.apps.openshift.io resource: apiservices - group: apiregistration.k8s.io name: v1.authorization.openshift.io resource: apiservices - group: apiregistration.k8s.io name: v1.build.openshift.io resource: apiservices - group: apiregistration.k8s.io name: v1.image.openshift.io resource: apiservices - group: apiregistration.k8s.io name: v1.oauth.openshift.io resource: apiservices - group: apiregistration.k8s.io name: v1.project.openshift.io resource: apiservices - group: apiregistration.k8s.io name: v1.quota.openshift.io resource: apiservices - group: apiregistration.k8s.io name: v1.route.openshift.io resource: apiservices - group: apiregistration.k8s.io name: v1.security.openshift.io resource: apiservices - group: apiregistration.k8s.io name: v1.template.openshift.io resource: apiservices - group: apiregistration.k8s.io name: v1.user.openshift.io resource: apiservices versions: - name: operator version: 4.1.2 - name: openshift-apiserver version: "" - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:11Z" generation: 1 name: openshift-controller-manager resourceVersion: "5010" selfLink: /apis/config.openshift.io/v1/clusteroperators/openshift-controller-manager uid: 9ae87d72-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:55:13Z" reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:57:26Z" reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:56:00Z" reason: AsExpected status: "True" type: Available - lastTransitionTime: "2019-06-21T01:55:12Z" reason: NoData status: Unknown type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: openshiftcontrollermanagers - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-controller-manager-operator resource: namespaces - group: "" name: openshift-controller-manager resource: namespaces versions: - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T02:00:05Z" generation: 1 name: openshift-samples resourceVersion: "8679" selfLink: /apis/config.openshift.io/v1/clusteroperators/openshift-samples uid: 49f7b127-93c8-11e9-9df7-0050561f3030 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T02:00:05Z" status: "False" type: Available - lastTransitionTime: "2019-06-21T02:00:05Z" status: "False" type: Progressing extension: null relatedObjects: - group: samples.operator.openshift.io name: cluster resource: configs - group: "" name: openshift-cluster-samples-operator resource: namespaces - group: "" name: openshift resource: namespaces - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:56:39Z" generation: 1 name: operator-lifecycle-manager resourceVersion: "4467" selfLink: /apis/config.openshift.io/v1/clusteroperators/operator-lifecycle-manager uid: cf6bb803-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:56:40Z" status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:56:40Z" message: Deployed 0.9.0 status: "True" type: Progressing - lastTransitionTime: "2019-06-21T01:56:40Z" status: "True" type: Available extension: null versions: - name: operator version: 4.1.2 - name: operator-lifecycle-manager version: 0.9.0 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:56:39Z" generation: 1 name: operator-lifecycle-manager-catalog resourceVersion: "4461" selfLink: /apis/config.openshift.io/v1/clusteroperators/operator-lifecycle-manager-catalog uid: cef654c0-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:56:39Z" status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:56:39Z" message: Deployed 0.9.0 status: "True" type: Progressing - lastTransitionTime: "2019-06-21T01:56:39Z" status: "True" type: Available extension: null versions: - name: operator version: 4.1.2 - name: operator-lifecycle-manager version: 0.9.0 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:12Z" generation: 1 name: service-ca resourceVersion: "3510" selfLink: /apis/config.openshift.io/v1/clusteroperators/service-ca uid: 9b28293f-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:55:24Z" reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:55:35Z" message: 'Progressing: All service-ca-operator deployments updated' reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:55:19Z" reason: AsExpected status: "True" type: Available - lastTransitionTime: "2019-06-21T01:55:12Z" reason: NoData status: Unknown type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: servicecas - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-service-ca-operator resource: namespaces - group: "" name: openshift-service-ca resource: namespaces versions: - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:59:15Z" generation: 1 name: service-catalog-apiserver resourceVersion: "6569" selfLink: /apis/config.openshift.io/v1/clusteroperators/service-catalog-apiserver uid: 2bea2c7a-93c8-11e9-b6fb-0050561f3131 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:59:16Z" reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:59:16Z" reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:59:16Z" message: 'Available: the apiserver is in the desired state (Removed).' reason: AsExpected status: "True" type: Available - lastTransitionTime: "2019-06-21T01:59:15Z" reason: NoData status: Unknown type: Upgradeable extension: null relatedObjects: - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-service-catalog-apiserver-operator resource: namespaces - group: "" name: openshift-service-catalog-apiserver resource: namespaces - group: apiregistration.k8s.io name: v1beta1.servicecatalog.k8s.io resource: apiservices versions: - name: operator version: 4.1.2 - apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:59:15Z" generation: 1 name: service-catalog-controller-manager resourceVersion: "6557" selfLink: /apis/config.openshift.io/v1/clusteroperators/service-catalog-controller-manager uid: 2c4cf67c-93c8-11e9-b6fb-0050561f3131 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T01:59:15Z" reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T01:59:15Z" reason: AsExpected status: "False" type: Progressing - lastTransitionTime: "2019-06-21T01:59:15Z" reason: AsExpected status: "True" type: Available - lastTransitionTime: "2019-06-21T01:59:15Z" reason: NoData status: Unknown type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: servicecatalogcontrollermanagers - group: "" name: openshift-service-catalog-controller-manager-operator resource: namespaces - group: "" name: openshift-service-catalog-controller-manager resource: namespaces versions: - name: operator version: 4.1.2 kind: List metadata: resourceVersion: "" selfLink: "" Oops. :) Accidently added in workers oc --config=${INSTALL_DIR}/auth/kubeconfig get clusteroperator [root@ctl ocp4wlab]# oc --config=${INSTALL_DIR}/auth/kubeconfig get nodes NAME STATUS ROLES AGE VERSION compute-0 Ready worker 10m v1.13.4+9252851b0 compute-1 Ready worker 10m v1.13.4+9252851b0 control-plane-0 Ready master 10m v1.13.4+9252851b0 control-plane-1 Ready master 9m56s v1.13.4+9252851b0 control-plane-2 Ready master 10m v1.13.4+9252851b0 Is this problem resolved as well, as in 4.1 official doc this is not recommended.
Auth just takes a while, csr for all machines auto approved. NAME AGE REQUESTOR CONDITION csr-4cs4h 39m system:node:control-plane-0 Approved,Issued csr-89d8r 39m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-9vrb4 39m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-fmdth 39m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-g45hp 38m system:node:compute-1 Approved,Issued csr-g99wl 38m system:node:control-plane-2 Approved,Issued csr-p2kbk 39m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-qh466 39m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-xckr2 38m system:node:compute-0 Approved,Issued csr-xp76q 38m system:node:control-plane-1 Approved,Issued [root@ctl ocp4wlab]# oc get clusteroperators NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.1.2 True False False 25m cloud-credential 4.1.2 True False False 37m cluster-autoscaler 4.1.2 True False False 37m console 4.1.2 True False False 28m dns 4.1.2 True False False 37m image-registry False False True 32m ingress 4.1.2 True False False 31m kube-apiserver 4.1.2 True False False 35m kube-controller-manager 4.1.2 True False False 35m kube-scheduler 4.1.2 True False False 35m machine-api 4.1.2 True False False 37m machine-config 4.1.2 True False False 36m marketplace 4.1.2 True False False 31m monitoring 4.1.2 True False False 30m network 4.1.2 True False False 37m node-tuning 4.1.2 True False False 33m openshift-apiserver 4.1.2 True False False 33m openshift-controller-manager 4.1.2 True False False 37m openshift-samples 4.1.2 True False False 27m operator-lifecycle-manager 4.1.2 True False False 36m operator-lifecycle-manager-catalog 4.1.2 True False False 36m service-ca 4.1.2 True False False 37m service-catalog-apiserver 4.1.2 True False False 33m service-catalog-controller-manager 4.1.2 True False False 33m storage 4.1.2 True False False 32m
Is this a upgrade happening automatically? [root@ctl ocp4wlab]# ./clusteroperatorstatus.sh kube-apiserver apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: creationTimestamp: "2019-06-21T01:55:12Z" generation: 1 name: kube-apiserver resourceVersion: "22623" selfLink: /apis/config.openshift.io/v1/clusteroperators/kube-apiserver uid: 9b4a6baf-93c7-11e9-8f65-0050561f2525 spec: {} status: conditions: - lastTransitionTime: "2019-06-21T02:04:20Z" message: |- StaticPodsDegraded: nodes/control-plane-0 pods/kube-apiserver-control-plane-0 container="kube-apiserver-6" is not ready StaticPodsDegraded: nodes/control-plane-0 pods/kube-apiserver-control-plane-0 container="kube-apiserver-cert-syncer-6" is not ready reason: AsExpected status: "False" type: Degraded - lastTransitionTime: "2019-06-21T02:39:01Z" message: 'Progressing: 1 nodes are at revision 5; 2 nodes are at revision 6' reason: Progressing status: "True" type: Progressing - lastTransitionTime: "2019-06-21T01:57:53Z" message: 'Available: 3 nodes are active; 1 nodes are at revision 5; 2 nodes are at revision 6' reason: AsExpected status: "True" type: Available - lastTransitionTime: "2019-06-21T01:55:12Z" reason: AsExpected status: "True" type: Upgradeable extension: null relatedObjects: - group: operator.openshift.io name: cluster resource: kubeapiservers - group: "" name: openshift-config resource: namespaces - group: "" name: openshift-config-managed resource: namespaces - group: "" name: openshift-kube-apiserver-operator resource: namespaces - group: "" name: openshift-kube-apiserver resource: namespaces versions: - name: raw-internal version: 4.1.2 - name: kube-apiserver version: 1.13.4 - name: operator version: 4.1.2
Looks like it was a auto upgrade. [root@ctl ocp4wlab]# oc get clusteroperators NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.1.2 True False False 38m cloud-credential 4.1.2 True False False 50m cluster-autoscaler 4.1.2 True False False 50m console 4.1.2 True False False 41m dns 4.1.2 True False False 49m image-registry 4.1.2 True False False 6m22s ingress 4.1.2 True False False 44m kube-apiserver 4.1.2 True False False 47m kube-controller-manager 4.1.2 True False False 47m kube-scheduler 4.1.2 True False False 48m machine-api 4.1.2 True False False 50m machine-config 4.1.2 True False False 49m marketplace 4.1.2 True False False 44m monitoring 4.1.2 True False False 42m network 4.1.2 True False False 50m node-tuning 4.1.2 True False False 46m openshift-apiserver 4.1.2 True False False 46m openshift-controller-manager 4.1.2 True False False 49m openshift-samples 4.1.2 True False False 39m operator-lifecycle-manager 4.1.2 True False False 49m operator-lifecycle-manager-catalog 4.1.2 True False False 49m service-ca 4.1.2 True False False 50m service-catalog-apiserver 4.1.2 True False False 46m service-catalog-controller-manager 4.1.2 True False False 46m storage 4.1.2 True False False 45m
Glenn, the changes that were made to address this BZ are in RHCOS itself. If you are running the worker nodes as RHCOS, they should have the same changes applied (as well as the master nodes in the control plane because they are also RHCOS). From what I can tell, it looks like this BZ has been successfully fixed, so I'm going to move it to VERIFIED. If you find additional problems, please file new BZs for them.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:1589