Created attachment 1896284 [details] kubelet log Description of problem: After doing SDN migration and rollback on a cluster on AliCloud, the static pod kube-controller-manager cannot become ready. The pods stuck in 'CreateContainerError' with error 'error reserving ctr name k8s_cluster-policy-controller_kube-controller-manager-zzhao-alisdn3-pzk57-master-2_openshift-kube-controller-manager_6b7005530f1d4a02799194a880c80087_3 for id 23b521bccdd316f1a6ab720ee9694c626a7f8c7814fc336e8439af3b37881c93: name is reserved' Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Deploy a cluster on alicloud using openshift-sdn 2. Migrate the network provider to ovn-kubernetes. https://docs.openshift.com/container-platform/4.10/networking/ovn_kubernetes_network_provider/migrate-from-openshift-sdn.html 3. Rollback the network provider to openshift-sdn, after the migration is done successfully. https://docs.openshift.com/container-platform/4.10/networking/ovn_kubernetes_network_provider/rollback-to-openshift-sdn.html Actual results: Some static pods cannot become ready. The pods can be kube-controller-manager or kube-apiserver. Expected results: All pods can work after the rollback. Additional info: The symptom looks similar to https://bugzilla.redhat.com/show_bug.cgi?id=1785399 This issue doesn't happen on other platforms, like AWS, GCP, baremetal, IBM Cloud etc.
Executing 'crictl rm -r -a' on the node can recover the pod. This is also a workaround mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1785399. I also found the container state from kubectl, is not synced with the actual container state on the node. The container was up and running on the node, but the container log from kubectl only contains previous container failure.
cri-o needs a reboot to detect the new networking setup. Is there something reboot the nodes when there is a network change?
@rphillips Yes, there are 2 reboots during the rollback. One is triggered manually, MCO triggers the other one. Also after seeing this problem, I tried to reboot the node with malfunctioning pods again, but it didn't help. The pods still could not become ready after reboot. BTW, The kube-controller-manager or kube-apiserver pods are using hostnetwork instead of CNI. So they are directly affected by the SDN migration which only swaps the CNI of the cluster.
Hey Peng, I'm looking into this issue in addition to my team members. (In reply to Peng Liu from comment #3) > I also found the container state from kubectl, is not synced with the actual container state on the node. The container was up and running on the node, but the container log from kubectl only contains previous container failure. I assume that up and running means that `crictl inspect $ID | jq .status.state` reports "CONTAINER_RUNNING", right? If that's the case, would it be possible to get the kubelet logs with increased verbosity `-v=10` as well as one of the affected container names?
I've prepared a cluster for online debugging.
The test setup mentioned by Peng reveals the issue that we have two cluster-policy-controller containers, where one is up and running and the other one blocks the name, which tries to get created by the kubelet: > 47b97d9dd4801 e1bdd290e0d56520dc3f8a8af8fcdee28b1d1c99214b9ee3cb63c8faa256c4c1 Less than a second ago Exited cluster-policy-controller 2 621ba87014777 kube-controller-manager-pliu-alicloud-qp45q-master-0 > 293d816f6735e e1bdd290e0d56520dc3f8a8af8fcdee28b1d1c99214b9ee3cb63c8faa256c4c1 48 minutes ago Running cluster-policy-controller 3 621ba87014777 kube-controller-manager-pliu-alicloud-qp45q-master-0 I'd expect that the kubelet does a cleanup of running containers before trying to create new ones, so this may be the bug here. Peng, can you please try to reproduce the issue with the latest 4.11 and tell me if it works there?
I've reproduced this issue with 4.11.0-0.ci-2022-07-27-174640.
I've built an experimental workaround into CRI-O to remove the container once we get the name reservation error message 10 times in 10 minutes: https://github.com/cri-o/cri-o/pull/6097 https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=46802628 Can you please try the 4.12 reproducer again with the modified package cri-o-1.25.0-13.rhaos4.12.gitdf67c83.el8 and enabled CRI-O debug logs: http://brew-task-repos.usersys.redhat.com/repos/scratch/sgrunert/cri-o/1.25.0/13.rhaos4.12.gitdf67c83.el8/ You can install the rpms locally by putting this .repo file in your /etc/yum.repos.d/ directory: http://brew-task-repos.usersys.redhat.com/repos/scratch/sgrunert/cri-o/1.25.0/13.rhaos4.12.gitdf67c83.el8/cri-o-1.25.0-13.rhaos4.12.gitdf67c83.el8-scratch.repo RPMs and build logs can be found in the following locations: http://brew-task-repos.usersys.redhat.com/repos/scratch/sgrunert/cri-o/1.25.0/13.rhaos4.12.gitdf67c83.el8/aarch64/ http://brew-task-repos.usersys.redhat.com/repos/scratch/sgrunert/cri-o/1.25.0/13.rhaos4.12.gitdf67c83.el8/ppc64le/ http://brew-task-repos.usersys.redhat.com/repos/scratch/sgrunert/cri-o/1.25.0/13.rhaos4.12.gitdf67c83.el8/s390x/ http://brew-task-repos.usersys.redhat.com/repos/scratch/sgrunert/cri-o/1.25.0/13.rhaos4.12.gitdf67c83.el8/x86_64/
Removing the blocker and reducing the urgency since we have the `crictl rm -fa` workaround.
I had to build a new version of the patch: - https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=46827743 - http://brew-task-repos.usersys.redhat.com/repos/scratch/sgrunert/cri-o/1.25.0/14.rhaos4.12.gitdf67c83.el8/ Peng, can you please try it again and send me over the CRI-O and kubelet debug logs?
The new build can remove the stale container from the node. However, the container still cannot become ready in the apiserver. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES apiserver-watcher-pliu-alicloud-bxctb-master-0 1/1 Running 2 8h 10.0.101.108 pliu-alicloud-bxctb-master-0 <none> <none> apiserver-watcher-pliu-alicloud-bxctb-master-1 1/1 Running 3 8h 10.0.157.160 pliu-alicloud-bxctb-master-1 <none> <none> apiserver-watcher-pliu-alicloud-bxctb-master-2 1/1 Running 3 8h 10.0.101.109 pliu-alicloud-bxctb-master-2 <none> <none> kube-apiserver-guard-pliu-alicloud-bxctb-master-0 1/1 Running 0 7h8m 10.130.0.13 pliu-alicloud-bxctb-master-0 <none> <none> kube-apiserver-guard-pliu-alicloud-bxctb-master-1 1/1 Running 1 7h15m 10.128.0.23 pliu-alicloud-bxctb-master-1 <none> <none> kube-apiserver-guard-pliu-alicloud-bxctb-master-2 1/1 Running 1 7h12m 10.129.0.9 pliu-alicloud-bxctb-master-2 <none> <none> kube-apiserver-pliu-alicloud-bxctb-master-0 5/5 Running 14 8h 10.0.101.108 pliu-alicloud-bxctb-master-0 <none> <none> kube-apiserver-pliu-alicloud-bxctb-master-1 4/5 Running 16 8h 10.0.157.160 pliu-alicloud-bxctb-master-1 <none> <none> kube-apiserver-pliu-alicloud-bxctb-master-2 4/5 Running 16 8h 10.0.101.109 pliu-alicloud-bxctb-master-2 <none> <none> revision-pruner-7-pliu-alicloud-bxctb-master-0 0/1 Completed 0 7h11m 10.130.0.44 pliu-alicloud-bxctb-master-0 <none> <none> revision-pruner-7-pliu-alicloud-bxctb-master-1 0/1 Completed 0 7h17m 10.128.0.5 pliu-alicloud-bxctb-master-1 <none> <none> revision-pruner-7-pliu-alicloud-bxctb-master-2 0/1 Completed 0 7h14m 10.129.0.40 pliu-alicloud-bxctb-master-2 <none> <none> Here are the kubelet log https://paste.c-net.org/PilotSocial and crio log https://paste.c-net.org/MuldoonPolitely
Thank you Peng, I would require the full CRI-O logs for the reproducer, unfortunately it looks like that we only capture 1 minute of them. Another question: Does a restart of the kubelet resolve the issue? Ryan, may I ask you to assist me with the kubelet part here? The facts are right now: - There is a container up and running (healthy) in the API server POD after the SDN migration (this is visible via crictl for example) - The kubelet still tries to create the container and fails because CRI-O reports "name is reserved" (see the logs above) I know the kubelet caches the results, right? So I'm wondering if a restart of the kubelet helps here from the failing state. It feels like a bug in the kubelet, and my CRI-O workaround (https://github.com/cri-o/cri-o/pull/6097) tries to fix it from the opposite direction.
@sgrunert The crio log is huge after turning on the debug level. Ping me on slack if you need an environment for debugging.
We have some new findings while debugging this issue. This issue can be reproduced without doing any SDN migration operation. Instead, a MachineConfig update can also trigger it, for instance, turning on crio debug logging following https://access.redhat.com/solutions/5133191. We hit this issue when doing SDN migration because there is also a MachineConfig update in the operation of migration. I set up the reproduce environment in Ali cloud with QE's CI job https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Flexy-install. The test environment can only last for 12 hours.
Thanks Peng, this is the summary of the current state: We already have a container kube-apiserver up and running in the pod where the kubelet tries to create one with the same name resulting in the "name reserved" error: sh-4.4# crictl ps -a --name kube-apiserver CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD 6305d725a6bd3 9fa866b8c15bf5e536504da71b706caf1dc0c926ed21991f69425f6e41938ba1 Less than a second ago Exited kube-apiserver 3 90943b3fa5860 kube-apiserver-pliu-alicloud-fblsj-master-0 57beb7420744b 9fa866b8c15bf5e536504da71b706caf1dc0c926ed21991f69425f6e41938ba1 11 minutes ago Running kube-apiserver 4 90943b3fa5860 kube-apiserver-pliu-alicloud-fblsj-master-0 The workaround does not work as expected so I have to find out why. Ryan, can you double check the kubelet logs to see if we can find the root cause in the kubelet sync loop? All logs can be found here: https://drive.google.com/drive/folders/1C18N-k_vsP6CMagxAYs20WSkdltKwxF6?usp=sharing
Have a new run without any cluster modification other than applying a machine config 99-master-kubelet-loglevel: apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: master name: 99-master-kubelet-loglevel spec: config: ignition: version: 3.2.0 systemd: units: - dropins: - contents: | [Service] Environment="KUBELET_LOG_LEVEL=10" name: 30-logging.conf enabled: true name: kubelet.service The crictl output shows the problem: CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD f3a6fa482cd0b 0bc42fd43ea720db6380802cc39ce72f392e78bc0eb18afc0fa261e2ae2c8e55 Less than a second ago Running kube-apiserver-cert-regeneration-controller 1 a8cc75ec372da kube-apiserver-pliu-alicloud-2n9bd-master-0 e287689759221 0bc42fd43ea720db6380802cc39ce72f392e78bc0eb18afc0fa261e2ae2c8e55 Less than a second ago Running kube-apiserver-cert-syncer 1 a8cc75ec372da kube-apiserver-pliu-alicloud-2n9bd-master-0 4065e776eb308 9fa866b8c15bf5e536504da71b706caf1dc0c926ed21991f69425f6e41938ba1 Less than a second ago Exited kube-apiserver 1 a8cc75ec372da kube-apiserver-pliu-alicloud-2n9bd-master-0 ad56b755641d9 9fa866b8c15bf5e536504da71b706caf1dc0c926ed21991f69425f6e41938ba1 40 minutes ago Running kube-apiserver 2 a8cc75ec372da kube-apiserver-pliu-alicloud-2n9bd-master-0 60b30abe43fda 0bc42fd43ea720db6380802cc39ce72f392e78bc0eb18afc0fa261e2ae2c8e55 40 minutes ago Running kube-apiserver-check-endpoints 1 a8cc75ec372da kube-apiserver-pliu-alicloud-2n9bd-master-0 d45d4d8dd6b32 0bc42fd43ea720db6380802cc39ce72f392e78bc0eb18afc0fa261e2ae2c8e55 40 minutes ago Running kube-apiserver-insecure-readyz 1 a8cc75ec372da kube-apiserver-pliu-alicloud-2n9bd-master-0 kubelet and CRI-O logs: https://drive.google.com/drive/folders/1rSvUMFP3bpROI9MVa1i19cUckKoGNwxU?usp=sharing
Found out that the revert in https://github.com/cri-o/cri-o/pull/6111 fixes the problem. We're now discussing how to proceed.
We're now working on https://github.com/cri-o/cri-o/pull/6123 to solve the problem.
I see the fix has been merged upstream. May I ask when we can have it in OCP?
(In reply to Peng Liu from comment #22) > I see the fix has been merged upstream. May I ask when we can have it in OCP? The merge into main will land in 1.26 / OCP 4.13, we can backport it to 4.12 once it's verified by QA.
@sgrunert Right now we do not have v4.13 available in https://amd64.ocp.releases.ci.openshift.org/, QE can not do any testing. Move Status back to Assign.
Alright I'm opening a cherry-pick for 4.15 to be able to verify it: https://github.com/cri-o/cri-o/pull/6241
The PR already merged, we can now verify once CI has CRI-O 76292062. https://github.com/cri-o/cri-o/commit/76292062dbcd6fc77569fcec45487551fa40d844
@weliang The patch has been merged. Could you help to verify it?
@pliu Two Alicloud cluster installation bugs blocked our verification. https://issues.redhat.com/browse/OCPBUGS-2248 https://issues.redhat.com/browse/OCPBUGS-2388
Ok, let's wait for the installer fix.
Verification failed on 4.13.0-0.nightly-2023-01-01-223309 [weliang@weliang openshift-tests-private]$ oc get all -n openshift-kube-apiserver NAME READY STATUS RESTARTS AGE pod/apiserver-watcher-weliang-01053-rx8tq-master-0 1/1 Running 4 4h39m pod/apiserver-watcher-weliang-01053-rx8tq-master-1 1/1 Running 4 4h37m pod/apiserver-watcher-weliang-01053-rx8tq-master-2 1/1 Running 4 4h37m pod/kube-apiserver-guard-weliang-01053-rx8tq-master-0 1/1 Running 0 134m pod/kube-apiserver-guard-weliang-01053-rx8tq-master-1 1/1 Running 0 126m pod/kube-apiserver-guard-weliang-01053-rx8tq-master-2 1/1 Running 0 130m pod/kube-apiserver-weliang-01053-rx8tq-master-0 4/5 RunContainerError 25 (134m ago) 4h11m pod/kube-apiserver-weliang-01053-rx8tq-master-1 4/5 Running 22 4h5m pod/kube-apiserver-weliang-01053-rx8tq-master-2 4/5 RunContainerError 25 (130m ago) 4h8m pod/revision-pruner-7-weliang-01053-rx8tq-master-0 0/1 Completed 0 135m pod/revision-pruner-7-weliang-01053-rx8tq-master-1 0/1 Completed 0 130m pod/revision-pruner-7-weliang-01053-rx8tq-master-2 0/1 Completed 0 132m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/apiserver ClusterIP 172.30.71.194 <none> 443/TCP 4h37m [weliang@weliang openshift-tests-private]$ oc describe pod/kube-apiserver-weliang-01053-rx8tq-master-0 -n openshift-kube-apiserver Name: kube-apiserver-weliang-01053-rx8tq-master-0 Namespace: openshift-kube-apiserver Priority: 2000001000 Priority Class Name: system-node-critical Node: weliang-01053-rx8tq-master-0/10.0.99.103 Start Time: Thu, 05 Jan 2023 09:43:08 -0500 Labels: apiserver=true app=openshift-kube-apiserver revision=7 Annotations: kubectl.kubernetes.io/default-container: kube-apiserver kubernetes.io/config.hash: 0fec70f8250dabd2139268425b6896e7 kubernetes.io/config.mirror: 0fec70f8250dabd2139268425b6896e7 kubernetes.io/config.seen: 2023-01-05T15:05:29.477569711Z kubernetes.io/config.source: file target.workload.openshift.io/management: {"effect": "PreferredDuringScheduling"} Status: Running IP: 10.0.99.103 IPs: IP: 10.0.99.103 Controlled By: Node/weliang-01053-rx8tq-master-0 Init Containers: setup: Container ID: cri-o://b99295901e4c00b71a9badb162fc7d10d77f47673eeecf14afe369286c8152c7 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70 Port: <none> Host Port: <none> Command: /usr/bin/timeout 220 /bin/bash -ec Args: echo "Fixing audit permissions ..." chmod 0700 /var/log/kube-apiserver && touch /var/log/kube-apiserver/audit.log && chmod 0600 /var/log/kube-apiserver/* LOCK=/var/log/kube-apiserver/.lock echo "Acquiring exclusive lock ${LOCK} ..." # Waiting for 135s max for old kube-apiserver's watch-termination process to exit and remove the lock. # Two cases: # 1. if kubelet does not start the old and new in parallel (i.e. works as expected), the flock will always succeed without any time. # 2. if kubelet does overlap old and new pods for up to 130s, the flock will wait and immediate return when the old finishes. # # NOTE: We can increase 135s for a bigger expected overlap. But a higher value means less noise about the broken kubelet behaviour, i.e. we hide a bug. # NOTE: Do not tweak these timings without considering the livenessProbe initialDelaySeconds exec {LOCK_FD}>${LOCK} && flock --verbose -w 135 "${LOCK_FD}" || { echo "$(date -Iseconds -u) kubelet did not terminate old kube-apiserver before new one" >> /var/log/kube-apiserver/lock.log echo -n ": WARNING: kubelet did not terminate old kube-apiserver before new one." # We failed to acquire exclusive lock, which means there is old kube-apiserver running in system. # Since we utilize SO_REUSEPORT, we need to make sure the old kube-apiserver stopped listening. # # NOTE: This is a fallback for broken kubelet, if you observe this please report a bug. echo -n "Waiting for port 6443 to be released due to likely bug in kubelet or CRI-O " while [ -n "$(ss -Htan state listening '( sport = 6443 or sport = 6080 )')" ]; do echo -n "." sleep 1 (( tries += 1 )) if [[ "${tries}" -gt 10 ]]; then echo "Timed out waiting for port :6443 and :6080 to be released, this is likely a bug in kubelet or CRI-O" exit 1 fi done # This is to make sure the server has terminated independently from the lock. # After the port has been freed (requests can be pending and need 60s max). sleep 65 } # We cannot hold the lock from the init container to the main container. We release it here. There is no risk, at this point we know we are safe. flock -u "${LOCK_FD}" State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 05 Jan 2023 20:03:19 -0500 Finished: Thu, 05 Jan 2023 20:03:19 -0500 Ready: True Restart Count: 4 Requests: cpu: 5m memory: 50Mi Environment: <none> Mounts: /var/log/kube-apiserver from audit-dir (rw) Containers: kube-apiserver: Container ID: cri-o://2f29f674782ee2f975537fb06e6ba732d4829f1276509d7badd547158edc470a Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70 Port: 6443/TCP Host Port: 6443/TCP Command: /bin/bash -ec Args: LOCK=/var/log/kube-apiserver/.lock # We should be able to acquire the lock immediatelly. If not, it means the init container has not released it yet and kubelet or CRI-O started container prematurely. exec {LOCK_FD}>${LOCK} && flock --verbose -w 30 "${LOCK_FD}" || { echo "Failed to acquire lock for kube-apiserver. Please check setup container for details. This is likely kubelet or CRI-O bug." exit 1 } if [ -f /etc/kubernetes/static-pod-certs/configmaps/trusted-ca-bundle/ca-bundle.crt ]; then echo "Copying system trust bundle ..." cp -f /etc/kubernetes/static-pod-certs/configmaps/trusted-ca-bundle/ca-bundle.crt /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem fi exec watch-termination --termination-touch-file=/var/log/kube-apiserver/.terminating --termination-log-file=/var/log/kube-apiserver/termination.log --graceful-termination-duration=135s --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/kube-apiserver-cert-syncer-kubeconfig/kubeconfig -- hyperkube kube-apiserver --openshift-config=/etc/kubernetes/static-pod-resources/configmaps/config/config.yaml --advertise-address=${HOST_IP} -v=2 --permit-address-sharing State: Waiting Reason: RunContainerError Last State: Terminated Reason: Error Message: ng.go:106] unable to get PriorityClass system-node-critical: Get "https://[::1]:6443/apis/scheduling.k8s.io/v1/priorityclasses/system-node-critical": x509: certificate has expired or is not yet valid: current time 2023-01-05T17:03:56Z is before 2023-01-06T00:03:21Z. Retrying... E0105 17:03:56.199404 14 storage_rbac.go:187] unable to initialize clusterroles: Get "https://[::1]:6443/apis/rbac.authorization.k8s.io/v1/clusterroles": x509: certificate has expired or is not yet valid: current time 2023-01-05T17:03:56Z is before 2023-01-06T00:03:21Z W0105 17:03:56.199456 14 storage_scheduling.go:106] unable to get PriorityClass system-node-critical: Get "https://[::1]:6443/apis/scheduling.k8s.io/v1/priorityclasses/system-node-critical": x509: certificate has expired or is not yet valid: current time 2023-01-05T17:03:56Z is before 2023-01-06T00:03:21Z. Retrying... F0105 17:03:56.199610 14 hooks.go:203] PostStartHook "scheduling/bootstrap-system-priority-classes" failed: unable to add default system priority classes: timed out waiting for the condition E0105 17:03:56.327343 14 sdn_readyz_wait.go:107] Get "https://[::1]:6443/api/v1/namespaces/openshift-oauth-apiserver/endpoints/api": x509: certificate has expired or is not yet valid: current time 2023-01-05T17:03:56Z is before 2023-01-06T00:03:21Z E0105 17:03:56.327629 14 sdn_readyz_wait.go:107] Get "https://[::1]:6443/api/v1/namespaces/openshift-apiserver/endpoints/api": x509: certificate has expired or is not yet valid: current time 2023-01-05T17:03:56Z is before 2023-01-06T00:03:21Z E0105 17:03:56.327679 14 storage_rbac.go:187] unable to initialize clusterroles: Get "https://[::1]:6443/apis/rbac.authorization.k8s.io/v1/clusterroles": x509: certificate has expired or is not yet valid: current time 2023-01-05T17:03:56Z is before 2023-01-06T00:03:21Z I0105 17:03:56.493315 1 main.go:235] Termination finished with exit code 255 I0105 17:03:56.493417 1 main.go:188] Deleting termination lock file "/var/log/kube-apiserver/.terminating" Exit Code: 255 Started: Thu, 05 Jan 2023 20:03:20 -0500 Finished: Thu, 05 Jan 2023 12:03:56 -0500 Ready: False Restart Count: 4 Requests: cpu: 265m memory: 1Gi Liveness: http-get https://:6443/livez delay=45s timeout=10s period=10s #success=1 #failure=3 Readiness: http-get https://:6443/readyz delay=10s timeout=10s period=10s #success=1 #failure=3 Environment: POD_NAME: kube-apiserver-weliang-01053-rx8tq-master-0 (v1:metadata.name) POD_NAMESPACE: openshift-kube-apiserver (v1:metadata.namespace) STATIC_POD_VERSION: 7 HOST_IP: (v1:status.hostIP) GOGC: 100 Mounts: /etc/kubernetes/static-pod-certs from cert-dir (rw) /etc/kubernetes/static-pod-resources from resource-dir (rw) /var/log/kube-apiserver from audit-dir (rw) kube-apiserver-cert-syncer: Container ID: cri-o://46d2a285aa4af70ae2f86c39257a125cf9f2e34d000b387a9fe5eb52c4bda0af Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc Port: <none> Host Port: <none> Command: cluster-kube-apiserver-operator cert-syncer Args: --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/kube-apiserver-cert-syncer-kubeconfig/kubeconfig --namespace=$(POD_NAMESPACE) --destination-dir=/etc/kubernetes/static-pod-certs State: Running Started: Thu, 05 Jan 2023 20:03:21 -0500 Ready: True Restart Count: 4 Requests: cpu: 5m memory: 50Mi Environment: POD_NAME: kube-apiserver-weliang-01053-rx8tq-master-0 (v1:metadata.name) POD_NAMESPACE: openshift-kube-apiserver (v1:metadata.namespace) Mounts: /etc/kubernetes/static-pod-certs from cert-dir (rw) /etc/kubernetes/static-pod-resources from resource-dir (rw) kube-apiserver-cert-regeneration-controller: Container ID: cri-o://0d6f91bda2f5cccedd0a27e7dcb39819f7f6e96b76bc365cf71747e0aa3e987f Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc Port: <none> Host Port: <none> Command: cluster-kube-apiserver-operator cert-regeneration-controller Args: --kubeconfig=/etc/kubernetes/static-pod-resources/configmaps/kube-apiserver-cert-syncer-kubeconfig/kubeconfig --namespace=$(POD_NAMESPACE) -v=2 State: Running Started: Thu, 05 Jan 2023 12:03:23 -0500 Ready: True Restart Count: 4 Requests: cpu: 5m memory: 50Mi Environment: POD_NAMESPACE: openshift-kube-apiserver (v1:metadata.namespace) Mounts: /etc/kubernetes/static-pod-resources from resource-dir (rw) kube-apiserver-insecure-readyz: Container ID: cri-o://fa288fb5de0e39c4b13e3f112b947e4cb761801c89ba15ab75b6116b483f30d6 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc Port: 6080/TCP Host Port: 6080/TCP Command: cluster-kube-apiserver-operator insecure-readyz Args: --insecure-port=6080 --delegate-url=https://localhost:6443/readyz State: Running Started: Thu, 05 Jan 2023 12:03:24 -0500 Ready: True Restart Count: 4 Requests: cpu: 5m memory: 50Mi Environment: <none> Mounts: <none> kube-apiserver-check-endpoints: Container ID: cri-o://f2fea0945620a234b492c9d936897e6eb83b57ea780482f99dc2c32e6f62a591 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc Port: 17697/TCP Host Port: 17697/TCP Command: cluster-kube-apiserver-operator check-endpoints Args: --kubeconfig /etc/kubernetes/static-pod-certs/configmaps/check-endpoints-kubeconfig/kubeconfig --listen 0.0.0.0:17697 --namespace $(POD_NAMESPACE) --v 2 State: Running Started: Thu, 05 Jan 2023 12:04:18 -0500 Last State: Terminated Reason: Error Message: W0105 17:03:50.570372 1 cmd.go:213] Using insecure, self-signed certificates I0105 17:03:50.570747 1 crypto.go:601] Generating new CA for check-endpoints-signer@1672938230 cert, and key in /tmp/serving-cert-1963727250/serving-signer.crt, /tmp/serving-cert-1963727250/serving-signer.key I0105 17:03:50.918995 1 observer_polling.go:159] Starting file observer W0105 17:03:50.934666 1 builder.go:230] unable to get owner reference (falling back to namespace): pods "kube-apiserver-weliang-01053-rx8tq-master-0" is forbidden: User "system:serviceaccount:openshift-kube-apiserver:check-endpoints" cannot get resource "pods" in API group "" in the namespace "openshift-kube-apiserver" I0105 17:03:50.934813 1 builder.go:262] check-endpoints version 4.13.0-202212240845.p0.gb6ca7dc.assembly.stream-b6ca7dc-b6ca7dcf808b9deb9a2ca8a1c67f8ceb475caf59 I0105 17:03:50.935452 1 dynamic_serving_content.go:113] "Loaded a new cert/key pair" name="serving-cert::/tmp/serving-cert-1963727250/tls.crt::/tmp/serving-cert-1963727250/tls.key" W0105 17:03:51.720732 1 requestheader_controller.go:193] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLEBINDING_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA' F0105 17:03:51.720771 1 cmd.go:138] error initializing delegating authentication: unable to load configmap based request-header-client-ca-file: configmaps "extension-apiserver-authentication" is forbidden: User "system:serviceaccount:openshift-kube-apiserver:check-endpoints" cannot get resource "configmaps" in API group "" in the namespace "kube-system" Exit Code: 255 Started: Thu, 05 Jan 2023 12:03:50 -0500 Finished: Thu, 05 Jan 2023 12:03:51 -0500 Ready: True Restart Count: 9 Requests: cpu: 10m memory: 50Mi Liveness: http-get https://:17697/healthz delay=10s timeout=10s period=10s #success=1 #failure=3 Readiness: http-get https://:17697/healthz delay=10s timeout=10s period=10s #success=1 #failure=3 Environment: POD_NAME: kube-apiserver-weliang-01053-rx8tq-master-0 (v1:metadata.name) POD_NAMESPACE: openshift-kube-apiserver (v1:metadata.namespace) Mounts: /etc/kubernetes/static-pod-certs from cert-dir (rw) /etc/kubernetes/static-pod-resources from resource-dir (rw) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: resource-dir: Type: HostPath (bare host directory volume) Path: /etc/kubernetes/static-pod-resources/kube-apiserver-pod-7 HostPathType: cert-dir: Type: HostPath (bare host directory volume) Path: /etc/kubernetes/static-pod-resources/kube-apiserver-certs HostPathType: audit-dir: Type: HostPath (bare host directory volume) Path: /var/log/kube-apiserver HostPathType: QoS Class: Burstable Node-Selectors: <none> Tolerations: op=Exists Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pulled 4h11m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70" already present on machine Normal Created 4h11m kubelet Created container setup Normal Started 4h11m kubelet Started container setup Normal Pulled 4h11m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70" already present on machine Normal Created 4h11m kubelet Created container kube-apiserver Normal Started 4h11m kubelet Started container kube-apiserver Normal Pulled 4h11m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 4h11m kubelet Created container kube-apiserver-cert-syncer Normal Started 4h11m kubelet Started container kube-apiserver-cert-syncer Normal Pulled 4h11m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Started 4h11m kubelet Started container kube-apiserver-check-endpoints Normal Started 4h11m kubelet Started container kube-apiserver-cert-regeneration-controller Normal Created 4h11m kubelet Created container kube-apiserver-insecure-readyz Normal Created 4h11m kubelet Created container kube-apiserver-cert-regeneration-controller Normal Started 4h11m kubelet Started container kube-apiserver-insecure-readyz Normal Pulled 4h11m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 4h11m kubelet Created container kube-apiserver-check-endpoints Normal Pulled 4h11m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 3h5m kubelet Created container kube-apiserver-cert-syncer Normal Started 3h5m kubelet Started container kube-apiserver-cert-syncer Normal Pulled 3h5m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 3h5m kubelet Created container kube-apiserver-cert-regeneration-controller Normal Started 3h5m kubelet Started container kube-apiserver-cert-regeneration-controller Normal Pulled 3h5m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 3h5m kubelet Created container kube-apiserver-insecure-readyz Normal Started 3h5m kubelet Started container kube-apiserver-insecure-readyz Normal Pulled 3h5m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 3h5m kubelet Created container kube-apiserver-check-endpoints Normal Started 3h5m kubelet Started container kube-apiserver-check-endpoints Normal Pulled 173m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70" already present on machine Normal Created 173m kubelet Created container setup Normal Started 173m kubelet Started container setup Normal Started 173m kubelet Started container kube-apiserver-insecure-readyz Normal Created 173m kubelet Created container kube-apiserver Normal Started 173m kubelet Started container kube-apiserver Normal Pulled 173m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 173m kubelet Created container kube-apiserver-cert-syncer Normal Started 173m kubelet Started container kube-apiserver-cert-syncer Normal Pulled 173m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 173m kubelet Created container kube-apiserver-cert-regeneration-controller Normal Pulled 173m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70" already present on machine Normal Pulled 173m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 173m kubelet Created container kube-apiserver-insecure-readyz Normal Started 173m kubelet Started container kube-apiserver-cert-regeneration-controller Normal Created 173m kubelet Created container kube-apiserver-check-endpoints Normal Started 173m kubelet Started container kube-apiserver-check-endpoints Warning Unhealthy 173m kubelet Liveness probe failed: Get "https://10.0.99.103:17697/healthz": read tcp 10.0.99.103:56616->10.0.99.103:17697: read: connection reset by peer Warning ProbeError 173m kubelet Readiness probe error: HTTP probe failed with statuscode: 403 body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"forbidden: User \"system:anonymous\" cannot get path \"/readyz\": RBAC: [clusterrole.rbac.authorization.k8s.io \"system:webhook\" not found, clusterrole.rbac.authorization.k8s.io \"system:openshift:public-info-viewer\" not found, clusterrole.rbac.authorization.k8s.io \"self-access-reviewer\" not found, clusterrole.rbac.authorization.k8s.io \"system:public-info-viewer\" not found, clusterrole.rbac.authorization.k8s.io \"system:oauth-token-deleter\" not found, clusterrole.rbac.authorization.k8s.io \"system:scope-impersonation\" not found]","reason":"Forbidden","details":{},"code":403} Warning Unhealthy 173m kubelet Readiness probe failed: HTTP probe failed with statuscode: 403 Warning ProbeError 173m kubelet Liveness probe error: Get "https://10.0.99.103:17697/healthz": read tcp 10.0.99.103:56616->10.0.99.103:17697: read: connection reset by peer body: Normal Pulled 173m (x2 over 173m) kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Warning ProbeError 173m kubelet Readiness probe error: HTTP probe failed with statuscode: 500 body: [+]ping ok [+]log ok [+]etcd ok [+]etcd-readiness ok [-]api-openshift-apiserver-available failed: reason withheld [-]api-openshift-oauth-apiserver-available failed: reason withheld [+]informer-sync ok [+]poststarthook/openshift.io-openshift-apiserver-reachable ok [+]poststarthook/openshift.io-oauth-apiserver-reachable ok [+]poststarthook/start-kube-apiserver-admission-initializer ok [+]poststarthook/quota.openshift.io-clusterquotamapping ok [+]poststarthook/openshift.io-deprecated-api-requests-filter ok [+]poststarthook/openshift.io-startkubeinformers ok [+]poststarthook/generic-apiserver-start-informers ok [+]poststarthook/priority-and-fairness-config-consumer ok [+]poststarthook/priority-and-fairness-filter ok [+]poststarthook/storage-object-count-tracker-hook ok [+]poststarthook/start-apiextensions-informers ok [+]poststarthook/start-apiextensions-controllers ok [+]poststarthook/crd-informer-synced ok [+]poststarthook/bootstrap-controller ok [-]poststarthook/rbac/bootstrap-roles failed: reason withheld [-]poststarthook/scheduling/bootstrap-system-priority-classes failed: reason withheld [+]poststarthook/priority-and-fairness-config-producer ok [+]poststarthook/start-cluster-authentication-info-controller ok [+]poststarthook/aggregator-reload-proxy-client-cert ok [+]poststarthook/start-kube-aggregator-informers ok [+]poststarthook/apiservice-registration-controller ok [+]poststarthook/apiservice-status-available-controller ok [+]poststarthook/apiservice-wait-for-first-sync ok [+]poststarthook/kube-apiserver-autoregistration ok [+]autoregister-completion ok [+]poststarthook/apiservice-openapi-controller ok [+]poststarthook/apiservice-openapiv3-controller ok [+]shutdown ok readyz check failed Warning ProbeError 146m kubelet Readiness probe error: HTTP probe failed with statuscode: 500 body: [+]ping ok [+]log ok [-]etcd failed: reason withheld [+]etcd-readiness ok [+]api-openshift-apiserver-available ok [+]api-openshift-oauth-apiserver-available ok [+]informer-sync ok [+]poststarthook/openshift.io-openshift-apiserver-reachable ok [+]poststarthook/openshift.io-oauth-apiserver-reachable ok [+]poststarthook/start-kube-apiserver-admission-initializer ok [+]poststarthook/quota.openshift.io-clusterquotamapping ok [+]poststarthook/openshift.io-deprecated-api-requests-filter ok [+]poststarthook/openshift.io-startkubeinformers ok [+]poststarthook/generic-apiserver-start-informers ok [+]poststarthook/priority-and-fairness-config-consumer ok [+]poststarthook/priority-and-fairness-filter ok [+]poststarthook/storage-object-count-tracker-hook ok [+]poststarthook/start-apiextensions-informers ok [+]poststarthook/start-apiextensions-controllers ok [+]poststarthook/crd-informer-synced ok [+]poststarthook/bootstrap-controller ok [+]poststarthook/rbac/bootstrap-roles ok [+]poststarthook/scheduling/bootstrap-system-priority-classes ok [+]poststarthook/priority-and-fairness-config-producer ok [+]poststarthook/start-cluster-authentication-info-controller ok [+]poststarthook/aggregator-reload-proxy-client-cert ok [+]poststarthook/start-kube-aggregator-informers ok [+]poststarthook/apiservice-registration-controller ok [+]poststarthook/apiservice-status-available-controller ok [+]poststarthook/apiservice-wait-for-first-sync ok [+]poststarthook/kube-apiserver-autoregistration ok [+]autoregister-completion ok [+]poststarthook/apiservice-openapi-controller ok [+]poststarthook/apiservice-openapiv3-controller ok [+]shutdown ok readyz check failed Warning Unhealthy 146m (x11 over 173m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500 Normal Pulled 143m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70" already present on machine Normal Created 143m kubelet Created container setup Normal Started 143m kubelet Started container setup Normal Pulled 143m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70" already present on machine Normal Created 143m kubelet Created container kube-apiserver Normal Started 143m kubelet Started container kube-apiserver Normal Pulled 143m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 143m kubelet Created container kube-apiserver-check-endpoints Normal Started 143m kubelet Started container kube-apiserver-cert-syncer Normal Pulled 143m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 143m kubelet Created container kube-apiserver-cert-regeneration-controller Normal Started 143m kubelet Started container kube-apiserver-cert-regeneration-controller Normal Created 143m kubelet Created container kube-apiserver-cert-syncer Normal Created 143m kubelet Created container kube-apiserver-insecure-readyz Normal Started 143m kubelet Started container kube-apiserver-insecure-readyz Normal Started 143m kubelet Started container kube-apiserver-check-endpoints Normal Pulled 143m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Warning ProbeError 143m kubelet Readiness probe error: Get "https://10.0.99.103:17697/healthz": dial tcp 10.0.99.103:17697: connect: connection refused body: Warning Unhealthy 143m kubelet Readiness probe failed: Get "https://10.0.99.103:17697/healthz": dial tcp 10.0.99.103:17697: connect: connection refused Warning ProbeError 143m kubelet Readiness probe error: HTTP probe failed with statuscode: 403 body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"forbidden: User \"system:anonymous\" cannot get path \"/readyz\"","reason":"Forbidden","details":{},"code":403} Warning Unhealthy 143m kubelet Readiness probe failed: HTTP probe failed with statuscode: 403 Normal Pulled 143m (x2 over 143m) kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Warning ProbeError 143m kubelet Liveness probe error: Get "https://10.0.99.103:17697/healthz": dial tcp 10.0.99.103:17697: connect: connection refused body: Warning Unhealthy 143m kubelet Liveness probe failed: Get "https://10.0.99.103:17697/healthz": dial tcp 10.0.99.103:17697: connect: connection refused Normal Created 135m kubelet Created container kube-apiserver-cert-regeneration-controller Normal Started 135m kubelet Started container kube-apiserver-insecure-readyz Normal Pulled 135m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created 135m kubelet Created container kube-apiserver-insecure-readyz Normal Started 135m kubelet Started container kube-apiserver-cert-regeneration-controller Normal Created 135m (x2 over 135m) kubelet Created container kube-apiserver-check-endpoints Normal Started 135m (x2 over 135m) kubelet Started container kube-apiserver-check-endpoints Warning BackOff 134m (x3 over 135m) kubelet Back-off restarting failed container Normal Pulled 134m (x3 over 135m) kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Warning BackOff 130m (x20 over 134m) kubelet Back-off restarting failed container Normal Pulled 1s (x600 over <invalid>) kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70" already present on machine Normal Pulled <invalid> kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70" already present on machine Normal Created <invalid> kubelet Created container setup Normal Started <invalid> kubelet Started container setup Normal Created <invalid> kubelet Created container kube-apiserver Normal Started <invalid> kubelet Started container kube-apiserver Normal Pulled <invalid> kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Pulled <invalid> kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70" already present on machine Normal Pulled <invalid> kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb3c465bdfa0ebd38bdb74cd8b80f131dfaf503a4f8c120b7fcb4440eccd6a70" already present on machine Normal Created <invalid> kubelet Created container setup Normal Started <invalid> kubelet Started container setup Normal Created <invalid> kubelet Created container kube-apiserver Normal Started <invalid> kubelet Started container kube-apiserver Normal Pulled <invalid> kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine Normal Created <invalid> kubelet Created container kube-apiserver-cert-syncer Normal Started <invalid> kubelet Started container kube-apiserver-cert-syncer Normal Pulled <invalid> kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f4d6dde7aebc339e6fc537f11c8ce94ca0d5a43d01ef18aaa21c3b957819afc" already present on machine [weliang@weliang openshift-tests-private]$ [weliang@weliang openshift-tests-private]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.0-0.nightly-2023-01-01-223309 True False 4h3m Error while reconciling 4.13.0-0.nightly-2023-01-01-223309: the cluster operator kube-apiserver is degraded [weliang@weliang openshift-tests-private]$
Tested and verified in 4.12.0-rc.8
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.12.1 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:0449