Note: 1. if you are dealing with Machines or MachineSet objects, please select the component as "Cloud Compute" under same product. 2. if you are dealing with kubelet / kubeletconfigs / container runtime configs, please select the component as "Node" under same product. Description of problem: During 4.10.20 to 4.11.0-rc.0 OCP upgrade, machine-config got stuck. Note that test pods are added to the SNO cluster before upgrade started. (All other components were successfully upgraded to 4.11.0-rc.0 via "oc get co") machine-config 4.10.20 True True False 21h Working towards 4.11.0-rc.0 From the machine config log, it looks like the reboot was triggered but stuck. update.go:1977] Rebooting node 2022-06-30T02:45:21.060670363+00:00 stderr F I0630 02:45:21.060638 796222 update.go:2007] Removing SIGTERM protection 2022-06-30T02:45:21.060772523+00:00 stderr F I0630 02:45:21.060734 796222 update.go:1977] initiating reboot: Node will reboot into config rendered-master-212bbb30b9272b2bff610f503c45c2d4 2022-06-30T02:45:22.426016359+00:00 stderr F I0630 02:45:22.425077 796222 daemon.go:934] Shutting down MachineConfigDaemon Version-Release number of MCO (Machine Config Operator) (if applicable): Platform : Openshift SNO on baremetal Are you certain that the root cause of the issue being reported is the MCO (Machine Config Operator)? Not sure How reproducible: Not sure yet. The node is stuck. Steps to Reproduce: 1. Deploy SNO with DU profile (rt kernel, 4 cpus reserved for platform, sriov, pao, logging, local-storage, ptp operators are deployed) 2. Launch test application. Test app specs are here: https://gitlab.cee.redhat.com/cnf/ran-test-app/-/blob/master/README.md 3. Trigger OCP upgrade from 4.10.20 to 4.11.0-rc.0. Actual results: upgrade stuck at machine-config Expected results: upgrade succeeded. Additional info: 1. Please consider attaching a must-gather archive (via oc adm must-gather). Please review must-gather contents for sensitive information before attaching any must-gathers to a Bugzilla report. You may also mark the bug private if you wish. ** must-gather stuck. $ oc adm must-gather [must-gather ] OUT the server is currently unable to handle the request (get imagestreams.image.openshift.io must-gather) [must-gather ] OUT [must-gather ] OUT Using must-gather plug-in image: registry.redhat.io/openshift4/ose-must-gather:latest When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information. ClusterID: eb9dfd6e-f481-4c9c-94c7-e171457f30fa ClusterVersion: Updating to "4.11.0-rc.0" from "4.10.20" for 14 hours: Working towards 4.11.0-rc.0: 678 of 802 done (84% complete) ClusterOperators: clusteroperator/machine-config is not upgradeable because One or more machine config pools are updating, please see `oc get mcp` for further details [must-gather ] OUT namespace/openshift-must-gather-6bgrf created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-tqw4v created [must-gather ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-tqw4v deleted [must-gather ] OUT namespace/openshift-must-gather-6bgrf deleted When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information. ClusterID: eb9dfd6e-f481-4c9c-94c7-e171457f30fa ClusterVersion: Updating to "4.11.0-rc.0" from "4.10.20" for 14 hours: Working towards 4.11.0-rc.0: 678 of 802 done (84% complete) ClusterOperators: clusteroperator/machine-config is not upgradeable because One or more machine config pools are updating, please see `oc get mcp` for further details I0630 10:43:12.855484 294602 request.go:665] Waited for 1.112940325s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/machineconfiguration.openshift.io/v1?timeout=32s I0630 10:43:22.957101 294602 request.go:665] Waited for 2.363324029s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/user.openshift.io/v1?timeout=32s I0630 10:43:33.493485 294602 request.go:665] Waited for 1.160487998s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/image.openshift.io/v1?timeout=32s Gathering data for ns/openshift-config... I0630 10:43:43.632459 294602 request.go:665] Waited for 1.355360452s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/user.openshift.io/v1?timeout=32s Gathering data for ns/openshift-config-managed... I0630 10:43:55.217536 294602 request.go:665] Waited for 1.152238621s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/route.openshift.io/v1?timeout=32s Gathering data for ns/openshift-authentication... I0630 10:44:24.782914 294602 request.go:665] Waited for 1.165069933s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/project.openshift.io/v1?timeout=32s Gathering data for ns/openshift-authentication-operator... I0630 10:44:55.663023 294602 request.go:665] Waited for 1.164915612s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/build.openshift.io/v1?timeout=32s Gathering data for ns/openshift-ingress... I0630 10:45:26.716237 294602 request.go:665] Waited for 1.153017709s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/build.openshift.io/v1?timeout=32s Gathering data for ns/openshift-oauth-apiserver... I0630 10:46:28.969526 294602 request.go:665] Waited for 1.144644393s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/security.openshift.io/v1?timeout=32s Gathering data for ns/openshift-machine-api... I0630 10:49:35.345802 294602 request.go:665] Waited for 1.157733546s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/security.openshift.io/v1?timeout=32s I0630 10:49:46.370165 294602 request.go:665] Waited for 1.119115839s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/node.k8s.io/v1beta1?timeout=32s I0630 10:49:56.381139 294602 request.go:665] Waited for 2.164421119s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/packages.operators.coreos.com/v1?timeout=32s I0630 10:50:07.375210 294602 request.go:665] Waited for 1.163590657s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/user.openshift.io/v1?timeout=32s I0630 10:50:17.420440 294602 request.go:665] Waited for 2.116520782s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/console.openshift.io/v1?timeout=32s Gathering data for ns/openshift-cloud-controller-manager-operator... I0630 10:51:25.320869 294602 request.go:665] Waited for 1.130220946s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/user.openshift.io/v1?timeout=32s Gathering data for ns/openshift-cloud-controller-manager... I0630 10:51:37.085617 294602 request.go:665] Waited for 1.142090368s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/authorization.openshift.io/v1?timeout=32s Gathering data for ns/openshift-cloud-credential-operator... I0630 10:52:43.645444 294602 request.go:665] Waited for 1.155097855s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/quota.openshift.io/v1?timeout=32s I0630 10:52:54.447605 294602 request.go:665] Waited for 1.074508271s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/local.storage.openshift.io/v1alpha1?timeout=32s I0630 10:53:04.548578 294602 request.go:665] Waited for 2.363965936s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/quota.openshift.io/v1?timeout=32s I0630 10:53:15.075122 294602 request.go:665] Waited for 1.165457327s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/route.openshift.io/v1?timeout=32s I0630 10:53:25.204234 294602 request.go:665] Waited for 2.52231623s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/discovery.k8s.io/v1?timeout=32s I0630 10:53:35.531610 294602 request.go:665] Waited for 1.111852868s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/machine.openshift.io/v1beta1?timeout=32s I0630 10:53:45.595112 294602 request.go:665] Waited for 2.364439956s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/quota.openshift.io/v1?timeout=32s I0630 10:53:56.123518 294602 request.go:665] Waited for 1.163525552s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/security.openshift.io/v1?timeout=32s I0630 10:54:06.248639 294602 request.go:665] Waited for 2.520724339s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/local.storage.openshift.io/v1?timeout=32s I0630 10:54:16.577607 294602 request.go:665] Waited for 1.115648455s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/storage.k8s.io/v1beta1?timeout=32s I0630 10:54:26.639432 294602 request.go:665] Waited for 2.363268685s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/oauth.openshift.io/v1?timeout=32s I0630 10:54:37.169591 294602 request.go:665] Waited for 1.162501452s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/oauth.openshift.io/v1?timeout=32s I0630 10:54:47.297514 294602 request.go:665] Waited for 2.522570931s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/metal3.io/v1alpha1?timeout=32s I0630 10:54:57.622659 294602 request.go:665] Waited for 1.128991429s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/events.k8s.io/v1beta1?timeout=32s I0630 10:55:07.685249 294602 request.go:665] Waited for 2.362913891s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/authorization.openshift.io/v1?timeout=32s I0630 10:55:18.210089 294602 request.go:665] Waited for 1.136462189s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/template.openshift.io/v1?timeout=32s I0630 10:55:28.336045 294602 request.go:665] Waited for 2.518674786s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/admissionregistration.k8s.io/v1?timeout=32s I0630 10:55:38.673335 294602 request.go:665] Waited for 1.12273545s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/machine.openshift.io/v1beta1?timeout=32s I0630 10:55:48.739676 294602 request.go:665] Waited for 2.362928396s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/user.openshift.io/v1?timeout=32s I0630 10:55:59.265118 294602 request.go:665] Waited for 1.166205557s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/route.openshift.io/v1?timeout=32s I0630 10:56:09.392196 294602 request.go:665] Waited for 2.52630517s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/k8s.ovn.org/v1?timeout=32s I0630 10:56:19.717412 294602 request.go:665] Waited for 1.108719387s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/network.operator.openshift.io/v1?timeout=32s I0630 10:56:29.782508 294602 request.go:665] Waited for 2.362060191s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/route.openshift.io/v1?timeout=32s I0630 10:56:40.311048 294602 request.go:665] Waited for 1.164439898s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/route.openshift.io/v1?timeout=32s I0630 10:56:50.435484 294602 request.go:665] Waited for 2.520510366s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/snapshot.storage.k8s.io/v1?timeout=32s I0630 10:57:00.898594 294602 request.go:665] Waited for 1.122649166s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/node.k8s.io/v1?timeout=32s I0630 10:57:11.096044 294602 request.go:665] Waited for 2.365049566s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/security.openshift.io/v1?timeout=32s Gathering data for ns/openshift-config-operator... I0630 10:57:45.446683 294602 request.go:665] Waited for 1.165750329s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/security.openshift.io/v1?timeout=32s I0630 10:57:56.288475 294602 request.go:665] Waited for 1.115593748s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/autoscaling/v2beta2?timeout=32s I0630 10:58:06.386182 294602 request.go:665] Waited for 2.362392673s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/packages.operators.coreos.com/v1?timeout=32s I0630 10:58:16.911859 294602 request.go:665] Waited for 1.164982193s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/oauth.openshift.io/v1?timeout=32s Gathering data for ns/openshift-console-operator... I0630 10:58:48.769644 294602 request.go:665] Waited for 1.161807794s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/packages.operators.coreos.com/v1?timeout=32s Gathering data for ns/openshift-console... I0630 10:59:00.582647 294602 request.go:665] Waited for 1.166190188s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/user.openshift.io/v1?timeout=32s Gathering data for ns/openshift-cluster-storage-operator... I0630 11:01:09.012555 294602 request.go:665] Waited for 1.161756731s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/oauth.openshift.io/v1?timeout=32s I0630 11:01:19.930771 294602 request.go:665] Waited for 1.094433947s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/sriovfec.intel.com/v2?timeout=32s Gathering data for ns/openshift-dns-operator... I0630 11:02:26.735569 294602 request.go:665] Waited for 1.145569277s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/route.openshift.io/v1?timeout=32s I0630 11:02:37.533106 294602 request.go:665] Waited for 1.123411735s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/sriovnetwork.openshift.io/v1?timeout=32s Gathering data for ns/openshift-dns... I0630 11:04:15.886968 294602 request.go:665] Waited for 1.122846532s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/oauth.openshift.io/v1?timeout=32s I0630 11:04:26.712613 294602 request.go:665] Waited for 1.120478217s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/coordination.k8s.io/v1?timeout=32s Gathering data for ns/openshift-etcd-operator... I0630 11:05:02.806827 294602 request.go:665] Waited for 1.166340636s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/authorization.openshift.io/v1?timeout=32s Gathering data for ns/openshift-etcd... I0630 11:10:42.825816 294602 request.go:665] Waited for 1.16472433s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/build.openshift.io/v1?timeout=32s I0630 11:10:53.740948 294602 request.go:665] Waited for 1.119343391s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/machineconfiguration.openshift.io/v1?timeout=32s I0630 11:11:03.857999 294602 request.go:665] Waited for 2.363834647s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/template.openshift.io/v1?timeout=32s I0630 11:11:14.421783 294602 request.go:665] Waited for 1.165248251s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/build.openshift.io/v1?timeout=32s Gathering data for ns/openshift-image-registry... I0630 11:13:18.655609 294602 request.go:665] Waited for 1.166576812s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/quota.openshift.io/v1?timeout=32s Gathering data for ns/openshift-ingress-operator... I0630 11:14:20.761520 294602 request.go:665] Waited for 1.117392189s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/packages.operators.coreos.com/v1?timeout=32s I0630 11:14:31.775616 294602 request.go:665] Waited for 1.120642365s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/coordination.k8s.io/v1?timeout=32s I0630 11:14:41.787680 294602 request.go:665] Waited for 2.166128064s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/authorization.openshift.io/v1?timeout=32s Gathering data for ns/openshift-ingress-canary... I0630 11:15:07.854690 294602 request.go:665] Waited for 1.165763071s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/apps.openshift.io/v1?timeout=32s Gathering data for ns/openshift-insights... I0630 11:15:38.851519 294602 request.go:665] Waited for 1.118131835s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/packages.operators.coreos.com/v1?timeout=32s I0630 11:15:51.427579 294602 request.go:665] Waited for 1.124934139s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/route.openshift.io/v1?timeout=32s I0630 11:16:01.599524 294602 request.go:665] Waited for 2.364174156s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/route.openshift.io/v1?timeout=32s Gathering data for ns/openshift-kube-apiserver-operator... I0630 11:16:40.905409 294602 request.go:665] Waited for 1.165771618s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/project.openshift.io/v1?timeout=32s Gathering data for ns/openshift-kube-apiserver... I0630 11:23:53.710063 294602 request.go:665] Waited for 1.047851299s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/oauth.openshift.io/v1?timeout=32s I0630 11:24:05.183974 294602 request.go:665] Waited for 1.106091181s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/local.storage.openshift.io/v1?timeout=32s I0630 11:24:15.353772 294602 request.go:665] Waited for 2.36489411s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/packages.operators.coreos.com/v1?timeout=32s I0630 11:24:26.107986 294602 request.go:665] Waited for 1.131169935s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/security.openshift.io/v1?timeout=32s I0630 11:24:36.878779 294602 request.go:665] Waited for 1.11633811s due to client-side throttling, not priority and fairness, request: GET:https://api.test-sno-1.lab.eng.rdu2.redhat.com:6443/apis/machine.openshift.io/v1beta1?timeout=32s Gathering data for ns/openshift-kube-controller-manager... ** pod logs cannot be accessed via oc cmd. Following is the end of the machine config logs. $ oc logs -n openshift-machine-config-operator machine-config-daemon-czltn machine-config-daemon Error from server (InternalError): Internal error occurred: Authorization error (user=system:kube-apiserver, verb=get, resource=nodes, subresource=proxy) By looking at pod logs on server, it seems the node got stuck at rebooting. 2022-06-30T02:45:15.700903222+00:00 stdout F Run "systemctl reboot" to start a reboot 2022-06-30T02:45:16.676546917+00:00 stderr F I0630 02:45:16.676219 796222 update.go:1977] Initiating switch from kernel realtime to realtime 2022-06-30T02:45:16.756636820+00:00 stderr F I0630 02:45:16.756584 796222 update.go:1977] Updating rt-kernel packages on host: ["update"] 2022-06-30T02:45:16.799364708+00:00 stderr F I0630 02:45:16.797676 796222 update.go:1962] Running: rpm-ostree update 2022-06-30T02:45:19.106876308+00:00 stdout F Checking out tree 188b97d...done 2022-06-30T02:45:19.186682609+00:00 stdout F Enabled rpm-md repositories: coreos-extensions 2022-06-30T02:45:19.192144631+00:00 stdout F rpm-md repo 'coreos-extensions' (cached); generated: 2022-06-27T23:38:35Z 2022-06-30T02:45:19.544713587+00:00 stdout F Importing rpm-md...done 2022-06-30T02:45:19.567872700+00:00 stdout F Resolving dependencies...done 2022-06-30T02:45:19.567946539+00:00 stdout F No upgrade available. 2022-06-30T02:45:21.052523320+00:00 stderr F I0630 02:45:21.052195 796222 update.go:1977] Rebooting node 2022-06-30T02:45:21.060670363+00:00 stderr F I0630 02:45:21.060638 796222 update.go:2007] Removing SIGTERM protection 2022-06-30T02:45:21.060772523+00:00 stderr F I0630 02:45:21.060734 796222 update.go:1977] initiating reboot: Node will reboot into config rendered-master-212bbb30b9272b2bff610f503c45c2d4 2022-06-30T02:45:22.426016359+00:00 stderr F I0630 02:45:22.425077 796222 daemon.go:934] Shutting down MachineConfigDaemon # last reboot reboot system boot 4.18.0-372.9.1.r Thu Jun 30 02:48 still running reboot system boot 4.18.0-305.49.1. Thu Jun 30 02:11 - 02:46 (00:35) 2. If a must-gather is unavailable, please provide the output of: $ oc get co machine-config -o yaml apiVersion: config.openshift.io/v1 kind: ClusterOperator metadata: annotations: exclude.release.openshift.io/internal-openshift-hosted: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2022-06-29T17:32:09Z" generation: 1 name: machine-config ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 8fb7c8de-30b1-4e73-a8c9-c22c0f1dc50a resourceVersion: "324846" uid: e47d1b8f-0bd1-4df6-919d-af050de83a78 spec: {} status: conditions: - lastTransitionTime: "2022-06-30T02:32:40Z" message: Working towards 4.11.0-rc.0 status: "True" type: Progressing - lastTransitionTime: "2022-06-30T02:38:25Z" message: One or more machine config pools are updating, please see `oc get mcp` for further details reason: PoolUpdating status: "False" type: Upgradeable - lastTransitionTime: "2022-06-29T17:47:53Z" status: "False" type: Degraded - lastTransitionTime: "2022-06-29T17:47:54Z" message: Cluster has deployed [{operator 4.10.20}] status: "True" type: Available extension: master: 0 (ready 0) out of 1 nodes are updating to latest configuration rendered-master-212bbb30b9272b2bff610f503c45c2d4 worker: all 0 nodes are at latest configuration rendered-worker-1edcda45fdce3bf53a58ed6c4240b9a1 relatedObjects: - group: "" name: openshift-machine-config-operator resource: namespaces - group: machineconfiguration.openshift.io name: "" resource: machineconfigpools - group: machineconfiguration.openshift.io name: "" resource: controllerconfigs - group: machineconfiguration.openshift.io name: "" resource: kubeletconfigs - group: machineconfiguration.openshift.io name: "" resource: containerruntimeconfigs - group: machineconfiguration.openshift.io name: "" resource: machineconfigs - group: "" name: "" resource: nodes - group: "" name: openshift-kni-infra resource: namespaces - group: "" name: openshift-openstack-infra resource: namespaces - group: "" name: openshift-ovirt-infra resource: namespaces - group: "" name: openshift-vsphere-infra resource: namespaces versions: - name: operator version: 4.10.20 $ oc get mcp (and oc describe mcp/${degraded_pool} if pools are degraded) NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-89d6c7214ba7d725b4fc5ef6ca90a7dd False True False 1 0 0 0 21h worker rendered-worker-1edcda45fdce3bf53a58ed6c4240b9a1 True False False 0 0 0 0 21h $ oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 21h 00-worker 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 21h 01-master-container-runtime 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 21h 01-master-kubelet 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 21h 01-worker-container-runtime 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 21h 01-worker-kubelet 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 21h 02-master-workload-partitioning 3.2.0 21h 04-accelerated-container-startup-master 3.2.0 21h 04-accelerated-container-startup-worker 3.2.0 21h 05-chronyd-dynamic-master 3.2.0 21h 05-chronyd-dynamic-worker 3.2.0 21h 06-kdump-enable-master 3.2.0 21h 06-kdump-enable-worker 3.2.0 21h 50-master-dnsmasq-configuration 2.2.0 21h 50-masters-chrony-configuration 2.2.0 21h 50-nto-master 3.2.0 20h 50-performance-openshift-node-performance-profile 3.2.0 20h 50-workers-chrony-configuration 2.2.0 21h 98-just-a-test 3.2.0 20h 99-master-generated-crio-seccomp-use-default 3.2.0 21h 99-master-generated-kubelet 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 20h 99-master-generated-registries 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 21h 99-master-ssh 3.2.0 21h 99-worker-generated-crio-seccomp-use-default 3.2.0 21h 99-worker-generated-registries 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 21h 99-worker-ssh 3.2.0 21h container-mount-namespace-and-kubelet-conf-master 3.2.0 21h container-mount-namespace-and-kubelet-conf-worker 3.2.0 21h load-sctp-module-master 2.2.0 21h load-sctp-module-worker 2.2.0 21h rendered-master-212bbb30b9272b2bff610f503c45c2d4 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 12h rendered-master-218f5c48beb67310d12f4515b8730143 c3ac7f07f1cae32456a7ab361e29f49af7eb0802 3.2.0 15h rendered-master-520e8b478f45a630f7416bf7ea81f4ea c3ac7f07f1cae32456a7ab361e29f49af7eb0802 3.2.0 13h rendered-master-84b534be57edf1bc926648d86304f490 c3ac7f07f1cae32456a7ab361e29f49af7eb0802 3.2.0 20h rendered-master-89d6c7214ba7d725b4fc5ef6ca90a7dd c3ac7f07f1cae32456a7ab361e29f49af7eb0802 3.2.0 13h rendered-master-8c27739c8a0c413cff40d3e9e9a8c05f c3ac7f07f1cae32456a7ab361e29f49af7eb0802 3.2.0 20h rendered-master-901c4d8a629d3af906cc82ff5590fb29 c3ac7f07f1cae32456a7ab361e29f49af7eb0802 3.2.0 21h rendered-master-f3768ff5decc9e6f07f30cccf0aa4f6b c3ac7f07f1cae32456a7ab361e29f49af7eb0802 3.2.0 20h rendered-worker-02b840c1bd66613f9d9a4aaa4d07b906 c3ac7f07f1cae32456a7ab361e29f49af7eb0802 3.2.0 15h rendered-worker-1edcda45fdce3bf53a58ed6c4240b9a1 4aa21da3efc9b49ff4d8fec0dfe5a8598a19332a 3.2.0 12h rendered-worker-838ef9a68093ae2eb8d407531ade0887 c3ac7f07f1cae32456a7ab361e29f49af7eb0802 3.2.0 21h $ oc get pod -n openshift-machine-config-operator NAME READY STATUS RESTARTS AGE machine-config-controller-b8d7fbbc7-rzds7 2/2 Running 0 12h machine-config-daemon-czltn 2/2 Running 0 12h machine-config-operator-6d48c84476-gnvpr 1/1 Running 0 12h machine-config-server-xgnk5 1/1 Running 0 12h $ oc get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME test-sno-1.lab.eng.rdu2.redhat.com Ready master,worker 21h v1.23.5+3afdacb 10.1.190.48 <none> Red Hat Enterprise Linux CoreOS 410.84.202206212304-0 (Ootpa) 4.18.0-305.49.1.rt7.121.el8_4.x86_64 cri-o://1.23.3-6.rhaos4.10.git74543e3.el8 3. If a node is not accessible via API, please provide console/journal/kubelet logs of the problematic node tarball for /var/log: https://drive.google.com/drive/folders/1SzDn7OBld7zcMy6wGYmJqwgQZ1g8YFSi?usp=sharing 4. Are there RHEL nodes on the cluster? If yes, please upload the whole Ansible logs or Jenkins job
It looks like this is another instance of https://bugzilla.redhat.com/show_bug.cgi?id=2100456#c3 We're working on a fix with the NetworkManager team. *** This bug has been marked as a duplicate of bug 2100456 ***
Reproducible from 4.10.23 to 4.11.0. mcp updating got stuck at shutting down system. Upgrade eventually completed after manually rebooting the server. 2022-08-15T18:11:20.571928782+00:00 stdout F Removed: 2022-08-15T18:11:20.572000238+00:00 stdout F ModemManager-glib-1.10.8-2.el8.x86_64 2022-08-15T18:11:20.572071556+00:00 stdout F libmbim-1.20.2-1.el8.x86_64 2022-08-15T18:11:20.572389870+00:00 stdout F libqmi-1.24.0-1.el8.x86_64 2022-08-15T18:11:20.572442417+00:00 stdout F openvswitch2.16-2.16.0-76.el8fdp.x86_64 2022-08-15T18:11:20.572482569+00:00 stdout F redhat-release-coreos-410.84-2.el8.x86_64 2022-08-15T18:11:20.572522012+00:00 stdout F Added: 2022-08-15T18:11:20.572560226+00:00 stdout F WALinuxAgent-udev-2.3.0.2-2.el8_6.2.noarch 2022-08-15T18:11:20.572598527+00:00 stdout F glibc-gconv-extra-2.28-189.5.el8_6.x86_64 2022-08-15T18:11:20.572636793+00:00 stdout F libbpf-0.4.0-3.el8.x86_64 2022-08-15T18:11:20.572675518+00:00 stdout F nvme-cli-1.16-3.el8.x86_64 2022-08-15T18:11:20.572713555+00:00 stdout F openvswitch2.17-2.17.0-31.el8fdp.x86_64 2022-08-15T18:11:20.572751647+00:00 stdout F redhat-release-8.6-0.1.el8.x86_64 2022-08-15T18:11:20.572791421+00:00 stdout F redhat-release-eula-8.6-0.1.el8.x86_64 2022-08-15T18:11:20.572829516+00:00 stdout F shadow-utils-subid-2:4.6-16.el8.x86_64 2022-08-15T18:11:20.573613294+00:00 stdout F Run "systemctl reboot" to start a reboot 2022-08-15T18:11:22.282168508+00:00 stderr F I0815 18:11:22.281258 824440 update.go:1991] Initiating switch from kernel realtime to realtime 2022-08-15T18:11:22.288012328+00:00 stderr F I0815 18:11:22.287967 824440 update.go:1991] Updating rt-kernel packages on host: ["update"] 2022-08-15T18:11:22.291666674+00:00 stderr F I0815 18:11:22.291639 824440 update.go:1976] Running: rpm-ostree update 2022-08-15T18:11:24.414831307+00:00 stdout F Checking out tree 5e84581...done 2022-08-15T18:11:24.488883838+00:00 stdout F Enabled rpm-md repositories: coreos-extensions 2022-08-15T18:11:24.496463584+00:00 stdout F rpm-md repo 'coreos-extensions' (cached); generated: 2022-08-03T11:48:07Z 2022-08-15T18:11:24.711890306+00:00 stdout F Importing rpm-md...done 2022-08-15T18:11:24.724547635+00:00 stdout F Resolving dependencies...done 2022-08-15T18:11:24.724787364+00:00 stdout F No upgrade available. 2022-08-15T18:11:26.553489762+00:00 stderr F I0815 18:11:26.553443 824440 update.go:1991] Rebooting node 2022-08-15T18:11:26.560119163+00:00 stderr F I0815 18:11:26.560071 824440 update.go:2021] Removing SIGTERM protection 2022-08-15T18:11:26.560209041+00:00 stderr F I0815 18:11:26.560191 824440 update.go:1991] initiating reboot: Node will reboot into config rendered-master-563530429f1165228713dedc9658c3ab 2022-08-15T18:11:27.212174684+00:00 stderr F I0815 18:11:27.210247 824440 daemon.go:934] Shutting down MachineConfigDaemon