Version: Upgrade from 4.6.3 to 4.7.0-0.nightly-2020-11-11-080140 $ ./openshift-baremetal-install version ./openshift-baremetal-install 4.6.3 built from commit a4f0869e0d2a5b2d645f0f28ef9e4b100fa8f779 release image registry.svc.ci.openshift.org/ocp/release@sha256:14986d2b9c112ca955aaa03f7157beadda0bd3c089e5e1d56f28020d2dd55c52 Platform: IPI Baremetal What happened? Upgrade procedure stuck on "the cluster operator machine-config has not yet successfully rolled out" What did you expect to happen? Upgrade procedure to pass successfully. How to reproduce it (as minimally and precisely as possible)? 1. Mirror release image to the disconnected registry. 2. Create ImageContentSourcePolicy. 3. Create ConfigMap for image signature. 4. Create custom upgrade graph. 5. Point CVO to custom upgrade graph. 6. Upgrade to 4.7 nightly. Anything else we need to know? I will attach must-gather
From comment 1's must-gather: $ yaml2json <cluster-scoped-resources/config.openshift.io/clusteroperators/machine-config.yaml | jq -r '.status.cond itions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message' | sort 2020-11-11T14:36:06Z Upgradeable=True AsExpected: 2020-11-11T19:55:08Z Available=False : Cluster not available for 4.7.0-0.nightly-2020-11-11-080140 2020-11-11T19:57:12Z Progressing=True : Working towards 4.7.0-0.nightly-2020-11-11-080140 2020-11-11T20:03:31Z Degraded=True MachineConfigControllerFailed: Unable to apply 4.7.0-0.nightly-2020-11-11-080140: timed out waiting for the condition during waitForControllerConfigToBeCompleted: controllerconfig is not completed: ControllerConfig has not completed: completed(false) running(false) failing(true) Re-assigning to the machine-config folks.
From pods/machine-config-controller-744646d477-9r8l6/machine-config-controller/machine-config-controller/logs/current.log: 2020-11-12T07:42:04.871659204Z I1112 07:42:04.871509 1 template_controller.go:366] Error syncing controllerconfig machine-config-controller: failed to create MachineConfig for role master: failed to execute template: template: /etc/mcc/templates/common/on-prem/files/NetworkManager-resolv-prepender.yaml:52:22: executing "/etc/mcc/templates/common/on-prem/files/NetworkManager-resolv-prepender.yaml" at <.DNS.Spec.BaseDomain>: nil pointer evaluating *v1.DNS.Spec
There's been some churn to this template see: https://github.com/openshift/machine-config-operator/commits/f41b1d2ae7feea9aedfbd62143baefdf950c8569/templates/common/on-prem/files/NetworkManager-resolv-prepender.yaml And another bug that I think this could be duped into: https://bugzilla.redhat.com/show_bug.cgi?id=1901376 I'll assign this one to the same author, Ben and let him dupe as he sees fit.
Even though this bug was opened first, there's a bit more discussion over on the other one so let's track this there. *** This bug has been marked as a duplicate of bug 1901376 ***