Description of problem: During an upgrade from 4.8.29 to 4.9.19, the Prometheus operator failed to rollout updates to prometheus and alert manager. As a result, prometheus became unavailable. Version-Release number of selected component (if applicable): 4.9.19 IPI install on vSphere in VMC How reproducible: Unknown, this was encountered once on the vSphere build cluster. Steps to Reproduce: 1. Upgrade from 4.8.29 to 4.9.19 2. 3. Actual results: Prometheus and associated cluster metrics and performance dashboards became unavailable Expected results: Prometheus operator should update statefulsets without intervention Additional info: Prometheus operator log reported the failures below and the statefulsets were stuck at Prometheus(0/2 available) Alert Manager(0/3 available). The issue was remediated by deleting the statefulsets and letting the operator recreate them. ~~~ level=info ts=2022-02-08T17:19:39.289456243Z caller=operator.go:804 component=alertmanageroperator key=openshift-monitoring/main msg="recreating AlertManager StatefulSet because the update operation wasn't possible" reason="Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden" level=info ts=2022-02-08T17:19:39.295713026Z caller=operator.go:742 component=alertmanageroperator key=openshift-monitoring/main msg="sync alertmanager" level=info ts=2022-02-08T17:19:39.330897841Z caller=operator.go:804 component=alertmanageroperator key=openshift-monitoring/main msg="recreating AlertManager StatefulSet because the update operation wasn't possible" reason="Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden" level=info ts=2022-02-08T17:19:39.42114441Z caller=operator.go:1306 component=prometheusoperator key=openshift-monitoring/k8s statefulset=prometheus-k8s shard=0 msg="recreating StatefulSet because the update operation wasn't possible" reason="Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden" level=info ts=2022-02-08T17:19:39.426608664Z caller=operator.go:1221 component=prometheusoperator key=openshift-monitoring/k8s msg="sync prometheus" level=info ts=2022-02-08T17:19:39.590102973Z caller=operator.go:1306 component=prometheusoperator key=openshift-monitoring/k8s statefulset=prometheus-k8s shard=0 msg="recreating StatefulSet because the update operation wasn't possible" reason="Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden" level=info ts=2022-02-08T17:24:44.340355154Z caller=operator.go:1221 component=prometheusoperator key=openshift-monitoring/k8s msg="sync prometheus"
From the operator logs, it's probably the same bug that has been reported in https://bugzilla.redhat.com/show_bug.cgi?id=2030539.
upgraded from 4.8.29 to 4.9.0-0.nightly-2022-05-11-100812, did not reproduce the issue, monitoring works well
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.9.33 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:2206