Hide Forgot
tested with PR, bound PVs for prometheus, and schedule prometheus pods to one same node, Upgradeable is False now # oc -n openshift-monitoring get pod -o wide |grep prometheus-k8s prometheus-k8s-0 7/7 Running 0 116s 10.128.2.25 ip-10-0-246-211.us-west-1.compute.internal <none> <none> prometheus-k8s-1 7/7 Running 0 116s 10.128.2.26 ip-10-0-246-211.us-west-1.compute.internal <none> <none> # oc -n openshift-monitoring get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE prometheus-k8s-db-prometheus-k8s-0 Bound pvc-64ba0456-9b74-4537-97df-93459b2d3bcf 10Gi RWO gp2 2m17s prometheus-k8s-db-prometheus-k8s-1 Bound pvc-b2fa154d-2fd3-4df4-9172-14607b9b9623 10Gi RWO gp2 2m17s # token=`oc sa get-token prometheus-k8s -n openshift-monitoring` # oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://prometheus-k8s.openshift-monitoring.svc:9091/api/v1/alerts'|jq ... "alerts": [ { "labels": { "alertname": "HighlyAvailableWorkloadIncorrectlySpread", "namespace": "openshift-monitoring", "severity": "warning", "workload": "prometheus-k8s" }, "annotations": { "description": "Workload openshift-monitoring/prometheus-k8s is incorrectly spread across multiple nodes which breaks high-availability requirements. Since the workload is using persistent volumes, manual intervention is needed. Please follow the guidelines provided in the runbook of this alert to fix this issue.", "runbook_url": "https://github.com/openshift/runbooks/blob/master/alerts/HighlyAvailableWorkloadIncorrectlySpread.md", "summary": "Highly-available workload is incorrectly spread across multiple nodes and manual intervention is needed." }, "state": "pending", "activeAt": "2021-11-18T03:40:35.421613011Z", "value": "1e+00" }, # oc adm upgrade Cluster version is 4.9.0-0.ci.test-2021-11-18-023919-ci-ln-9b5cn5t-latest Upgradeable=False Reason: WorkloadSinglePointOfFailure Message: Cluster operator monitoring should not be upgraded between minor versions: Highly-available workload in namespace openshift-monitoring, with label map["app.kubernetes.io/name":"prometheus"] and persistent storage enabled has a single point of failure. Manual intervention is needed to upgrade to the next minor version. For each highly-available workload that has a single point of failure please mark at least one of their PersistentVolumeClaim for deletion by annotating them with map["openshift.io/cluster-monitoring-drop-pvc":"yes"]. warning: Cannot display available updates: Reason: NoChannel Message: The update channel has not been configured. # oc get co monitoring -oyaml ... - lastTransitionTime: "2021-11-18T03:40:10Z" message: |- Highly-available workload in namespace openshift-monitoring, with label map["app.kubernetes.io/name":"prometheus"] and persistent storage enabled has a single point of failure. Manual intervention is needed to upgrade to the next minor version. For each highly-available workload that has a single point of failure please mark at least one of their PersistentVolumeClaim for deletion by annotating them with map["openshift.io/cluster-monitoring-drop-pvc":"yes"]. reason: WorkloadSinglePointOfFailure status: "False" type: Upgradeable
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.9.8 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4712
Suppose the bug need update documentation, only annotating pvc with map["openshift.io/cluster-monitoring-drop-pvc":"yes"] can't make upgrade true and the pvc with be recreated quickly
correct comments 7, annotating pvc with map["openshift.io/cluster-monitoring-drop-pvc":"yes"] can make upgrade true, I added annotation by edit one pvc