Description of problem: this is a corner case, we could close it if the scenario is not valid. enabled UWM and UserAlertmanagerConfig in cluster-monitoring-config map, and enabled UWM alertmanager only, user project AlertmanagerConfig is not loaded to UWM alertmanager or platform alertmanager enabled UWM and UserAlertmanagerConfig in cluster-monitoring-config map # oc -n openshift-monitoring get cm cluster-monitoring-config -oyaml apiVersion: v1 data: config.yaml: | enableUserWorkload: true alertmanagerMain: enableUserAlertmanagerConfig: true kind: ConfigMap metadata: creationTimestamp: "2022-06-22T02:47:57Z" name: cluster-monitoring-config namespace: openshift-monitoring resourceVersion: "73038" uid: bac78028-354f-4fdd-81a2-bb3b3601744b enabled UWM alertmanager only # oc -n openshift-user-workload-monitoring get cm user-workload-monitoring-config -oyaml apiVersion: v1 data: config.yaml: | alertmanager: enabled: true kind: ConfigMap metadata: creationTimestamp: "2022-06-22T02:48:04Z" name: user-workload-monitoring-config namespace: openshift-user-workload-monitoring resourceVersion: "86637" uid: 96e32301-16d1-47e7-8ced-fe0668b678cb default alertnanager configuration in UWM alertmanager # oc -n openshift-user-workload-monitoring exec -c alertmanager alertmanager-user-workload-0 -- cat /etc/alertmanager/config/alertmanager.yaml "receivers": - "name": "Default" "route": "group_by": - "namespace" "receiver": "Default" create AlertmanagerConfig under user project ns1 *********************** apiVersion: monitoring.coreos.com/v1beta1 kind: AlertmanagerConfig metadata: name: example-routing namespace: ns1 spec: route: receiver: default groupBy: [job] receivers: - name: default webhookConfigs: - url: https://example.org/post *********************** # oc -n ns1 get AlertmanagerConfig example-routing -oyaml apiVersion: monitoring.coreos.com/v1beta1 kind: AlertmanagerConfig metadata: creationTimestamp: "2022-06-22T02:59:37Z" generation: 1 name: example-routing namespace: ns1 resourceVersion: "64491" uid: e31c35ad-bf78-4bc3-ace5-63b4a1f70c65 spec: receivers: - name: default webhookConfigs: - url: https://example.org/post route: groupBy: - job receiver: default ns1 AlertmanagerConfig is not loaded to UWM alertmanager # oc -n openshift-user-workload-monitoring exec -c alertmanager alertmanager-user-workload-0 -- cat /etc/alertmanager/config/alertmanager.yaml "receivers": - "name": "Default" "route": "group_by": - "namespace" "receiver": "Default" not in platform alertmanager either # oc -n openshift-monitoring exec -c alertmanager alertmanager-main-0 -- cat /etc/alertmanager/config/alertmanager.yaml "global": "resolve_timeout": "5m" "inhibit_rules": - "equal": - "namespace" - "alertname" "source_matchers": - "severity = critical" "target_matchers": - "severity =~ warning|info" - "equal": - "namespace" - "alertname" "source_matchers": - "severity = warning" "target_matchers": - "severity = info" - "equal": - "namespace" "source_matchers": - "alertname = InfoInhibitor" "target_matchers": - "severity = info" "receivers": - "name": "Default" - "name": "Watchdog" - "name": "Critical" - "name": "null" "route": "group_by": - "namespace" "group_interval": "5m" "group_wait": "30s" "receiver": "Default" "repeat_interval": "12h" "routes": - "matchers": - "alertname = Watchdog" "receiver": "Watchdog" - "matchers": - "alertname = InfoInhibitor" "receiver": "null" - "matchers": - "severity = critical" "receiver": "Critical" Version-Release number of selected component (if applicable): 4.11.0-0.nightly-2022-06-21-151125 How reproducible: always Steps to Reproduce: 1. see the description 2. 3. Actual results: enabled UWM alertmanager only, user project AlertmanagerConfig is not loaded to UWM alertmanager or platform alertmanager Expected results: not sure for the expected result Additional info:
This is expected because settings from the UWM configmap take precedence. But CMO could surface the inconsistency in the Available condition with a specific reason/message (like we do already with the PrometheusDataPersistenceNotConfigured reason).
tested with 4.12.0-0.nightly-2022-08-31-101631, and followed the steps in comment 0 which does not attach PV, the message is like below # oc get co monitoring -oyaml ... - lastTransitionTime: "2022-09-01T00:50:08Z" message: 'Prometheus is running without persistent storage which can lead to data loss during upgrades and cluster disruptions. Please refer to the official documentation to see how to configure storage for Prometheus: https://docs.openshift.com/container-platform/4.8/monitoring/configuring-the-monitoring-stack.html' reason: PrometheusDataPersistenceNotConfigured status: "False" type: Degraded if we attached PVs for prometheus and keep the same settings as comment 0, would see the UserAlertmanagerMisconfigured message, issue is fixed, change to verified # oc get co monitoring -oyaml ... - lastTransitionTime: "2022-09-01T09:54:38Z" message: 'Misconfigured Alertmanager: Alertmanager for user-defined alerting is enabled in the openshift-monitoring/cluster-monitoring-config configmap by setting ''enableUserAlertmanagerConfig: true'' field. This conflicts with a dedicated Alertmanager instance enabled in openshift-user-workload-monitoring/user-workload-monitoring-config. Alertmanager enabled in openshift-user-workload-monitoring takes precedence over the one in openshift-monitoring, so please remove the ''enableUserAlertmanagerConfig'' field in openshift-monitoring/cluster-monitoring-config.' reason: UserAlertmanagerMisconfigured status: "False" type: Degraded
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399