Bug 2060083 - CMO doesn't react to changes in clusteroperator console
Summary: CMO doesn't react to changes in clusteroperator console
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: Simon Pasquier
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks: 2062441 2062452
TreeView+ depends on / blocked
 
Reported: 2022-03-02 16:49 UTC by aaleman
Modified: 2022-08-10 10:52 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 10:51:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-monitoring-operator pull 1575 0 None open Bug 2060083: React to changes in clusteroperators 2022-03-02 17:03:52 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:52:26 UTC

Description aaleman 2022-03-02 16:49:46 UTC
Description of problem:

CMO uses the status of the clusteroperator console as input for the alertmanager and prometheus configuration  (The web.external-url command line flag) but it doesn't react to changes in them


Version-Release number of selected component (if applicable):

4.10.0-rc.7


How reproducible:


Steps to Reproduce:
1. Disable the CVO and console clusteroperator
2. Patch the console resource: `oc proxy & ; curl -v -XPATCH  -H "Accept: application/json" -H "Content-Type: application/merge-patch+json" -H "User-Agent: kubectl/v1.23.4 (linux/amd64) kubernetes/e6c093d" 'http://127.0.0.1:8001/apis/config.openshift.io/v1/consoles/cluster/status?fieldManager=kubectl-edit' --data '{"status":null}`

Actual results:

The Alertmanager and Prometheus statefulset does not get updated


Expected results:

The Alertmanager and Prometheus statefulset does get updated


Additional info:

Comment 5 Junqi Zhao 2022-03-14 11:23:06 UTC
tested with 4.11.0-0.nightly-2022-03-13-055724, CMO reacts to changes in clusteroperator console
before seeting status.consoleURL to null for console/cluster
# oc get console/cluster -o jsonpath="{.status.consoleURL}"
https://console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com

# oc -n openshift-monitoring get pod | grep -E "alertmanager-main|prometheus-k8s|cluster-monitoring-operator"
alertmanager-main-0                           6/6     Running   0          6h50m
alertmanager-main-1                           6/6     Running   0          6h51m
cluster-monitoring-operator-84cd5d7c7-8rrmd   2/2     Running   0          7h10m
prometheus-k8s-0                              6/6     Running   0          6h50m
prometheus-k8s-1                              6/6     Running   0          6h51m
# oc -n openshift-monitoring get sts alertmanager-main -oyaml | grep "web.external-url"
        - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring
# oc -n openshift-monitoring get sts prometheus-k8s -oyaml | grep "web.external-url"
        - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring
# oc -n openshift-monitoring get pod alertmanager-main-0 -oyaml | grep "web.external-url"
    - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring
# oc -n openshift-monitoring get pod alertmanager-main-1 -oyaml | grep "web.external-url"
    - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring
# oc -n openshift-monitoring get pod prometheus-k8s-0 -oyaml | grep "web.external-url"
    - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring
# oc -n openshift-monitoring get pod prometheus-k8s-1 -oyaml | grep "web.external-url"
    - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring

scale down CVO/console-operator
# oc -n openshift-cluster-version scale deploy cluster-version-operator --replicas=0
# oc -n openshift-console-operator scale deploy console-operator --replicas=0

# oc proxy
open another teminal
# curl -v -XPATCH  -H "Accept: application/json" -H "Content-Type: application/merge-patch+json" -H "User-Agent: kubectl/v1.23.4 (linux/amd64) kubernetes/e6c093d" 'http://127.0.0.1:8001/apis/config.openshift.io/v1/consoles/cluster/status?fieldManager=kubectl-edit' --data '{"status":null}'


# oc get console/cluster -o jsonpath="{.status.consoleURL}"
no result

wait for all the pods are Running, The Alertmanager and Prometheus statefulset get updated
# oc -n openshift-monitoring get pod | grep -E "alertmanager-main|prometheus-k8s|cluster-monitoring-operator"
alertmanager-main-0                           6/6     Running   0          58s
alertmanager-main-1                           6/6     Running   0          90s
cluster-monitoring-operator-84cd5d7c7-8rrmd   2/2     Running   0          7h27m
prometheus-k8s-0                              6/6     Running   0          69s
prometheus-k8s-1                              6/6     Running   0          86s

# oc -n openshift-monitoring get sts alertmanager-main -oyaml | grep "web.external-url"
no result

# oc -n openshift-monitoring get sts prometheus-k8s -oyaml | grep "web.external-url"
        - --web.external-url=https://prometheus-k8s.openshift-monitoring.svc:9091

# oc -n openshift-monitoring get pod alertmanager-main-0 -oyaml | grep "web.external-url"
no result

# oc -n openshift-monitoring get pod alertmanager-main-1 -oyaml | grep "web.external-url"
no result

# oc -n openshift-monitoring get pod prometheus-k8s-0 -oyaml | grep "web.external-url"
    - --web.external-url=https://prometheus-k8s.openshift-monitoring.svc:9091

# oc -n openshift-monitoring get pod prometheus-k8s-1 -oyaml | grep "web.external-url"
    - --web.external-url=https://prometheus-k8s.openshift-monitoring.svc:9091



restore cluster
# oc -n openshift-cluster-version scale deploy cluster-version-operator --replicas=1
# oc -n openshift-console-operator scale deploy console-operator --replicas=1

wait for a while, The Alertmanager and Prometheus statefulset get updated
# oc get console/cluster -o jsonpath="{.status.consoleURL}"
https://console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com
# oc -n openshift-monitoring get pod | grep -E "alertmanager-main|prometheus-k8s|cluster-monitoring-operator"
alertmanager-main-0                           6/6     Running   0          86s
alertmanager-main-1                           6/6     Running   0          118s
cluster-monitoring-operator-84cd5d7c7-8rrmd   2/2     Running   0          7h35m
prometheus-k8s-0                              6/6     Running   0          97s
prometheus-k8s-1                              6/6     Running   0          114s
# oc -n openshift-monitoring get sts alertmanager-main -oyaml | grep "web.external-url"
        - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring
# oc -n openshift-monitoring get sts prometheus-k8s -oyaml | grep "web.external-url"
        - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring
# oc -n openshift-monitoring get pod alertmanager-main-0 -oyaml | grep "web.external-url"
    - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring
# oc -n openshift-monitoring get pod alertmanager-main-1 -oyaml | grep "web.external-url"
    - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring
# oc -n openshift-monitoring get pod prometheus-k8s-0 -oyaml | grep "web.external-url"
    - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring
# oc -n openshift-monitoring get pod prometheus-k8s-1 -oyaml | grep "web.external-url"
    - --web.external-url=https:/console-openshift-console.apps.qe-ui411-0314.qe.devcluster.openshift.com/monitoring

Comment 10 errata-xmlrpc 2022-08-10 10:51:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.