Bug 1822442

Summary: telemeter-client deployment should be rolled back after telemetry-config configmap rolled back
Product: OpenShift Container Platform Reporter: Junqi Zhao <juzhao>
Component: MonitoringAssignee: Pawel Krupa <pkrupa>
Status: CLOSED WONTFIX QA Contact: Junqi Zhao <juzhao>
Severity: low Docs Contact:
Priority: low    
Version: 4.5CC: alegrand, anpicker, erooth, kakkoyun, lcosic, mloibl, pkrupa, spasquie, surbania
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-17 06:53:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
telemetry-config configmap and telemeter-client deploy file none

Description Junqi Zhao 2020-04-09 02:49:43 UTC
Created attachment 1677413 [details]
telemetry-config configmap and telemeter-client deploy file

Created attachment 1677413 [details]
telemetry-config configmap and telemeter-client deploy file


Description of problem:
In order to test the added metrics could be loaded from telemetry-config configmap, do the following

oc -n openshift-cluster-version scale deploy cluster-version-operator --replicas=0
oc -n openshift-monitoring scale deploy cluster-monitoring-operator --replicas=0
oc -n openshift-monitoring scale deploy telemeter-client --replicas=0

add metrics to telemetry-config configmap, example:
oc -n openshift-monitoring edit configmap telemetry-config
add
    # reports machine cpu cores
    - '{__name__="machine_cpu_cores"}'

then
oc -n openshift-monitoring scale deploy cluster-monitoring-operator --replicas=1
oc -n openshift-monitoring scale deploy telemeter-client --replicas=1


The change is now in telemetry-config configmap and telemeter-client deploy
# oc -n openshift-monitoring get configmap telemetry-config -oyaml | grep "machine_cpu_cores"
    - '{__name__="machine_cpu_cores"}'

# oc -n openshift-monitoring get deploy telemeter-client -oyaml | grep "machine_cpu_cores"
        - --match={__name__="machine_cpu_cores"}

restore the cluster, the added metrics is removed from telemetry-config configmap, but still in telemeter-client deploy

oc -n openshift-cluster-version scale deploy cluster-version-operator --replicas=1


# oc -n openshift-monitoring get configmap telemetry-config -oyaml | grep "machine_cpu_cores"
no result

# oc -n openshift-monitoring get deploy telemeter-client -oyaml | grep "machine_cpu_cores"
        - --match={__name__="machine_cpu_cores"}

only find "Updating Telemeter client" in CMO pod's log

Version-Release number of selected component (if applicable):
4.5.0-0.nightly-2020-04-08-194554

How reproducible:
Always

Steps to Reproduce:
1. see the description
2.
3.

Actual results:
telemeter-client deployment is not rolled back

Expected results:
telemeter-client deployment should be rolled back after telemetry-config configmap rolled back

Additional info:

Comment 8 Sergiusz Urbaniak 2020-08-21 13:46:42 UTC
UpcomingSprint: not enough time/capacity to tackle the issue this sprint.

Comment 10 Pawel Krupa 2020-09-17 06:53:46 UTC
After internal discussions we decided current approach is good enough. Telemeter deployment shouldn't be reconciled to prevent users from changing matches. Even if user manages to find the way to change it, we are still protected on receiving side.