Description of problem: this bug is found when verifing bug 1952744 enabled user workload monitoring and upgrade from 4.7.10 to 4.8.0-0.nightly-2021-05-10-225140 in 4.7.10 cluster, there is prometheus servicemonitor under openshift-user-workload-monitoring, since 4.8, prometheus servicemonitor is renamed to prometheus-user-workload, we should delete prometheus servicemonitor, but it still exists after upgrade to 4.8 NOTE: no functional effect in OpenShift cluster, since it reports "Error on ingesting samples with different value but same timestamp" in OSD cluster which bug 1952744 mentioned, we should remove the prometheus servicemonitor from openshift-user-workload-monitoring project level=warn ts=2021-04-23T02:51:03.446Z caller=scrape.go:1375 component="scrape manager" scrape_pool=openshift-user-workload-monitoring/prometheus-user-workload/0 target=https://10.130.6.24:9091/metrics msg="Error on ingesting samples with different value but same timestamp" num_dropped=7 *********************************** # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.10 True False 4m23s Cluster version is 4.7.10 # oc -n openshift-user-workload-monitoring get servicemonitor NAME AGE prometheus 9m2s prometheus-operator 9m18s thanos-sidecar 9m2s *********************************** after upgrade to 4.8.0-0.nightly-2021-05-10-225140 ************************************************* # oc -n openshift-user-workload-monitoring get servicemonitor NAME AGE prometheus 71m prometheus-operator 71m prometheus-user-workload 29m thanos-ruler 23m thanos-sidecar 71m # oc -n openshift-user-workload-monitoring get servicemonitor prometheus -oyaml apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: creationTimestamp: "2021-05-11T06:27:59Z" generation: 1 labels: k8s-app: prometheus name: prometheus namespace: openshift-user-workload-monitoring resourceVersion: "29261" uid: 2e00db1d-e711-4d7c-bbae-9bb02edd18cd spec: endpoints: - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token bearerTokenSecret: key: "" interval: 30s port: metrics scheme: https tlsConfig: ca: {} caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt cert: {} serverName: prometheus-user-workload.openshift-user-workload-monitoring.svc namespaceSelector: {} selector: matchLabels: prometheus: user-workload # oc -n openshift-user-workload-monitoring get servicemonitor prometheus-user-workload -oyaml apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: creationTimestamp: "2021-05-11T07:10:02Z" generation: 1 labels: app.kubernetes.io/component: prometheus app.kubernetes.io/name: prometheus app.kubernetes.io/part-of: openshift-monitoring app.kubernetes.io/version: 2.26.0 name: prometheus-user-workload namespace: openshift-user-workload-monitoring resourceVersion: "47597" uid: 86f2236a-ec57-47a9-84e2-e85370e11e63 spec: endpoints: - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token bearerTokenSecret: key: "" interval: 30s port: metrics scheme: https tlsConfig: ca: {} caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt cert: {} serverName: prometheus-user-workload.openshift-user-workload-monitoring.svc namespaceSelector: {} selector: matchLabels: app.kubernetes.io/component: prometheus app.kubernetes.io/name: prometheus app.kubernetes.io/part-of: openshift-monitoring prometheus: user-workload ************************************************* Version-Release number of selected component (if applicable): upgrade from 4.7.10 to 4.8.0-0.nightly-2021-05-10-225140 How reproducible: always Steps to Reproduce: 1. enabled user workload monitoring and upgrade from 4.7.10 to 4.8.0-0.nightly-2021-05-10-225140 oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-05-10-225140 --allow-explicit-upgrade=true --force 2. 3. Actual results: Expected results: Additional info:
Thanks for pointing this out. This should be removed now in latest builds of master.
upgrade from 4.7.10 to 4.8.0-0.nightly-2021-05-21-233425, no prometheus servicemonitor now 4.7.10 # oc -n openshift-user-workload-monitoring get servicemonitor NAME AGE prometheus-operator 139m prometheus-user-workload 74m thanos-ruler 68m thanos-sidecar 139m upgrade to 4.8.0-0.nightly-2021-05-21-233425 # oc -n openshift-user-workload-monitoring get servicemonitor NAME AGE prometheus-operator 139m prometheus-user-workload 74m thanos-ruler 68m thanos-sidecar 139m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438