Bug 1960334

Summary: manifests: invalid selector in ServiceMonitor makes CVO hotloop
Product: OpenShift Container Platform Reporter: Vadim Rutkovsky <vrutkovs>
Component: SamplesAssignee: Vadim Rutkovsky <vrutkovs>
Status: CLOSED ERRATA QA Contact: XiuJuan Wang <xiuwang>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.8   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:08:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1961518    

Description Vadim Rutkovsky 2021-05-13 17:06:58 UTC
Invalid `selector` set in ServiceMonitor for samples-registry operator, this makes CVO apply this manifest on every sync iteration

Comment 1 XiuJuan Wang 2021-05-17 10:33:59 UTC
Can't install openshift-samples with the pr built image.
The must-gather log, http://virt-openshift-05.lab.eng.nay.redhat.com/xiuwang/pr374/

$ oc get co 
NAME                                       VERSION                                                  AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h16m
baremetal                                  4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h35m
cloud-controller-manager                   4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h36m
cloud-credential                           4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h46m
cluster-autoscaler                         4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h35m
config-operator                            4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h36m
console                                                                                                                                  
csi-snapshot-controller                    4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h35m
dns                                        4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h35m
etcd                                       4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h34m
image-registry                             4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h29m
ingress                                    4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h29m
insights                                   4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h29m
kube-apiserver                             4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h32m
kube-controller-manager                    4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h34m
kube-scheduler                             4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h34m
kube-storage-version-migrator              4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h36m
machine-api                                4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h32m
machine-approver                           4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h36m
machine-config                             4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h34m
marketplace                                4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h35m
monitoring                                 4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      126m
network                                    4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h37m
node-tuning                                4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h36m
openshift-apiserver                        4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h33m
openshift-controller-manager               4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h35m
openshift-samples                                                                                                                        
operator-lifecycle-manager                 4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h35m
operator-lifecycle-manager-catalog         4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h35m
operator-lifecycle-manager-packageserver   4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h33m
service-ca                                 4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h36m
storage                                    4.8.0-0.ci.test-2021-05-17-063626-ci-ln-6mxtkgb-latest   True        False         False      3h35m

Comment 3 XiuJuan Wang 2021-05-18 08:26:41 UTC
Test on 4.8.0-0.nightly-2021-05-17-231618 cluster,the servicemonitors.monitoring.coreos.com for samples operator is updated.

$oc get clusterversion 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-05-17-231618   True        False         5h15m   Cluster version is 4.8.0-0.nightly-2021-05-17-231618

https://prometheus-k8s-openshift-monitoring.apps.wxj-518svcmonitor.qe.devcluster.openshift.com/targets
serviceMonitor/openshift-cluster-samples-operator/cluster-samples-operator/0 (1/1 up)

 oc get  servicemonitors.monitoring.coreos.com -n openshift-cluster-samples-operator -o json  | jq .items[0].spec
{
  "endpoints": [
    {
      "interval": "60s",
      "path": "/metrics",
      "scheme": "https",
      "targetPort": 60000,
      "tlsConfig": {
        "caFile": "/etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt",
        "serverName": "metrics.openshift-cluster-samples-operator.svc"
      }
    }
  ],
  "selector": {
    "matchLabels": {
      "name": "cluster-samples-operator"
    }
  }
}

Comment 6 errata-xmlrpc 2021-07-27 23:08:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438