Bug 1999374 - Metrics not available on GUI for Compliance Operator
Summary: Metrics not available on GUI for Compliance Operator
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Compliance Operator
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.10.0
Assignee: Matt Rogers
QA Contact: Prashant Dhamdhere
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-31 04:01 UTC by xiyuan
Modified: 2021-11-10 07:37 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-10 07:37:22 UTC
Target Upstream Version:
Embargoed:
xiyuan: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift compliance-operator pull 694 0 None None None 2021-09-03 16:04:03 UTC
Red Hat Product Errata RHBA-2021:4530 0 None None None 2021-11-10 07:37:28 UTC

Description xiyuan 2021-08-31 04:01:26 UTC
Description of problem:
Install Compliance operator, trigger a scan by scansettingbinding, and check the metrics by CLI:
$  oc run --rm -i --restart=Never --image=registry.fedoraproject.org/fedora-minimal:latest -n openshift-compliance test-metrics -- bash -c 'curl -ks -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://metrics.openshift-compliance.svc:8585/metrics-co' | grep compliance
# HELP compliance_operator_compliance_remediation_status_total A counter for the total number of updates to the status of a ComplianceRemediation
# TYPE compliance_operator_compliance_remediation_status_total counter
compliance_operator_compliance_remediation_status_total{name="ocp4-moderate-oauth-or-oauthclient-token-maxage",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-master-chronyd-or-ntpd-set-maxpoll",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-master-chronyd-or-ntpd-specify-multiple-servers",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-master-chronyd-or-ntpd-specify-remote-server",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-master-configure-usbguard-auditbackend",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-master-coreos-vsyscall-kernel-argument",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-master-service-usbguard-enabled",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-master-usbguard-allow-hid-and-hub",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-worker-chronyd-or-ntpd-set-maxpoll",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-worker-chronyd-or-ntpd-specify-multiple-servers",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-worker-chronyd-or-ntpd-specify-remote-server",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-worker-configure-usbguard-auditbackend",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-worker-coreos-vsyscall-kernel-argument",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-worker-service-usbguard-enabled",state="NotApplied"} 1
compliance_operator_compliance_remediation_status_total{name="rhcos4-moderate-worker-usbguard-allow-hid-and-hub",state="NotApplied"} 1

However, logging into the console, navigating to Oberve -> Metrics
Query metric compliance_operator_compliance_scan_status_total, will get error “No datapoints found”

Version-Release number of selected component (if applicable):
4.9.0-0.nightly-2021-08-29-010334 + compliance-operator.v0.1.39

How reproducible:
Always

Steps to Reproduce:
Install compliance operator
Trigger a scan with ssb:
$ oc create -f -<<EOF
apiVersion: compliance.openshift.io/v1alpha1
kind: ScanSettingBinding
metadata:
  name: my-ssb-r
profiles:
  - name: ocp4-moderate
    kind: Profile
    apiGroup: compliance.openshift.io/v1alpha1
  - name: rhcos4-moderate
    kind: Profile
    apiGroup: compliance.openshift.io/v1alpha1
settingsRef:
  name: default
  kind: ScanSetting
  apiGroup: compliance.openshift.io/v1alpha1
EOF
3. Login into the console, navigating to Oberve -> Metrics
Query metric compliance_operator_compliance_scan_status_total

Actual results:
Will get error “No datapoints found” on GUI

Expected results:
The metrics displayed on GUI

Additional info:
$ oc get namespace openshift-compliance --show-labels
NAME                   STATUS   AGE     LABELS
openshift-compliance   Active   3h30m   kubernetes.io/metadata.name=openshift-compliance,olm.operatorgroup.uid/331ab3eb-dfb4-47ef-8ecc-845e5d1a4d19=,olm.operatorgroup.uid/c04342d4-2aa4-4505-99ab-90cf3206e305=,openshift.io/cluster-monitoring=true
$ oc -n openshift-monitoring logs -c prometheus prometheus-k8s-0 | grep "compliance"
ts=2021-08-30T13:26:21.626Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:446: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"openshift-compliance\""
ts=2021-08-30T13:26:21.640Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:447: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"openshift-compliance\""
ts=2021-08-30T13:26:21.690Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:445: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"openshift-compliance\""
ts=2021-08-30T13:26:22.713Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:447: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"openshift-compliance\""
ts=2021-08-30T13:26:22.864Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:445: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"openshift-compliance\""
ts=2021-08-30T13:26:23.212Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:446: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"openshift-compliance\""
ts=2021-08-30T13:26:25.211Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:446: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"openshift-compliance\""
ts=2021-08-30T13:26:25.546Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:447: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"openshift-compliance\""
ts=2021-08-30T13:26:25.701Z caller=level.go:63 level=error component=k8s_client_runtime func=ErrorDepth msg="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:445: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:openshift-monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"openshift-compliance\""

Comment 2 Matt Rogers 2021-09-03 16:04:03 UTC
Accepting and targeting 4.9 - Fix is ongoing https://github.com/openshift/compliance-operator/pull/694

Comment 11 Prashant Dhamdhere 2021-09-27 14:19:07 UTC
[Bug_verification]

Looks good to me. The metrics is getting reported over GUI for compliance operator

Verified on:

4.9.0-0.nightly-2021-09-25-094414 + compliance-operator.v0.1.41

$ oc get csv
NAME                              DISPLAY                            VERSION    REPLACES   PHASE
compliance-operator.v0.1.41       Compliance Operator                0.1.41                Succeeded
elasticsearch-operator.5.2.2-10   OpenShift Elasticsearch Operator   5.2.2-10              Succeeded

$ oc get pods 
NAME                                              READY   STATUS    RESTARTS      AGE
compliance-operator-656bb958f-kvzkk               1/1     Running   1 (16m ago)   16m
ocp4-openshift-compliance-pp-64dbd7c98f-8rcs5     1/1     Running   0             15m
rhcos4-openshift-compliance-pp-66575dc885-4sg4j   1/1     Running   0             15m


$ oc get pod compliance-operator-656bb958f-kvzkk -oyaml |grep -A3 "RELATED_IMAGE"
    - name: RELATED_IMAGE_OPENSCAP
      value: registry.redhat.io/compliance/openshift-compliance-openscap-rhel8@sha256:20656dd9b1e06a699f2294f4a9ac8e52606d9409d0ec75a055578e94117f4a5d
    - name: RELATED_IMAGE_OPERATOR
      value: registry.redhat.io/compliance/openshift-compliance-rhel8-operator@sha256:298e116c5840047f4c38ac9976a7016e52076b5398448fb977c83c1fae132d1d
    - name: RELATED_IMAGE_PROFILE
      value: registry.redhat.io/compliance/openshift-compliance-content-rhel8@sha256:b28ff0ae5ec3e8338ede1eea5379270f340e3b60d7bae509ebdf2b24f5289197
    - name: OPERATOR_CONDITION_NAME
      value: compliance-operator.v0.1.41

$ oc create -f -<<EOF
> apiVersion: compliance.openshift.io/v1alpha1
> kind: ScanSettingBinding
> metadata:
>   name: my-ssb-r
> profiles:
>   - name: ocp4-moderate
>     kind: Profile
>     apiGroup: compliance.openshift.io/v1alpha1
>   - name: rhcos4-moderate
>     kind: Profile
>     apiGroup: compliance.openshift.io/v1alpha1
> settingsRef:
>   name: default
>   kind: ScanSetting
>   apiGroup: compliance.openshift.io/v1alpha1
> EOF
scansettingbinding.compliance.openshift.io/my-ssb-r created


$ oc get suite 
NAME       PHASE   RESULT
my-ssb-r   DONE    NON-COMPLIANT


$ oc get scan
NAME                     PHASE   RESULT
ocp4-moderate            DONE    NON-COMPLIANT
rhcos4-moderate-master   DONE    NON-COMPLIANT
rhcos4-moderate-worker   DONE    NON-COMPLIANT


$ oc run --rm -i --restart=Never --image=registry.fedoraproject.org/fedora-minimal:latest -n openshift-compliance test-metrics -- bash -c 'curl -ks -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://metrics.openshift-compliance.svc:8585/metrics-co' | grep "compliance_operator_compliance_scan_status_total"
# HELP compliance_operator_compliance_scan_status_total A counter for the total number of updates to the status of a ComplianceScan
# TYPE compliance_operator_compliance_scan_status_total counter
compliance_operator_compliance_scan_status_total{name="ocp4-moderate",phase="AGGREGATING",result="NOT-AVAILABLE"} 4
compliance_operator_compliance_scan_status_total{name="ocp4-moderate",phase="DONE",result="NON-COMPLIANT"} 1
compliance_operator_compliance_scan_status_total{name="ocp4-moderate",phase="LAUNCHING",result="NOT-AVAILABLE"} 1
compliance_operator_compliance_scan_status_total{name="ocp4-moderate",phase="PENDING",result=""} 1
compliance_operator_compliance_scan_status_total{name="ocp4-moderate",phase="RUNNING",result="NOT-AVAILABLE"} 1
compliance_operator_compliance_scan_status_total{name="rhcos4-moderate-master",phase="AGGREGATING",result="NOT-AVAILABLE"} 5
compliance_operator_compliance_scan_status_total{name="rhcos4-moderate-master",phase="DONE",result="NON-COMPLIANT"} 1
compliance_operator_compliance_scan_status_total{name="rhcos4-moderate-master",phase="LAUNCHING",result="NOT-AVAILABLE"} 1
compliance_operator_compliance_scan_status_total{name="rhcos4-moderate-master",phase="PENDING",result=""} 1
compliance_operator_compliance_scan_status_total{name="rhcos4-moderate-master",phase="RUNNING",result="NOT-AVAILABLE"} 1
compliance_operator_compliance_scan_status_total{name="rhcos4-moderate-worker",phase="AGGREGATING",result="NOT-AVAILABLE"} 4
compliance_operator_compliance_scan_status_total{name="rhcos4-moderate-worker",phase="DONE",result="NON-COMPLIANT"} 1
compliance_operator_compliance_scan_status_total{name="rhcos4-moderate-worker",phase="LAUNCHING",result="NOT-AVAILABLE"} 1
compliance_operator_compliance_scan_status_total{name="rhcos4-moderate-worker",phase="PENDING",result=""} 1
compliance_operator_compliance_scan_status_total{name="rhcos4-moderate-worker",phase="RUNNING",result="NOT-AVAILABLE"} 1

$ oc get ns openshift-compliance --show-labels
NAME                   STATUS   AGE   LABELS
openshift-compliance   Active   58m   kubernetes.io/metadata.name=openshift-compliance,olm.operatorgroup.uid/0ffdfe52-fbeb-44dc-97fa-4ccae9230644=,olm.operatorgroup.uid/978d5930-fc16-44df-98a1-a984d0c42634=,openshift.io/cluster-monitoring=true


Attaching some screenshots of metrics displayed over GUI

Comment 18 errata-xmlrpc 2021-11-10 07:37:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Compliance Operator bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4530


Note You need to log in before you can comment on or make changes to this bug.