Bug 2060726 - Compliance operator does not generate alert notification for non-control namespace
Summary: Compliance operator does not generate alert notification for non-control name...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Compliance Operator
Version: 4.10
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.12.0
Assignee: Matt Rogers
QA Contact: xiyuan
Jeana Routh
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-04 06:15 UTC by Prashant Dhamdhere
Modified: 2022-12-22 19:32 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
* Previously, the Compliance Operator hard-coded notifications to the default namespace. As a result, notifications from the Operator would not appear if the Operator was installed in a different namespace. This issue is fixed in this release. (link:https://bugzilla.redhat.com/show_bug.cgi?id=2060726[*BZ#2060726*])
Clone Of:
Environment:
Last Closed: 2022-11-02 16:00:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ComplianceAsCode compliance-operator pull 70 0 None open Bug 2060726: Use namespace when creating ServiceMonitor 2022-07-18 15:26:51 UTC
Red Hat Product Errata RHBA-2022:6657 0 None None None 2022-11-02 16:01:09 UTC

Description Prashant Dhamdhere 2022-03-04 06:15:17 UTC
Description of problem:
Compliance operator does not generate alert notification if the operator deploy in 
non-control namespace.

#  oc create -f - << EOF
> apiVersion: compliance.openshift.io/v1alpha1
> kind: ScanSettingBinding
> metadata:
>   name: moderate-test
> profiles:
>   - name: ocp4-moderate
>     kind: Profile
>     apiGroup: compliance.openshift.io/v1alpha1
> settingsRef:
>   name: default
>   kind: ScanSetting
>   apiGroup: compliance.openshift.io/v1alpha1
> EOF
scansettingbinding.compliance.openshift.io/moderate-test created

# oc get suite 
NAME            PHASE   RESULT
moderate-test   DONE    NON-COMPLIANT

# oc get pods
NAME                                         READY   STATUS      RESTARTS      AGE
aggregator-pod-ocp4-moderate                 0/1     Completed   0             71s
compliance-operator-6fb484b5cd-g244t         1/1     Running     1 (10m ago)   10m
ocp4-compliance-test-pp-75d888d7db-2wgnv     1/1     Running     0             9m43s
ocp4-moderate-api-checks-pod                 0/2     Completed   0             113s
rhcos4-compliance-test-pp-867b989956-p6snl   1/1     Running     0             9m43s

# oc get ccr -ncompliance-test |grep compliance-notification
ocp4-moderate-compliance-notification-enabled                           PASS     medium

# oc get prometheusrules --all-namespaces -o json | jq '[.items[] | select(.metadata.name =="compliance") | .metadata.name]'
[
  "compliance",
  "compliance"
]

# oc get prometheusrules --all-namespaces -o json | jq '[.items[] | select(.metadata.name =="compliance") | .metadata.name]'
[
  "compliance",
  "compliance"
]

# oc get route alertmanager-main -n openshift-monitoring
NAME                HOST/PORT                                                                         PATH   SERVICES            PORT   TERMINATION          WILDCARD
alertmanager-main   alertmanager-main-openshift-monitoring.apps.coci632.qe.devcluster.openshift.com   /api   alertmanager-main   web    reencrypt/Redirect   None

# ALERT_MANAGER=$(oc get route alertmanager-main -n openshift-monitoring -o jsonpath='{@.spec.host}')

# curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)"  https://$ALERT_MANAGER/api/v1/alerts |jq '.data[] | select(.labels.alertname | contains("NonCompliant"))'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 33720    0 33720    0     0  57928      0 --:--:-- --:--:-- --:--:-- 57938


Version-Release number of selected component (if applicable):
4.10.0 + compliance-operator.v0.1.48

How reproducible:
Always

Steps to Reproduce:

1. Deploy Compliance Operator latest version
2. Create scansettingbinding object

# oc project compliance-test
Now using project "compliance-test" on server "https://api.coci632.qe.devcluster.openshift.com:6443".

#  oc create -f - << EOF
> apiVersion: compliance.openshift.io/v1alpha1
> kind: ScanSettingBinding
> metadata:
>   name: moderate-test
> profiles:
>   - name: ocp4-moderate
>     kind: Profile
>     apiGroup: compliance.openshift.io/v1alpha1
> settingsRef:
>   name: default
>   kind: ScanSetting
>   apiGroup: compliance.openshift.io/v1alpha1
> EOF
scansettingbinding.compliance.openshift.io/moderate-test created
 
3. Once the scan complete, check for the suite & ccr 

# oc get suite 
NAME            PHASE   RESULT
moderate-test   DONE    NON-COMPLIANT

# oc get pods
NAME                                         READY   STATUS      RESTARTS      AGE
aggregator-pod-ocp4-moderate                 0/1     Completed   0             71s
compliance-operator-6fb484b5cd-g244t         1/1     Running     1 (10m ago)   10m
ocp4-compliance-test-pp-75d888d7db-2wgnv     1/1     Running     0             9m43s
ocp4-moderate-api-checks-pod                 0/2     Completed   0             113s
rhcos4-compliance-test-pp-867b989956-p6snl   1/1     Running     0             9m43s

# oc get ccr -ncompliance-test |grep compliance-notification
ocp4-moderate-compliance-notification-enabled                           PASS     medium

4. Check if the alert notification gets generated or not

# oc get prometheusrules --all-namespaces -o json | jq '[.items[] | select(.metadata.name =="compliance") | .metadata.name]'
[
  "compliance",
  "compliance"
]

# oc get prometheusrules --all-namespaces -o json | jq '[.items[] | select(.metadata.name =="compliance") | .metadata.name]'
[
  "compliance",
  "compliance"
]

# oc get route alertmanager-main -n openshift-monitoring
NAME                HOST/PORT                                                                         PATH   SERVICES            PORT   TERMINATION          WILDCARD
alertmanager-main   alertmanager-main-openshift-monitoring.apps.coci632.qe.devcluster.openshift.com   /api   alertmanager-main   web    reencrypt/Redirect   None

# ALERT_MANAGER=$(oc get route alertmanager-main -n openshift-monitoring -o jsonpath='{@.spec.host}')

# curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)"  https://$ALERT_MANAGER/api/v1/alerts |jq '.data[] | select(.labels.alertname | contains("NonCompliant"))'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 33720    0 33720    0     0  57928      0 --:--:-- --:--:-- --:--:-- 57938


Actual results:
The compliance operator does not generate alert notification if the operator deploy in non-control namespace.

Expected results:
The compliance operator should generate alert notification for non-control namespace as well.

Additional info:

1. Does not generate alert notification for non-control namespace


# oc project compliance-test
Now using project "compliance-test" on server "https://api.coci632.qe.devcluster.openshift.com:6443".

# oc get csv
NAME                          DISPLAY               VERSION   REPLACES   PHASE
compliance-operator.v0.1.48   Compliance Operator   0.1.48               Succeeded

# oc get pods
NAME                                         READY   STATUS    RESTARTS        AGE
compliance-operator-6fb484b5cd-g244t         1/1     Running   1 (8m17s ago)   8m54s
ocp4-compliance-test-pp-75d888d7db-2wgnv     1/1     Running   0               7m38s
rhcos4-compliance-test-pp-867b989956-p6snl   1/1     Running   0               7m38s

#  oc create -f - << EOF
> apiVersion: compliance.openshift.io/v1alpha1
> kind: ScanSettingBinding
> metadata:
>   name: moderate-test
> profiles:
>   - name: ocp4-moderate
>     kind: Profile
>     apiGroup: compliance.openshift.io/v1alpha1
> settingsRef:
>   name: default
>   kind: ScanSetting
>   apiGroup: compliance.openshift.io/v1alpha1
> EOF
scansettingbinding.compliance.openshift.io/moderate-test created

# oc get suite -w
NAME            PHASE       RESULT
moderate-test   LAUNCHING   NOT-AVAILABLE
moderate-test   RUNNING     NOT-AVAILABLE
moderate-test   LAUNCHING   NOT-AVAILABLE
moderate-test   RUNNING     NOT-AVAILABLE
moderate-test   AGGREGATING   NOT-AVAILABLE
moderate-test   DONE          NON-COMPLIANT
moderate-test   DONE          NON-COMPLIANT

# oc get suite 
NAME            PHASE   RESULT
moderate-test   DONE    NON-COMPLIANT

# oc get pods
NAME                                         READY   STATUS      RESTARTS      AGE
aggregator-pod-ocp4-moderate                 0/1     Completed   0             71s
compliance-operator-6fb484b5cd-g244t         1/1     Running     1 (10m ago)   10m
ocp4-compliance-test-pp-75d888d7db-2wgnv     1/1     Running     0             9m43s
ocp4-moderate-api-checks-pod                 0/2     Completed   0             113s
rhcos4-compliance-test-pp-867b989956-p6snl   1/1     Running     0             9m43s

# oc get ccr -ncompliance-test |grep compliance-notification
ocp4-moderate-compliance-notification-enabled                           PASS     medium

# oc get prometheusrules --all-namespaces -o json | jq '[.items[] | select(.metadata.name =="compliance") | .metadata.name]'
[
  "compliance",
  "compliance"
]

# oc get prometheusrules --all-namespaces -o json | jq '[.items[] | select(.metadata.name =="compliance") | .metadata.name]'
[
  "compliance",
  "compliance"
]

# oc get route alertmanager-main -n openshift-monitoring
NAME                HOST/PORT                                                                         PATH   SERVICES            PORT   TERMINATION          WILDCARD
alertmanager-main   alertmanager-main-openshift-monitoring.apps.coci632.qe.devcluster.openshift.com   /api   alertmanager-main   web    reencrypt/Redirect   None

# ALERT_MANAGER=$(oc get route alertmanager-main -n openshift-monitoring -o jsonpath='{@.spec.host}')

# curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)"  https://$ALERT_MANAGER/api/v1/alerts |jq '.data[] | select(.labels.alertname | contains("NonCompliant"))'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 33720    0 33720    0     0  57928      0 --:--:-- --:--:-- --:--:-- 57938


2. Generates alert notification for openshift-compliance namespace only

# oc project openshift-compliance
Now using project "openshift-compliance" on server "https://api.coci632.qe.devcluster.openshift.com:6443".

# oc get csv
NAME                          DISPLAY               VERSION   REPLACES   PHASE
compliance-operator.v0.1.48   Compliance Operator   0.1.48               Succeeded

# oc get pods 
NAME                                              READY   STATUS    RESTARTS        AGE
compliance-operator-7f46b76c5d-26h2g              1/1     Running   1 (2m32s ago)   3m9s
ocp4-openshift-compliance-pp-8469fd7544-hhr64     1/1     Running   0               112s
rhcos4-openshift-compliance-pp-6b45c98fd8-rsjrp   1/1     Running   0               112s

#  oc create -f - << EOF
apiVersion: compliance.openshift.io/v1alpha1
kind: ScanSettingBinding
metadata:
  name: moderate-test
profiles:
  - name: ocp4-moderate
    kind: Profile
    apiGroup: compliance.openshift.io/v1alpha1
settingsRef:
  name: default
  kind: ScanSetting
  apiGroup: compliance.openshift.io/v1alpha1
EOF
scansettingbinding.compliance.openshift.io/moderate-test created

# oc get suite
NAME            PHASE   RESULT
moderate-test   DONE    NON-COMPLIANT

# oc get pods
NAME                                              READY   STATUS      RESTARTS        AGE
aggregator-pod-ocp4-moderate                      0/1     Completed   0               4m57s
compliance-operator-7f46b76c5d-26h2g              1/1     Running     1 (8m23s ago)   9m
ocp4-moderate-api-checks-pod                      0/2     Completed   0               5m37s
ocp4-openshift-compliance-pp-8469fd7544-hhr64     1/1     Running     0               7m43s
rhcos4-openshift-compliance-pp-6b45c98fd8-rsjrp   1/1     Running     0               7m43s

# oc get ccr -nopenshift-compliance |grep compliance-notification
ocp4-moderate-compliance-notification-enabled                           PASS     medium

# oc get prometheusrules --all-namespaces -o json | jq '[.items[] | select(.metadata.name =="compliance") | .metadata.name]'
[
  "compliance",
  "compliance"
]

# curl -k -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)"  https://$ALERT_MANAGER/api/v1/alerts |jq '.data[] | select(.labels.alertname | contains("NonCompliant"))'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 35936    0 35936    0     0  63191      0 --:--:-- --:--:-- --:--:-- 63267
{
  "labels": {
    "alertname": "NonCompliant",
    "endpoint": "metrics-co",
    "instance": "10.129.0.196:8585",
    "job": "metrics",
    "name": "moderate-test",
    "namespace": "openshift-compliance",
    "openshift_io_alert_source": "platform",
    "pod": "compliance-operator-7f46b76c5d-26h2g",
    "prometheus": "openshift-monitoring/k8s",
    "service": "metrics",
    "severity": "warning"
  },
  "annotations": {
    "description": "The compliance suite moderate-test returned as NON-COMPLIANT, ERROR, or INCONSISTENT",
    "summary": "The cluster is out-of-compliance"
  },
  "startsAt": "2022-03-04T05:55:00.825Z",
  "endsAt": "2022-03-04T06:03:37.902Z",
  "generatorURL": "https://prometheus-k8s-openshift-monitoring.apps.coci632.qe.devcluster.openshift.com/graph?g0.expr=compliance_operator_compliance_state%7Bname%3D~%22.%2B%22%7D+%3E+0&g0.tab=1",
  "status": {
    "state": "active",
    "silencedBy": null,
    "inhibitedBy": null
  },
  "receivers": [
    "Default"
  ],
  "fingerprint": "83c6a1886c47b1bd"
}

Comment 1 Jakub Hrozek 2022-03-10 13:47:27 UTC
It seems that everything should be created in the operator's namespace already, at least looking at the patches that added the alerts I don't see an obvious reason why it shouldn't work. Matt would know better, probably, though.

That said, why do we try to test this use-case? IIRC even with ACM integration, the operator is installed into openshift-compliance just watches resources in other namespaces, right?

Comment 2 Jakub Hrozek 2022-03-10 13:48:20 UTC
Lowering severity and unsetting blocker because this doesn't seem to be a super common use-case.

Comment 10 xiyuan 2022-09-23 06:10:42 UTC
Verification pass with 4.12.0-0.nightly-2022-09-22-153054 + compliance-operator.v0.1.55

#######1. install operator in a non-control namespace:
$ oc apply -f -<<EOF
apiVersion: v1
kind: Namespace
metadata:
  name: co
  labels:
    openshift.io/cluster-monitoring: "true"
    security.openshift.io/scc.podSecurityLabelSync: "false"
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/warn: privileged
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
   name: openshift-compliance-abcd
   namespace: co
spec:
   targetNamespaces:
   - co
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
   name: openshift-compliance-operator
   namespace: co
spec:
   channel: "release-0.1"
   Approval: Automatic
   name: compliance-operator
   source: qe-app-registry
   sourceNamespace: openshift-marketplace
EOF
namespace/co created
operatorgroup.operators.coreos.com/openshift-compliance-abcd created
subscription.operators.coreos.com/openshift-compliance-operator created
$ oc project co
Now using project "co" on server "https://api.xiyuan23-1.qe.azure.devcluster.openshift.com:6443".
$ oc get pod
NAME                                   READY   STATUS    RESTARTS      AGE
compliance-operator-75c4687f47-thjdr   1/1     Running   1 (22m ago)   3m
ocp4-co-pp-746bfb6c5c-d4c5h            1/1     Running   0             3m
rhcos4-co-pp-7c5946fdb9-d5bdb          1/1     Running   0             3m

#############2. create ssb:
$ oc apply -f -<<EOF
apiVersion: compliance.openshift.io/v1alpha1
kind: ScanSettingBinding
metadata:
  name: my-ssb-r
profiles:
  - name: ocp4-moderate
    kind: Profile
    apiGroup: compliance.openshift.io/v1alpha1
settingsRef:
  name: default
  kind: ScanSetting
  apiGroup: compliance.openshift.io/v1alpha1
$ oc get suite
NAME       PHASE   RESULT
my-ssb-r   DONE    NON-COMPLIANT

##########3. check alert:
$ oc get route alertmanager-main -n openshift-monitoring
NAME                HOST/PORT                                                                                  PATH   SERVICES            PORT   TERMINATION          WILDCARD
alertmanager-main   alertmanager-main-openshift-monitoring.apps.xiyuan23-1.qe.azure.devcluster.openshift.com   /api   alertmanager-main   web    reencrypt/Redirect   None
$ ALERT_MANAGER=$(oc get route alertmanager-main -n openshift-monitoring -o jsonpath='{@.spec.host}')
$  curl -k -H "Authorization: Bearer $(oc create token prometheus-k8s -n openshift-monitoring)"  https://$ALERT_MANAGER/api/v1/alerts |jq '.data[] | select(.labels.alertname | contains("NonCompliant"))'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  5490    0  5490    0     0   3188      0 --:--:--  0:00:01 --:--:--  3188
{
  "labels": {
    "alertname": "NonCompliant",
    "endpoint": "metrics-co",
    "instance": "10.130.0.75:8585",
    "job": "metrics",
    "name": "my-ssb-r",
    "namespace": "co",
    "openshift_io_alert_source": "platform",
    "pod": "compliance-operator-75c4687f47-thjdr",
    "prometheus": "openshift-monitoring/k8s",
    "service": "metrics",
    "severity": "warning"
  },
  "annotations": {
    "description": "The compliance suite my-ssb-r returned as NON-COMPLIANT, ERROR, or INCONSISTENT",
    "summary": "The cluster is out-of-compliance"
  },
  "startsAt": "2022-09-23T05:52:22.939Z",
  "endsAt": "2022-09-23T05:57:52.939Z",
  "generatorURL": "https:///console-openshift-console.apps.xiyuan23-1.qe.azure.devcluster.openshift.com/monitoring/graph?g0.expr=compliance_operator_compliance_state%7Bname%3D~%22.%2B%22%7D+%3E+0&g0.tab=1",
  "status": {
    "state": "active",
    "silencedBy": null,
    "inhibitedBy": null
  },
  "receivers": [
    "Default"
  ],
  "fingerprint": "0e7e6f43de393147"
}

Comment 12 errata-xmlrpc 2022-11-02 16:00:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Compliance Operator bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:6657


Note You need to log in before you can comment on or make changes to this bug.