apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: labels: role: alert-rules name: controller-manager-alerts-monitor spec: groups: - name: node-feature-discovery-operator.rules rules: - alert: NFDDegraded annotations: message: | The Node Feature Discovery Operator is degraded. Review the "NodeFeatureDiscovery" CustomResource object for further details. expr: nfd_degraded_info == 1 for: 1h labels: severity: warning https://github.com/openshift/cluster-nfd-operator/blob/6da1f491fcf72e710ef36ef1686[…]ger-alerts-monitor_monitoring.coreos.com_v1_prometheusrule.yaml Greenwave CVP test is complaining about it, and I have no idea how to make it happy: Error: Value controller-manager-alerts-monitor: error validating object: metadata.namespace: Forbidden: not allowed on this type. http://external-ci-coldstorage.datahub.redhat.com/cvp/cvp-redhat-operator-bundle-image-validation[…]erator-metadata-linting-bundle-image-output.txt
Build tested: 4.10.0-0.nightly-2021-10-04-213416 Verified with the following tests: 1. Create/deploy bundle, install create NFD instance. 2. Check the worker node labels 3. Add a worker node and check the labels of the new worker node. 4. Deploy GPU operator (v 1.8.2) and run GPU burn test. 5. Uninstall GPU operator and check the worker node labels. 6. Remove the node and check the worker node labels. 7. Log creation and stability (log file size did not increase over time). 8. Check pods in the openshift-etcd namespace for restarts (there were no restarts).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.10.3 extras update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0057