Bug 1993261
| Summary: | OCP 4.9: Node Feature Discovery (NFD) Operator - Creating, deleting, then recreating an NFD instance causes the controller manager to crash | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Courtney Pacheco <cpacheco> |
| Component: | Node Feature Discovery Operator | Assignee: | Carlos Eduardo Arango Gutierrez <carangog> |
| Status: | CLOSED ERRATA | QA Contact: | liqcui |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 4.9 | CC: | carangog, sejug |
| Target Milestone: | --- | ||
| Target Release: | 4.9.z | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-02-14 12:39:21 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Verified Result: Re-create nfd instance several times, no crash was found anymore [ocpadmin@ec2-18-217-45-133 cluster-nfd-operator]$ oc delete -f config/samples/nfd.openshift.io_v1_nodefeaturediscovery.yaml nodefeaturediscovery.nfd.openshift.io "nfd-instance" deleted [ocpadmin@ec2-18-217-45-133 cluster-nfd-operator]$ oc create -f config/samples/nfd.openshift.io_v1_nodefeaturediscovery.yaml nodefeaturediscovery.nfd.openshift.io/nfd-instance created [ocpadmin@ec2-18-217-45-133 cluster-nfd-operator]$ watch -d oc get pods -n openshift-nfd [ocpadmin@ec2-18-217-45-133 cluster-nfd-operator]$ oc logs nfd-controller-manager-6db8fbd7dd-jphpt -n openshift-nfd -c manager |tail -15 I0207 07:04:24.695107 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:04:24.701477 1 nodefeaturediscovery_controller.go:51] [Looking for Service ' nfd-master ' in Namespace ' openshift-nfd '] I0207 07:04:24.701543 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:04:24.709543 1 nodefeaturediscovery_controller.go:51] [Looking for ServiceAccount ' nfd-worker ' in Namespace ' openshift-nfd '] I0207 07:04:24.709594 1 nodefeaturediscovery_controller.go:51] [Found, skipping update] I0207 07:04:24.709605 1 nodefeaturediscovery_controller.go:51] [Looking for Role ' nfd-worker ' in Namespace ' openshift-nfd '] I0207 07:04:24.709622 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:04:24.714317 1 nodefeaturediscovery_controller.go:51] [Looking for RoleBinding nfd-worker in Namespace openshift-nfd] I0207 07:04:24.714371 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:04:24.718740 1 nodefeaturediscovery_controller.go:51] [Looking for ConfigMap ' nfd-worker ' in Namespace ' openshift-nfd '] I0207 07:04:24.718788 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:04:24.723058 1 nodefeaturediscovery_controller.go:51] [Looking for DaemonSet ' nfd-worker ' in Namespace ' openshift-nfd '] I0207 07:04:24.723137 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:04:24.729994 1 nodefeaturediscovery_controller.go:51] [Looking for SecurityContextConstraints ' nfd-worker ' in Namespace 'default'] I0207 07:04:24.730078 1 nodefeaturediscovery_controller.go:51] [Found, updating] [ocpadmin@ec2-18-217-45-133 cluster-nfd-operator]$ oc delete -f config/samples/nfd.openshift.io_v1_nodefeaturediscovery.yaml nodefeaturediscovery.nfd.openshift.io "nfd-instance" deleted [ocpadmin@ec2-18-217-45-133 cluster-nfd-operator]$ oc create -f config/samples/nfd.openshift.io_v1_nodefeaturediscovery.yaml nodefeaturediscovery.nfd.openshift.io/nfd-instance created [ocpadmin@ec2-18-217-45-133 cluster-nfd-operator]$ watch -d oc get pods -n openshift-nfd [ocpadmin@ec2-18-217-45-133 cluster-nfd-operator]$ oc logs nfd-controller-manager-6db8fbd7dd-jphpt -n openshift-nfd -c manager |tail -15 I0207 07:07:50.353429 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:07:50.360650 1 nodefeaturediscovery_controller.go:51] [Looking for Service ' nfd-master ' in Namespace ' openshift-nfd '] I0207 07:07:50.360713 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:07:50.368283 1 nodefeaturediscovery_controller.go:51] [Looking for ServiceAccount ' nfd-worker ' in Namespace ' openshift-nfd '] I0207 07:07:50.368345 1 nodefeaturediscovery_controller.go:51] [Found, skipping update] I0207 07:07:50.368355 1 nodefeaturediscovery_controller.go:51] [Looking for Role ' nfd-worker ' in Namespace ' openshift-nfd '] I0207 07:07:50.368375 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:07:50.372657 1 nodefeaturediscovery_controller.go:51] [Looking for RoleBinding nfd-worker in Namespace openshift-nfd] I0207 07:07:50.372703 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:07:50.376913 1 nodefeaturediscovery_controller.go:51] [Looking for ConfigMap ' nfd-worker ' in Namespace ' openshift-nfd '] I0207 07:07:50.376955 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:07:50.382458 1 nodefeaturediscovery_controller.go:51] [Looking for DaemonSet ' nfd-worker ' in Namespace ' openshift-nfd '] I0207 07:07:50.382550 1 nodefeaturediscovery_controller.go:51] [Found, updating] I0207 07:07:50.390484 1 nodefeaturediscovery_controller.go:51] [Looking for SecurityContextConstraints ' nfd-worker ' in Namespace 'default'] I0207 07:07:50.390530 1 nodefeaturediscovery_controller.go:51] [Found, updating] [ocpadmin@ec2-18-217-45-133 cluster-nfd-operator]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2022-02-05-025954 True False 78m Cluster version is 4.9.0-0.nightly-2022-02-05-025954 $ oc get pods -n openshift-nfd NAME READY STATUS RESTARTS AGE nfd-controller-manager-6db8fbd7dd-jphpt 2/2 Running 0 20m nfd-master-dn5l9 1/1 Running 0 5m28s nfd-master-mst5l 1/1 Running 0 5m28s nfd-master-v9hfz 1/1 Running 0 5m28s nfd-worker-qcsw6 1/1 Running 0 5m28s nfd-worker-rblm7 1/1 Running 0 5m28s nfd-worker-vp2zt 1/1 Running 0 5m28s Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.9.21 extras update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0489 |
Description of problem: If you create, then delete, then recreate an NFD instance, the controller manager will crash. Version-Release number of selected component (if applicable): 4.9 How reproducible: Always Steps to Reproduce: 1. `oc create -f config/samples/nfd.openshift.io_v1_nodefeaturediscovery.yaml` 2. `oc delete -f config/samples/nfd.openshift.io_v1_nodefeaturediscovery.yaml` 3. `oc create -f config/samples/nfd.openshift.io_v1_nodefeaturediscovery.yaml` Actual results: ``` 2021-08-12T15:29:22.053Z ERROR controller-runtime.manager.controller.nodefeaturediscovery Reconciler error {"reconciler group": "nfd.openshift.io", "reconciler kind": "NodeFeatureDiscovery", "name": "nfd-instance", "namespace": "openshift-nfd", "error": "Operation cannot be fulfilled on nodefeaturediscoveries.nfd.openshift.io \"nfd-instance\": the object has been modified; please apply your changes to the latest version and try again"} github.com/go-logr/zapr.(*zapLogger).Error /go/src/github.com/openshift/cluster-nfd-operator/vendor/github.com/go-logr/zapr/zapr.go:132 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/src/github.com/openshift/cluster-nfd-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:267 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/src/github.com/openshift/cluster-nfd-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1 /go/src/github.com/openshift/cluster-nfd-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:198 k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1 /go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1 /go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 k8s.io/apimachinery/pkg/util/wait.BackoffUntil /go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 k8s.io/apimachinery/pkg/util/wait.JitterUntil /go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext /go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185 k8s.io/apimachinery/pkg/util/wait.UntilWithContext /go/src/github.com/openshift/cluster-nfd-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99 ``` Expected results: No crashing Additional info: GitHub issue here: https://github.com/openshift/cluster-nfd-operator/issues/200