I noticed this while testing changes for bug 2019532, so the log messages are different from master, but the basic problem appears to be applicable to master. After enabling local volume discovery through the UI (Installed Operators -> Local Storage -> Local Volume Discovery -> Create Local Volume Discovery), I see the following log entries repeated in LSO (more than 10 times in this case): I1202 22:38:37.116686 1 localvolumediscovery_controller.go:69] Reconciling LocalVolumeDiscovery, namespace = openshift-local-storage, name = auto-discover-devices I1202 22:38:37.117090 1 exporter.go:99] Reconciling metrics service, namespace = openshift-local-storage, name = local-storage-discovery-metrics I1202 22:38:37.312589 1 exporter.go:126] Reconciling metrics service monitor, namespace = openshift-local-storage, name = local-storage-discovery-metrics I1202 22:38:37.784872 1 localvolumediscovery_controller.go:109] daemonset diskmaker-discovery changed with result updated 2021-12-02T22:38:37.794Z ERROR controller-runtime.manager.controller.localvolumediscovery controller/controller.go:253 Reconciler error {"reconciler group": "local.storage.openshift.io", "reconciler kind": "LocalVolumeDiscovery", "name": "auto-discover-devices", "namespace": "openshift-local-storage", "error": "running 0 out of 1 discovery daemons"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 I1202 22:38:40.425801 1 localvolumediscovery_controller.go:69] Reconciling LocalVolumeDiscovery, namespace = openshift-local-storage, name = auto-discover-devices I1202 22:38:40.426271 1 exporter.go:99] Reconciling metrics service, namespace = openshift-local-storage, name = local-storage-discovery-metrics I1202 22:38:40.464581 1 exporter.go:126] Reconciling metrics service monitor, namespace = openshift-local-storage, name = local-storage-discovery-metrics I1202 22:38:40.487867 1 localvolumediscovery_controller.go:109] daemonset diskmaker-discovery changed with result updated 2021-12-02T22:38:40.487Z ERROR controller-runtime.manager.controller.localvolumediscovery controller/controller.go:253 Reconciler error {"reconciler group": "local.storage.openshift.io", "reconciler kind": "LocalVolumeDiscovery", "name": "auto-discover-devices", "namespace": "openshift-local-storage", "error": "running 0 out of 1 discovery daemons"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 These log messages are coming from this chunk of code, where we report an error in the reconciler if !(desiredDaemons == readyDaemons) https://github.com/openshift/local-storage-operator/blob/8fc42cc8b990907c88a6da551dc85b55c2dc4417/controllers/localvolumediscovery/localvolumediscovery_controller.go#L127-L135 The diskmaker-discovery pod actually did start up: diskmaker-discovery-qjh4p 2/2 Running 0 24m The problem is that in the code above, each time we enter the reconcile loop, it reports an error back to the controller runtime framework. So until the diskmaker-discovery pod is running, it will continue to add those errors to the log in a loop. This may be more of a "noisy log" problem than anything else, but LSO should not continue trying to update the daemonset every time and report an error while diskmaker-discovery is starting up.
Log level is now changed from error to info. To verify this the operator can be run locally, I've documented it here if needed: https://github.com/openshift/local-storage-operator/blob/master/HACKING.md BEFORE FIX: 2021-12-07T12:22:05.555+0100 INFO metrics-exporter localmetrics/exporter.go:68 Reconciling metrics exporter service 2021-12-07T12:22:05.664+0100 INFO service-monitor localmetrics/exporter.go:89 creating service monitor 2021-12-07T12:22:05.780+0100 ERROR controller-runtime.manager.controller.localvolumediscovery controller/controller.go:253 Reconciler error {"reconciler group": "local.storage.openshift.io", "reconciler kind": "LocalVolumeDiscovery", "name": "auto-discover-devices", "namespace": "openshift-local-storage", "error": "running 0 out of 3 discovery daemons"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /home/rbednar/git/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /home/rbednar/git/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 2021-12-07T12:22:16.020+0100 INFO controllers.LocalVolumeDiscovery controller/controller.go:298 Reconciling LocalVolumeDiscovery {"Request.Namespace": "openshift-local-storage", "Request.Name": "auto-discover-devices"} 2021-12-07T12:22:16.021+0100 INFO metrics-exporter localmetrics/exporter.go:68 Reconciling metrics exporter service 2021-12-07T12:22:16.134+0100 INFO service-monitor localmetrics/exporter.go:89 creating service monitor 2021-12-07T12:22:16.247+0100 ERROR controller-runtime.manager.controller.localvolumediscovery controller/controller.go:253 Reconciler error {"reconciler group": "local.storage.openshift.io", "reconciler kind": "LocalVolumeDiscovery", "name": "auto-discover-devices", "namespace": "openshift-local-storage", "error": "running 0 out of 3 discovery daemons"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /home/rbednar/git/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /home/rbednar/git/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 2021-12-07T12:22:36.727+0100 INFO controllers.LocalVolumeDiscovery controller/controller.go:298 Reconciling LocalVolumeDiscovery {"Request.Namespace": "openshift-local-storage", "Request.Name": "auto-discover-devices"} 2021-12-07T12:22:36.728+0100 INFO metrics-exporter localmetrics/exporter.go:68 Reconciling metrics exporter service 2021-12-07T12:22:36.840+0100 INFO service-monitor localmetrics/exporter.go:89 creating service monitor AFTER FIX: 2021-12-07T15:00:34.499+0100 INFO controller-runtime.manager.controller.localvolume controller/controller.go:221 Starting workers {"reconciler group": "local.storage.openshift.io", "reconciler kind": "LocalVolume", "worker count": 1} 2021-12-07T15:00:34.502+0100 INFO metrics-exporter localmetrics/exporter.go:68 Reconciling metrics exporter service 2021-12-07T15:00:34.817+0100 INFO service-monitor localmetrics/exporter.go:89 creating service monitor 2021-12-07T15:00:35.157+0100 INFO controllers.LocalVolumeDiscovery controller/controller.go:298 running 0 out of 3 discovery daemons {"Request.Namespace": "openshift-local-storage", "Request.Name": "auto-discover-devices"} 2021-12-07T15:00:45.158+0100 INFO controllers.LocalVolumeDiscovery controller/controller.go:298 Reconciling LocalVolumeDiscovery {"Request.Namespace": "openshift-local-storage", "Request.Name": "auto-discover-devices"} 2021-12-07T15:00:45.159+0100 INFO metrics-exporter localmetrics/exporter.go:68 Reconciling metrics exporter service 2021-12-07T15:00:45.266+0100 INFO service-monitor localmetrics/exporter.go:89 creating service monitor 2021-12-07T15:00:45.381+0100 INFO controllers.LocalVolumeDiscovery controller/controller.go:298 running 0 out of 3 discovery daemons {"Request.Namespace": "openshift-local-storage", "Request.Name": "auto-discover-devices"}
2021-12-13T05:39:45.254Z INFO service-monitor localmetrics/exporter.go:89 creating service monitor 2021-12-13T05:39:45.267Z ERROR controller-runtime.manager.controller.localvolumediscovery controller/controller.go:253 Reconciler error {"reconciler group": "local.storage.openshift.io", "reconciler kind": "LocalVolumeDiscovery", "name": "auto-discover-devices", "namespace": "local-storage", "error": "no discovery daemons are scheduled for running"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/src/github.com/openshift/local-storage-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 2021-12-13T05:39:45.267Z INFO controllers.LocalVolumeDiscovery controller/controller.go:298 Reconciling LocalVolumeDiscovery {"Request.Namespace": "local-storage", "Request.Name": "auto-discover-devices"} 2021-12-13T05:39:45.267Z INFO metrics-exporter localmetrics/exporter.go:68 Reconciling metrics exporter service 2021-12-13T05:39:45.289Z INFO service-monitor localmetrics/exporter.go:89 creating service monitor 2021-12-13T05:39:45.332Z INFO controllers.LocalVolumeDiscovery controller/controller.go:298 running 0 out of 1 discovery daemons {"Request.Namespace": "local-storage", "Request.Name": "auto-discover-devices"} 2021-12-13T05:39:45.332Z INFO controllers.LocalVolumeDiscovery controller/controller.go:298 Reconciling LocalVolumeDiscovery {"Request.Namespace": "local-storage", "Request.Name": "auto-discover-devices"} 2021-12-13T05:39:45.332Z INFO metrics-exporter localmetrics/exporter.go:68 Reconciling metrics exporter service 2021-12-13T05:39:45.350Z INFO service-monitor localmetrics/exporter.go:89 creating service monitor 2021-12-13T05:39:45.361Z INFO controllers.LocalVolumeDiscovery controller/controller.go:298 running 0 out of 1 discovery daemons {"Request.Namespace": "local-storage", "Request.Name": "auto-discover-devices"} 2021-12-13T05:39:55.332Z INFO controllers.LocalVolumeDiscovery controller/controller.go:298 Reconciling LocalVolumeDiscovery {"Request.Namespace": "local-storage", "Request.Name": "auto-discover-devices"} 2021-12-13T05:39:55.333Z INFO metrics-exporter localmetrics/exporter.go:68 Reconciling metrics exporter service 2021-12-13T05:39:55.357Z INFO service-monitor localmetrics/exporter.go:89 creating service monitor 2021-12-13T05:39:55.370Z INFO controllers.LocalVolumeDiscovery controller/controller.go:298 running 0 out of 1 discovery daemons {"Request.Namespace": "local-storage", "Request.Name": "auto-discover-devices"}
"2021-12-13T05:39:45.332Z INFO controllers.LocalVolumeDiscovery controller/controller.go:298 running 0 out of 1 discovery daemons {"Request.Namespace": "local-storage", "Request.Name": "auto-discover-devices"}" Now it is changed from error to info. But this info log will pop every time. Is it ok? @jdobson@jdobson
Improving the behavior so that the operator can stop trying to update the daemonset under some condition (timeout?) would be non-trivial to implement. Considering the effort needed vs relatively small gain (reduced log noise) I think it's sufficient to just change the error to info and keep logging the message.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056