Bug 2132892 - logs of csi-addons-controller-manager pod are flooded with error msg
Summary: logs of csi-addons-controller-manager pod are flooded with error msg
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: csi-addons
Version: 4.12
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ODF 4.12.0
Assignee: Madhu Rajanna
QA Contact: Yuli Persky
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-07 07:18 UTC by Pratik Surve
Modified: 2023-08-09 16:37 UTC (History)
5 users (show)

Fixed In Version: 4.12.0-74
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-02-08 14:06:28 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github csi-addons kubernetes-csi-addons pull 247 0 None open controller: fix csiaddonsnodes object deletion 2022-10-07 08:13:49 UTC
Github red-hat-storage kubernetes-csi-addons pull 64 0 None open BUG 2132892: controller: fix csiaddonsnodes object deletion 2022-10-07 10:00:44 UTC

Description Pratik Surve 2022-10-07 07:18:14 UTC
Description of problem (please be detailed as possible and provide log
snippets):

logs of CSI-addons-controller-manager are flooded with error msg 

Version of all relevant components (if applicable):

OCP version:- 4.12.0-0.nightly-2022-09-28-204419
ODF version:- 4.12.0-70
CEPH version:- ceph version 16.2.10-41.el8cp (26bc3d938546adfb098168b7b565d4f9fa377775) pacific (stable)


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.Deploy 4.12 cluster 
2.check logs of csi-addons-controller-manager
3.


Actual results:
2022-10-07T07:11:17.817Z	ERROR	Failed to resolve endpoint	{"controller": "csiaddonsnode", "controllerGroup": "csiaddons.openshift.io", "controllerKind": "CSIAddonsNode", "CSIAddonsNode": {"name":"csi-rbdplugin-provisioner-855bb9d67b-xfqtp","namespace":"openshift-storage"}, "namespace": "openshift-storage", "name": "csi-rbdplugin-provisioner-855bb9d67b-xfqtp", "reconcileID": "6084bd72-2fb2-4da9-a503-38b6a0cf8489", "error": "failed to get pod openshift-storage/csi-rbdplugin-provisioner-855bb9d67b-xfqtp: Pod \"csi-rbdplugin-provisioner-855bb9d67b-xfqtp\" not found"}
github.com/csi-addons/kubernetes-csi-addons/controllers/csiaddons.(*CSIAddonsNodeReconciler).Reconcile
	/remote-source/app/controllers/csiaddons/csiaddonsnode_controller.go:98
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:121
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:320
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234
2022-10-07T07:11:17.817Z	ERROR	Reconciler error	{"controller": "csiaddonsnode", "controllerGroup": "csiaddons.openshift.io", "controllerKind": "CSIAddonsNode", "CSIAddonsNode": {"name":"csi-rbdplugin-provisioner-855bb9d67b-xfqtp","namespace":"openshift-storage"}, "namespace": "openshift-storage", "name": "csi-rbdplugin-provisioner-855bb9d67b-xfqtp", "reconcileID": "6084bd72-2fb2-4da9-a503-38b6a0cf8489", "error": "Failed to resolve endpoint \"pod://csi-rbdplugin-provisioner-855bb9d67b-xfqtp.openshift-storage:9070\": failed to get pod openshift-storage/csi-rbdplugin-provisioner-855bb9d67b-xfqtp: Pod \"csi-rbdplugin-provisioner-855bb9d67b-xfqtp\" not found"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:326
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234


Expected results:


Additional info:

Must-gather:- https://url.corp.redhat.com/2132566-c1

Comment 3 Niels de Vos 2022-10-07 07:39:38 UTC
This suggests that there is a CSIAddonsNode object with name `csi-rbdplugin-provisioner-855bb9d67b-xfqtp`. There is no Pod in the must-gather with that name, so the logging is correct.

The ownerRef od CSIAddonsNode object should point to their parent Pod. When the pod is deleted, the CSIAddonsNode object is expected to be deleted as well (automatically by Kubernetes).

Can you provide the yaml output of CSIAddonsNode/csi-rbdplugin-provisioner-855bb9d67b-xfqtp to see if something is wrong there?

Comment 8 Yuli Persky 2022-11-22 10:47:12 UTC
Verified in 4.12.0-114 , the "failed to get pod" message does not appear in the log. 

There is another temporary error message in the log : 
022-11-20T12:19:58.445Z	ERROR	Reconciler error	{"controller": "csiaddonsnode", "controllerGroup": "csiaddons.openshift.io", "controllerKind": "CSIAddonsNode", "CSIAddonsNode": {"name":"csi-rbdplugin-provisioner-6c4cfdc667-fw6qn","namespace":"openshift-storage"}, "namespace": "openshift-storage", "name": "csi-rbdplugin-provisioner-6c4cfdc667-fw6qn", "reconcileID": "04b2005f-de55-4b57-9b68-ec631901536f", "error": "failed to resolve endpoint \"pod://csi-rbdplugin-provisioner-6c4cfdc667-fw6qn.openshift-storage:9070\": pod openshift-storage/csi-rbdplugin-provisioner-6c4cfdc667-fw6qn does not have an IP-address"}


But it dissappears after a while, and this is another error message. 
Moving this BZ to "Verified".


Note You need to log in before you can comment on or make changes to this bug.