Bug 2297285 - [cee/sd][rook-ceph-exporter] Prometheus pods reports all 3 rook-ceph-exporter targets as down when deployed on ipv6
Summary: [cee/sd][rook-ceph-exporter] Prometheus pods reports all 3 rook-ceph-exporter...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: rook
Version: 4.14
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ODF 4.14.11
Assignee: Subham Rai
QA Contact: Oded
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-07-11 09:13 UTC by Lijo Stephen Thomas
Modified: 2024-10-18 02:31 UTC (History)
7 users (show)

Fixed In Version: 4.14.11-2
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage rook pull 711 0 None open Bug 2297285: exporter: bind to all interfaces if IPv6 is enabled 2024-09-06 05:51:28 UTC
Red Hat Issue Tracker OCSBZM-8677 0 None None None 2024-07-19 04:45:37 UTC

Description Lijo Stephen Thomas 2024-07-11 09:13:19 UTC
Description of problem (please be detailed as possible and provide log snippets):
----------------------------------------------------------------------------------

IHAC were cu recently deployed ODF 4.14 environment with ipv6 and is observing an alert which states that the rook-ceph-exporter targets in openshift UI.

Further debugging the issue, we observed that the rook-ceph-exporter targets are unreachable from the prometheus pods.

rook-ceph-exporter are generating the 


Version of all relevant components (if applicable):
---------------------------------------------------
ODF 4.14 | 17.2.6-209.el9cp | OCP 4.14 


Is there any workaround available to the best of your knowledge?
----------------------------------------------------------------
No


Can this issue reproducible?
----------------------------
Probably every time on customer environment its a new deployment, I was unable to test because of lack of test systems on ipv6.


Steps to Reproduce:
1. Deploy ODF on ipv6



Actual results:
---------------
rook-ceph-exporter Target shows down.


Expected results:
-----------------
rook-ceph-exporter Target should be up and listening on ipv6.

Comment 11 Sunil Kumar Acharya 2024-09-05 19:24:31 UTC
Please backport the fix to ODF-4.14 and update the RDT flag/text appropriately

Comment 20 Oded 2024-10-14 13:09:15 UTC
Added new step :

8. Verify rook-ceph-exporter listen to TCP/9926 on IPV6

$ oc get pods -o wide rook-ceph-exporter-0938a6655a2810403053e52537379d8d-7f4fc8dkb54
NAME                                                              READY   STATUS    RESTARTS   AGE    IP               NODE                                                    NOMINATED NODE   READINESS GATES
rook-ceph-exporter-0938a6655a2810403053e52537379d8d-7f4fc8dkb54   1/1     Running   0          117m   fd01:0:0:5::21   worker-02.pamoedo-odfqe10.qe.devcluster.openshift.com   <none>           <none>

$ oc debug node/worker-02.pamoedo-odfqe10.qe.devcluster.openshift.com

sh-4.4# NAME=rook-ceph-exporter-0938a6655a2810403053e52537379d8d-7f4fc8dkb54 
sh-4.4# NAMESPACE=openshift-storage
sh-4.4# pod_id=$(chroot /host crictl pods --namespace ${NAMESPACE} --name ${NAME} -q)
sh-4.4# echo $pod_id 
7d30eaefb2f6977a728d7d9afd3b364e8a9309ed0fa5811915311b90186ba4a4
sh-4.4# ns_path="/host$(chroot /host bash -c "crictl inspectp $pod_id | jq '.info.runtimeSpec.linux.namespaces[]|select(.type==\"network\").path' -r")"
sh-4.4# echo $ns_path 
/host/var/run/netns/1a0da4c2-5d58-4c32-8a5e-2eb33bb38672
sh-4.4#  nsenter --net=${ns_path} -- netstat -plunt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp6       0      0 :::9926                 :::*                    LISTEN      138363/ceph-exporte 


For more info: https://docs.google.com/document/d/1Mv2uMM4VikItNTpbdqSg9aPCS1oWbZtVDTx4udGzspM/edit


Note You need to log in before you can comment on or make changes to this bug.