Bug 1982721 - [Multus] rbd command hung in toolbox pod on Multus enabled OCS cluster
Summary: [Multus] rbd command hung in toolbox pod on Multus enabled OCS cluster
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ODF 4.14.0
Assignee: Nikhil Ladha
QA Contact: Oded
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-15 14:41 UTC by Sidhant Agrawal
Modified: 2024-03-08 04:25 UTC (History)
12 users (show)

Fixed In Version: 4.14.0-126
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-08 18:49:50 UTC
Embargoed:


Attachments (Terms of Use)
default_pod_created_without_annotations (8.40 KB, text/plain)
2021-07-15 14:41 UTC, Sidhant Agrawal
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage ocs-operator pull 2128 0 None Merged Add multus related annotations to ceph toolbox 2023-08-14 11:00:02 UTC
Github red-hat-storage ocs-operator pull 2135 0 None Merged Bug 1982721:[release-4.14] Add multus related annotations to ceph toolbox 2023-08-28 06:30:47 UTC
Github red-hat-storage ocs-operator pull 2168 0 None Merged Add annotation for multus in toolbox after checking the multus network name for namespace 2023-09-04 13:11:41 UTC
Github red-hat-storage ocs-operator pull 2175 0 None open Bug 1982721: [release-4.14] Add annotation for multus in toolbox after checking the multus network name for namespace 2023-09-04 13:12:12 UTC
Red Hat Issue Tracker RHSTOR-2067 0 None None None 2022-01-20 16:11:36 UTC
Red Hat Product Errata RHSA-2023:6832 0 None None None 2023-11-08 18:50:34 UTC

Description Sidhant Agrawal 2021-07-15 14:41:48 UTC
Created attachment 1801897 [details]
default_pod_created_without_annotations

Description of problem (please be detailed as possible and provide log
snippests):
In a Multus enabled OCS Internal mode cluster, all commands are not working as expected in toolbox pod.
Simple ceph commands works, but rbd commands get hung.

```
sh-4.4# ceph -s
  cluster:
    id:     465f536e-3fed-41ec-88c5-00584ad4f069
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum a,b,c (age 6h)
    mgr: a(active, since 6h)
    mds: ocs-storagecluster-cephfilesystem:1 {0=ocs-storagecluster-cephfilesystem-a=up:active} 1 up:standby-replay
    osd: 3 osds: 3 up (since 6h), 3 in (since 6h)
    rgw: 1 daemon active (ocs.storagecluster.cephobjectstore.a)
 
  data:
    pools:   10 pools, 176 pgs
    objects: 351 objects, 204 MiB
    usage:   3.5 GiB used, 1.5 TiB / 1.5 TiB avail
    pgs:     176 active+clean
 
  io:
    client:   852 B/s rd, 11 KiB/s wr, 1 op/s rd, 1 op/s wr
 
sh-4.4# rbd ls -p ocs-storagecluster-cephblockpool
>> hangs & no output after several minutes

```

It was observed that toolbox pod doesn't have Multus related annotaions/interfaces by default.
Operator should take care of this automatically and apply proper multus annotations to the toolbox.

Version of all relevant components (if applicable):
OCP: 4.8.0-0.nightly-2021-07-14-153019
OCS: ocs-operator.v4.8.0-452.ci

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
User won't be able to execute all commands via toolbox

Is there any workaround available to the best of your knowledge?
Yes,
Add annotation to "rook-ceph-tools" deployment similar to what MONs have
 and set hostNetwork to false.

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:
-

Steps to Reproduce:
1. Install OCS operator
2. Create Storage Cluster with Multus
3. Use this command to start toolbox pod
$ oc patch ocsinitialization ocsinit -n openshift-storage --type json --patch  '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'

4. Run a simple rbd command
$ rbd ls -p ocs-storagecluster-cephblockpool

Actual results:
rbd command get hung in toolbox pod

Expected results:
all ceph & rbd commands should work same with and without multus

Additional info:

Comment 2 Sébastien Han 2021-07-19 07:42:32 UTC
Rook does not create the toolbox, ocs-operator does, on-demand.
So this request will have to be done from the ocs-operator.

Comment 3 Mudit Agarwal 2021-07-21 06:29:12 UTC
Not a 4.8 blocker, can be fixed in 4.8.1

Comment 5 Jose A. Rivera 2021-10-11 16:22:34 UTC
This is definitely something we should fix. But, since multus is only going GA in ODF 4.10, moving it accordingly since we have time to fix it there. Also giving devel_ack+.

Rohan, if you could take a quick look, see if you can take care of this and set it to ASSIGNED. Otherwise let me know and I'll see if I can find anyone else.

Comment 6 Jose A. Rivera 2022-01-20 16:11:37 UTC
This BZ is directly related to the Multus Epic, which is not targeted for ODF 4.10: https://issues.redhat.com/browse/RHSTOR-2067

As we have hit feature freeze for 4.10, moving this to ODF 4.11.

Comment 7 Sébastien Han 2022-01-21 09:54:17 UTC
Rohan, any plan to work on this? Thanks

Comment 9 Sébastien Han 2022-05-09 15:44:07 UTC
Ideally yes if someone from ocs-op can pick it up, it looks small. It's only about adding the network annotations to the toolbox when requested.
José, do we have someone else than Rohan? He moved to a different team so I'm not sure if we can count on him :) 

Thanks!

Comment 10 Martin Bukatovic 2022-05-10 13:22:29 UTC
Reproducer looks clear.

Comment 28 Blaine Gardner 2023-05-30 16:53:09 UTC
Eran and I talked about targeting 4.13.z for this. @uchapaga could you help me be sure this is in the work queue for a 4.13 z-stream?

Comment 29 umanga 2023-05-31 05:13:09 UTC
Since we have all the acks on this BZ, will take this up for 4.14.
I will create a 4.13.z clone of this BZ when it's available.

Comment 34 Oded 2023-08-24 12:00:06 UTC
There is no option to create a ceph tool pod with ocsinitialization. The pod stuck in  ContainerCreating state
[
error adding pod openshift-storage_rook-ceph-tools-5b4ff9bd75-p4qrv to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400:
]

I move the bz to assigned state.

SetUp:
ODF Version: odf-operator.v4.14.0-115.stable
OCP Version: 4.14.0-0.nightly-2023-08-11-055332
Platform: Vsphere

Test Process: 
1.Enable multus:
$ oc get storageclusters.ocs.openshift.io -o yaml
    network:
      connections:
        encryption: {}
      multiClusterService: {}
      provider: multus
      selectors:
        cluster: default/cluster-net
        public: default/public-net

2.Verify storagecluster on Ready state:
$ oc get storageclusters.ocs.openshift.io 
NAME                 AGE   PHASE   EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   24m   Ready              2023-08-24T11:29:53Z   4.14.0


3.Create rook-ceph-tool pod
$ oc patch ocsinitialization ocsinit -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'


4.Check rook-ceph-tools pod [stuck on ContainerCreating state]
$ oc get pods rook-ceph-tools-5b4ff9bd75-p4qrv
NAME                               READY   STATUS              RESTARTS   AGE
rook-ceph-tools-5b4ff9bd75-p4qrv   0/1     ContainerCreating   0          2m29s


$ oc patch ocsinitialization ocsinit -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'
ocsinitialization.ocs.openshift.io/ocsinit patched
Events:
  Type     Reason                  Age                From               Message
  ----     ------                  ----               ----               -------
  Normal   Scheduled               76s                default-scheduler  Successfully assigned openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv to compute-2
  Warning  ErrorUpdatingResource   77s (x2 over 77s)  controlplane       addLogicalPort failed for openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv: error while getting network attachment definition for [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv]: failed to get all NetworkSelectionElements for pod openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv: parsePodNetworkAnnotation: Invalid network object (failed at '/')
  Warning  FailedCreatePodSandBox  76s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_rook-ceph-tools-5b4ff9bd75-p4qrv_openshift-storage_84b86089-bc20-4c18-b106-b68b438ee025_0(fc44e52cc3b98612627fcbe1f68c3761ed7743aee01d4fae1ef11eb3a25ecc71): error adding pod openshift-storage_rook-ceph-tools-5b4ff9bd75-p4qrv to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: '&{ContainerID:fc44e52cc3b98612627fcbe1f68c3761ed7743aee01d4fae1ef11eb3a25ecc71 Netns:/var/run/netns/cbeb0888-9065-4f82-851c-d0c79fa50c46 IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=fc44e52cc3b98612627fcbe1f68c3761ed7743aee01d4fae1ef11eb3a25ecc71;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025 Path: StdinData:[123 34 98 105 110 68 105 114 34 58 34 47 118 97 114 47 108 105 98 47 99 110 105 47 98 105 110 34 44 34 99 108 117 115 116 101 114 78 101 116 119 111 114 107 34 58 34 47 104 111 115 116 4 116 117 115 47 115 111 99 107 101 116 34 10 125 10]} ContainerID:"fc44e52cc3b98612627fcbe1f68c3761ed7743aee01d4fae1ef11eb3a25ecc71" Netns:"/var/run/netns/cbeb0888-9065-4f82-851c-d0c79fa50c46" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=fc44e52cc3b98612627fcbe1f68c3761ed7743aee01d4fae1ef11eb3a25ecc71;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025" Path:"" ERRORED: error configuring pod [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv] networking: Multus: [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv/84b86089-bc20-4c18-b106-b68b438ee025]: error loading k8s delegates k8s args: parsePodNetworkAnnotation: parsePodNetworkObjectName: Invalid network object (failed at '/')
'
  Warning  FailedCreatePodSandBox  60s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_rook-ceph-tools-5b4ff9bd75-p4qrv_openshift-storage_84b86089-bc20-4c18-b106-b68b438ee025_0(0c0f414a773f3cac6fd59b4722ca9e8ebda00604b0791b60a8e8bb841ab79251): error adding pod openshift-storage_rook-ceph-tools-5b4ff9bd75-p4qrv to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: '&{ContainerID:0c0f414a773f3cac6fd59b4722ca9e8ebda00604b0791b60a8e8bb841ab79251 Netns:/var/run/netns/01f4b910-5e0f-42cd-b382-46410c2c3f6a IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=0c0f414a773f3cac6fd59b4722ca9e8ebda00604b0791b60a8e8bb841ab79251;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025 Path: StdinData:[123 34 98 105 110 68 105 114 34 58 34 47 118 97 114 47 108 105 98 47 99 110 105 47 98 105 110 34 44 34 99 108 117 115 116 101 114 78 101 116 119 111 114 107 34 58 34 47 104 111 115 116 47 114 11 108 116 117 115 47 115 111 99 107 101 116 34 10 125 10]} ContainerID:"0c0f414a773f3cac6fd59b4722ca9e8ebda00604b0791b60a8e8bb841ab79251" Netns:"/var/run/netns/01f4b910-5e0f-42cd-b382-46410c2c3f6a" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=0c0f414a773f3cac6fd59b4722ca9e8ebda00604b0791b60a8e8bb841ab79251;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025" Path:"" ERRORED: error configuring pod [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv] networking: Multus: [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv/84b86089-bc20-4c18-b106-b68b438ee025]: error loading k8s delegates k8s args: parsePodNetworkAnnotation: parsePodNetworkObjectName: Invalid network object (failed at '/')
'
  Warning  FailedCreatePodSandBox  46s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_rook-ceph-tools-5b4ff9bd75-p4qrv_openshift-storage_84b86089-bc20-4c18-b106-b68b438ee025_0(7f05041188670d07b08a8a3dd160aec8b3c055bc5955b812a56ceb42e5180654): error adding pod openshift-storage_rook-ceph-tools-5b4ff9bd75-p4qrv to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: '&{ContainerID:7f05041188670d07b08a8a3dd160aec8b3c055bc5955b812a56ceb42e5180654 Netns:/var/run/netns/aea3d985-f2f9-4332-bc3c-adb89e32a3b7 IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=7f05041188670d07b08a8a3dd160aec8b3c055bc5955b812a56ceb42e5180654;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025 Path: StdinData:[123 34 98 105 110 68 105 114 34 58 34 47 118 97 114 47 108 105 98 47 99 110 105 47 98 105 110 34 44 34 99 108 117 115 116 101 114 78 101 116 119 111 114 107 34 58 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 99 110 105 47 110 101 116 46 100 47 49 48 45 111 118 110 45 107 117 98 101 114 110 101 116 101 1 117 108 116 117 115 47 115 111 99 107 101 116 34 44 10 32 32 32 32 34 115 111 99 107 101 116 68 105 114 34 58 32 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 10 125 10]} ContainerID:"7f05041188670d07b08a8a3dd160aec8b3c055bc5955b812a56ceb42e5180654" Netns:"/var/run/netns/aea3d985-f2f9-4332-bc3c-adb89e32a3b7" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=7f05041188670d07b08a8a3dd160aec8b3c055bc5955b812a56ceb42e5180654;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025" Path:"" ERRORED: error configuring pod [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv] networking: Multus: [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv/84b86089-bc20-4c18-b106-b68b438ee025]: error loading k8s delegates k8s args: parsePodNetworkAnnotation: parsePodNetworkObjectName: Invalid network object (failed at '/')
'
  Warning  FailedCreatePodSandBox  31s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_rook-ceph-tools-5b4ff9bd75-p4qrv_openshift-storage_84b86089-bc20-4c18-b106-b68b438ee025_0(118343cb613a68eb7c269fdecaaa9f15b00ec52bb16f6ba376cec7d6f21c7ad9): error adding pod openshift-storage_rook-ceph-tools-5b4ff9bd75-p4qrv to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: '&{ContainerID:118343cb613a68eb7c269fdecaaa9f15b00ec52bb16f6ba376cec7d6f21c7ad9 Netns:/var/run/netns/9d6214bf-6e18-412c-9ccf-6f43bd714271 IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=118343cb613a68eb7c269fdecaaa9f15b00ec52bb16f6ba376cec7d6f21c7ad9;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025 Path: StdinData:[123 34 98 105 110 68 105 114 34 58 34 47 118 97 114 47 108 105 98 47 99 110 105 47 98 105 110 34 44 34 99 108 117 115 116 101 114 78 101 116 119 111 114 107 34 58 34 47 104 111 115 116 47 114 117 110 4101 116 68 105 114 34 58 32 34 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 44 10 32 32 32 32 34 115 111 99 107 101 116 68 105 114 34 58 32 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 10 125 10]} ContainerID:"118343cb613a68eb7c269fdecaaa9f15b00ec52bb16f6ba376cec7d6f21c7ad9" Netns:"/var/run/netns/9d6214bf-6e18-412c-9ccf-6f43bd714271" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=118343cb613a68eb7c269fdecaaa9f15b00ec52bb16f6ba376cec7d6f21c7ad9;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025" Path:"" ERRORED: error configuring pod [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv] networking: Multus: [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv/84b86089-bc20-4c18-b106-b68b438ee025]: error loading k8s delegates k8s args: parsePodNetworkAnnotation: parsePodNetworkObjectName: Invalid network object (failed at '/')
'
  Warning  FailedCreatePodSandBox  18s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_rook-ceph-tools-5b4ff9bd75-p4qrv_openshift-storage_84b86089-bc20-4c18-b106-b68b438ee025_0(cedd8984fb8903c978b0f0785cf12f823737ec16b718140ec7be7dfd86f47e72): error adding pod openshift-storage_rook-ceph-tools-5b4ff9bd75-p4qrv to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: '&{ContainerID:cedd8984fb8903c978b0f0785cf12f823737ec16b718140ec7be7dfd86f47e72 Netns:/var/run/netns/6ae2c185-bbcf-40a2-a1a6-e3bcab8974ba IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=cedd8984fb8903c978b0f0785cf12f823737ec16b718140ec7be7dfd86f47e72;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025 Path: StdinData:[123 34 98 105 110 68 105 114 34 58 34 47 118 97 114 47 108 105 98 47 99 110 105 47 98 105 110 34 44 34 99 108 117 115 116 101 114 78 101 116 119 111 114 107 34 58 34 47 104 9 111 110 83 111 99 107 101 116 68 105 114 34 58 32 34 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 44 10 32 32 32 32 34 115 111 99 107 101 116 68 105 114 34 58 32 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 10 125 10]} ContainerID:"cedd8984fb8903c978b0f0785cf12f823737ec16b718140ec7be7dfd86f47e72" Netns:"/var/run/netns/6ae2c185-bbcf-40a2-a1a6-e3bcab8974ba" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=cedd8984fb8903c978b0f0785cf12f823737ec16b718140ec7be7dfd86f47e72;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025" Path:"" ERRORED: error configuring pod [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv] networking: Multus: [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv/84b86089-bc20-4c18-b106-b68b438ee025]: error loading k8s delegates k8s args: parsePodNetworkAnnotation: parsePodNetworkObjectName: Invalid network object (failed at '/')
'
  Warning  FailedCreatePodSandBox  6s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_rook-ceph-tools-5b4ff9bd75-p4qrv_openshift-storage_84b86089-bc20-4c18-b106-b68b438ee025_0(8241381d61af794085127ceed00a666663ffb5b528426e6c3519c140606fae47): error adding pod openshift-storage_rook-ceph-tools-5b4ff9bd75-p4qrv to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: '&{ContainerID:8241381d61af794085127ceed00a666663ffb5b528426e6c3519c140606fae47 Netns:/var/run/netns/338b55db-8e47-4974-9565-8573ab582ec1 IfName:eth0 Args:IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=8241381d61af794085127ceed00a666663ffb5b528426e6c3519c140606fae47;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025 Path: StdinData:[12311 99 107 101 116 68 105 114 34 58 32 34 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 44 10 32 32 32 32 34 115 111 99 107 101 116 68 105 114 34 58 32 34 47 104 111 115 116 47 114 117 110 47 109 117 108 116 117 115 47 115 111 99 107 101 116 34 10 125 10]} ContainerID:"8241381d61af794085127ceed00a666663ffb5b528426e6c3519c140606fae47" Netns:"/var/run/netns/338b55db-8e47-4974-9565-8573ab582ec1" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=openshift-storage;K8S_POD_NAME=rook-ceph-tools-5b4ff9bd75-p4qrv;K8S_POD_INFRA_CONTAINER_ID=8241381d61af794085127ceed00a666663ffb5b528426e6c3519c140606fae47;K8S_POD_UID=84b86089-bc20-4c18-b106-b68b438ee025" Path:"" ERRORED: error configuring pod [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv] networking: Multus: [openshift-storage/rook-ceph-tools-5b4ff9bd75-p4qrv/84b86089-bc20-4c18-b106-b68b438ee025]: error loading k8s delegates k8s args: parsePodNetworkAnnotation: parsePodNetworkObjectName: Invalid network object (failed at '/')
'

Comment 35 Blaine Gardner 2023-08-25 15:15:25 UTC
I'm not sure what's going wrong here, but I hope this information is helpufl:

The normal ODF toolbox pod should be modified as follows to allow the toolbox pod to correctly message the cluster:


Let's start with Oded's example cluster here:

$ oc get storageclusters.ocs.openshift.io -o yaml
    network:
      connections:
        encryption: {}
      multiClusterService: {}
      provider: multus
      selectors:
        cluster: default/cluster-net
        public: default/public-net


This cluster has both cluster and public networks specified. The toolbox pod should have the "k8s.v1.cni.cncf.io/networks" annotation key added with a value that is equal to the "public" network selector.


This needs to be added to the toolbox pod for Oded's example.

metadata:
  annotations:
    k8s.v1.cni.cncf.io/networks: default/public-net


Keep in mind that if only the "cluster" net is specified, the annotation should *not* be added because RBD communication will happen over OpenShift's default pod network.

Comment 38 Oded 2023-09-07 09:10:30 UTC
Bug Fixed 
We can run "rbd ls -p ocs-storagecluster-cephblockpool" via tool pod.

SetUp:
ODF Version: odf-operator.v4.14.0-126.stable
OCP Version: 4.14.0-0.nightly-2023-09-02-132842
Platform: BM
Multus Enabled:
```   
$ oc get storagecluster -A -oyaml
 network:
      multiClusterService: {}
      provider: multus
      selectors:
        cluster: default/private-net
        public: default/public-net
```

Test Process:
1.Create tool pod with ocsinitialization cmd:
$ oc patch ocsinitialization ocsinit -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'
ocsinitialization.ocs.openshift.io/ocsinit patched

$ oc get pods rook-ceph-tools-6b974c5c4c-t6scj
NAME                               READY   STATUS    RESTARTS   AGE
rook-ceph-tools-6b974c5c4c-t6scj   1/1     Running   0          81s

2. Check rbd list:
oviner:odf-must-gather$ oc rsh rook-ceph-tools-6b974c5c4c-t6scj
sh-5.1$ rbd ls -p ocs-storagecluster-cephblockpool
csi-vol-04f8bec1-a09a-4c2a-90fb-046169c27496
csi-vol-119194f2-6165-4476-b647-070c2a4e0895
csi-vol-81b26ecb-f4ed-4217-8ccd-69634050926f
csi-vol-af450b1a-5866-46d0-85f8-12d76eca4ab3
csi-vol-d1c91c0e-c7c3-442f-8343-081c1ca4ae5c

$ oc get deployment rook-ceph-tools -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2023-09-07T08:54:54Z"
  generation: 1
  name: rook-ceph-tools
  namespace: openshift-storage
  ownerReferences:
  - apiVersion: ocs.openshift.io/v1
    kind: StorageCluster
    name: ocs-storagecluster
    uid: 3f34deae-72d2-4441-a4f6-7ce74aa54b56
  resourceVersion: "1048234"
  uid: 7e219ffe-0e68-48d8-bb1e-ee20435a699c
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: rook-ceph-tools
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: default/public-net
      creationTimestamp: null
      labels:
        app: rook-ceph-tools
    spec:
      containers:
      - args:
        - -m
        - -c
        - /usr/local/bin/toolbox.sh
        command:
        - /bin/bash
        env:
        - name: ROOK_CEPH_USERNAME
          valueFrom:
            secretKeyRef:
              key: ceph-username
              name: rook-ceph-mon
        - name: ROOK_CEPH_SECRET
          valueFrom:
            secretKeyRef:
              key: ceph-secret
              name: rook-ceph-mon
        image: registry.redhat.io/odf4/rook-ceph-rhel9-operator@sha256:87c5fdcf898028ae9ab1d1733bac5a077102ec054f5661221a047282e1882a16
        imagePullPolicy: IfNotPresent
        name: rook-ceph-tools
        resources: {}
        securityContext:
          runAsGroup: 2016
          runAsNonRoot: true
          runAsUser: 2016
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        tty: true
        volumeMounts:
        - mountPath: /etc/ceph
          name: ceph-config
        - mountPath: /etc/rook
          name: mon-endpoint-volume
      dnsPolicy: ClusterFirstWithHostNet
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"
      volumes:
      - emptyDir: {}
        name: ceph-config
      - configMap:
          defaultMode: 420
          items:
          - key: data
            path: mon-endpoints
          name: rook-ceph-mon-endpoints
        name: mon-endpoint-volume
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2023-09-07T08:54:58Z"
    lastUpdateTime: "2023-09-07T08:54:58Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2023-09-07T08:54:54Z"
    lastUpdateTime: "2023-09-07T08:54:58Z"
    message: ReplicaSet "rook-ceph-tools-6b974c5c4c" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

Comment 39 Oded 2023-09-07 09:11:06 UTC
Bug Fixed 
We can run "rbd ls -p ocs-storagecluster-cephblockpool" via tool pod.

SetUp:
ODF Version: odf-operator.v4.14.0-126.stable
OCP Version: 4.14.0-0.nightly-2023-09-02-132842
Platform: BM
Multus Enabled:
```   
$ oc get storagecluster -A -oyaml
 network:
      multiClusterService: {}
      provider: multus
      selectors:
        cluster: default/private-net
        public: default/public-net
```

Test Process:
1.Create tool pod with ocsinitialization cmd:
$ oc patch ocsinitialization ocsinit -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'
ocsinitialization.ocs.openshift.io/ocsinit patched

$ oc get pods rook-ceph-tools-6b974c5c4c-t6scj
NAME                               READY   STATUS    RESTARTS   AGE
rook-ceph-tools-6b974c5c4c-t6scj   1/1     Running   0          81s

2. Check rbd list:
oviner:odf-must-gather$ oc rsh rook-ceph-tools-6b974c5c4c-t6scj
sh-5.1$ rbd ls -p ocs-storagecluster-cephblockpool
csi-vol-04f8bec1-a09a-4c2a-90fb-046169c27496
csi-vol-119194f2-6165-4476-b647-070c2a4e0895
csi-vol-81b26ecb-f4ed-4217-8ccd-69634050926f
csi-vol-af450b1a-5866-46d0-85f8-12d76eca4ab3
csi-vol-d1c91c0e-c7c3-442f-8343-081c1ca4ae5c

$ oc get deployment rook-ceph-tools -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2023-09-07T08:54:54Z"
  generation: 1
  name: rook-ceph-tools
  namespace: openshift-storage
  ownerReferences:
  - apiVersion: ocs.openshift.io/v1
    kind: StorageCluster
    name: ocs-storagecluster
    uid: 3f34deae-72d2-4441-a4f6-7ce74aa54b56
  resourceVersion: "1048234"
  uid: 7e219ffe-0e68-48d8-bb1e-ee20435a699c
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: rook-ceph-tools
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: default/public-net
      creationTimestamp: null
      labels:
        app: rook-ceph-tools
    spec:
      containers:
      - args:
        - -m
        - -c
        - /usr/local/bin/toolbox.sh
        command:
        - /bin/bash
        env:
        - name: ROOK_CEPH_USERNAME
          valueFrom:
            secretKeyRef:
              key: ceph-username
              name: rook-ceph-mon
        - name: ROOK_CEPH_SECRET
          valueFrom:
            secretKeyRef:
              key: ceph-secret
              name: rook-ceph-mon
        image: registry.redhat.io/odf4/rook-ceph-rhel9-operator@sha256:87c5fdcf898028ae9ab1d1733bac5a077102ec054f5661221a047282e1882a16
        imagePullPolicy: IfNotPresent
        name: rook-ceph-tools
        resources: {}
        securityContext:
          runAsGroup: 2016
          runAsNonRoot: true
          runAsUser: 2016
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        tty: true
        volumeMounts:
        - mountPath: /etc/ceph
          name: ceph-config
        - mountPath: /etc/rook
          name: mon-endpoint-volume
      dnsPolicy: ClusterFirstWithHostNet
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoSchedule
        key: node.ocs.openshift.io/storage
        operator: Equal
        value: "true"
      volumes:
      - emptyDir: {}
        name: ceph-config
      - configMap:
          defaultMode: 420
          items:
          - key: data
            path: mon-endpoints
          name: rook-ceph-mon-endpoints
        name: mon-endpoint-volume
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2023-09-07T08:54:58Z"
    lastUpdateTime: "2023-09-07T08:54:58Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2023-09-07T08:54:54Z"
    lastUpdateTime: "2023-09-07T08:54:58Z"
    message: ReplicaSet "rook-ceph-tools-6b974c5c4c" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

Comment 41 errata-xmlrpc 2023-11-08 18:49:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.14.0 security, enhancement & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6832

Comment 42 Red Hat Bugzilla 2024-03-08 04:25:02 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.