Bug 2245975

Summary: Missing ip tool in RHCS 6.1 (17.2.6-148) image
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Petr Balogh <pbalogh>
Component: ContainerAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED NOTABUG QA Contact: Pranav Prakash <prprakas>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: brgardne, ceph-eng-bugs, cephqe-warriors, ebenahar, kdreyer, madam, owasserm, vdas
Target Milestone: ---Keywords: Automation
Target Release: 6.1z3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-03 14:44:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2247363    

Description Petr Balogh 2023-10-24 20:14:13 UTC
Description of problem:
In ODF 4.14 we have a problem with deployment with multus.
Blaine Gardner pointed out the issue is:

```
I think resolving this should be fairly straightforward. The `quay.io/ceph/ceph` images have the CLI tool `ip` installed, but it looks like RHCS must not have it. Rook expects that package in order to find the network information for the pod
Please create a BZ against RHCS build component to add the `ip` tool. I think this is the `iproute` package, but I could be wrong where RHEL is concerned.
```

So I am opening this bug.

Version-Release number of selected component (if applicable):
17.2.6-148.el9cp (badc1d27cb07762bea48f6554ad4f92b9d3fbb6b) quincy (stable)
as a part of ODF build 	4.14.0-155

How reproducible:
Install ODF 4.14 with multus

Steps to Reproduce:
1.
2.
3.

Actual results:
 oc describe cephclusters.ceph.rook.io -A
    Message:               failed to create cluster: failed to start ceph monitors: failed to apply ceph network settings: failed to discover network CIDRs for multus, please correct any possible errors in the CephCluster spec.network.selectors, or use CephCluster spec.network.addressRanges to manually specify which network ranges to use for public/cluster networks: ceph "public" network canary ceph CSI version job returned failure code 127: stdout: "[{\n    \"name\": \"openshift-sdn\",\n    \"interface\": \"eth0\",\n    \"ips\": [\n        \"10.128.3.143\"\n    ],\n    \"default\": true,\n    \"dns\": {}\n},{\n    \"name\": \"openshift-storage/public-net\",\n    \"interface\": \"public\",\n    \"ips\": [\n        \"192.168.20.21\"\n    ],\n    \"mac\": \"c2:87:eb:81:7a:80\",\n    \"dns\": {}\n}]\n===== k8s.v1.cni.cncf.io/network-status above ===== ip address below =====\n": stderr: "bash: line 5: ip: command not found\n"

Expected results:


Additional info:
Logs:
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-009bu1cmni33-t1/j-009bu1cmni33-t1_20231019T124823/logs/failed_testcase_ocs_logs_1697722657/deployment_ocs_logs/j-009bu1cmni33-t1/
ODF QE jenkins job:
https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/9513/

Comment 2 RHEL Program Management 2023-10-24 20:14:23 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 3 Elad 2023-10-25 07:52:30 UTC
ODF deployment with Multus last passed with RHCS 6.1z1.

4.14.0-108 ODF build
17.2.6-107.el9cp (4079b48a400e4d23864de0da6d093e200038d7fb) quincy (stable).

Comment 5 Blaine Gardner 2023-10-30 16:08:21 UTC
This is not a regression. Rather, it is needed to enable the work that was done here: https://bugzilla.redhat.com/show_bug.cgi?id=2218952

Development is usually done with upstream quay.io/ceph/ceph images that are generally equivalent to downstream RHCS images, but this seems to be a case where that assumption wasn't correct on my/Rook's part. The upstream images have `ip` where RHCS currently does not. 

I'm sorry this is coming up so late in the release process. This was my mistake, and I'll discuss this with the Rook bug triage tomorrow to make sure we do our best not to miss dependencies like this in future development.

Comment 7 Ken Dreyer (Red Hat) 2023-11-03 14:44:51 UTC
Blaine, Boris, Teoman and I discussed this. https://github.com/ceph/ceph-container/pull/2164 can be closed, and Rook will take this fix instead: https://github.com/red-hat-storage/rook/pull/533

Comment 8 Michael Adam 2023-11-07 09:12:58 UTC
@kdreyer why was this closed NOTSBUG? shouldn't it rather be MODIFIED since the downstream rook fix was merged?
or ON_QA if a build with the fix has been given to QE ...

Comment 9 Michael Adam 2023-11-07 09:16:59 UTC
(In reply to Michael Adam from comment #8)
> @kdreyer why was this closed NOTSBUG? shouldn't it rather be
> MODIFIED since the downstream rook fix was merged?
> or ON_QA if a build with the fix has been given to QE ...

answering to myself. not a bug in ceph, but in ODF. 

The corresponding ODF bug https://bugzilla.redhat.com/show_bug.cgi?id=2245978  is verified thanks to the fix in rook.

Comment 10 Red Hat Bugzilla 2024-03-07 04:26:10 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days