Description of problem (please be detailed as possible and provide log snippests): In a cluster deployed with MSGRv2, both ports 3300 and 6789 are open. Version of all relevant components (if applicable): 4.13,4.14 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: Steps to Reproduce: 1.Simply deploy and scan for open ports. 2. or check the mon logs Actual results: debug 2024-01-31T18:24:36.763+0000 ffffb1ea4040 0 starting mon.a rank 0 at public addrs v2:10.111.162.89:3300/0 at bind addrs [v2:10.244.0.10:3300/0,v1:10.244.0.10:6789/0] mon_data /var/lib/ceph/mon/ceph-a fsid 6e5077ea-85bf-4afc-8869-852fc4f5e046 Expected results: only 3300 should be available. Additional info:
I didn't succeed in running the commands in https://bugzilla.redhat.com/show_bug.cgi?id=2262134#c10. I got this output: $ oc rsh rook-ceph-mon-a-77cf76f8f8-4sstf Defaulted container "mon" out of: mon, log-collector, chown-container-data-dir (init), init-mon-fs (init) sh-5.1# yum install net-tools -y sh: yum: command not found sh-5.1# dnf install net-tools -y sh: dnf: command not found sh-5.1# Anyway, I checked the mon logs, and I saw that the port was 3300, as expected. $ oc logs rook-ceph-mon-a-77cf76f8f8-4sstf -c mon | grep "at bind addrs" debug 2024-06-04T10:23:30.922+0000 7f9adf119b00 0 starting mon.a rank 0 at public addrs v2:172.30.207.86:3300/0 at bind addrs v2:10.128.2.35:3300/0 mon_data /var/lib/ceph/mon/ceph-a fsid 96146cd1-acfa-47e9-bb80-10087b75c44b $ oc logs rook-ceph-mon-b-5676d6b8b6-5r9jn -c mon | grep "at bind addrs" debug 2024-06-04T10:24:03.082+0000 7f2850268b00 0 starting mon.b rank 0 at public addrs v2:172.30.155.101:3300/0 at bind addrs v2:10.129.2.25:3300/0 mon_data /var/lib/ceph/mon/ceph-b fsid 96146cd1-acfa-47e9-bb80-10087b75c44b $ oc logs rook-ceph-mon-c-d86f4d766-qftbv -c mon | grep "at bind addrs" debug 2024-06-04T10:24:23.564+0000 7fa150c01b00 0 starting mon.c rank 1 at public addrs v2:172.30.189.25:3300/0 at bind addrs v2:10.131.0.24:3300/0 mon_data /var/lib/ceph/mon/ceph-c fsid 96146cd1-acfa-47e9-bb80-10087b75c44b Let me know if this suffice.
The tools can be tricky to install in different container environments (even upstream vs downstream). The mon logs do show the expected binding, so I do see that as sufficient, thanks!
Okay, thanks for the clarification. The steps I did to test the BZ: 1. Deploy an IBMCloud 4.16 cluster. 2. Check the logs in the rook-ceph-mon pods and verify that we see only the 3300 port: $ oc logs rook-ceph-mon-a-77cf76f8f8-4sstf -c mon | grep "at bind addrs" debug 2024-06-04T10:23:30.922+0000 7f9adf119b00 0 starting mon.a rank 0 at public addrs v2:172.30.207.86:3300/0 at bind addrs v2:10.128.2.35:3300/0 mon_data /var/lib/ceph/mon/ceph-a fsid 96146cd1-acfa-47e9-bb80-10087b75c44b $ oc logs rook-ceph-mon-b-5676d6b8b6-5r9jn -c mon | grep "at bind addrs" debug 2024-06-04T10:24:03.082+0000 7f2850268b00 0 starting mon.b rank 0 at public addrs v2:172.30.155.101:3300/0 at bind addrs v2:10.129.2.25:3300/0 mon_data /var/lib/ceph/mon/ceph-b fsid 96146cd1-acfa-47e9-bb80-10087b75c44b $ oc logs rook-ceph-mon-c-d86f4d766-qftbv -c mon | grep "at bind addrs" debug 2024-06-04T10:24:23.564+0000 7fa150c01b00 0 starting mon.c rank 1 at public addrs v2:172.30.189.25:3300/0 at bind addrs v2:10.131.0.24:3300/0 mon_data /var/lib/ceph/mon/ceph-c fsid 96146cd1-acfa-47e9-bb80-10087b75c44b Link to the Jenkins job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/38285/. Versions: OC version: Client Version: 4.10.24 Server Version: 4.16.0-0.nightly-2024-06-03-060250 Kubernetes Version: v1.29.5+87992f4 OCS version: ocs-operator.v4.16.0-118.stable OpenShift Container Storage 4.16.0-118.stable Succeeded Cluster version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.16.0-0.nightly-2024-06-03-060250 True False 5h11m Cluster version is 4.16.0-0.nightly-2024-06-03-060250 Rook version: 2024/06/04 15:21:14 maxprocs: Leaving GOMAXPROCS=16: CPU quota undefined rook: v4.16.0-0.a2396a5186cc038b22154e857e0f7865e709d06a go: go1.21.9 (Red Hat 1.21.9-1.el9_4) Ceph version: ceph version 18.2.1-188.el9cp (b1ae9c989e2f41dcfec0e680c11d1d9465b1db0e) reef (stable)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.16.0 security, enhancement & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:4591