1567346 – ceph-mgr Cannot get stat of OSD 0

Bug 1567346 - ceph-mgr Cannot get stat of OSD 0

Summary: ceph-mgr Cannot get stat of OSD 0

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RADOS
Sub Component:
Version:	3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	z3
Target Release:	3.0
Assignee:	Josh Durgin
QA Contact:	ceph-qe-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-04-13 20:23 UTC by Vasu Kulkarni
Modified:	2022-02-21 18:05 UTC (History)
CC List:	17 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-04-26 22:03:15 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Logs for failed upgrade scenario (454.77 KB, application/zip) 2018-04-13 22:59 UTC, Coady LaCroix	no flags	Details
View All

Description Vasu Kulkarni 2018-04-13 20:23:55 UTC

Description of problem:


1. Install 2.5 from CDN and use the same all.yaml file for the upgrade playbook, 

eg:

  global:
    mon_allow_pool_delete: true
    mon_max_pg_per_osd: 2048
    osd_default_pool_size: 2
    osd_pool_default_pg_num: 128
    osd_pool_default_pgp_num: 128
ceph_origin: distro
ceph_repository: rhcs
ceph_rhcs_version: 3
ceph_stable_release: luminous
ceph_test: true
copy_admin_key: true
fetch_directory: /home/cephuser/fetch
osd_auto_discovery: false
osd_scenario: collocated
public_network: 172.16.0.0/12
radosgw_interface: eth0
upgrade_ceph_packages: true


2. All pgs are in clean state after 2.5 install, use the upgrade playbook to run the upgrade from 2.5 to 3.z2

ansible-playbook -e ireallymeanit=yes -vv -i hosts rolling_update.yml


3. Notice that upgrade playbook tries to activate the OSD's again, which causes pg's to be in unclean state at the end of the upgrade


TASK [ceph-osd : manually prepare ceph "filestore" non-containerized osd disk(s) with collocated osd data and journal] ***
task path: /usr/share/ceph-ansible/roles/ceph-osd/tasks/scenarios/collocated.yml:54

2018-04-13 19:06:31,744 - ceph.ceph - INFO - skipping: [ceph-clacroix-run712-node5-osd] => (item=[{'_ansible_parsed': True, 'stderr_lines': [], u'cmd': u"parted --script /dev/vdb print | egrep -sq '^ 1.*ceph'", u'end': u'2018-04-13 15:06:28.168932', '_ansible_no_log': False, u'stdout': u'', '_ansible_item_result': True, u'changed': False, 'item': u'/dev/vdb', u'delta': u'0:00:00.019154', u'stderr': u'', u'rc': 0, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"parted --script /dev/vdb print | egrep -sq '^ 1.*ceph'", u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, 'stdout_lines': [], 'failed_when_result': False, u'start': u'2018-04-13 15:06:28.149778', '_ansible_ignore_errors': None, 'failed': False}, u'/dev/vdb'])  => {"changed": false, "item": [{"_ansible_ignore_errors": null, "_ansible_item_result": true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": false, "cmd": "parted --script /dev/vdb print | egrep -sq '^ 1.*ceph'", "delta": "0:00:00.019154", "end":
2018-04-13 19:06:31,744 - ceph.ceph - INFO -  "2018-04-13 15:06:28.168932", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "parted --script /dev/vdb print | egrep -sq '^ 1.*ceph'", "_uses_shell": true, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "/dev/vdb", "rc": 0, "start": "2018-04-13 15:06:28.149778", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}, "/dev/vdb"], "skip_reason": "Conditional result was False"}



Full logs:
http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1523642631/ceph_ansible_upgrade_to_rhcs_3_nightly_0.log


Expected Results:

Upgrade playbook should not try to activate the osd's again and should be mainly doing upgrades on the roles defined in hosts file.

Comment 4 Christina Meno 2018-04-13 22:18:49 UTC

No, it does not block z2 we’ll just release note that use of this flag doesn’t work with upgrade

Comment 6 Coady LaCroix 2018-04-13 22:57:34 UTC

I have removed the osd specific config and some other items that shouldn't have been necessary for upgrade to try to get around this issue. The config I ran with was this:

ceph_test: True
ceph_origin: distro
ceph_repository: rhcs
ceph_rhcs_version: 3
ceph_stable_release: luminous
upgrade_ceph_packages: True
fetch_directory: ~/fetch
copy_admin_key: True

The first result was during execution of check_mandatory_vars.yml, which complained that my public_network was not configured. I configured that and reran the upgrade test to a similar error. This time the check was returning the following:

TASK [ceph-osd : make sure an osd scenario was chosen] *************************
task path: /usr/share/ceph-ansible/roles/ceph-osd/tasks/check_mandatory_vars.yml:23

fatal: [ceph-clacroix-run113-node5-osd]: FAILED! => {"changed": false, "msg": "please choose an osd scenario"}


Are these config items truly necessary for upgrade?

Comment 7 Coady LaCroix 2018-04-13 22:59:30 UTC

Created attachment 1421631 [details]
Logs for failed upgrade scenario

Comment 10 Sébastien Han 2018-04-16 16:40:06 UTC

Vasu,

After reading the BZ I'm a bit confused initially you said OSD's were prepared again during the upgrade. However, the task you wrote says "INFO - skipping" so the task was skipped. Also looking at your log, this same task got skipped.

Let's assume for a second that the OSD got prepared again, what's the state of these OSDs? Can you share more info on the state of the drives? Also, a "ceph -s" will be useful.

Now, yes specifying an osd scenario is mandatory, even during an upgrade the whole playbook runs and devices are skipped if they have been prepared already. Only new devices that might appear during the upgrade can be prepared.

Please clarify.
Thanks.

Comment 11 Sébastien Han 2018-04-16 16:43:17 UTC

Vasu, I read the logs again, I suspect a fw issue with the mgr. Can you check the mgr logs on ceph-clacroix-run712-node1-mon? And see if it says anything about not being able to contact the OSDs or something?

Thanks!

Comment 14 Sébastien Han 2018-04-17 08:53:59 UTC

Vasu, the tasks:

2018-04-13 19:06:31,712 - ceph.ceph - INFO - 
TASK [ceph-osd : manually prepare ceph "filestore" non-containerized osd disk(s) with collocated osd data and journal] ***
task path: /usr/share/ceph-ansible/roles/ceph-osd/tasks/scenarios/collocated.yml:54

2018-04-13 19:06:31,744 - ceph.ceph - INFO - skipping: [ceph-clacroix-run712-node5-osd] => (item=[{'_ansible_parsed': True, 'stderr_lines': [], u'cmd': u"parted --script /dev/vdb print | egrep -sq '^ 1.*ceph'", u'end': u'2018-04-13 15:06:28.168932', '_ansible_no_log': False, u'stdout': u'', '_ansible_item_result': True, u'changed': False, 'item': u'/dev/vdb', u'delta': u'0:00:00.019154', u'stderr': u'', u'rc': 0, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"parted --script /dev/vdb print | egrep -sq '^ 1.*ceph'", u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, 'stdout_lines': [], 'failed_when_result': False, u'start': u'2018-04-13 15:06:28.149778', '_ansible_ignore_errors': None, 'failed': False}, u'/dev/vdb'])  => {"changed": false, "item": [{"_ansible_ignore_errors": null, "_ansible_item_result": true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": false, "cmd": "parted --script /dev/vdb print | egrep -sq '^ 1.*ceph'", "delta": "0:00:00.019154", "end":
2018-04-13 19:06:31,744 - ceph.ceph - INFO -  "2018-04-13 15:06:28.168932", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "parted --script /dev/vdb print | egrep -sq '^ 1.*ceph'", "_uses_shell": true, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "/dev/vdb", "rc": 0, "start": "2018-04-13 15:06:28.149778", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}, "/dev/vdb"], "skip_reason": "Conditional result was False"}

2018-04-13 19:06:31,755 - ceph.ceph - INFO - skipping: [ceph-clacroix-run712-node5-osd] => (item=[{'_ansible_parsed': True, 'stderr_lines': [], u'cmd': u"parted --script /dev/vdc print | egrep -sq '^ 1.*ceph'", u'end': u'2018-04-13 15:06:28.498038', '_ansible_no_log': False, u'stdout': u'', '_ansible_item_result': True, u'changed': False, 'item': u'/dev/vdc', u'delta': u'0:00:00.019548', u'stderr': u'', u'rc': 0, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"parted --script /dev/vdc print | egrep -sq '^ 1.*ceph'", u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, 'stdout_lines': [], 'failed_when_result': False, u'start': u'2018-04-13 15:06:28.478490', '_ansible_ignore_errors': None, 'failed': False}, u'/dev/vdc'])  => {"changed": false, "item": [{"_ansible_ignore_errors": null, "_ansible_item_result": true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": false, "cmd": "parted --script /dev/vdc print | egrep -sq '^ 1.*ceph'", "delta": "0:00:00.019548", "end":
2018-04-13 19:06:31,755 - ceph.ceph - INFO -  "2018-04-13 15:06:28.498038", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "parted --script /dev/vdc print | egrep -sq '^ 1.*ceph'", "_uses_shell": true, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "/dev/vdc", "rc": 0, "start": "2018-04-13 15:06:28.478490", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}, "/dev/vdc"], "skip_reason": "Conditional result was False"}

2018-04-13 19:06:31,769 - ceph.ceph - INFO - skipping: [ceph-clacroix-run712-node5-osd] => (item=[{'_ansible_parsed': True, 'stderr_lines': [], u'cmd': u"parted --script /dev/vdd print | egrep -sq '^ 1.*ceph'", u'end': u'2018-04-13 15:06:28.837381', '_ansible_no_log': False, u'stdout': u'', '_ansible_item_result': True, u'changed': False, 'item': u'/dev/vdd', u'delta': u'0:00:00.018885', u'stderr': u'', u'rc': 0, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': True, u'_raw_params': u"parted --script /dev/vdd print | egrep -sq '^ 1.*ceph'", u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, 'stdout_lines': [], 'failed_when_result': False, u'start': u'2018-04-13 15:06:28.818496', '_ansible_ignore_errors': None, 'failed': False}, u'/dev/vdd'])  => {"changed": false, "item": [{"_ansible_ignore_errors": null, "_ansible_item_result": true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": false, "cmd": "parted --script /dev/vdd print | egrep -sq '^ 1.*ceph'", "delta": "0:00:00.018885", "end":
2018-04-13 19:06:31,769 - ceph.ceph - INFO -  "2018-04-13 15:06:28.837381", "failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "parted --script /dev/vdd print | egrep -sq '^ 1.*ceph'", "_uses_shell": true, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": "/dev/vdd", "rc": 0, "start": "2018-04-13 15:06:28.818496", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}, "/dev/vdd"], "skip_reason": "Conditional result was False"}


They are all "skipping".

If you can get into the same status again, please check the firewall rules and make sure the manager can connect to the OSDs. Also I can get into the env and debug.

Thanks.

Comment 15 Sébastien Han 2018-04-17 09:54:41 UTC

Tejas, we still need to find the root cause so, for now, there is no fix to be submitted.

Comment 16 Coady LaCroix 2018-04-17 17:34:37 UTC

I have recreated this issue and have the nodes available. I will leave these up for now just to be safe. If we need any additional node IPs for troubleshooting I can get those as well. 

Installer: 10.8.248.47
Mon/Mgr: 10.8.243.153

I ran the health check from the mon and under services it says 9 osds are up, so I assume this is not a firewall issue. I also don't think we would have been able to successfully install 2.5 if a firewall issue was present. 

From mon node:

sudo ceph -s
  cluster:
    id:     8f1fe5e1-999a-49bb-8892-63c70538b463
    health: HEALTH_WARN
            Reduced data availability: 832 pgs inactive
            clock skew detected on mon.ceph-clacroix-run224-node3-mon, mon.ceph-clacroix-run224-node2-mon
 
  services:
    mon: 3 daemons, quorum ceph-clacroix-run224-node1-mon,ceph-clacroix-run224-node3-mon,ceph-clacroix-run224-node2-mon
    mgr: ceph-clacroix-run224-node1-mon(active), standbys: ceph-clacroix-run224-node3-mon, ceph-clacroix-run224-node2-mon
    osd: 9 osds: 9 up, 9 in
 
  data:
    pools:   7 pools, 832 pgs
    objects: 0 objects, 0 bytes
    usage:   0 kB used, 0 kB / 0 kB avail
    pgs:     100.000% pgs unknown
             832 unknown

Comment 17 Sébastien Han 2018-04-17 21:18:39 UTC

The ceph-mgr owns the stats for all the PGS. Typically if you see 100.000% pgs unknown there is a high chance that the mgr can not reach out to the OSD. So seeing OSD up and in is completely different, monitors own these stats.

You can deploy if you have a fw issue blocking mgr communication with OSDs, I've seen this twice on QE machines already.

So again, please, check the mgr node logs and put them here.

Comment 18 Vasu Kulkarni 2018-04-17 22:13:37 UTC

can you tell me what ports do you think should be open here for mgr <-> osd daemons ? because i dont see that information either in downstream docs or in upstream?


Here is the log from active mgr node

2018-04-16 21:43:56.337892 7fb4bcb84680  0 set uid:gid to 167:167 (ceph:ceph)
2018-04-16 21:43:56.337915 7fb4bcb84680  0 ceph version 12.2.4-6.el7cp (78f60b924802e34d44f7078029a40dbe6c0c922f) luminous (stable), process (unknown), pid 13681
2018-04-16 21:43:56.339294 7fb4bcb84680  0 pidfile_write: ignore empty --pid-file
2018-04-16 21:43:56.344648 7fb4bcb84680  1 mgr send_beacon standby
2018-04-16 21:43:56.353116 7fb4b3ce7700  1 mgr init Loading python module 'balancer'
2018-04-16 21:43:56.369880 7fb4b3ce7700  1 mgr init Loading python module 'restful'
2018-04-16 21:43:56.492836 7fb4b3ce7700  1 mgr init Loading python module 'status'
2018-04-16 21:43:56.891651 7fb4b3ce7700  1 mgr handle_mgr_map Activating!
2018-04-16 21:43:56.891827 7fb4b3ce7700  1 mgr handle_mgr_map I am now activating
2018-04-16 21:43:56.902512 7fb4a2e8c700  1 mgr load Constructed class from module: balancer
2018-04-16 21:43:56.902697 7fb4a2e8c700  1 mgr load Constructed class from module: restful
2018-04-16 21:43:56.902794 7fb4a2e8c700  1 mgr load Constructed class from module: status
2018-04-16 21:43:56.902817 7fb4a2e8c700  1 mgr send_beacon active
2018-04-16 21:43:56.903994 7fb4a0e88700  1 mgr[restful] server not running: no certificate configured
2018-04-16 21:43:58.345073 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:43:58.345329 7fb4b0ce1700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2018-04-16 21:44:00.345440 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:44:00.345786 7fb4b0ce1700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2018-04-16 21:44:02.345896 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:44:02.346239 7fb4b0ce1700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2018-04-16 21:44:04.346348 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:44:04.346739 7fb4b0ce1700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2018-04-16 21:44:06.346815 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:44:06.347152 7fb4b0ce1700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2018-04-16 21:44:08.347240 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:44:08.347572 7fb4b0ce1700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2018-04-16 21:44:10.347668 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:44:10.347964 7fb4b0ce1700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2018-04-16 21:44:12.348076 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:44:12.348400 7fb4b0ce1700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2018-04-16 21:44:14.348485 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:44:14.348845 7fb4b0ce1700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2018-04-16 21:44:16.348957 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:44:16.349288 7fb4b0ce1700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
2018-04-16 21:44:18.349389 7fb4b0ce1700  1 mgr send_beacon active
2018-04-16 21:44:18.349711 7fb4b0ce1700  1 mgr.server send_report Giving up on OSDs that haven't reported yet, sending potentially incomplete PG state to mon
2018-04-16 21:44:18.349745 7fb4b0ce1700  0 Cannot get stat of OSD 0
2018-04-16 21:44:18.349747 7fb4b0ce1700  0 Cannot get stat of OSD 1
2018-04-16 21:44:18.349754 7fb4b0ce1700  0 Cannot get stat of OSD 2
2018-04-16 21:44:18.349754 7fb4b0ce1700  0 Cannot get stat of OSD 3
2018-04-16 21:44:18.349755 7fb4b0ce1700  0 Cannot get stat of OSD 4
2018-04-16 21:44:18.349755 7fb4b0ce1700  0 Cannot get stat of OSD 5
2018-04-16 21:44:18.349756 7fb4b0ce1700  0 Cannot get stat of OSD 6
2018-04-16 21:44:18.349756 7fb4b0ce1700  0 Cannot get stat of OSD 7
2018-04-16 21:44:18.349756 7fb4b0ce1700  0 Cannot get stat of OSD 8

Comment 19 Vasu Kulkarni 2018-04-17 22:37:42 UTC

On all the nodes following ports are open 6789, 6800 to 7300.

And I see active mgr is using 6800

tcp        0      0 172.16.115.18:6800      0.0.0.0:*               LISTEN      13681/ceph-mgr

Comment 20 Vasu Kulkarni 2018-04-18 18:49:25 UTC

John,

Can you tell if this is a port communication issue?

Comment 21 Tomas Petr 2018-04-19 10:05:47 UTC

Hi Vasu,

can you check the communication:
- from node where osd.0 is running, as interface use name of public_network interface on that node, <value> is MTU on that network, <target_ip> is public_network interface-IP on active mgr node:
#ping -W 2 -I <interface> -M do -s <value> <target_ip>

- from active  mgr node, interface use name of public_network interface on that node, <value> is MTU on that network, <target_ip> is public_network interface-IP of node where osd.0 is running
#ping -W 2 -I <interface> -M do -s <value> <target_ip>

and with telnet
 from node where osd.0 is running, as <target_ip> is public_network interface-IP on active mgr node, port from comment#19 above should be 6800
# telnet <target_ip> <port>

from active  mgr node,  <target_ip> is public_network interface-IP of node where osd.0 is running, port is port of osd.0 on public_network
that out can get from monitor node
# ceph osd metadata 0 | grep front_addr 
# telnet <target_ip> <port>

Comment 22 Vasu Kulkarni 2018-04-19 18:34:36 UTC

I am going to try that, but looks like we have one more here with same symptom 
https://bugzilla.redhat.com/show_bug.cgi?id=1557063

Comment 23 Coady LaCroix 2018-04-26 16:10:00 UTC

Hi Tomas,

I have executed the commands you mentioned. This is a newly generated stack however, so the IPs will be different than before. I have reproduced the original issue before executing these commands.

Both pings execute successfully with no packet loss.

# telnet <target_ip> <port> (osd.0 -> mgr)
There is no route to host when trying to connect from the osd node to the mgr on port 6800.

[cephuser@ceph-clacroix-run517-node4-osd ~]$ telnet 10.8.246.122 6800
Trying 10.8.246.122...
telnet: connect to address 10.8.246.122: No route to host


# telnet <target_ip> <port> (mgr -> osd.0)
From the other side (mgr -> osd) it looks like I can establish a connection on that port.

[cephuser@ceph-clacroix-run517-node2-mon ~]$ telnet 10.8.246.100 6800
Trying 10.8.246.100...
Connected to 10.8.246.100.
Escape character is '^]'.
ceph v027
         ���s0ɺ
��quit
^]
telnet> quit
Connection closed.


# ceph osd metadata 0 | grep front_addr 
    "front_addr": "172.16.115.48:6800/44555",
    "hb_front_addr": "172.16.115.48:6803/44555",

Comment 24 Vasu Kulkarni 2018-04-26 17:25:01 UTC

Adding to RADOS group as per Brett, Josh, we have the setup in same state, can you look into this one when you get time? Thanks

Comment 26 Josh Durgin 2018-04-26 21:40:02 UTC

The mgr is running on 172.16.115.71:6800, as shown by 'ceph mgr dump'.
Attempting to telnet to this address and port from the osd nodes results in 'no route to host'.

As leseb said, this is caused by the firewall blocking the osd -> mgr communication. It's setup on the mon/mgr nodes to only allow port 6789:

[cephuser@ceph-clacroix-run517-node1-mon ~]$ sudo firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: eth0
  sources: 
  services: ssh dhcpv6-client
  ports: 6789/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules:

Comment 27 Vasu Kulkarni 2018-04-26 22:03:15 UTC

thanks for confirmation, we open required ports during pre configure scripts and have sanity running across luminous builds in jenkins, only the upgrade was failing so will dig more into what is causing those firewall ports to go away during upgrade

[cephuser@ceph-clacroix-run517-node1-mon ~]$ sudo firewall-cmd --zone=public --add-port=6800-7300/tcp 
success

[cephuser@ceph-clacroix-run517-node1-mon ~]$ sudo firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: eth0
  sources: 
  services: ssh dhcpv6-client
  ports: 6789/tcp 6800-7300/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules:

Comment 28 Vasu Kulkarni 2018-04-26 22:19:57 UTC

for the record, this is where we opened the ports when we install 2.5

http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-1523642631/ceph_ansible_install_rhcs_2_stable_0.log

2018-04-13 18:15:34,253 - ceph.ceph - INFO - Running command firewall-cmd --zone=public --add-port=6800-7300/tcp on 10.8.246.61
2018-04-13 18:15:34,652 - ceph.ceph - INFO - Command completed successfully

Note You need to log in before you can comment on or make changes to this bug.