Description of problem: When deploying MON and OSD colocated, MON ports aren't being opened in firewall, causing deploy to get stuck. [admin@ceph-osd01 ~]$ sudo firewall-cmd --list-services ceph cockpit dhcpv6-client ssh Adding "ceph-mon" service to node allows installation to continue. [admin@ceph-osd01 ~]$ sudo firewall-cmd --add-service ceph-mon success [admin@ceph-osd01 ~]$ sudo firewall-cmd --add-service ceph-mon --permanent success [admin@ceph-osd01 ~]$ sudo firewall-cmd --list-services ceph ceph-mon cockpit dhcpv6-client ssh Version-Release number of selected component (if applicable): ceph-ansible-4.0.0-0.1.rc9.el8cp.noarch
I'm changing bz title cause ceph-ansible actually isn't configuring firewall at all when deploying in containers. My initial report mentioned only MON because I was deploying Ceph on the same nodes I had previously deployed OSDs using RPM packages, so the OSD ports were already opened. I did a full node reinstall and noticed that firewall configuration was being skipped. The following small change allowed me to finish the installation: [admin@ceph-ansible ceph-ansible]$ diff -uNr roles/ceph-infra/tasks/configure_firewall.yml.orig roles/ceph-infra/tasks/configure_firewall.yml --- roles/ceph-infra/tasks/configure_firewall.yml.orig 2019-07-25 15:19:53.533504113 -0400 +++ roles/ceph-infra/tasks/configure_firewall.yml 2019-07-25 15:17:35.935218088 -0400 @@ -8,7 +8,7 @@ check_mode: no changed_when: false tags: firewall - when: not containerized_deployment | bool + #when: not containerized_deployment | bool - when: (firewalld_pkg_query.get('rc', 1) == 0 or is_atomic | bool)
Indeed, this is a valid bug which looks easy to address at first glance but we need to think about backward compatibility (especially for OSP)
I don't recall if OSP uses firewalld or iptables, but I guess it's the latter. If so, an idea could be to add a new variable such as firewall_type which would default to iptables. If it is set to iptables, behaviour wouldn't change. Problem is that an upgrade to RHEL 8 based hosts would require and additional workflow to change such setting. Indeed a tricky one. In fact, I guess this part of the playbook[1] should be changed to avoid having firewalld being started and enabled only because the package is installed. I guess there's another bz or github issue about this, as there's a note in documentation stating that firewalld is being enabled even if configure_firewall is set to False. [1] - name: check firewalld installation on redhat or suse command: rpm -q firewalld args: warn: no register: firewalld_pkg_query ignore_errors: true check_mode: no changed_when: false tags: firewall when: not containerized_deployment | bool - when: (firewalld_pkg_query.get('rc', 1) == 0 or is_atomic | bool) block: - name: start firewalld service: name: firewalld state: started enabled: yes
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0312