Bug 1740964

Summary: [Backport 4.1]Unless RHEL worker disable firewalld.service, "oc rsh/exec/logs" does not work
Product: OpenShift Container Platform Reporter: Weihua Meng <wmeng>
Component: InstallerAssignee: Daein Park <dapark>
Installer sub component: openshift-ansible QA Contact: Weihua Meng <wmeng>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: dapark, rteague
Version: 4.1.0   
Target Milestone: ---   
Target Release: 4.1.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Default minimal install of RHEL installs firewalld.service Consequence: Firewall blocks ports and which prevents "oc rsh/exec/logs" from working Fix: Disabled firewalld.service if it is installed to conform to the the same as RHCOS hosts. Result: Remote 'oc' commands work as expected.
Story Points: ---
Clone Of: 1740439 Environment:
Last Closed: 2019-08-28 19:55:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1740439    
Bug Blocks:    

Description Weihua Meng 2019-08-14 02:31:43 UTC
Backport to OCP 4.1.z expected.

+++ This bug was initially created as a clone of Bug #1740439 +++

Description of problem:

If you add RHEL worker through [0] steps, you cannot use "oc rsh/exec/logs" with "connect: no route to host" error message as follows.

~~~
# oc get pod -o wide
NAME                   READY   STATUS    RESTARTS   AGE   IP             NODE                         NOMINATED NODE   READINESS GATES
...
sdn-vfjxw              1/1     Running   6          43h   10.0.1.10   worker-1.ocp41.rhel.worker   <none>           <none>

# oc rsh sdn-vfjxw
Error from server: error dialing backend: dial tcp 10.0.1.10:10250: connect: no route to host

# oc exec sdn-vfjxw -- date
Error from server: error dialing backend: dial tcp 10.0.1.10:10250: connect: no route to host

# oc logs sdn-vfjxw
Error from server: Get https://worker-1.ocp41.rhel.worker:10250/containerLogs/openshift-sdn/sdn-vfjxw/sdn: dial tcp 10.0.1.10:10250: connect: no route to host
~~~

[0] Adding RHEL compute machines to an OpenShift Container Platform cluster
    [ https://docs.openshift.com/container-platform/4.1/machine_management/adding-rhel-compute.html ]
~~~
$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i /<path>/inventory/hosts playbooks/scaleup.yml 
~~~


Version-Release number of the following components:
rpm -q openshift-ansible
openshift-ansible-4.1.9-201907280809.git.160.39bd430.el7.noarch

rpm -q ansible
ansible-2.7.12-1.el7ae.noarch

ansible --version
ansible 2.7.12
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Jun 11 2019, 12:19:05) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]

How reproducible:

You can always reproduce this issue as adding RHEL worker through [0]steps.

[0] Adding RHEL compute machines to an OpenShift Container Platform cluster
    [ https://docs.openshift.com/container-platform/4.1/machine_management/adding-rhel-compute.html ]
~~~
$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i /<path>/inventory/hosts playbooks/scaleup.yml 
~~~

Steps to Reproduce:
1.
2.
3.

Actual results:

"oc rsh/exec/logs" are failed with "connect: no route to host".

Expected results:

"oc rsh/exec/logs" work well without any errors.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

--- Additional comment from Daein Park on 2019-08-13 02:17:34 UTC ---

I've opened PR here: https://github.com/openshift/openshift-ansible/pull/11824

This issue can be resolved to stop firewalld service manually on the RHEL worker node.

Comment 2 Weihua Meng 2019-08-20 23:14:31 UTC
Fixed.

openshift-ansible-4.1.13-201908201227.git.162.4ce8a66.el7

Comment 4 errata-xmlrpc 2019-08-28 19:55:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2547