Bug 1740964 - [Backport 4.1]Unless RHEL worker disable firewalld.service, "oc rsh/exec/logs" does not work
Summary: [Backport 4.1]Unless RHEL worker disable firewalld.service, "oc rsh/exec/logs...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.1.z
Assignee: Daein Park
QA Contact: Weihua Meng
URL:
Whiteboard:
Depends On: 1740439
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-14 02:31 UTC by Weihua Meng
Modified: 2019-08-28 19:55 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Default minimal install of RHEL installs firewalld.service Consequence: Firewall blocks ports and which prevents "oc rsh/exec/logs" from working Fix: Disabled firewalld.service if it is installed to conform to the the same as RHCOS hosts. Result: Remote 'oc' commands work as expected.
Clone Of: 1740439
Environment:
Last Closed: 2019-08-28 19:55:01 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:2547 None None None 2019-08-28 19:55:07 UTC
Github openshift openshift-ansible pull 11842 None None None 2019-08-20 03:06:10 UTC

Description Weihua Meng 2019-08-14 02:31:43 UTC
Backport to OCP 4.1.z expected.

+++ This bug was initially created as a clone of Bug #1740439 +++

Description of problem:

If you add RHEL worker through [0] steps, you cannot use "oc rsh/exec/logs" with "connect: no route to host" error message as follows.

~~~
# oc get pod -o wide
NAME                   READY   STATUS    RESTARTS   AGE   IP             NODE                         NOMINATED NODE   READINESS GATES
...
sdn-vfjxw              1/1     Running   6          43h   10.0.1.10   worker-1.ocp41.rhel.worker   <none>           <none>

# oc rsh sdn-vfjxw
Error from server: error dialing backend: dial tcp 10.0.1.10:10250: connect: no route to host

# oc exec sdn-vfjxw -- date
Error from server: error dialing backend: dial tcp 10.0.1.10:10250: connect: no route to host

# oc logs sdn-vfjxw
Error from server: Get https://worker-1.ocp41.rhel.worker:10250/containerLogs/openshift-sdn/sdn-vfjxw/sdn: dial tcp 10.0.1.10:10250: connect: no route to host
~~~

[0] Adding RHEL compute machines to an OpenShift Container Platform cluster
    [ https://docs.openshift.com/container-platform/4.1/machine_management/adding-rhel-compute.html ]
~~~
$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i /<path>/inventory/hosts playbooks/scaleup.yml 
~~~


Version-Release number of the following components:
rpm -q openshift-ansible
openshift-ansible-4.1.9-201907280809.git.160.39bd430.el7.noarch

rpm -q ansible
ansible-2.7.12-1.el7ae.noarch

ansible --version
ansible 2.7.12
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Jun 11 2019, 12:19:05) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]

How reproducible:

You can always reproduce this issue as adding RHEL worker through [0]steps.

[0] Adding RHEL compute machines to an OpenShift Container Platform cluster
    [ https://docs.openshift.com/container-platform/4.1/machine_management/adding-rhel-compute.html ]
~~~
$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i /<path>/inventory/hosts playbooks/scaleup.yml 
~~~

Steps to Reproduce:
1.
2.
3.

Actual results:

"oc rsh/exec/logs" are failed with "connect: no route to host".

Expected results:

"oc rsh/exec/logs" work well without any errors.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

--- Additional comment from Daein Park on 2019-08-13 02:17:34 UTC ---

I've opened PR here: https://github.com/openshift/openshift-ansible/pull/11824

This issue can be resolved to stop firewalld service manually on the RHEL worker node.

Comment 2 Weihua Meng 2019-08-20 23:14:31 UTC
Fixed.

openshift-ansible-4.1.13-201908201227.git.162.4ce8a66.el7

Comment 4 errata-xmlrpc 2019-08-28 19:55:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2547


Note You need to log in before you can comment on or make changes to this bug.