Bug 1293578 - Ansible should open port '1936' in iptables
Ansible should open port '1936' in iptables
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Scott Dodson
Ma xiaoqiang
: Regression
Depends On: 1301654
Blocks: 1267746
  Show dependency treegraph
Reported: 2015-12-22 04:23 EST by Ma xiaoqiang
Modified: 2016-07-03 20:47 EDT (History)
14 users (show)

See Also:
Fixed In Version: atomic-openshift-
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1301654 1316615 (view as bug list)
Last Closed: 2016-02-23 15:31:41 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Ma xiaoqiang 2015-12-22 04:23:18 EST
Description of problem:
Can not start router pod after installtion. Ansible should open port '1936' in iptables

Version-Release number of selected component (if applicable):
https://github.com/openshift/openshift-ansible master

How reproducible:

Steps to Reproduce:

1. create router after installation
2. check the router
# oc describe pod router-1-71w5n

Actual results:
  24s   4s    3 {kubelet openshift-159.lab.eng.nay.redhat.com}  spec.containers{router}     Unhealthy Readiness probe failed: Get dial tcp no route to host

the port '1936' is not opened on system.

Expected results:
create router successfully

Additional info:
run the following command on nodes
iptables -I OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 1936 -j ACCEPT
Comment 1 Ma xiaoqiang 2016-01-12 02:04:57 EST
This issue causes that the router pod can not be started.
Comment 19 Scott Dodson 2016-01-28 10:57:42 EST
*** Bug 1231127 has been marked as a duplicate of this bug. ***
Comment 25 openshift-github-bot 2016-02-02 07:20:37 EST
Commit pushed to master at https://github.com/openshift/origin

Bug 1293578 - The Router liveness/readiness probes should always use localhost

Pods using the hostNetwork are getting the default IP from the Node entry for
their liveness probe today.  In some common misconfigurations this IP will not
actually be physically present on the Node running the probes and therefore
will not be short-circuited to use the loopback interface.  In those cases the
probes will fail unless an admin manually opens up port that allows the probe
to pass.

We're putting checks in place for this situation but this seems like a
reasonable safeguard to make sure a critical piece of infrastructure comes up
the first time.
Comment 26 Scott Dodson 2016-02-03 13:23:33 EST
This has been fixed via a two pronged approach.

1) The new build should use localhost for the router liveness probes

2) openshift-ansible will now ensure openshift_hostname resolves to an ip address on the host in question. If it detects that the hostname does not it will pause the install waiting for the user to abort or continue. This behavior can be overridden by setting `openshift_override_hostname_check=true` which will simply pause the install for 10 seconds then move on.

See https://github.com/openshift/openshift-ansible/pull/1291 for details on the installer change
Comment 33 Ma xiaoqiang 2016-02-04 19:40:19 EST
Check on the Errata puddle.

The router pod is running, move this issue to VERIFIED.
Comment 35 errata-xmlrpc 2016-02-23 15:31:41 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

Comment 36 Kenjiro Nakayama 2016-03-09 22:52:54 EST
Scott, Brenton

Because of this fix, we don't necessary to open 1936 port now, since the livenessProbe prove access to the localhost, right?

As this bz's subject is "Ansible should open port '1936' in iptables", the fix looks like opening the 1936 by ansible installer, but I believe it is wrong.

If my understanding is correct, I hope either of you update the doc https://docs.openshift.com/enterprise/3.1/release_notes/ose_3_1_release_notes.html#ose-3-1-1-known-issues
Comment 37 Scott Dodson 2016-03-10 09:20:44 EST

That's correct, we no longer need to open port 1936. I'm not sure if we should retroactively change the title of this bug or not.

PR to clarify those docs

Note You need to log in before you can comment on or make changes to this bug.