Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1609138 - Node unreachable when deploy logging
Node unreachable when deploy logging
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging (Show other bugs)
3.11.0
Unspecified Unspecified
unspecified Severity high
: ---
: 3.11.0
Assigned To: ewolinet
Anping Li
:
: 1609131 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-07-27 02:07 EDT by Qiaoling Tang
Modified: 2018-10-11 03:22 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
As part of installing the ES5 stack, we need to create a sysctl file for the nodes that ES runs on. This was to fix the way we were evaluating which nodes/ansible hosts to run the tasks against.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-10-11 03:22:24 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 None None None 2018-10-11 03:22 EDT

  None (edit)
Comment 1 Qiaoling Tang 2018-07-27 02:21:24 EDT
Ansible version:
 ansible-2.6.1-1.el7ae.noarch
Comment 3 Anping Li 2018-07-27 06:11:42 EDT
*** Bug 1609131 has been marked as a duplicate of this bug. ***
Comment 5 Rich Megginson 2018-07-27 12:02:28 EDT
Please provide the entire ansible inventory file, and any -e parameters you pass on the ansible-playbook command line, and any vars.yaml files you pass in with -e@vars.yaml
Comment 7 Rich Megginson 2018-07-30 19:08:40 EDT
Does this work?

oc project default
oc get pods
# look for the router pod e.g.
# router-1-njpz6              1/1       Running   0          12m
oc exec router-1-xxxx -- ls

What happens?
Comment 8 Rich Megginson 2018-07-30 19:12:44 EDT
I get this error:

$ oc exec router-1-njpz6 -- ls
Error from server: error dialing backend: dial tcp: lookup infra-node-0.ocp311.rmeggins.test on 192.168.99.15:53: no such host

This is with an openshift on openstack deployment, without logging.

It seems that the oc exec command is attempting to ssh to the _external_ fqdn of the node rather than the _internal_ cluster IP address:

$ oc get nodes
NAME                                STATUS    ROLES     AGE       VERSION
app-node-0.ocp311.rmeggins.test     Ready     compute   18m       v1.11.0+d4cacc0
app-node-1.ocp311.rmeggins.test     Ready     compute   18m       v1.11.0+d4cacc0
infra-node-0.ocp311.rmeggins.test   Ready     infra     18m       v1.11.0+d4cacc0
master-0.ocp311.rmeggins.test       Ready     master    21m       v1.11.0+d4cacc0

$ oc get node master-0.ocp311.rmeggins.test
status:
  addresses:
  - address: 192.168.99.15
    type: InternalIP
  - address: master-0.ocp311.rmeggins.test
    type: Hostname

Did something change in ocp 3.11 that makes it use the external FQDN for the node names instead of the internal names/IP addresses?  Also, in 3.9, the node addresses were both internal IP addresses e.g.

status:
  addresses:
  - address: 192.168.99.15
    type: InternalIP
  - address: 192.168.99.15
    type: Hostname

So I don't know if this is a logging problem.
Comment 11 ewolinet 2018-07-31 10:22:00 EDT
Can you try this again with the latest ansible changes for the logging playbook?
We had done just that -- map node names back to the inventory names. This was originally done as a fix for oc cluster up --logging

It had merged in < 24 hours ago.

https://github.com/openshift/openshift-ansible/pull/9267
Comment 12 Qiaoling Tang 2018-08-01 02:06:34 EDT
Tried to use the latest  playbooks/openshift-logging/private/config.yml mentioned by ewolinet to deploy logging, the playbook ran successfully without any error, and all pod are running and ready.
Comment 13 Qiaoling Tang 2018-08-01 02:07:39 EDT
According to comment 12 , I removed the keyword "TestBlocker"
Comment 14 Jeff Cantrill 2018-08-01 11:00:13 EDT
Please close if this is no longer an issue.
Comment 15 Qiaoling Tang 2018-08-01 20:27:06 EDT
Wait for a new official 3.11 puddle.
Comment 17 Qiaoling Tang 2018-08-23 02:34:47 EDT
Verified on 

openshift-ansible-docs-3.11.0-0.19.0.git.0.ebd1bf9None.noarch
openshift-ansible-roles-3.11.0-0.19.0.git.0.ebd1bf9None.noarch
openshift-ansible-3.11.0-0.19.0.git.0.ebd1bf9None.noarch
openshift-ansible-playbooks-3.11.0-0.19.0.git.0.ebd1bf9None.noarch
Comment 19 errata-xmlrpc 2018-10-11 03:22:24 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652

Note You need to log in before you can comment on or make changes to this bug.