Bug 1648832 - Installation failed due to invalid local-ipv4 value configured as hostname in OpenStack
Summary: Installation failed due to invalid local-ipv4 value configured as hostname in...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.11.z
Assignee: Tzu-Mainn Chen
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-11-12 09:00 UTC by Daein Park
Modified: 2022-03-13 16:02 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-10 09:04:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 10670 0 None None None 2018-12-03 03:58:51 UTC
Red Hat Product Errata RHBA-2019:0024 0 None None None 2019-01-10 09:05:49 UTC

Description Daein Park 2018-11-12 09:00:56 UTC
Description of problem:

prerequisites.yml is failed with the following error messages on OpenStack provider. Because the hostname sets up  as "xxx.xxx.xxx.xxx,yyy.yyy.yyy.yyy" format.


~~~
TASK [Query DNS for IP address of 10.0.1.1,10.0.2.2] ****************************************
> ok: [master1.example.com] => {"changed": false, "cmd": "getent ahostsv4 10.0.1.1,10.0.2.2 | head -n 1 | awk '{ print $1 }'", "delta": "0:00:04.234768", "end": "2018-11-07 06:28:45.514534", "failed": false, "failed_when_result": false, "rc": 0, "start": "2018-11-07 06:28:41.279799", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
~~~

Version-Release number of selected component (if applicable):

I could verify this issue on v3.9 but it can be problematic in v3.10 and v3.11 either.

openshift v3.9.43
kubernetes v1.9.1+a0ce1bc657

rpm -q openshift-ansible
openshift-ansible-3.9.41-1.git.0.4c55974.el7.noarch

rpm -q ansible
ansible-2.4.6.0-1.el7ae.noarch

ansible --version
ansible 2.4.6.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, May 31 2018, 09:41:32) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]


How reproducible:

If you install OCP on OpenStack, then you can use metadata service using 'curl' and the output is as follows.

# curl http://169.254.169.254/latest/meta-data/local-ipv4
10.0.1.1,10.0.2.2

And then you can reproduce by running prerequisites.yml


Steps to Reproduce:
1.
2.
3.

Actual results:

wrong value would set up as hostname, so it was failed during prerequisites.yml.

Expected results:

the DNS query would resolve as correct hostname.

Additional info:

Root cause is here in openshift_facts.py.

https://github.com/openshift/openshift-ansible/blob/release-3.11/roles/openshift_facts/library/openshift_facts.py#L307-L315

~~~
    for f_var, h_var, ip_var in [('hostname', 'hostname', 'local-ipv4'),
                                 ('public_hostname', 'public-hostname', 'public-ipv4')]:
        try:
            if socket.gethostbyname(metadata['ec2_compat'][h_var]) == metadata['ec2_compat'][ip_var]:
                facts['network'][f_var] = metadata['ec2_compat'][h_var]
            else:
                facts['network'][f_var] = metadata['ec2_compat'][ip_var]
        except socket.gaierror:
            facts['network'][f_var] = metadata['ec2_compat'][ip_var]
~~~

The partial result of openshift_facts role:
~~~
               "provider": {
                    "metadata": {                        
                        "ec2_compat": {
                            ...
                            },
                            "hostname": "master1.example.com",
                            ...                            
                            "local-hostname": "master1.example.com",
                            "local-ipv4": "10.0.1.1,10.0.2.2",
                            ...
                            "public-hostname": "master1.example.com",
                            "public-ipv4": [],
                   ...
                   "name": "openstack",
                    "network": {
                        "hostname": "10.0.1.1,10.0.2.2",
                        "ip": "10.0.1.1",
                        "public_hostname": [],
                        "public_ip": []
                    },
~~~

Comment 3 Scott Dodson 2018-11-12 13:02:44 UTC
A few questions right now, are you configuring the OSP provider? We disabled this metadata inspection when the provider isn't defined in 3.10+.

Comment 4 Daein Park 2018-11-12 23:10:24 UTC
@Scott,

Yes, I am.  "openshift_cloudprovider_kind=openstack" has configured in the inventory file. And CU has worried how to configure the correct hostname on OpenStack if the OCP version is more than v3.9, such as v3.10 and v3.11. As of v3.10 "openshift_hostname" has disabled, so the workaround can not use those versions.

Comment 11 Gaoyun Pei 2018-12-24 08:34:40 UTC
QE don't have an available OpenStack still have such issue[1] to reproduce this OCP installation bug.

So verified latest 3.11 installation on OSP10 works well after this PR merged.

[root@qe-gpei-311node-registry-router-1 ~]# curl http://169.254.169.254/latest/meta-data/local-ipv4
172.16.122.41


ansible-playbook playbooks/prerequisites.yml -v

TASK [Gather Cluster facts] ****************************************************

~~~
          hostname: qe-gpei-311node-registry-router-1
          instance-action: none
          instance-id: i-00d38302
          instance-type: m1.medium
          local-hostname: qe-gpei-311node-registry-router-1
          local-ipv4: 172.16.122.41
~~~

Verified with openshift-ansible-3.11.59-1.git.0.ba8e948.el7


[1]https://bugs.launchpad.net/nova/+bug/1334857

Comment 13 errata-xmlrpc 2019-01-10 09:04:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0024


Note You need to log in before you can comment on or make changes to this bug.