Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1648832

Summary: Installation failed due to invalid local-ipv4 value configured as hostname in OpenStack
Product: OpenShift Container Platform Reporter: Daein Park <dapark>
Component: InstallerAssignee: Tzu-Mainn Chen <tzumainn>
Installer sub component: openshift-ansible QA Contact: Gaoyun Pei <gpei>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: gferrazs, gpei, tkimura
Version: 3.9.0   
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-10 09:04:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Daein Park 2018-11-12 09:00:56 UTC
Description of problem:

prerequisites.yml is failed with the following error messages on OpenStack provider. Because the hostname sets up  as "xxx.xxx.xxx.xxx,yyy.yyy.yyy.yyy" format.


~~~
TASK [Query DNS for IP address of 10.0.1.1,10.0.2.2] ****************************************
> ok: [master1.example.com] => {"changed": false, "cmd": "getent ahostsv4 10.0.1.1,10.0.2.2 | head -n 1 | awk '{ print $1 }'", "delta": "0:00:04.234768", "end": "2018-11-07 06:28:45.514534", "failed": false, "failed_when_result": false, "rc": 0, "start": "2018-11-07 06:28:41.279799", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
~~~

Version-Release number of selected component (if applicable):

I could verify this issue on v3.9 but it can be problematic in v3.10 and v3.11 either.

openshift v3.9.43
kubernetes v1.9.1+a0ce1bc657

rpm -q openshift-ansible
openshift-ansible-3.9.41-1.git.0.4c55974.el7.noarch

rpm -q ansible
ansible-2.4.6.0-1.el7ae.noarch

ansible --version
ansible 2.4.6.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, May 31 2018, 09:41:32) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]


How reproducible:

If you install OCP on OpenStack, then you can use metadata service using 'curl' and the output is as follows.

# curl http://169.254.169.254/latest/meta-data/local-ipv4
10.0.1.1,10.0.2.2

And then you can reproduce by running prerequisites.yml


Steps to Reproduce:
1.
2.
3.

Actual results:

wrong value would set up as hostname, so it was failed during prerequisites.yml.

Expected results:

the DNS query would resolve as correct hostname.

Additional info:

Root cause is here in openshift_facts.py.

https://github.com/openshift/openshift-ansible/blob/release-3.11/roles/openshift_facts/library/openshift_facts.py#L307-L315

~~~
    for f_var, h_var, ip_var in [('hostname', 'hostname', 'local-ipv4'),
                                 ('public_hostname', 'public-hostname', 'public-ipv4')]:
        try:
            if socket.gethostbyname(metadata['ec2_compat'][h_var]) == metadata['ec2_compat'][ip_var]:
                facts['network'][f_var] = metadata['ec2_compat'][h_var]
            else:
                facts['network'][f_var] = metadata['ec2_compat'][ip_var]
        except socket.gaierror:
            facts['network'][f_var] = metadata['ec2_compat'][ip_var]
~~~

The partial result of openshift_facts role:
~~~
               "provider": {
                    "metadata": {                        
                        "ec2_compat": {
                            ...
                            },
                            "hostname": "master1.example.com",
                            ...                            
                            "local-hostname": "master1.example.com",
                            "local-ipv4": "10.0.1.1,10.0.2.2",
                            ...
                            "public-hostname": "master1.example.com",
                            "public-ipv4": [],
                   ...
                   "name": "openstack",
                    "network": {
                        "hostname": "10.0.1.1,10.0.2.2",
                        "ip": "10.0.1.1",
                        "public_hostname": [],
                        "public_ip": []
                    },
~~~

Comment 3 Scott Dodson 2018-11-12 13:02:44 UTC
A few questions right now, are you configuring the OSP provider? We disabled this metadata inspection when the provider isn't defined in 3.10+.

Comment 4 Daein Park 2018-11-12 23:10:24 UTC
@Scott,

Yes, I am.  "openshift_cloudprovider_kind=openstack" has configured in the inventory file. And CU has worried how to configure the correct hostname on OpenStack if the OCP version is more than v3.9, such as v3.10 and v3.11. As of v3.10 "openshift_hostname" has disabled, so the workaround can not use those versions.

Comment 11 Gaoyun Pei 2018-12-24 08:34:40 UTC
QE don't have an available OpenStack still have such issue[1] to reproduce this OCP installation bug.

So verified latest 3.11 installation on OSP10 works well after this PR merged.

[root@qe-gpei-311node-registry-router-1 ~]# curl http://169.254.169.254/latest/meta-data/local-ipv4
172.16.122.41


ansible-playbook playbooks/prerequisites.yml -v

TASK [Gather Cluster facts] ****************************************************

~~~
          hostname: qe-gpei-311node-registry-router-1
          instance-action: none
          instance-id: i-00d38302
          instance-type: m1.medium
          local-hostname: qe-gpei-311node-registry-router-1
          local-ipv4: 172.16.122.41
~~~

Verified with openshift-ansible-3.11.59-1.git.0.ba8e948.el7


[1]https://bugs.launchpad.net/nova/+bug/1334857

Comment 13 errata-xmlrpc 2019-01-10 09:04:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0024