Bug 1251968 - [Tracker] - Hosted engine setup fails with localhost.localdomain could not be used as a valid FQDN
[Tracker] - Hosted engine setup fails with localhost.localdomain could not be...
Status: CLOSED CURRENTRELEASE
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: Network (Show other bugs)
1.3.0
Unspecified Unspecified
low Severity medium (vote)
: ovirt-3.6.7
: 1.3.7.0
Assigned To: Simone Tiraboschi
Nikolai Sednev
: Triaged
: 1188675 1347663 (view as bug list)
Depends On:
Blocks: 1339216
  Show dependency treegraph
 
Reported: 2015-08-10 08:31 EDT by Martin Sivák
Modified: 2016-07-04 08:30 EDT (History)
18 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
hosted-engine setup was failing with a not that clear error if the host address is localhost.localdomain. Now hosted-engine-setup on additional hosts lets the user review the hostname and refuses to deploy with localhost.localdomain
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-07-04 08:30:31 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Integration
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑3.6.z+
rule-engine: planning_ack+
dfediuck: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)
Displayed error (35.42 KB, image/png)
2015-08-10 08:31 EDT, Martin Sivák
no flags Details
Log file (260.43 KB, text/plain)
2015-08-10 08:32 EDT, Martin Sivák
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 49310 master MERGED setup: ask host address on additional hosts 2016-05-25 08:50 EDT
oVirt gerrit 56856 ovirt-hosted-engine-setup-1.3 MERGED setup: ask host address on additional hosts 2016-05-24 07:45 EDT

  None (edit)
Description Martin Sivák 2015-08-10 08:31:59 EDT
Created attachment 1061022 [details]
Displayed error

Description of problem:

I am trying to install a hosted engine host using ovirt-hosted-engine-setup-1.3.0-0.0.master.20150729070044 (git 26149d7) and ovirt-engine-appliance-20150802.0-1 on RHEL 7.1 host.

The setup fails after I fill in all the details, complaining about localhost.localdomain even though I entered the real FQDN as you can see in the attached screenshot.

How reproducible:

Always


Steps to Reproduce:
1. Start the installation using ovirt-hosted-engine-setup
2. Enter the usual values (storage, disk boot, OVA image
3. Enter fqdn (he-vm04.rhev.lab.eng.brq.redhat.com in my case)
4. Answer all the other questions..

Actual results:

localhost.localdomain related error

Expected results:

the setup continues and installs the hosted engine

Additional info:
Comment 1 Martin Sivák 2015-08-10 08:32:38 EDT
Created attachment 1061024 [details]
Log file
Comment 2 Simone Tiraboschi 2015-08-10 09:09:34 EDT
The error is probably not that clear but the issue is relative to the host hostname which is localhost.localdomain and so it will not be uniquely resolvable by the engine VM.
Please see:
https://bugzilla.redhat.com/show_bug.cgi?id=1178535#c10

As Martin pointed out, now the user could also add additional host from the web ui just using its IP address and so we have to review that decision.
Comment 3 Martin Sivák 2015-08-10 09:29:41 EDT
Thanks for clarification. The host's hostname is localhost.localdomain indeed.

I see couple of things that we should do here:

1) Improve the error reporting - I had no idea the setup tries to resolve host's name before Simone told me (we did not do that in the past and it is not obvious from the log file)

2) Use socket.getfqdn() in the code that tries to do the resolving as Python documentation states: Note: gethostname() doesn’t always return the fully qualified domain name; use getfqdn()

3) Give the user a chance to review and change the hostname (/etc/hostname does not have to be the name for the host in the DNS system)

4) Allow using the IP directly (we still support that in the ovirt-engine)

5) Warn the user that localhost will cause trouble with migrations as libvirt specifically checks for that name (it should not as vdsm provide all the extra information needed, but it does and we have to live with it atm). Libvirt does not require that the hostname is resolvable, it just has to be different from localhost.
Comment 4 Yaniv Lavi 2015-08-20 07:33:38 EDT
Can you please open a separate RFE for each of the above request and block this one as a tracker?
Comment 5 Yaniv Lavi 2015-08-20 07:35:38 EDT
Can you please open a bug on engine and the way we determine the host name?
Comment 6 Martin Sivák 2015-08-24 10:16:03 EDT
Yaniv: This is all related to a single code block in the hosted engine setup. We do not even have UI for that part (3rd point). It is up to Simone if he wants to track the ideas separately, but I guess all will end up in the same patchset anyway..
Comment 7 Sandro Bonazzola 2015-09-29 05:09:53 EDT
This is an automated message.
oVirt 3.6.0 RC1 has been released. This bug has no target release and still have target milestone set to 3.6.0-rc.
Please review this bug and set target milestone and release to one of the next releases.
Comment 8 Sandro Bonazzola 2015-10-01 04:09:21 EDT
Postponing since it's not a 3.6.0 blocker.
Comment 9 Red Hat Bugzilla Rules Engine 2015-10-19 06:49:14 EDT
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Comment 10 Sandro Bonazzola 2015-10-28 10:10:05 EDT
(In reply to Yaniv Dary from comment #5)
> Can you please open a bug on engine and the way we determine the host name?

Yaniv, needinfo on me was cleared by Martin on comment #6.
I'm also not sure to understand what I should ask on the bug other than what already mentioned in this bug itself.
Comment 11 Yedidyah Bar David 2015-10-28 10:27:43 EDT
Wouldn't this bug be solved by fixing bug 1188675 ?
Comment 12 Simone Tiraboschi 2015-10-28 10:55:53 EDT
*** Bug 1188675 has been marked as a duplicate of this bug. ***
Comment 13 Simone Tiraboschi 2015-10-28 12:07:27 EDT
Probably it's better to focus on the requirement before taking any action: if we allow the user to enter custom values we have to somehow validate.

hosted-engine setup is calling host.add on the REST API to add the host where is running on to the engine.
The host address is one of the parameters of that call so all the point is how to validate it to exclude further issues.

In 3.4 we were adding the host using the IP address of the interface were hosted-engine-setup created the management bridge on as the host address.

We changed it for two reasons:
- showing just that in hosted-engine --status is probably less usable than showing the hostname
- we found that we add an issue with live migrations cause hosted-engine-setup was temporary generating generating vdsm certs with the hostname and than host-deploy was overwriting with the address we passed to host.add: https://bugzilla.redhat.com/show_bug.cgi?id=1178535#c10 so if we allow the user to customize it we have also to fix there to avoid it again.

so:
- localhost.localdomain is of-course not valid and the address should be well-formed
- an IP address is acceptable? should it match with the management bridge IP? what if the host has a different network for migration?
Comment 14 Yaniv Lavi 2015-10-28 12:34:28 EDT
(In reply to Simone Tiraboschi from comment #13)
> Probably it's better to focus on the requirement before taking any action:
> if we allow the user to enter custom values we have to somehow validate.
> 
> hosted-engine setup is calling host.add on the REST API to add the host
> where is running on to the engine.
> The host address is one of the parameters of that call so all the point is
> how to validate it to exclude further issues.
> 
> In 3.4 we were adding the host using the IP address of the interface were
> hosted-engine-setup created the management bridge on as the host address.
> 
> We changed it for two reasons:
> - showing just that in hosted-engine --status is probably less usable than
> showing the hostname
> - we found that we add an issue with live migrations cause
> hosted-engine-setup was temporary generating generating vdsm certs with the
> hostname and than host-deploy was overwriting with the address we passed to
> host.add: https://bugzilla.redhat.com/show_bug.cgi?id=1178535#c10 so if we
> allow the user to customize it we have also to fix there to avoid it again.
> 
> so:
> - localhost.localdomain is of-course not valid and the address should be
> well-formed
> - an IP address is acceptable? should it match with the management bridge
> IP? what if the host has a different network for migration?


We should require DNS resolvable FQDN. IP should not be supported.
Comment 15 Simone Tiraboschi 2015-10-28 13:03:09 EDT
libvirt seams quite sensitive on TLS CN verification:

[root@c71het20151028 ~]# virsh -c qemu+tls://c71het20151028/system 
2015-10-28 16:36:02.248+0000: 13049: info : libvirt version: 1.2.8, package: 16.el7_1.4 (CentOS BuildSystem <http://bugs.centos.org>, 2015-09-15-14:00:05, worker1.bsys.centos.org)
2015-10-28 16:36:02.248+0000: 13049: warning : virNetTLSContextCheckCertificate:1145 : Certificate check failed Certificate [session] owner does not match the hostname c71het20151028
error: failed to connect to the hypervisor
error: authentication failed: Failed to verify peer's certificate

[root@c71het20151028 ~]# virsh -c qemu+tls://c71het20151028.localdomain/system 
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
       'quit' to quit

virsh # quit


We need to properly understand this https://bugzilla.redhat.com/show_bug.cgi?id=1178535 a bit better otherwise I feel it will happen again if we allow the user to use custom values.
Comment 16 Martin Sivák 2015-10-29 06:53:54 EDT
> We should require DNS resolvable FQDN. IP should not be supported.

Why? It is still supported by the engine.
Comment 17 Yaniv Lavi 2015-10-29 08:41:55 EDT
(In reply to Martin Sivák from comment #16)
> > We should require DNS resolvable FQDN. IP should not be supported.
> 
> Why? It is still supported by the engine.

The fact it might happen to work, doesn't mean it is the design.
DNS resolvable FQDN is what we make sure to work.
Comment 18 Mike McCune 2016-03-28 19:37:25 EDT
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune@redhat.com with any questions
Comment 19 Sahina Bose 2016-04-21 02:45:03 EDT
This RFE is also required when we have multiple FQDNs for host, and need to specify the FQDN to use during additional host deployment - otherwise the bridge ends up being created on wrong interface. See bug 1326709
Comment 20 Sandro Bonazzola 2016-05-02 05:48:39 EDT
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.
Comment 21 Sandro Bonazzola 2016-05-02 06:02:06 EDT
Re-targeting to 3.6.7 for Gluster sake.
Comment 22 Sahina Bose 2016-05-05 07:56:33 EDT
We have a workaround - ensuring that hostname correctly resolves to required FQDN before deploying HE. - so we can retarget back to 4.0 if there's a bandwidth constraint
Comment 23 Simone Tiraboschi 2016-05-05 08:04:18 EDT
It's basically ready.
Comment 24 Nikolai Sednev 2016-05-29 13:40:32 EDT
1)Can you please provide desirable reproduction steps for this bug?
2)Current status is as follows:
HE deployment using rhevm-appliance-20160515.0-1.el7ev.noarch on NFS has succeeded, clean host booted with properly assigned FQDN, so I'm not sure that this reproduction is sufficient.
Components on host:
ovirt-vmconsole-1.0.2-2.el7ev.noarch
ovirt-hosted-engine-setup-1.3.7.0-1.el7ev.noarch
sanlock-3.2.4-2.el7_2.x86_64
mom-0.5.3-1.el7ev.noarch
ovirt-host-deploy-1.4.1-1.el7ev.noarch
ovirt-setup-lib-1.0.1-1.el7ev.noarch
vdsm-4.17.29-0.el7ev.noarch
rhevm-sdk-python-3.6.5.1-1.el7ev.noarch
libvirt-client-1.2.17-13.el7_2.5.x86_64
ovirt-hosted-engine-ha-1.3.5.6-1.el7ev.noarch
ovirt-vmconsole-host-1.0.2-2.el7ev.noarch
rhevm-appliance-20160515.0-1.el7ev.noarch
Linux version 3.10.0-327.22.1.el7.x86_64 (mockbuild@x86-034.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Mon May 16 13:31:48 EDT 2016
Red Hat Enterprise Linux Server release 7.2 (Maipo)
Linux 3.10.0-327.22.1.el7.x86_64 #1 SMP Mon May 16 13:31:48 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

Engine:
rhevm-cli-3.6.2.1-1.el6ev.noarch                                                      
rhevm-dwh-setup-3.6.6-1.el6ev.noarch                                                     
rhevm-userportal-3.6.7-0.1.el6.noarch                                                     
rhevm-spice-client-x64-cab-3.6-7.el6.noarch                                                        
rhevm-setup-plugins-3.6.5-1.el6ev.noarch                                                     
rhevm-setup-plugin-ovirt-engine-3.6.7-0.1.el6.noarch                                                           
rhevm-extensions-api-impl-3.6.7-0.1.el6.noarch                                                           
rhevm-tools-backup-3.6.7-0.1.el6.noarch
rhevm-dbscripts-3.6.7-0.1.el6.noarch
rhevm-backend-3.6.7-0.1.el6.noarch
rhevm-dependencies-3.6.0-1.el6ev.noarch
rhevm-spice-client-x86-cab-3.6-7.el6.noarch
rhevm-guest-agent-common-1.0.11-6.el6ev.noarch
rhevm-image-uploader-3.6.0-1.el6ev.noarch
rhevm-lib-3.6.7-0.1.el6.noarch
rhevm-setup-base-3.6.7-0.1.el6.noarch
rhevm-setup-plugin-websocket-proxy-3.6.7-0.1.el6.noarch
rhevm-vmconsole-proxy-helper-3.6.7-0.1.el6.noarch
rhevm-branding-rhev-3.6.0-10.el6ev.noarch
rhevm-reports-setup-3.6.5.1-1.el6ev.noarch
rhevm-webadmin-portal-3.6.7-0.1.el6.noarch
rhevm-3.6.7-0.1.el6.noarch
rhevm-log-collector-3.6.1-1.el6ev.noarch
rhevm-spice-client-x86-msi-3.6-7.el6.noarch
rhevm-setup-plugin-vmconsole-proxy-helper-3.6.7-0.1.el6.noarch
rhevm-setup-3.6.7-0.1.el6.noarch
rhevm-doc-3.6.7-1.el6eng.noarch
rhevm-reports-3.6.5.1-1.el6ev.noarch
rhevm-tools-3.6.7-0.1.el6.noarch
rhevm-websocket-proxy-3.6.7-0.1.el6.noarch
rhevm-dwh-3.6.6-1.el6ev.noarch
rhevm-restapi-3.6.7-0.1.el6.noarch
rhevm-spice-client-x64-msi-3.6-7.el6.noarch
rhevm-iso-uploader-3.6.0-1.el6ev.noarch
rhevm-setup-plugin-ovirt-engine-common-3.6.7-0.1.el6.noarch
rhevm-sdk-python-3.6.5.1-1.el6ev.noarch
Red Hat Enterprise Linux Server release 6.8 (Santiago)
Linux version 2.6.32-642.el6.x86_64 (mockbuild@x86-033.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) ) #1 SMP Wed Apr 13 00:51:26 EDT 2016
Linux 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux


The rhevm-appliance-20160515.0-1.el7ev.noarch comes with Red Hat Enterprise Virtualization Manager Version: 3.6.6.2-0.1.el6, so I had to update the engine right after it's deployment had been finished, I've also added reports and dwh, rhevm-dwh.noarch 0:3.6.6-1.el6ev, and ovirt-vmconsole-proxy-1.0.2-2.el6ev.noarch, while host was set to global maintenance and after engine was upgraded, I've reactivated the host back.
Comment 25 Simone Tiraboschi 2016-05-30 03:39:28 EDT
(In reply to Nikolai Sednev from comment #24)
> 1)Can you please provide desirable reproduction steps for this bug?

Deploy the first host and the engine using the appliance, try adding an additional host with 'hosted-engine --deploy': now the script will let you validate the host address so:
1. ensure that the proposed valued is correct
2. try replacing it with 'localhost.localdomain'
Comment 26 Nikolai Sednev 2016-05-30 06:15:19 EDT
(In reply to Simone Tiraboschi from comment #25)
> (In reply to Nikolai Sednev from comment #24)
> > 1)Can you please provide desirable reproduction steps for this bug?
> 
> Deploy the first host and the engine using the appliance, try adding an
> additional host with 'hosted-engine --deploy': now the script will let you
> validate the host address so:
> 1. ensure that the proposed valued is correct
> 2. try replacing it with 'localhost.localdomain'

The initial deployment on first host using appliance was made as described in comment #24 and was successful. Addition of addition host with changed /etc/hostname to localhost.localdomain and hostnamectl set-hostname localhost.localdomain revealed that deployment warns customer about problematic FQDN and it's resolution issue, so then if IP address given, that also does not resolve the issue and eventually I've provided proper FQDN of the host and succeeded:

[root@alma03 ~]# cat /etc/hostname
localhost.localdomain
[root@alma03 ~]# hostnamectl set-hostname localhost.localdomain
[root@alma03 ~]# hostname
localhost.localdomain
[root@alma03 ~]# hosted-engine --deploy
[ INFO  ] Stage: Initializing
[ INFO  ] Generating a temporary VNC password.
[ INFO  ] Stage: Environment setup
          Continuing will configure this host for serving as hypervisor and create a VM where you have to install the engine afterwards.
          Are you sure you want to continue? (Yes, No)[Yes]: 
          Configuration files: []
          Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160530120807-yyobn3.log
          Version: otopi-1.4.1 (otopi-1.4.1-1.el7ev)
          It has been detected that this program is executed through an SSH connection without using screen.
          Continuing with the installation may lead to broken installation if the network connection fails.
          It is highly recommended to abort the installation and run it inside a screen session using command "screen".
          Do you want to continue anyway? (Yes, No)[No]: yes
[ INFO  ] Hardware supports virtualization
[ INFO  ] Stage: Environment packages setup
[ INFO  ] Stage: Programs detection
[ INFO  ] Stage: Environment setup
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Generating libvirt-spice certificates
[ INFO  ] Stage: Environment customization
         
          --== STORAGE CONFIGURATION ==--
         
          During customization use CTRL-D to abort.
          Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs3, nfs4)[nfs3]: 
          Please specify the full shared storage connection path to use (example: host:/path): 10.35.64.11:/vol/RHEV/Virt/nsednev_3_6_HE_2
          The specified storage location already contains a data domain. Is this an additional host setup (Yes, No)[Yes]? 
[ INFO  ] Installing on additional host
          Please specify the Host ID [Must be integer, default: 2]: 
         
          --== SYSTEM CONFIGURATION ==--
         
[WARNING] A configuration file must be supplied to deploy Hosted Engine on an additional host.
[ INFO  ] Answer file successfully loaded
         
          --== NETWORK CONFIGURATION ==--
         
[ INFO  ] Additional host deployment, firewall manager is 'iptables'
          The following CPU types are supported by this host:
                 - model_SandyBridge: Intel SandyBridge Family
                 - model_Westmere: Intel Westmere Family
                 - model_Nehalem: Intel Nehalem Family
                 - model_Penryn: Intel Penryn Family
                 - model_Conroe: Intel Conroe Family
         
          --== HOSTED ENGINE CONFIGURATION ==--
         
          Enter the name which will be used to identify this host inside the Administrator Portal [hosted_engine_2]: 
          Enter 'admin@internal' user password that will be used for accessing the Administrator Portal: 
          Confirm 'admin@internal' user password: 
[ INFO  ] Stage: Setup validation
[WARNING] Cannot validate host name settings, reason: resolved host does not match any of the local addresses
          Please provide the address of this host.
          Note: The engine VM and all the other hosts should be able to correctly resolve it.
          Host address:  [localhost.localdomain]: 
[WARNING] Failed to resolve localhost.localdomain using DNS, it can be resolved only locally
[ ERROR ] Host name is not valid: localhost.localdomain resolves to 127.0.0.1 and not all of them can be mapped to non loopback devices on this host
          Please provide the address of this host.
          Note: The engine VM and all the other hosts should be able to correctly resolve it.
          Host address:  [localhost.localdomain]: 10.35.117.24
[ ERROR ] Host name is not valid: 10.35.117.24 is an IP address and not a FQDN. A FQDN is needed to be able to generate certificates correctly.
          Please provide the address of this host.
          Note: The engine VM and all the other hosts should be able to correctly resolve it.
          Host address:  [localhost.localdomain]: alma03.qa.lab.tlv.redhat.com
         
          --== CONFIGURATION PREVIEW ==--
         
          Engine FQDN                        : nsednev-he-2.qa.lab.tlv.redhat.com
          Bridge name                        : ovirtmgmt
          Host address                       : alma03.qa.lab.tlv.redhat.com
          SSH daemon port                    : 22
          Firewall manager                   : iptables
          Gateway address                    : 10.35.117.254
          Host name for web application      : hosted_engine_2
          Storage Domain type                : nfs3
          Host ID                            : 2
          Image size GB                      : 50
          GlusterFS Share Name               : hosted_engine_glusterfs
          GlusterFS Brick Provisioning       : False
          Storage connection                 : 10.35.64.11:/vol/RHEV/Virt/nsednev_3_6_HE_2
          Console type                       : vnc
          Memory size MB                     : 4096
          MAC address                        : 00:16:3E:7B:BB:BB
          Boot type                          : disk
          Number of CPUs                     : 4
          Restart engine VM after engine-setup: True
          CPU Type                           : model_SandyBridge
[ INFO  ] Stage: Transaction setup
[ INFO  ] Stage: Misc configuration
[ INFO  ] Stage: Package installation
[ INFO  ] Stage: Misc configuration
[ INFO  ] Configuring libvirt
[ INFO  ] Configuring VDSM
[ INFO  ] Starting vdsmd
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Waiting for VDSM hardware info
[ INFO  ] Configuring VM
[ INFO  ] Updating hosted-engine configuration
[ INFO  ] Stage: Transaction commit
[ INFO  ] Stage: Closing up
[ INFO  ] Acquiring internal CA cert from the engine
[ INFO  ] The following CA certificate is going to be used, please immediately interrupt if not correct:
[ INFO  ] Issuer: C=US, O=qa.lab.tlv.redhat.com, CN=nsednev-he-2.qa.lab.tlv.redhat.com.25977, Subject: C=US, O=qa.lab.tlv.redhat.com, CN=nsednev-he-2.qa.lab.tlv.redhat.com.25977, Fingerprint (SHA-1): 2EA33E00CF9BCA3774DA08D708110F570F655192
[ INFO  ] Connecting to the Engine
[ INFO  ] Waiting for the host to become operational in the engine. This may take several minutes...
[ INFO  ] Still waiting for VDSM host to become operational...
[ INFO  ] The VDSM Host is now operational
[ INFO  ] Enabling and starting HA services
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160530131113.conf'
[ INFO  ] Generating answer file '/etc/ovirt-hosted-engine/answers.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination
[ INFO  ] Hosted Engine successfully set up
[root@alma03 ~]# 



In case that /etc/hostname equal to resolvable and proper FQDN, the addition succeeds.
Comment 27 Simone Tiraboschi 2016-06-17 09:51:13 EDT
*** Bug 1347663 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.