Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1145678

Summary: If host is not accessible by its dns name, misleading error is dispalayed
Product: [Retired] oVirt Reporter: Nir Soffer <nsoffer>
Component: ovirt-engine-coreAssignee: Marcin Mirecki <mmirecki>
Status: CLOSED CURRENTRELEASE QA Contact: Pavel Stehlik <pstehlik>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.5CC: bazulay, bugs, ecohen, gklein, lsurette, mburman, mmirecki, rbalakri, yeylon
Target Milestone: ---   
Target Release: 3.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: network
Fixed In Version: 3.6.0-4_alpha3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-04 11:35:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine log none

Description Nir Soffer 2014-09-23 13:33:59 UTC
Created attachment 940438 [details]
engine log

Description of problem:

When a host is not accessible using its dns name 
(e.g. voodoo1.example.com) engine fail to connect to the host
with java.net.UnknownHostException.

In the event log, we see:

Host voodoo1 is non responsive.

And later:

Host voodoo1 is not responding. It will stay in Connecting state for a grace period of 120 seconds and after that an attempt to fence the host will be issued.

Since the host cannot be accessed, fencing the host fails, the host
stays in non-operational state.

Version-Release number of selected component (if applicable):
3.5.0-0.0.master.20140911091402.gite1c5ffd.fc20

How reproducible:
100%

Steps to Reproduce:
1. Cause dns to host to fail (somehow)
2. Activate the host

Actual results:
Host stay in non-responsive mode forever

Expected results:
Complain about inaccessible host

Additional info:

I don't know why the host was not accessible using the dns name.
While engine was failing pathetically with java.net.UnknownHostException
I could ping and access this host using ssh from the same machine
engine was running. Understanding the root cause of this exception
is another issue, I'm not sure there is enough information here
to resolve it.

Comment 1 Michael Burman 2015-08-02 13:28:41 UTC
Can't reproduce step 1 ^^ , edited /etc/resolv.conf only with 127.0.0.1 and couldn't ping to hostname, but managed to activate host with success in rhev-M.

Comment 2 Marcin Mirecki 2015-08-04 10:08:35 UTC
Try executing a new action on the host.
For example opening the setup networks dialog and changing something seems to work.

Comment 3 Michael Burman 2015-08-04 11:44:46 UTC
Hi Marcin,

I had no issues to execute actions on host. Network changes were saved.

Comment 4 Marcin Mirecki 2015-08-04 11:53:04 UTC
Did you change any networks? Try adding a new network to a nic, or change its properties. When I just click 'Ok', the connection is ok.

Comment 5 Michael Burman 2015-08-04 13:23:43 UTC
Yes, i removed network from NIC with success

Comment 6 Marcin Mirecki 2015-08-12 08:21:00 UTC
This behaves differently than on my setup. Restarting the engine should always work.

Comment 7 Michael Burman 2015-08-18 05:51:59 UTC
Hi again Marcin,

So i performed the next steps --> 

1) RHEV-M 3.6.0-0.11.master.el6 with 5 servers installed (vdsm-4.17.2-1.el7ev.noarch)

2) Set 1 server to maintenance
3) edited /etc/resolv.conf only with :
nameserver 127.0.0.1 and saved.
4) As long ovirt-engine service wasn't restarted, all servers in engine stayed UP and i managed to set UP the server from maintenance and perform some networks changes via Setup Networks with success.

5) Once restarted ovirt-engine, all servers changed their states for Non-responsive states. Couldn't put UP server from maintenance with error:

"The address of host silver-vdsa.qa.lab.tlv.redhat.com could not be determined"

This error message was displayed in the event log for all the servers in the setup.

- So Marcin, this is the fix for this bug?
The error message for cases in which host's hostname can't be resolved-
"The address of host 'name-server' could not be determined"  ?

Please ACK, so i can move this bug to verified, thanks.

Comment 8 Marcin Mirecki 2015-08-18 07:15:18 UTC
Yes, this is the added error message which should be appearing in this scenario.

Comment 9 Michael Burman 2015-08-18 07:55:41 UTC
Verified on - 3.6.0-0.11.master.el6

Comment 10 Sandro Bonazzola 2015-11-04 11:35:44 UTC
oVirt 3.6.0 has been released on November 4th, 2015 and should fix this issue.
If problems still persist, please open a new BZ and reference this one.