Bug 1145678 - If host is not accessible by its dns name, misleading error is dispalayed
Summary: If host is not accessible by its dns name, misleading error is dispalayed
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: ovirt-engine-core
Version: 3.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.6.0
Assignee: Marcin Mirecki
QA Contact: Pavel Stehlik
URL:
Whiteboard: network
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-09-23 13:33 UTC by Nir Soffer
Modified: 2016-02-10 19:36 UTC (History)
9 users (show)

Fixed In Version: 3.6.0-4_alpha3
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-04 11:35:44 UTC
oVirt Team: Network
Embargoed:


Attachments (Terms of Use)
engine log (41.12 KB, application/x-xz)
2014-09-23 13:33 UTC, Nir Soffer
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 43198 0 master MERGED engine - changed error msg when host hostname can not be resolved Never

Description Nir Soffer 2014-09-23 13:33:59 UTC
Created attachment 940438 [details]
engine log

Description of problem:

When a host is not accessible using its dns name 
(e.g. voodoo1.example.com) engine fail to connect to the host
with java.net.UnknownHostException.

In the event log, we see:

Host voodoo1 is non responsive.

And later:

Host voodoo1 is not responding. It will stay in Connecting state for a grace period of 120 seconds and after that an attempt to fence the host will be issued.

Since the host cannot be accessed, fencing the host fails, the host
stays in non-operational state.

Version-Release number of selected component (if applicable):
3.5.0-0.0.master.20140911091402.gite1c5ffd.fc20

How reproducible:
100%

Steps to Reproduce:
1. Cause dns to host to fail (somehow)
2. Activate the host

Actual results:
Host stay in non-responsive mode forever

Expected results:
Complain about inaccessible host

Additional info:

I don't know why the host was not accessible using the dns name.
While engine was failing pathetically with java.net.UnknownHostException
I could ping and access this host using ssh from the same machine
engine was running. Understanding the root cause of this exception
is another issue, I'm not sure there is enough information here
to resolve it.

Comment 1 Michael Burman 2015-08-02 13:28:41 UTC
Can't reproduce step 1 ^^ , edited /etc/resolv.conf only with 127.0.0.1 and couldn't ping to hostname, but managed to activate host with success in rhev-M.

Comment 2 Marcin Mirecki 2015-08-04 10:08:35 UTC
Try executing a new action on the host.
For example opening the setup networks dialog and changing something seems to work.

Comment 3 Michael Burman 2015-08-04 11:44:46 UTC
Hi Marcin,

I had no issues to execute actions on host. Network changes were saved.

Comment 4 Marcin Mirecki 2015-08-04 11:53:04 UTC
Did you change any networks? Try adding a new network to a nic, or change its properties. When I just click 'Ok', the connection is ok.

Comment 5 Michael Burman 2015-08-04 13:23:43 UTC
Yes, i removed network from NIC with success

Comment 6 Marcin Mirecki 2015-08-12 08:21:00 UTC
This behaves differently than on my setup. Restarting the engine should always work.

Comment 7 Michael Burman 2015-08-18 05:51:59 UTC
Hi again Marcin,

So i performed the next steps --> 

1) RHEV-M 3.6.0-0.11.master.el6 with 5 servers installed (vdsm-4.17.2-1.el7ev.noarch)

2) Set 1 server to maintenance
3) edited /etc/resolv.conf only with :
nameserver 127.0.0.1 and saved.
4) As long ovirt-engine service wasn't restarted, all servers in engine stayed UP and i managed to set UP the server from maintenance and perform some networks changes via Setup Networks with success.

5) Once restarted ovirt-engine, all servers changed their states for Non-responsive states. Couldn't put UP server from maintenance with error:

"The address of host silver-vdsa.qa.lab.tlv.redhat.com could not be determined"

This error message was displayed in the event log for all the servers in the setup.

- So Marcin, this is the fix for this bug?
The error message for cases in which host's hostname can't be resolved-
"The address of host 'name-server' could not be determined"  ?

Please ACK, so i can move this bug to verified, thanks.

Comment 8 Marcin Mirecki 2015-08-18 07:15:18 UTC
Yes, this is the added error message which should be appearing in this scenario.

Comment 9 Michael Burman 2015-08-18 07:55:41 UTC
Verified on - 3.6.0-0.11.master.el6

Comment 10 Sandro Bonazzola 2015-11-04 11:35:44 UTC
oVirt 3.6.0 has been released on November 4th, 2015 and should fix this issue.
If problems still persist, please open a new BZ and reference this one.


Note You need to log in before you can comment on or make changes to this bug.