1358530 – jsonrpcclient fails connecting with the default parameters if the hostname is not resolvable

Bug 1358530 - jsonrpcclient fails connecting with the default parameters if the hostname is not resolvable

Summary: jsonrpcclient fails connecting with the default parameters if the hostname is...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	vdsm
Classification:	oVirt
Component:	Bindings-API
Sub Component:
Version:	4.18.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	ovirt-4.0.4
Target Release:	4.18.13
Assignee:	Piotr Kliczewski
QA Contact:	Petr Kubica
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	1367458 1373018 (view as bug list)
Depends On:
Blocks:	1160423 1377161
TreeView+	depends on / blocked

Reported:	2016-07-20 22:38 UTC by Ryan Barry
Modified:	2022-08-15 13:15 UTC (History)
CC List:	20 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2016-09-26 12:36:49 UTC
oVirt Team:	Infra
Embargoed:
Dependent Products:
Flags:	rule-engine: ovirt-4.0.z+ rule-engine: exception+ ylavi: planning_ack+ mperina: devel_ack+ pstehlik: testing_ack+

Attachments	(Terms of Use)
setup_and_vdsm_logs (18.75 KB, application/x-gzip) 2016-07-20 22:38 UTC, Ryan Barry	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1160423	high	CLOSED	hosted-engine --deploy doesn't copy DNS config to ovirtmgmt	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1329943	urgent	CLOSED	myhostname is missing from the hosts line in nsswitch.conf	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1350883	high	CLOSED	vdscli.connect's heuristic ends up reading the local server address from vdsm config, where it finds the default ipv6-lo...	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1373018	unspecified	CLOSED	Hosted Engine fails to deploy with "Couldnt connect to VDSM within 240 seconds"	2021-02-22 00:41:40 UTC
Red Hat Issue Tracker	RHV-47828	None	None	None	2022-08-15 13:15:08 UTC
oVirt gerrit	61782	'None'	'MERGED'	'jsonvdscli: change hostname default'	2019-11-25 09:00:08 UTC
oVirt gerrit	63308	'None'	'MERGED'	'jsonvdscli: change hostname default'	2019-11-25 09:00:08 UTC

Internal Links: 1160423 1329943 1350883 1373018

Description Ryan Barry 2016-07-20 22:38:54 UTC

Created attachment 1182277 [details]
setup_and_vdsm_logs

Description of problem:
If the system hostname is changed with hostnamectl, 

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-2.0.1-1
vdsm-4.18.6-1

How reproducible:
100%

Steps to Reproduce:
1. Install a system
2. hostnamectl set-hostname foobar && hosted-engine --deploy

Actual results:
ovirt-hosted-engine-setup times out trying to connect to vdsm

Expected results:
This works

Additional info:

Comment 3 Yaniv Lavi 2016-07-21 08:18:27 UTC

We will not be fixing this type of issue. You can do a lot of ugly things during setup like change networks, storage, host name and much more. We will not support this. You can either change name before or after the setup.

Comment 4 Moran Goldboim 2016-07-21 11:55:11 UTC

(In reply to Yaniv Dary from comment #3)
> We will not be fixing this type of issue. You can do a lot of ugly things
> during setup like change networks, storage, host name and much more. We will
> not support this. You can either change name before or after the setup.

Yaniv, this is a very basic flow and probably expected to be done by every user.
he logs in to cockpit, changes the hostname and going to install hosted-engine.
happened to me last night with beta2 bits. 

I think it needs an immediate care to understand the scope of it (it wasn't reproduced on virt-qe env)

Comment 5 Ryan Barry 2016-07-21 12:07:58 UTC

(In reply to Yaniv Dary from comment #3)
> We will not be fixing this type of issue. You can do a lot of ugly things
> during setup like change networks, storage, host name and much more. We will
> not support this. You can either change name before or after the setup.

Note that this is about changing the hostname BEFORE setup, with hostnamectl.

Comment 6 Yaniv Lavi 2016-07-21 14:09:48 UTC

OK, in bash:
 hostnamectl set-hostname foobar && hosted-engine --deploy
means to run at the same time.

Comment 7 Yaniv Lavi 2016-07-21 14:11:14 UTC

Sandro, can you have a look?

Comment 8 Simone Tiraboschi 2016-07-21 14:43:26 UTC

I think that this could be somehow related to:
http://lists.ovirt.org/pipermail/users/2016-June/040578.html
an we have other complains about similar issues.

On hosted-engine-setup we are using something like:
 requestQueues = vdsmconfig.get('addresses', 'request_queues')
 requestQueue = requestQueues.split(",")[0]
 cli = jsonrpcvdscli.connect(
     requestQueue=requestQueue,
 )


Under lib/vdsm/jsonrpcvdscli.py in vdsm we have:

def _create(requestQueue,
            host=None, port=None,
            useSSL=None,
            responseQueue=None):
    if host is None:
        host = socket.gethostname()
    if port is None:
        port = int(config.getint('addresses', 'management_port'))

so, since we are not passing any value for host parameter, the behavior simply depends on the result of socket.gethostname()

Comment 9 Simone Tiraboschi 2016-07-21 15:14:54 UTC

While from VDSM logs we see:
MainThread::INFO::2016-07-20 18:32:57,812::protocoldetector::179::vds.MultiProtocolAcceptor::(__init__) Listening at :::54321

so it seams related to ipv4/ipv6 topic

Comment 10 Robert Story 2016-07-22 12:26:09 UTC

(In reply to Yaniv Dary from comment #6)
> OK, in bash:
>  hostnamectl set-hostname foobar && hosted-engine --deploy
> means to run at the same time.

Actually, '&' means run in the background. '&&' is the 'and' operator, which means only run the second command if the first succeeds.

Comment 11 GervaisdeM 2016-07-22 12:55:23 UTC

I experienced a similar issue when setting up one of my hosts on 3.6. I set the hostname with nmtui before disabling NetworkManager and the vdsm would not start during hosted-engine --deploy.

I can't say for certain how I got it working (sorry) as I tried many things. I did open a second session to the server and try starting (or restarting) the vdsmd service before hosted-engine --deploy timed out.

Now I am on 4.0.1 and having errors. It has been suggested that the errors I am having are related to this one.

Comment 12 Simone Tiraboschi 2016-07-22 14:21:12 UTC

Dan, Oved, can you please have somebody taking a look?
It seams white relevant since we already have different complains from upstream users.

Comment 13 Fabian Deutsch 2016-07-22 17:41:32 UTC

Simone, Ryan, can you please check if the myhostname module is used in the hosts line in /etc/nsswitch.conf.

I suspect that this is bug 1329943.

Comment 14 GervaisdeM 2016-07-22 17:57:42 UTC

I don't think it is the same. The things that are failing in https://bugzilla.redhat.com/show_bug.cgi?id=1329943 all seem to work fine for me:

```
$ hostname -f
cultivar3.grove.silverorange.com

$ cat /etc/nsswitch.conf | grep host
#hosts:     db files nisplus nis dns
hosts:      files dns myhostname

$ cat /etc/hostname
cultivar3.grove.silverorange.com

$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
###       oVirt role entries       ###
192.168.0.203       cultivar.grove.silverorange.com cultivar
192.168.0.204       cultivar0.grove.silverorange.com cultivar0
192.168.0.205       cultivar1.grove.silverorange.com cultivar1
192.168.0.206       cultivar2.grove.silverorange.com cultivar2
192.168.0.207       cultivar3.grove.silverorange.com cultivar3

$ ping cultivar3.grove.silverorange.com
PING cultivar3.grove.silverorange.com (192.168.0.207) 56(84) bytes of data.
64 bytes from cultivar3.grove.silverorange.com (192.168.0.207): icmp_seq=1 ttl=64 time=0.028 ms
...
```

(In reply to Fabian Deutsch from comment #13)
> Simone, Ryan, can you please check if the myhostname module is used in the
> hosts line in /etc/nsswitch.conf.
> 
> I suspect that this is bug 1329943.

Comment 15 Yaniv Bronhaim 2016-07-25 08:11:08 UTC

Can you share /var/log/messages log as well? I can't see if vdsm failed to start, by vdsm.log it looks like vdsm is up.
Can you check if vdsClient getVdsCaps works ? looks like the requests from HE don't get to vdsm

Comment 16 Simone Tiraboschi 2016-07-25 08:20:53 UTC

AFAIK Yaniv, VDSM is up but HE could not connect it using jsonrpcvdscli with the default values.

Comment 17 Piotr Kliczewski 2016-07-25 09:44:29 UTC

As Simone pointed in comment #8 we use host socket.gethostname() if the host name was not provided. We assume that we want to connect to localhost. Whatever value is provided by this call we use it to connect to vdsm. There are 2 ways to overcome the issue:

1. Make sure that socket.gethostname() returns pingable hostname
2. Update the code to provide proper hostname

Comment 18 Simone Tiraboschi 2016-07-25 11:27:54 UTC

So in general the name resolution should properly work also to connect to vdsm on the local host.
Under this assumption, rhbz#1160423 becomes more relevant.

Comment 19 Simone Tiraboschi 2016-07-25 13:04:26 UTC

See also rhbz#1350883 since now vdsm by default binds on :: and so we can experiment a connection issue also if ipv6 is disabled on the host.

Comment 20 Yaniv Bronhaim 2016-07-27 15:45:10 UTC

The description of the bug is not reasonable:
1. Install a system
2. hostnamectl set-hostname foobar && hosted-engine --deploy

The new hostname must be resolvable of course. As mentioned in comment #14 the issue is probably related to some other configuration and not the connectivity to vdsm. Please update if that's the case - from my check, after setting hostname to something that defined to localhost in /etc/hosts , jsonrpcvdscli works well.

Comment 21 Moran Goldboim 2016-07-28 13:07:19 UTC

(In reply to Yaniv Bronhaim from comment #20)
> The description of the bug is not reasonable:
> 1. Install a system
> 2. hostnamectl set-hostname foobar && hosted-engine --deploy
> 
> The new hostname must be resolvable of course. As mentioned in comment #14
> the issue is probably related to some other configuration and not the
> connectivity to vdsm. Please update if that's the case - from my check,
> after setting hostname to something that defined to localhost in /etc/hosts
> , jsonrpcvdscli works well.

this is true (just validated it), nevertheless it's a bad user experience.
is it a hostnamectl bug from your perspective?

Comment 22 Simone Tiraboschi 2016-07-28 13:10:04 UTC

(In reply to Moran Goldboim from comment #21)
> this is true (just validated it), nevertheless it's a bad user experience.
> is it a hostnamectl bug from your perspective?

Now we just found another reproducer, we have the patch for 1350883, we didn't change the hostname with hostnamectl, the hostname is resolvable but the issue is still here.

Comment 25 Edward Haas 2016-07-28 17:22:04 UTC

(In reply to Simone Tiraboschi from comment #22)
> (In reply to Moran Goldboim from comment #21)
> > this is true (just validated it), nevertheless it's a bad user experience.
> > is it a hostnamectl bug from your perspective?
> 
> Now we just found another reproducer, we have the patch for 1350883, we
> didn't change the hostname with hostnamectl, the hostname is resolvable but
> the issue is still here.

What issue exactly?
We started with the hostname unresolved and mentioned the IPv6 issue.

Please include the logs with the problem.

Comment 26 Simone Tiraboschi 2016-07-29 10:01:21 UTC

(In reply to Edward Haas from comment #25)
> What issue exactly?
> We started with the hostname unresolved and mentioned the IPv6 issue.

The IPv6 issue has been correctly addressed here:
https://gerrit.ovirt.org/#/c/61363/

hosted-engine-setup uses jsonrpcvdscli to connect VDSM in loopback on the same host.
It just use the default address that should end in socket.gethostname()

The issue is what happens if it's not correctly/easily resolvable.
In the reproducer we saw yesterday, due to other reasons, we missed the default route so it wasn't able to reach the DNS.

The hostname was correctly configured under /etc/hostname but there wasn't any specific entry under /etc/hosts

ping <hostname> was working, but hosted-engine-setup was still not able to connect to vdsm using the same hostname so it seams that the python name resolution works a bit differently than the OS one.

> Please include the logs with the problem.

in hosted-engine-setup logs, 
2016-07-28 12:03:43 INFO otopi.plugins.gr_he_setup.system.vdsmenv util.connect_vdsm_json_rpc:194 Waiting for VDSM to reply

in loop till

2016-07-28 12:03:43 DEBUG otopi.context context._executeMethod:142 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-setup/system/vdsmenv.py", line 94, in _late_setup
    timeout=ohostedcons.Const.VDSCLI_SSL_TIMEOUT,
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/util.py", line 198, in connect_vdsm_json_rpc
    timeout=MAX_RETRY * DELAY
RuntimeError: Couldnt  connect to VDSM within 240 seconds


jsonrpcclient doesn't log anywhere, and in VDSM client we didn't fount any sign of attempted connection.

Comment 27 Sandro Bonazzola 2016-07-29 13:24:11 UTC

Reducing priority to high according to comment #26

Comment 28 Edward Haas 2016-07-31 10:48:22 UTC

The client heuristic is to use localhost if no target is provided, it seems reasonable to me to keep this logic.

If the caller prefers, the client can be called with a specific target, overriding the default.

What is the output of:
python -c 'import socket ; print socket.gethostname()'
Compare to hostname?

Comment 29 Yaniv Lavi 2016-08-01 08:13:35 UTC

Also one replying, please move to ON_QA and add TestOnly if should be fixed.

Comment 30 Simone Tiraboschi 2016-08-01 08:37:42 UTC

(In reply to Edward Haas from comment #28)
> The client heuristic is to use localhost if no target is provided, it seems
> reasonable to me to keep this logic.

This is not really true,
the default is socket.gethostname()

https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=blob;f=lib/vdsm/jsonrpcvdscli.py;h=98f6f9ca02ae0590c8ad8eeb70ebcfcd8a457b97;hb=refs/heads/ovirt-4.0#l201

and this requires the hostname to be correctly resolvable and so this issue when we have DNS or network issues.
Probably using 'localhost' as the default can make it more robust on this kind of issues.

Comment 31 Simone Tiraboschi 2016-08-01 12:45:58 UTC

moving to vdsm - jsonrpcclient as for comment 30.

Comment 32 Simone Tiraboschi 2016-08-01 12:46:23 UTC

moving to vdsm - jsonrpcclient as for comment 30.

Comment 33 Piotr Kliczewski 2016-08-01 12:59:45 UTC

We could fix it by changing jsonrpcvdscli or calling code and provide correct host name when calling the client.

Comment 34 Edward Haas 2016-08-01 14:52:34 UTC

(In reply to Simone Tiraboschi from comment #30)
> (In reply to Edward Haas from comment #28)
> > The client heuristic is to use localhost if no target is provided, it seems
> > reasonable to me to keep this logic.
> 
> This is not really true,
> the default is socket.gethostname()

Sorry, I meant hostname.

> 
> https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=blob;f=lib/vdsm/jsonrpcvdscli.
> py;h=98f6f9ca02ae0590c8ad8eeb70ebcfcd8a457b97;hb=refs/heads/ovirt-4.0#l201
> 
> and this requires the hostname to be correctly resolvable and so this issue
> when we have DNS or network issues.

hostname is usually resolvable from the hosts file, it does not require dns or network access.
I guess it depends on /etc/nsswitch.conf settings.

> Probably using 'localhost' as the default can make it more robust on this
> kind of issues.

hostname was used with the hope to avoid the question of which IP version to use. I am not sure how localhost will behave in this case.
We can try it, but it must be checked with IPv6 enabled and disabled (at host level).

Comment 35 Simone Tiraboschi 2016-08-24 12:22:54 UTC

*** Bug 1367458 has been marked as a duplicate of this bug. ***

Comment 36 Simone Tiraboschi 2016-09-08 09:44:27 UTC

*** Bug 1373018 has been marked as a duplicate of this bug. ***

Comment 37 Petr Kubica 2016-09-19 09:07:42 UTC

Verified in 4.0.4-4
vdsm-4.18.13-1.el7ev.x86_64

Note You need to log in before you can comment on or make changes to this bug.

alain
bugs
danken
edwardh
fdeutsch
fweimer
gervais
lsurette
mgoldboi
mperina
oourfali
pkliczew
rbarry
rheslop
rs
sbonazzo
stirabos
ybronhei
ykaul
ylavi