1588720 – A system wide proxy with no exception for the engine FQDN will cause a "Failed connect to <ManagerFQDN>:443; No route to host"

Bug 1588720 - A system wide proxy with no exception for the engine FQDN will cause a "Failed connect to <ManagerFQDN>:443; No route to host"

Summary: A system wide proxy with no exception for the engine FQDN will cause a "Faile...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-hosted-engine-setup
Sub Component:
Version:	4.2.3
Hardware:	x86_64
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	ovirt-4.2.5
Target Release:	---
Assignee:	Simone Tiraboschi
QA Contact:	Liran Rotenberg
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1547768 ovirt-hosted-engine-setup-2.2.24
TreeView+	depends on / blocked

Reported:	2018-06-07 18:31 UTC by Santiago Maudet
Modified:	2022-03-13 15:05 UTC (History)
CC List:	8 users (show)
Fixed In Version:	ovirt-hosted-engine-setup-2.2.23-1.el7ev
Doc Type:	Bug Fix
Doc Text:	Cause: The bootstrap engine VM is not reachable from an external proxy by design. Consequence: The deployment was failing if a proxy is set at system level with no exception for the engine VM FQDN. Fix: Detect if a proxy is configured with no exception for the engine VM FQDN and emit a clear error message for the user. Result:
Clone Of:
Environment:
Last Closed:	2018-08-20 08:41:47 UTC
oVirt Team:	Integration
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Hosted engine deploy logs (316.03 KB, text/plain) 2018-06-07 19:00 UTC, Santiago Maudet	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHV-43514	None	None	None	2021-09-09 14:31:08 UTC
Red Hat Knowledge Base (Solution)	3483331	None	None	None	2018-08-06 12:38:54 UTC
oVirt gerrit	92146	master	MERGED	proxy: detect a proxy accessing engine FQDN	2021-02-04 07:36:19 UTC
oVirt gerrit	92159	ovirt-hosted-engine-setup-2.2	MERGED	proxy: detect a proxy accessing engine FQDN	2021-02-04 07:36:19 UTC

Description Santiago Maudet 2018-06-07 18:31:04 UTC

Description of problem:

Trying to install a SHE 4.2.3 over a RHEL 7.5 Hypervisor using "hosted-engine --deploy" the install fails with "Failed connect to <ManagerFQDN>:443; No route to host"

This issue happens because the Hosted Engine VM took an DHCP Address from the default network created by libvirt. This overrides the ovirtmgmt network IP address that was configured in the setup process.

Also, and entry with this Manager DHCP IP address is created in the Hypervisor /etc/hosts and the Original IP address selected in the setup questions is no longer reachable. 

Version-Release number of selected component (if applicable):

- rhvm-appliance-4.2-20180504.0.el7.noarch
- rhvm-4.2.3.4-0.1.el7.noarch
- RHEL 7.5 (3.10.0-862.3.2.el7.x86_64)

How reproducible: 100%

- Install a fresh RHEL 7.5 
- Enable the following repos:

~~~
# subscription-manager repos --enable=rhel-7-server-rpms
# subscription-manager repos --enable=rhel-7-server-rhv-4-mgmt-agent-rpms
# subscription-manager repos --enable=rhel-7-server-ansible-2-rpms
# yum update
~~~

- Proceed with the SHE Setup:

~~~
# yum install ovirt-hosted-engine-setup
# yum install screen
# screen
# hosted-engine --deploy
~~~

Answer the questions accordingly 

Actual results:

The setup fails with: 

~~~
2018-06-06 20:40:24,434-0400 ERROR otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:98 fatal: [localhost]: FAILED! => {"attempts": 30, "changed": false, "content": "", "msg": "Status code was -1 and not [200]: Connection failure: timed out", "redirected": false, "status": -1, "url": "http://manager.example.com/ovirt-engine/services/health"}
~~~

* In the /etc/hosts of the Hypervisor you will find the DHCP IP entry from the libvirt default network. (192.168.122.xxx)

Comment 1 Santiago Maudet 2018-06-07 19:00:32 UTC

Created attachment 1448813 [details]
Hosted engine deploy logs

Comment 2 Santiago Maudet 2018-06-07 19:54:36 UTC

* Before the Deploy in the Hypervisor:

~~~
cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.122.87 rhel75.example.com rhel75
~~~

~~~
ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:5a:c4:2a brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.87/24 brd 192.168.122.255 scope global noprefixroute dynamic ens3
       valid_lft 3573sec preferred_lft 3573sec
    inet6 fe80::385b:c199:cf90:cc9d/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
~~~

After the deploy in the Hypervisor:

~~~
cat /etc/hosts
192.168.124.215 manager41.example.com 
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
~~~

~~~
ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:5a:c4:2a brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.87/24 brd 192.168.122.255 scope global noprefixroute dynamic ens3
       valid_lft 3375sec preferred_lft 3375sec
    inet6 fe80::385b:c199:cf90:cc9d/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:3d:9d:b1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.124.1/24 brd 192.168.124.255 scope global virbr0
       valid_lft forever preferred_lft forever
4: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN group default qlen 1000
    link/ether 52:54:00:3d:9d:b1 brd ff:ff:ff:ff:ff:ff
5: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master virbr0 state UNKNOWN group default qlen 1000
    link/ether fe:16:3e:7d:fe:41 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fc16:3eff:fe7d:fe41/64 scope link 
       valid_lft forever preferred_lft forever

The Manager VM took the IP ADDR: 192.168.124.215 instead of 192.168.122.100:

~~~
[root@manager41 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:16:3e:7d:fe:41 brd ff:ff:ff:ff:ff:ff
    inet 192.168.124.215/24 brd 192.168.124.255 scope global noprefixroute dynamic eth0
       valid_lft 2308sec preferred_lft 2308sec
    inet6 fe80::216:3eff:fe7d:fe41/64 scope link 
       valid_lft forever preferred_lft forever
~~~

NOTE: Dismiss the FQDN name manager41, this is a 4.2 installation.

Comment 4 Simone Tiraboschi 2018-06-08 07:22:30 UTC

(In reply to Santiago Maudet from comment #0)
> This issue happens because the Hosted Engine VM took an DHCP Address from
> the default network created by libvirt. This overrides the ovirtmgmt network
> IP address that was configured in the setup process.
> 
> Also, and entry with this Manager DHCP IP address is created in the
> Hypervisor /etc/hosts and the Original IP address selected in the setup
> questions is no longer reachable. 

This is exactly the expected behavior for the bootstrap process:
it's going to launch a temporary bootstrap VM over libvirt default network starting from a local disk.
That engine will be used to configure the remaining things (deploy the host, configure its networking,  create a storage domain, create a VM...) and only at the end the disk of the bootstrap VM will be transferred over the disk of the VM created by the engine on the shared storage.

From the logs I see that engine-setup has been correctly executed via ansible on the bootstrap local VM and so it was correctly reachable over ssh on its temporary address over the libvirt default network but then the host failed to reach the engine on the same address.

Did you manually edited /etc/hosts in the middle of the deployment?

Do you also have logs from the engine VM?

Comment 11 Simone Tiraboschi 2018-06-11 13:31:50 UTC

Lowering to high since adding the engine FQDN to no_proxy list is enough to avoid this.

Comment 12 Liran Rotenberg 2018-07-22 10:24:45 UTC

Verified on:
ovirt-hosted-engine-setup-2.2.23-1.el7ev.noarch

Steps of verification:
1. Set a proxy on the host
# export {http,https,ftp}_proxy="http://PROXY_SERVER:PORT"
(optional check) run:
# env | grep -i proxy
to see the proxy set and check internet connection
2. run hosted-engine --deploy


Results:
After a while in the installation:
[ INFO  ] TASK [Check http/https proxy]
[ ERROR ] {u'changed': False, u'_ansible_no_log': False, u'url': u'http://lrotenbe-he-1.scl.lab.tlv.redhat.com/', u'_ansible_item_result': True, u'failed': True, u'msg': u'Your system is configured to use a proxy, please add an exception for http://lrotenbe-he-1.scl.lab.tlv.redhat.com/ with no_proxy directive.\n', u'_ansible_item_label': u'http://lrotenbe-he-1.scl.lab.tlv.redhat.com/'}
[ ERROR ] {u'changed': False, u'_ansible_no_log': False, u'url': u'https://lrotenbe-he-1.scl.lab.tlv.redhat.com/', u'_ansible_item_result': True, u'failed': True, u'msg': u'Your system is configured to use a proxy, please add an exception for https://lrotenbe-he-1.scl.lab.tlv.redhat.com/ with no_proxy directive.\n', u'_ansible_item_label': u'https://lrotenbe-he-1.scl.lab.tlv.redhat.com/'}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook

And the hosted engine deployment failed as expected.

Note You need to log in before you can comment on or make changes to this bug.