Bug 1616158 - Check that DHCP assigned IP of the hosted-engine belongs to the same subnet, that ha-hosts belongs to.
Summary: Check that DHCP assigned IP of the hosted-engine belongs to the same subnet, ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: Network
Version: 2.2.24
Hardware: x86_64
OS: Linux
low
low
Target Milestone: ovirt-4.5.0
: 2.6.1
Assignee: Asaf Rachmani
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks: 2044362 2050512
TreeView+ depends on / blocked
 
Reported: 2018-08-15 06:54 UTC by Nikolai Sednev
Modified: 2022-04-20 06:33 UTC (History)
5 users (show)

Fixed In Version: ovirt-setup-lib-1.3.3, ovirt-hosted-engine-setup-2.6.1
Doc Type: Enhancement
Doc Text:
A check has been added to Self Hosted Engine Setup to ensure that the IP address resolved from oVirt Engine FQDN belongs to the same Subnet of the host which will run the Self Hosted Engine Agent.
Clone Of:
Environment:
Last Closed: 2022-04-20 06:33:59 UTC
oVirt Team: Integration
Embargoed:
sbonazzo: ovirt-4.5+
rule-engine: planning_ack?
pm-rhel: devel_ack+
rule-engine: testing_ack?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 117407 0 master MERGED plugins: cloud_init: Check that host and engine are in the same subnet 2021-12-16 11:24:56 UTC
oVirt gerrit 117832 0 master MERGED hostname: Add an option for extra tests 2021-12-10 09:05:23 UTC

Description Nikolai Sednev 2018-08-15 06:54:50 UTC
Description of problem:
During deployment of HE, there should be a check, that compares the assigned IP address for the engine belongs to the same subnet as ha-host belongs to.
For example, if ha-host belongs to subnet 10.1.1.0/24, and engine had been assigned IP address by DHCP from different subnet, like 20.7.7.1/24, then the deployment should fail with appropriate error.
By official RH documentation, engine-VM and ha-hosts have to lay within the same VLAN and the same subnet. 
Allowing engine to be configured with IP address from different subnet than ha-hosts, will lead in to routing issues, the migration will be dependent on router availability and performance will degrade.

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-2.2.25-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.16-1.el7ev.noarch
ovirt-hosted-engine-setup.noarch
Linux 3.10.0-862.11.6.el7.x86_64 #1 SMP Fri Aug 10 16:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)


How reproducible:
100%

Steps to Reproduce:
1.Have a host within subnet "A".
2.Deploy HE over any type of storage on host, but assign an IP address by DHCP from different subnet to engine, from subnet "B".

Actual results:
Deployment continues without any warning.

Expected results:
Deployment should be stopped with appropriate error.

Additional info:

Comment 1 Yedidyah Bar David 2018-08-15 07:05:43 UTC
(In reply to Nikolai Sednev from comment #0)
> Description of problem:
> During deployment of HE, there should be a check, that compares the assigned
> IP address for the engine belongs to the same subnet as ha-host belongs to.

Makes sense, but:

> For example, if ha-host belongs to subnet 10.1.1.0/24, and engine had been
> assigned IP address by DHCP from different subnet, like 20.7.7.1/24, then
> the deployment should fail with appropriate error.

Why fail?

> By official RH documentation, engine-VM and ha-hosts have to lay within the
> same VLAN and the same subnet. 

Fine, then perhaps warn, or also prompt.

The code does not try to restrict users to do only what's in the official documentation. Documentation says how to use the product, and also what's in the scope of supported flows/setups/etc. So we might decide to not support having them in different networks, but not sure we must prevent that, if it has any reasonable chance to work.

> Allowing engine to be configured with IP address from different subnet than
> ha-hosts, will lead in to routing issues,

Which?

> the migration will be dependent on
> router availability and performance will degrade.

I do not think migration has much to do with the engine's IP addresses. It's about whether the hosts use the same network for this.

Also:

The engine vm (final one, in node-zero aka ansible deployment) starts up rather late in the game, and only then we can know which IP address it got. Failing then is very annoying, IMO.

So your next request will be to lookup the provided name and check the resultant address. Then people will complain that this breaks their flow if they do not have a static assignment for the engine vm (quite common for dev/test envs) and rely on adding the name with the eventually-leased address to /etc/hosts...

Comment 2 Nikolai Sednev 2018-08-15 07:31:09 UTC
(In reply to Yedidyah Bar David from comment #1)
> (In reply to Nikolai Sednev from comment #0)
> > Description of problem:
> > During deployment of HE, there should be a check, that compares the assigned
> > IP address for the engine belongs to the same subnet as ha-host belongs to.
> 
> Makes sense, but:
> 
> > For example, if ha-host belongs to subnet 10.1.1.0/24, and engine had been
> > assigned IP address by DHCP from different subnet, like 20.7.7.1/24, then
> > the deployment should fail with appropriate error.
> 
> Why fail?

You may not fail and warn, just like it was decided in https://bugzilla.redhat.com/show_bug.cgi?id=1506240.
> 
> > By official RH documentation, engine-VM and ha-hosts have to lay within the
> > same VLAN and the same subnet. 
> 
> Fine, then perhaps warn, or also prompt.
I agree.
> 
> The code does not try to restrict users to do only what's in the official
> documentation. Documentation says how to use the product, and also what's in
> the scope of supported flows/setups/etc. So we might decide to not support
> having them in different networks, but not sure we must prevent that, if it
> has any reasonable chance to work.
> 
> > Allowing engine to be configured with IP address from different subnet than
> > ha-hosts, will lead in to routing issues,
> 
> Which?
That VDSM is managing network and any static routes will be wiped out on host's reboots.
> 
> > the migration will be dependent on
> > router availability and performance will degrade.
> 
> I do not think migration has much to do with the engine's IP addresses. It's
> about whether the hosts use the same network for this.
Engine's migration will be influenced by router availability, as hosts will have to have network access through it to the engine, which is unnecessary point of failure.
> 
> Also:
> 
> The engine vm (final one, in node-zero aka ansible deployment) starts up
> rather late in the game, and only then we can know which IP address it got.
> Failing then is very annoying, IMO.
I agree, but this is due to architecture and implementation, which of course can be improved.
> 
> So your next request will be to lookup the provided name and check the
> resultant address. Then people will complain that this breaks their flow if
> they do not have a static assignment for the engine vm (quite common for
> dev/test envs) and rely on adding the name with the eventually-leased
> address to /etc/hosts...
In production, both DHCP and static IP configurations have to be supported for both IPv4 and IPv6.

Comment 3 Sandro Bonazzola 2019-11-20 11:30:39 UTC
Missed 4.4 feature freeze, low severity. closing deferred due to capacity. We'll consider re-opening if we'll have capacity.

Comment 4 Martin Tessun 2020-02-05 08:46:54 UTC
Hi Nikolai,

(In reply to Nikolai Sednev from comment #0)
> Description of problem:
> During deployment of HE, there should be a check, that compares the assigned
> IP address for the engine belongs to the same subnet as ha-host belongs to.

Why? I know lots of scenarios and deployments where this isn't the case. (E.g. having a DC host/customer network but also having an Admin Network where the HE would "live". So this is a complete valid scenario from network separation perspective.
Only thing that needs to be ensured is that the HE can reach the host.

> For example, if ha-host belongs to subnet 10.1.1.0/24, and engine had been
> assigned IP address by DHCP from different subnet, like 20.7.7.1/24, then
> the deployment should fail with appropriate error.

Why? As long as my DHCP setup and routing information on the host can reach each subnet, it is still fine. Afaik the connection between HE and Host is checked during deployment.

> By official RH documentation, engine-VM and ha-hosts have to lay within the
> same VLAN and the same subnet. 

Ok. So we need to change that, as the reachability is the key here.

> Allowing engine to be configured with IP address from different subnet than
> ha-hosts, will lead in to routing issues, the migration will be dependent on
> router availability and performance will degrade.

Migration won't be dependent on routers, as the hosts themselves are in the same network. Also there won't be performance degradation as the connection RHV-M<->Host isn't needed for migration, but Host<->Host instead.

So I would either move this to documentation to clarify on the above mentioned network documentation - otherwise close it.

> 
> Version-Release number of selected component (if applicable):
> ovirt-hosted-engine-setup-2.2.25-1.el7ev.noarch
> ovirt-hosted-engine-ha-2.2.16-1.el7ev.noarch
> ovirt-hosted-engine-setup.noarch
> Linux 3.10.0-862.11.6.el7.x86_64 #1 SMP Fri Aug 10 16:55:11 UTC 2018 x86_64
> x86_64 x86_64 GNU/Linux
> Red Hat Enterprise Linux Server release 7.5 (Maipo)
> 
> 
> How reproducible:
> 100%
> 
> Steps to Reproduce:
> 1.Have a host within subnet "A".
> 2.Deploy HE over any type of storage on host, but assign an IP address by
> DHCP from different subnet to engine, from subnet "B".
> 
> Actual results:
> Deployment continues without any warning.
> 
> Expected results:
> Deployment should be stopped with appropriate error.
> 
> Additional info:

Comment 5 Nikolai Sednev 2020-02-05 17:33:11 UTC
(In reply to Martin Tessun from comment #4)
> Hi Nikolai,
> 
> (In reply to Nikolai Sednev from comment #0)
> > Description of problem:
> > During deployment of HE, there should be a check, that compares the assigned
> > IP address for the engine belongs to the same subnet as ha-host belongs to.
> 
> Why? I know lots of scenarios and deployments where this isn't the case.
> (E.g. having a DC host/customer network but also having an Admin Network
> where the HE would "live". So this is a complete valid scenario from network
> separation perspective.
> Only thing that needs to be ensured is that the HE can reach the host.
> 
This is because HE was designed to run from the same subnet and also documentation advises so.
To run engine with IP from different subnet on ha-host that lays in another different subnet should be tested and supported in separate RFE and also covered in appropriate documentation. HE ha-hosts also should be in the same host cluster by design.
> > For example, if ha-host belongs to subnet 10.1.1.0/24, and engine had been
> > assigned IP address by DHCP from different subnet, like 20.7.7.1/24, then
> > the deployment should fail with appropriate error.
> 
> Why? As long as my DHCP setup and routing information on the host can reach
> each subnet, it is still fine. Afaik the connection between HE and Host is
> checked during deployment.
Its again matter of RFE, which was not tested before or meant to be tested/supported. Now in new deployment workflow, node-0 aka ansible, during deployment ha-host reaches engine from the local NAT over reserved IP, provided by libvirt local DHCP during deployment at initial stage, then engine's disk being copied to the storage and engine getting new IP from the DHCP and finally ha-host checks for the engine's availability, checking it's liveliness.
> 
> > By official RH documentation, engine-VM and ha-hosts have to lay within the
> > same VLAN and the same subnet. 
> 
> Ok. So we need to change that, as the reachability is the key here.
Agreed.
> 
> > Allowing engine to be configured with IP address from different subnet than
> > ha-hosts, will lead in to routing issues, the migration will be dependent on
> > router availability and performance will degrade.
> 
> Migration won't be dependent on routers, as the hosts themselves are in the
> same network. Also there won't be performance degradation as the connection
> RHV-M<->Host isn't needed for migration, but Host<->Host instead.
> 
> So I would either move this to documentation to clarify on the above
> mentioned network documentation - otherwise close it.
Again, I didn't tested such scenarios before. I totally agree that HE might be part of different subnet, but such scenario was never tested by me and never been advised to customers in official documentation. During migration network being used, in case of engine lays in different subnet, I think that hosts will have to use HE's subnet connectivity, hence delays and latency and router's reach-ability are expected issues.
> 
> > 
> > Version-Release number of selected component (if applicable):
> > ovirt-hosted-engine-setup-2.2.25-1.el7ev.noarch
> > ovirt-hosted-engine-ha-2.2.16-1.el7ev.noarch
> > ovirt-hosted-engine-setup.noarch
> > Linux 3.10.0-862.11.6.el7.x86_64 #1 SMP Fri Aug 10 16:55:11 UTC 2018 x86_64
> > x86_64 x86_64 GNU/Linux
> > Red Hat Enterprise Linux Server release 7.5 (Maipo)
> > 
> > 
> > How reproducible:
> > 100%
> > 
> > Steps to Reproduce:
> > 1.Have a host within subnet "A".
> > 2.Deploy HE over any type of storage on host, but assign an IP address by
> > DHCP from different subnet to engine, from subnet "B".
> > 
> > Actual results:
> > Deployment continues without any warning.
> > 
> > Expected results:
> > Deployment should be stopped with appropriate error.
> > 
> > Additional info:

Comment 6 Yedidyah Bar David 2020-02-06 07:03:43 UTC
Thinking about this again, I think I agree with Nikolai. With a hosted-engine deployment, the engine VM runs on one of the hosts. Its first vNIC is attached, during deploy, and I do not think the user can change that later, to the bridge of the 'ovirtmgmt' network, and also one of the host's physical NICs (or bonds) has to be attached to this network - deploy asks about it and does that. So for the engine and host to communicate over a router, they should either:

1. Have two logical networks (subnets) on a single physical network (ovirtmgmt) and have a router bounce back packets to the same leg for "routing". I do not think this makes much sense (although in theory might work).

2. Connect through a different network. Say, have net2 bridge network, with the host having a NIC on it and the engine a vNIC on it, and communicate between them through these two networks (ovirtmgmt and net2) via a router. Does not make much sense to me either. Even if both of them have net2, why wouldn't they communicate over ovirtmgmt?

Again, the whole point here is that it's a hosted-engine. For a standalone one, there is definitely no problem having the engine and (some of) its hosts on two different networks with a router between them.

That said, I am not sure, as I wrote 1.5 years ago in comment 1, that we should _prevent_ that. I do guess that this flow (have a test find out that they are in different subnets) is quite rare, and in almost all cases is unintended (meaning, a misconfiguration in the dhcp server or a mistake during input to deploy). It's also somewhat complex to do (see my comment 1's end).

Comment 7 Sandro Bonazzola 2020-03-18 09:29:37 UTC
This may be possibly covered by documentation but we need to provide instructions to follow for avoiding this.

Comment 8 Sandro Bonazzola 2020-11-06 16:02:54 UTC
Moved out to 4.5 due to capacity

Comment 9 Asaf Rachmani 2021-10-21 14:03:15 UTC
Not sure I understand how the environment looks in order to reproduce this bug.
If the host is in subnet "A" and DHCP IP can be assigned from subnet "B" (I think it's possible only if there is routing between the subnets), it means that the engine VM cannot reach the default gateway.
Nikolai, can you please provide more details about the environment design and also ovirt-hosted-engine-setup logs?

Comment 10 Nikolai Sednev 2021-10-21 21:41:51 UTC
DHCP assigns IP by MAC address of the engine VM, which configured manually by IT (IP to MAC reservation).
DHCP may assign IP to the engine during deployment, which not necessarily will be from the same subnet, to which host belongs. 
During deployment should exist verification for such case.
Verification should be similar to one we already have for default gateway and host's IP address, both have to be from the same subnet.

Comment 11 Asaf Rachmani 2021-10-25 07:24:57 UTC
Do you have an environment with this issue that I can have a look at?

Comment 12 Nikolai Sednev 2021-10-25 10:24:26 UTC
I have such a case for example, when using IPv6. We're using static configuration for the IPv6 and its totally possible to configure default gateway from different subnet, while IPv6 address of the host will be in another subnet. This bug was submitted earlier for such cases with IPv4, but its totally can be reproduced with IPv6 too. It can be reproduced if configuration is static or if DHCP server was misconfigured.

Comment 13 Nikolai Sednev 2021-10-25 10:27:52 UTC
This very bug was opened with reconstruction steps as appears in comment #1.
1.Have a host within subnet "A".
2.Deploy HE over any type of storage on host, but assign an IP address by DHCP from different subnet to engine, from subnet "B".

The issue here is to prevent IP address of the engine to be from different subnet, it should be only from the same subnet that host came from.

Comment 14 Nikolai Sednev 2021-10-26 09:27:35 UTC
Do we need that during the deployment of HE, the engine's domain won't be the same as the domain of the host, on which deployment takes place? I think of no such real use case.

Comment 15 Nikolai Sednev 2022-02-21 10:17:08 UTC
[ INFO  ] The Engine VM FQDN was resolved into: 'x.y.y.y'.
[WARNING] The Engine VM ('z.z.z.z') and this host (2620:52:0:235c:a236:9fff:fe3a:c4f0/64) will not be in the same IP subnet.
         Static routing configuration are not supported on automatic VM configuration.
         
          OK?  (Yes, No, Abort) [No]: 

ansible-2.9.27-1.el8ae.noarch
python3-ovirt-setup-lib-1.3.3-1.el8ev.noarch
ovirt-hosted-engine-setup-2.6.1-1.el8ev.noarch
ovirt-hosted-engine-ha-2.4.10-1.el8ev.noarch
Red Hat Enterprise Linux release 8.6 Beta (Ootpa)
Linux 4.18.0-367.el8.x86_64 #1 SMP Thu Feb 10 14:56:38 EST 2022 x86_64 x86_64 x86_64 GNU/Linux

Comment 16 Sandro Bonazzola 2022-04-20 06:33:59 UTC
This bugzilla is included in oVirt 4.5.0 release, published on April 20th 2022.

Since the problem described in this bug report should be resolved in oVirt 4.5.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.