Bug 1540463 - New IPv4 validator handles interfaces with no gateway incorrectly
Summary: New IPv4 validator handles interfaces with no gateway incorrectly
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: Network
Version: ---
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ovirt-4.2.2
: ---
Assignee: Ido Rosenzwig
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks: 1458709
TreeView+ depends on / blocked
 
Reported: 2018-01-31 06:27 UTC by brian
Modified: 2023-09-14 04:16 UTC (History)
8 users (show)

Fixed In Version: ovirt-hosted-engine-setup-2.2.11
Clone Of:
Environment:
Last Closed: 2018-03-29 11:11:04 UTC
oVirt Team: Integration
Embargoed:
rule-engine: ovirt-4.2+


Attachments (Terms of Use)
vdsm_ip_validator_gateway_fix patch (585 bytes, text/plain)
2018-01-31 06:27 UTC, brian
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 87812 0 master MERGED src: packaging: Handle interfaces with no default gateway 2018-09-03 09:35:40 UTC
oVirt gerrit 87824 0 ovirt-hosted-engine-setup-2.2 MERGED src: packaging: Handle interfaces with no default gateway 2018-02-19 08:24:47 UTC

Description brian 2018-01-31 06:27:17 UTC
Created attachment 1388746 [details]
vdsm_ip_validator_gateway_fix patch

Description of problem:

Running hosted-engine --deploy for oVirt 4.2 results in failure with "(code=22, message=u'' does not appear to be an IPv4 or IPv6 interface)" when the chosen ovirt management interface does not have a configured gateway.

Version-Release number of selected component (if applicable): 

4.20.9.3-1.el7.centos

How reproducible:

100%

Steps to Reproduce:
1. Configure a second interface on host with no gateway
2. Attempt to run hosted-engine --deploy specifying the second interface as the ovirt management interface

Actual results:

hosted-engine --deploy fails with the following message: 
[ ERROR ] Failed to execute stage 'Misc configuration': Failed to setup networks {'ovirtmgmt': {'nic': 'eth1', 'netmask': u'255.255.255.0', 'ipaddr': u'192.168.0.1', 'gateway': u'', 'defaultRoute': False}}. Error: "Command Host.setupNetworks with args {'bondings': {}, 'options': {'connectivityCheck': Fal
se}, 'networks': {'ovirtmgmt': {'nic': 'eth1', 'netmask': u'255.255.255.0', 'ipaddr': u'192.168.0.1', 'gateway': u'', 'defaultRoute': False}}} failed:
         (code=22, message=u'' does not appear to be an IPv4 or IPv6 interface)"

Expected results:

Hosted engine is deployed


Additional info:

I realize this may be an out-of-norm situation, but we've got an out of norm kinda setup. No reason things shouldn't work without a gateway anyway. 

Bug looks to be introduced in commit 90325486e3039c2f4a620a9b1a68dd8f086d3ac2 (https://gerrit.ovirt.org/#/c/79951/) of vdsm, specifically the addition of this check "if 'gateway' in net_attrs:". vdsm getCapabilities seems to always return a 'gateway' key and leaves it as an empty string if no gateway is configured. Temporarily changing the check to "if 'gateway' in net_attrs and net_attrs['gateway']" allowed the engine to deploy. See the attached patch.

Comment 1 brian 2018-01-31 07:07:22 UTC
This issue continues in the Setup Host Networks configuration wizard. Where the following error is given:

"    must match "^\b((25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)\_){3}(25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)"
    Attribute: ipConfiguration.iPv4Addresses[0].gateway "

Comment 2 Dan Kenigsberg 2018-01-31 08:45:45 UTC
I believe that this is a dup of bug 1528906, please retest

Comment 3 brian 2018-01-31 14:32:19 UTC
The second issue I thought might be related in comment 1 is a duplicate of bug 1528906, the original issue in vdsm is separate. Thanks

Comment 4 brett.meier 2018-02-03 22:23:29 UTC
I ran into this issue when adding additional VM networks, after having added an interface on a "storage only" network (no gateway; local /24 only). Our NFS service for oVirt/VDSM runs on a dedicated VLAN with no outside routing (10G switching fabric; layer 3 uplinks are GigE). The primary purpose was to contain all NFS traffic on the faster switching fabric.

172.26.36.0/24 - VLAN 243 oVirt/VDSM management network
172.16.0.0/24 - VLAN 1 (isolated switching fabric) NFS service
No IP - VLAN 230 - Infrastructure Services
No IP - VLAN 232 - VPN DMZ
No IP - VLAN 235 - Public Services DMZ

The validation induced a commit failure after having added the first two networks, even without the gateway for 172.16.0.0/24. Restated, it only threw the validation when adding the unnumbered VLAN interfaces, which took a while to track down.

The work-around I've implemented, for now, was to add the VDSM host's 172.16.0.0/24 address as its gateway (this should fail, or at least not be installed as a route on the host). This allows validation to succeed without negatively impacting the routing table on the host.

I'm trying to track down where the validation occurs to fix it. I'm working under the assumption that the validation occurs on the Engine host, rather than the VDSM nodes. I've been using oVirt since version 3.x, but haven't spent a lot of time digging into the orchestration layer.

Comment 5 Dan Kenigsberg 2018-02-05 11:23:43 UTC
brian, please excuse me but I am confused.

Does `hosted-engine --deploy` of ovirt-4.2.2 pass when your management network is added to a NIC without a gateway? If yes, please mark this bug as verified. If not, please reopen this bug, and as much reproduction info as possible.

If there is a second issue you would like to discuss, such as storage-only networks, please file a separate bug for it.

Comment 6 Nikolai Sednev 2018-02-13 14:55:31 UTC
I don't see how such deployment might be possible, if during deployment customer being asked to provide pingable default gateway and then during deployment the process discovers that default gateway does not lay within the same subnet with management interface and getting in to the loop...


         
          --== STORAGE CONFIGURATION ==--
         
         
          --== HOST NETWORK CONFIGURATION ==--
         
          Please indicate a pingable gateway IP address [10.35.163.254]: 
[ INFO  ] TASK [Gathering Facts]
[ INFO  ] ok: [localhost]
[ INFO  ] TASK [Get all active network interfaces]
[ INFO  ] TASK [filter bonds with bad naming]
          Please indicate a nic to set ovirtmgmt bridge on: (enp4s0f0, enp4s0f1) [enp4s0f0]: enp4s0f1
         
          --== VM CONFIGURATION ==--
         
          If you want to deploy with a custom engine appliance image,
          please specify the path to the OVA archive you would like to use
          (leave it empty to skip, the setup will use rhvm-appliance rpm installing it if missing): 
[ INFO  ] Detecting host timezone.
          Please provide the FQDN you would like to use for the engine appliance.
          Note: This will be the FQDN of the engine VM you are now going to launch,
          it should not point to the base host or to any other existing machine.
          Engine VM FQDN: (leave it empty to skip):  []: nsednev-he-4.scl.lab.tlv.redhat.com
          Please provide the domain name you would like to use for the engine appliance.
          Engine VM domain: [scl.lab.tlv.redhat.com]
          Automatically execute engine-setup on the engine appliance on first boot (Yes, No)[Yes]? 
          Automatically restart the engine VM as a monitored service after engine-setup (Yes, No)[Yes]? 
          Enter root password that will be used for the engine appliance: 
          Confirm appliance root password: 
          Enter ssh public key for the root user that will be used for the engine appliance (leave it empty to skip): 
[WARNING] Skipping appliance root ssh public key
          Do you want to enable ssh access for the root user (yes, no, without-password) [yes]: 
          Please specify the number of virtual CPUs for the VM (Defaults to appliance OVF value): [4]: 
          Please specify the memory size of the VM in MB (Defaults to appliance OVF value): [16384]: 
          You may specify a unicast MAC address for the VM or accept a randomly generated default [00:16:3e:24:ce:60]: 00:16:3E:7D:DD:DD
          How should the engine VM network be configured (DHCP, Static)[DHCP]? Static
          Please enter the IP address to be used for the engine VM [1.1.1.2]: 
[ ERROR ] The Engine VM (1.1.1.2/24) and the default gateway (10.35.163.254) will not be in the same IP subnet.
         Static routing configuration are not supported on automatic VM configuration.
         
          How should the engine VM network be configured (DHCP, Static)[DHCP]?

Comment 7 Edward Haas 2018-02-18 08:54:55 UTC
(In reply to Nikolai Sednev from comment #6)
> I don't see how such deployment might be possible, if during deployment
> customer being asked to provide pingable default gateway and then during
> deployment the process discovers that default gateway does not lay within
> the same subnet with management interface and getting in to the loop...
> 

This is unrelated to what this bug was opened for, please open a separate BZ and assign it to hosted-engine team.
The user is not interested in setting the gateway at all (and he did manage to set it somehow based on the command sent to VDSM).

Comment 8 Edward Haas 2018-02-18 09:05:42 UTC
Several issues exist here:

- hosted-engine setup is sending a command to VDSM with gateway='' , which is not something that VDSM accepts. If there is no gateway to be set, it should not be sent.

- In continuation to the previous point, VDSM can be less pedantic and normalize the input (converting the key with an empty value to no key at all).

- A setup where the host gateway and the Engine VM gateway is not the same or where there is no gateway at all, is a valid configuration. Different gateways also include having one in the VM and none at the host and vice versa.
This by its own should be explored through a separated BZ.

Comment 9 Gabriel 2018-02-22 09:03:46 UTC
Just adding more info.

Updated all Hosts and Hosted-Engine to 4.2.1 and really strange, for 2 Hosts worked, for another 2 not. Same networks, just added them as VLAN, without IP or Gateway(I do that because I want to have my VMs using some VLANs). Yes, I rebooted all hosts after updates.

Same as Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1528906

Some details:

As usual, engine.log:

2018-02-22 09:56:24,605+01 WARN  [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (default task-52) [74bf6519-4c86-4cd2-b157-ec8a854c16de] Validation of action 'HostSetupNetworks' failed for user admin@internal-authz. Reasons: VAR__ACTION__SETUP,VAR__TYPE__NETWORKS,must match "^\b((25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)\.){3}(25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)",$flags [Ljavax.validation.constraints.Pattern$Flag;@3a8f61fe,$regexp ^\b((25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)\.){3}(25[0-5]|2[0-4]\d|[01]\d\d|\d?\d),$groups [Ljava.lang.Class;@487c0f2b,$message {javax.validation.constraints.Pattern.message},$payload [Ljava.lang.Class;@47f3434,ACTION_TYPE_FAILED_ATTRIBUTE_PATH,$path ipConfiguration.iPv4Addresses[0].gateway,$validatedValue

Packages on Hosted-Engine(grep as ovirt):

ovirt-engine-setup-base-4.2.1.7-1.el7.centos.noarch
ovirt-ansible-vm-infra-1.1.4-1.el7.centos.noarch
ovirt-engine-dwh-setup-4.2.1.2-1.el7.centos.noarch
ovirt-engine-setup-4.2.1.7-1.el7.centos.noarch
ovirt-engine-restapi-4.2.1.7-1.el7.centos.noarch
ovirt-ansible-image-template-1.1.5-1.el7.centos.noarch
ovirt-imageio-common-1.2.1-0.el7.centos.noarch
ovirt-imageio-proxy-1.2.1-0.el7.centos.noarch
ovirt-engine-extensions-api-impl-4.2.1.7-1.el7.centos.noarch
ovirt-engine-webadmin-portal-4.2.1.7-1.el7.centos.noarch
ovirt-engine-wildfly-11.0.0-1.el7.centos.x86_64
ovirt-setup-lib-1.1.4-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7.centos.noarch
ovirt-ansible-infra-1.1.3-1.el7.centos.noarch
ovirt-imageio-proxy-setup-1.2.1-0.el7.centos.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.2.1.7-1.el7.centos.noarch
ovirt-engine-metrics-1.1.2.2-1.el7.centos.noarch
ovirt-engine-tools-backup-4.2.1.7-1.el7.centos.noarch
ovirt-engine-dwh-4.2.1.2-1.el7.centos.noarch
ovirt-ansible-engine-setup-1.1.0-1.el7.centos.noarch
ovirt-ansible-disaster-recovery-0.1-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-4.2.1.7-1.el7.centos.noarch
ovirt-provider-ovn-1.2.5-1.el7.centos.noarch
ovirt-engine-vmconsole-proxy-helper-4.2.1.7-1.el7.centos.noarch
ovirt-ansible-roles-1.1.3-1.el7.centos.noarch
ovirt-web-ui-1.3.5-1.el7.centos.noarch
ovirt-engine-dbscripts-4.2.1.7-1.el7.centos.noarch
ovirt-engine-4.2.1.7-1.el7.centos.noarch
ovirt-iso-uploader-4.1.0-1.el7.centos.noarch
ovirt-ansible-cluster-upgrade-1.1.4-1.el7.centos.noarch
ovirt-engine-websocket-proxy-4.2.1.7-1.el7.centos.noarch
ovirt-engine-tools-4.2.1.7-1.el7.centos.noarch
ovirt-engine-wildfly-overlay-11.0.1-1.el7.centos.noarch
ovirt-cockpit-sso-0.0.4-1.el7.noarch
ovirt-engine-extension-aaa-jdbc-1.1.6-1.el7.centos.noarch
ovirt-vmconsole-1.0.4-1.el7.centos.noarch
ovirt-engine-cli-3.6.9.2-1.el7.centos.noarch
ovirt-engine-lib-4.2.1.7-1.el7.centos.noarch
ovirt-host-deploy-1.7.2-1.el7.centos.noarch
ovirt-ansible-manageiq-1.1.5-1.el7.centos.noarch
ovirt-engine-setup-plugin-websocket-proxy-4.2.1.7-1.el7.centos.noarch
ovirt-engine-setup-plugin-ovirt-engine-4.2.1.7-1.el7.centos.noarch
ovirt-host-deploy-java-1.7.2-1.el7.centos.noarch
ovirt-release42-4.2.1.1-1.el7.centos.noarch
ovirt-engine-backend-4.2.1.7-1.el7.centos.noarch
ovirt-js-dependencies-1.2.0-3.1.el7.centos.noarch
ovirt-engine-api-explorer-0.0.2-1.el7.centos.noarch
ovirt-vmconsole-proxy-1.0.4-1.el7.centos.noarch
ovirt-guest-tools-iso-4.2-1.el7.centos.noarch
ovirt-ansible-repositories-1.1.0-1.el7.centos.noarch
ovirt-guest-agent-common-1.0.14-1.el7.noarch
python-ovirt-engine-sdk4-4.2.4-2.el7.centos.x86_64
ovirt-engine-dashboard-1.2.1-1.el7.centos.noarch


Packages on Host with Problems:

ovirt-hosted-engine-ha-2.2.4-1.el7.centos.noarch
ovirt-release42-4.2.1.1-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7.centos.noarch
ovirt-imageio-common-1.2.1-0.el7.centos.noarch
ovirt-hosted-engine-setup-2.2.9-1.el7.centos.noarch
python-ovirt-engine-sdk4-4.2.4-2.el7.centos.x86_64
ovirt-engine-extensions-api-impl-4.2.1.7-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.4-1.el7.centos.noarch
ovirt-host-dependencies-4.2.1-1.el7.centos.x86_64
ovirt-host-deploy-1.7.2-1.el7.centos.noarch
cockpit-ovirt-dashboard-0.11.11-0.1.el7.centos.noarch
ovirt-host-4.2.1-1.el7.centos.x86_64
ovirt-engine-appliance-4.2-20180214.1.el7.centos.noarch
ovirt-vmconsole-1.0.4-1.el7.centos.noarch
ovirt-imageio-daemon-1.2.1-0.el7.centos.noarch
ovirt-setup-lib-1.1.4-1.el7.centos.noarch
ovirt-provider-ovn-driver-1.2.5-1.el7.centos.noarch


Packages on Host without problems:

ovirt-host-dependencies-4.2.1-1.el7.centos.x86_64
ovirt-imageio-common-1.2.1-0.el7.centos.noarch
ovirt-hosted-engine-setup-2.2.9-1.el7.centos.noarch
ovirt-hosted-engine-ha-2.2.4-1.el7.centos.noarch
ovirt-vmconsole-host-1.0.4-1.el7.noarch
ovirt-setup-lib-1.1.4-1.el7.centos.noarch
ovirt-host-deploy-1.7.2-1.el7.centos.noarch
ovirt-provider-ovn-driver-1.2.5-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch
python-ovirt-engine-sdk4-4.2.4-2.el7.centos.x86_64
ovirt-release42-4.2.1.1-1.el7.centos.noarch
ovirt-imageio-daemon-1.2.1-0.el7.centos.noarch
ovirt-vmconsole-1.0.4-1.el7.noarch
ovirt-cockpit-sso-0.0.4-1.el7.noarch
ovirt-host-4.2.1-1.el7.centos.x86_64
cockpit-ovirt-dashboard-0.11.11-0.1.el7.centos.noarch

If you need any info, I will send you right away.

Thanks!

Comment 10 Nikolai Sednev 2018-02-22 09:13:16 UTC
Hello Gabriel,
Can you please supply kernel versions of hosts that worked fine vs. hosts with the problem?
What were OS exactly installed on hosts, were they RHVHs or RHELs?

Comment 11 Gabriel 2018-02-22 09:15:53 UTC
All are CentOS 7.4 running oVirt.

# cat /etc/redhat-release 
CentOS Linux release 7.4.1708 (Core) 

Kernel Versions(same):

Fine: 3.10.0-693.17.1.el7.x86_64
Problem: 3.10.0-693.17.1.el7.x86_64

Comment 12 Ido Rosenzwig 2018-02-22 09:58:32 UTC
Edward, can you please take a look at the error message on https://bugzilla.redhat.com/show_bug.cgi?id=1540463#c9

Comment 13 Edward Haas 2018-02-22 11:03:23 UTC
I do not understand if this is related to the original problem or this is a new thing.
If it is a different error, please open a separate BZ.
Make sure all logs from both Engine and VDSM are provided.

Comment 14 Gabriel 2018-02-22 11:05:03 UTC
Look this: https://bugzilla.redhat.com/show_bug.cgi?id=1540463#c1

It's the same error.

Comment 15 Nikolai Sednev 2018-02-22 11:53:32 UTC
Hello Brian,
Would you please explain in more details about your environment?

I understand from your description that you want to deploy SHE environment within the same subnet, which has no gateway.
Lets say all your hosts should be running within 1.1.1.0/24.
Network 1.1.1.0/24 is not routable.

On the other side you're telling that first interface is existing, but is it routable and have default gateway?

Please provide more data about your setup.

Comment 16 Nikolai Sednev 2018-02-22 17:44:42 UTC
Forth to our discussion with Ido, reproduction for the fix in this bug was as follows:
1.Configured first interface enp3s0f0 as static with 1.1.1.1/24 with default gateway 1.1.1.254.
2.Configured enp3s0f1 as second interface with static IPv4 1.1.1.2/24 without gateway.
3.During deployment used enp3s0f1 as bridged interface.
4.I used local disk as NFS storage for deployment.

I was not able to reproduce the original error and hence moved this bug to verified.
Now VDSM properly consumes empty line of default gateway from enp3s0f1.

Components on host:
rhvm-appliance-4.2-20180202.0.el7.noarch
ovirt-hosted-engine-ha-2.2.6-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.11-1.el7ev.noarch
Linux alma03.qa.lab.tlv.redhat.com 3.10.0-855.el7.x86_64 #1 SMP Tue Feb 20 06:46:45 EST 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 Beta (Maipo)

Comment 17 Sandro Bonazzola 2018-03-29 11:11:04 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 18 Red Hat Bugzilla 2023-09-14 04:16:09 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.