Created attachment 1388746 [details] vdsm_ip_validator_gateway_fix patch Description of problem: Running hosted-engine --deploy for oVirt 4.2 results in failure with "(code=22, message=u'' does not appear to be an IPv4 or IPv6 interface)" when the chosen ovirt management interface does not have a configured gateway. Version-Release number of selected component (if applicable): 4.20.9.3-1.el7.centos How reproducible: 100% Steps to Reproduce: 1. Configure a second interface on host with no gateway 2. Attempt to run hosted-engine --deploy specifying the second interface as the ovirt management interface Actual results: hosted-engine --deploy fails with the following message: [ ERROR ] Failed to execute stage 'Misc configuration': Failed to setup networks {'ovirtmgmt': {'nic': 'eth1', 'netmask': u'255.255.255.0', 'ipaddr': u'192.168.0.1', 'gateway': u'', 'defaultRoute': False}}. Error: "Command Host.setupNetworks with args {'bondings': {}, 'options': {'connectivityCheck': Fal se}, 'networks': {'ovirtmgmt': {'nic': 'eth1', 'netmask': u'255.255.255.0', 'ipaddr': u'192.168.0.1', 'gateway': u'', 'defaultRoute': False}}} failed: (code=22, message=u'' does not appear to be an IPv4 or IPv6 interface)" Expected results: Hosted engine is deployed Additional info: I realize this may be an out-of-norm situation, but we've got an out of norm kinda setup. No reason things shouldn't work without a gateway anyway. Bug looks to be introduced in commit 90325486e3039c2f4a620a9b1a68dd8f086d3ac2 (https://gerrit.ovirt.org/#/c/79951/) of vdsm, specifically the addition of this check "if 'gateway' in net_attrs:". vdsm getCapabilities seems to always return a 'gateway' key and leaves it as an empty string if no gateway is configured. Temporarily changing the check to "if 'gateway' in net_attrs and net_attrs['gateway']" allowed the engine to deploy. See the attached patch.
This issue continues in the Setup Host Networks configuration wizard. Where the following error is given: " must match "^\b((25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)\_){3}(25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)" Attribute: ipConfiguration.iPv4Addresses[0].gateway "
I believe that this is a dup of bug 1528906, please retest
The second issue I thought might be related in comment 1 is a duplicate of bug 1528906, the original issue in vdsm is separate. Thanks
I ran into this issue when adding additional VM networks, after having added an interface on a "storage only" network (no gateway; local /24 only). Our NFS service for oVirt/VDSM runs on a dedicated VLAN with no outside routing (10G switching fabric; layer 3 uplinks are GigE). The primary purpose was to contain all NFS traffic on the faster switching fabric. 172.26.36.0/24 - VLAN 243 oVirt/VDSM management network 172.16.0.0/24 - VLAN 1 (isolated switching fabric) NFS service No IP - VLAN 230 - Infrastructure Services No IP - VLAN 232 - VPN DMZ No IP - VLAN 235 - Public Services DMZ The validation induced a commit failure after having added the first two networks, even without the gateway for 172.16.0.0/24. Restated, it only threw the validation when adding the unnumbered VLAN interfaces, which took a while to track down. The work-around I've implemented, for now, was to add the VDSM host's 172.16.0.0/24 address as its gateway (this should fail, or at least not be installed as a route on the host). This allows validation to succeed without negatively impacting the routing table on the host. I'm trying to track down where the validation occurs to fix it. I'm working under the assumption that the validation occurs on the Engine host, rather than the VDSM nodes. I've been using oVirt since version 3.x, but haven't spent a lot of time digging into the orchestration layer.
brian, please excuse me but I am confused. Does `hosted-engine --deploy` of ovirt-4.2.2 pass when your management network is added to a NIC without a gateway? If yes, please mark this bug as verified. If not, please reopen this bug, and as much reproduction info as possible. If there is a second issue you would like to discuss, such as storage-only networks, please file a separate bug for it.
I don't see how such deployment might be possible, if during deployment customer being asked to provide pingable default gateway and then during deployment the process discovers that default gateway does not lay within the same subnet with management interface and getting in to the loop... --== STORAGE CONFIGURATION ==-- --== HOST NETWORK CONFIGURATION ==-- Please indicate a pingable gateway IP address [10.35.163.254]: [ INFO ] TASK [Gathering Facts] [ INFO ] ok: [localhost] [ INFO ] TASK [Get all active network interfaces] [ INFO ] TASK [filter bonds with bad naming] Please indicate a nic to set ovirtmgmt bridge on: (enp4s0f0, enp4s0f1) [enp4s0f0]: enp4s0f1 --== VM CONFIGURATION ==-- If you want to deploy with a custom engine appliance image, please specify the path to the OVA archive you would like to use (leave it empty to skip, the setup will use rhvm-appliance rpm installing it if missing): [ INFO ] Detecting host timezone. Please provide the FQDN you would like to use for the engine appliance. Note: This will be the FQDN of the engine VM you are now going to launch, it should not point to the base host or to any other existing machine. Engine VM FQDN: (leave it empty to skip): []: nsednev-he-4.scl.lab.tlv.redhat.com Please provide the domain name you would like to use for the engine appliance. Engine VM domain: [scl.lab.tlv.redhat.com] Automatically execute engine-setup on the engine appliance on first boot (Yes, No)[Yes]? Automatically restart the engine VM as a monitored service after engine-setup (Yes, No)[Yes]? Enter root password that will be used for the engine appliance: Confirm appliance root password: Enter ssh public key for the root user that will be used for the engine appliance (leave it empty to skip): [WARNING] Skipping appliance root ssh public key Do you want to enable ssh access for the root user (yes, no, without-password) [yes]: Please specify the number of virtual CPUs for the VM (Defaults to appliance OVF value): [4]: Please specify the memory size of the VM in MB (Defaults to appliance OVF value): [16384]: You may specify a unicast MAC address for the VM or accept a randomly generated default [00:16:3e:24:ce:60]: 00:16:3E:7D:DD:DD How should the engine VM network be configured (DHCP, Static)[DHCP]? Static Please enter the IP address to be used for the engine VM [1.1.1.2]: [ ERROR ] The Engine VM (1.1.1.2/24) and the default gateway (10.35.163.254) will not be in the same IP subnet. Static routing configuration are not supported on automatic VM configuration. How should the engine VM network be configured (DHCP, Static)[DHCP]?
(In reply to Nikolai Sednev from comment #6) > I don't see how such deployment might be possible, if during deployment > customer being asked to provide pingable default gateway and then during > deployment the process discovers that default gateway does not lay within > the same subnet with management interface and getting in to the loop... > This is unrelated to what this bug was opened for, please open a separate BZ and assign it to hosted-engine team. The user is not interested in setting the gateway at all (and he did manage to set it somehow based on the command sent to VDSM).
Several issues exist here: - hosted-engine setup is sending a command to VDSM with gateway='' , which is not something that VDSM accepts. If there is no gateway to be set, it should not be sent. - In continuation to the previous point, VDSM can be less pedantic and normalize the input (converting the key with an empty value to no key at all). - A setup where the host gateway and the Engine VM gateway is not the same or where there is no gateway at all, is a valid configuration. Different gateways also include having one in the VM and none at the host and vice versa. This by its own should be explored through a separated BZ.
Just adding more info. Updated all Hosts and Hosted-Engine to 4.2.1 and really strange, for 2 Hosts worked, for another 2 not. Same networks, just added them as VLAN, without IP or Gateway(I do that because I want to have my VMs using some VLANs). Yes, I rebooted all hosts after updates. Same as Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1528906 Some details: As usual, engine.log: 2018-02-22 09:56:24,605+01 WARN [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (default task-52) [74bf6519-4c86-4cd2-b157-ec8a854c16de] Validation of action 'HostSetupNetworks' failed for user admin@internal-authz. Reasons: VAR__ACTION__SETUP,VAR__TYPE__NETWORKS,must match "^\b((25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)\.){3}(25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)",$flags [Ljavax.validation.constraints.Pattern$Flag;@3a8f61fe,$regexp ^\b((25[0-5]|2[0-4]\d|[01]\d\d|\d?\d)\.){3}(25[0-5]|2[0-4]\d|[01]\d\d|\d?\d),$groups [Ljava.lang.Class;@487c0f2b,$message {javax.validation.constraints.Pattern.message},$payload [Ljava.lang.Class;@47f3434,ACTION_TYPE_FAILED_ATTRIBUTE_PATH,$path ipConfiguration.iPv4Addresses[0].gateway,$validatedValue Packages on Hosted-Engine(grep as ovirt): ovirt-engine-setup-base-4.2.1.7-1.el7.centos.noarch ovirt-ansible-vm-infra-1.1.4-1.el7.centos.noarch ovirt-engine-dwh-setup-4.2.1.2-1.el7.centos.noarch ovirt-engine-setup-4.2.1.7-1.el7.centos.noarch ovirt-engine-restapi-4.2.1.7-1.el7.centos.noarch ovirt-ansible-image-template-1.1.5-1.el7.centos.noarch ovirt-imageio-common-1.2.1-0.el7.centos.noarch ovirt-imageio-proxy-1.2.1-0.el7.centos.noarch ovirt-engine-extensions-api-impl-4.2.1.7-1.el7.centos.noarch ovirt-engine-webadmin-portal-4.2.1.7-1.el7.centos.noarch ovirt-engine-wildfly-11.0.0-1.el7.centos.x86_64 ovirt-setup-lib-1.1.4-1.el7.centos.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7.centos.noarch ovirt-ansible-infra-1.1.3-1.el7.centos.noarch ovirt-imageio-proxy-setup-1.2.1-0.el7.centos.noarch ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.2.1.7-1.el7.centos.noarch ovirt-engine-metrics-1.1.2.2-1.el7.centos.noarch ovirt-engine-tools-backup-4.2.1.7-1.el7.centos.noarch ovirt-engine-dwh-4.2.1.2-1.el7.centos.noarch ovirt-ansible-engine-setup-1.1.0-1.el7.centos.noarch ovirt-ansible-disaster-recovery-0.1-1.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-common-4.2.1.7-1.el7.centos.noarch ovirt-provider-ovn-1.2.5-1.el7.centos.noarch ovirt-engine-vmconsole-proxy-helper-4.2.1.7-1.el7.centos.noarch ovirt-ansible-roles-1.1.3-1.el7.centos.noarch ovirt-web-ui-1.3.5-1.el7.centos.noarch ovirt-engine-dbscripts-4.2.1.7-1.el7.centos.noarch ovirt-engine-4.2.1.7-1.el7.centos.noarch ovirt-iso-uploader-4.1.0-1.el7.centos.noarch ovirt-ansible-cluster-upgrade-1.1.4-1.el7.centos.noarch ovirt-engine-websocket-proxy-4.2.1.7-1.el7.centos.noarch ovirt-engine-tools-4.2.1.7-1.el7.centos.noarch ovirt-engine-wildfly-overlay-11.0.1-1.el7.centos.noarch ovirt-cockpit-sso-0.0.4-1.el7.noarch ovirt-engine-extension-aaa-jdbc-1.1.6-1.el7.centos.noarch ovirt-vmconsole-1.0.4-1.el7.centos.noarch ovirt-engine-cli-3.6.9.2-1.el7.centos.noarch ovirt-engine-lib-4.2.1.7-1.el7.centos.noarch ovirt-host-deploy-1.7.2-1.el7.centos.noarch ovirt-ansible-manageiq-1.1.5-1.el7.centos.noarch ovirt-engine-setup-plugin-websocket-proxy-4.2.1.7-1.el7.centos.noarch ovirt-engine-setup-plugin-ovirt-engine-4.2.1.7-1.el7.centos.noarch ovirt-host-deploy-java-1.7.2-1.el7.centos.noarch ovirt-release42-4.2.1.1-1.el7.centos.noarch ovirt-engine-backend-4.2.1.7-1.el7.centos.noarch ovirt-js-dependencies-1.2.0-3.1.el7.centos.noarch ovirt-engine-api-explorer-0.0.2-1.el7.centos.noarch ovirt-vmconsole-proxy-1.0.4-1.el7.centos.noarch ovirt-guest-tools-iso-4.2-1.el7.centos.noarch ovirt-ansible-repositories-1.1.0-1.el7.centos.noarch ovirt-guest-agent-common-1.0.14-1.el7.noarch python-ovirt-engine-sdk4-4.2.4-2.el7.centos.x86_64 ovirt-engine-dashboard-1.2.1-1.el7.centos.noarch Packages on Host with Problems: ovirt-hosted-engine-ha-2.2.4-1.el7.centos.noarch ovirt-release42-4.2.1.1-1.el7.centos.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7.centos.noarch ovirt-imageio-common-1.2.1-0.el7.centos.noarch ovirt-hosted-engine-setup-2.2.9-1.el7.centos.noarch python-ovirt-engine-sdk4-4.2.4-2.el7.centos.x86_64 ovirt-engine-extensions-api-impl-4.2.1.7-1.el7.centos.noarch ovirt-vmconsole-host-1.0.4-1.el7.centos.noarch ovirt-host-dependencies-4.2.1-1.el7.centos.x86_64 ovirt-host-deploy-1.7.2-1.el7.centos.noarch cockpit-ovirt-dashboard-0.11.11-0.1.el7.centos.noarch ovirt-host-4.2.1-1.el7.centos.x86_64 ovirt-engine-appliance-4.2-20180214.1.el7.centos.noarch ovirt-vmconsole-1.0.4-1.el7.centos.noarch ovirt-imageio-daemon-1.2.1-0.el7.centos.noarch ovirt-setup-lib-1.1.4-1.el7.centos.noarch ovirt-provider-ovn-driver-1.2.5-1.el7.centos.noarch Packages on Host without problems: ovirt-host-dependencies-4.2.1-1.el7.centos.x86_64 ovirt-imageio-common-1.2.1-0.el7.centos.noarch ovirt-hosted-engine-setup-2.2.9-1.el7.centos.noarch ovirt-hosted-engine-ha-2.2.4-1.el7.centos.noarch ovirt-vmconsole-host-1.0.4-1.el7.noarch ovirt-setup-lib-1.1.4-1.el7.centos.noarch ovirt-host-deploy-1.7.2-1.el7.centos.noarch ovirt-provider-ovn-driver-1.2.5-1.el7.centos.noarch ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch python-ovirt-engine-sdk4-4.2.4-2.el7.centos.x86_64 ovirt-release42-4.2.1.1-1.el7.centos.noarch ovirt-imageio-daemon-1.2.1-0.el7.centos.noarch ovirt-vmconsole-1.0.4-1.el7.noarch ovirt-cockpit-sso-0.0.4-1.el7.noarch ovirt-host-4.2.1-1.el7.centos.x86_64 cockpit-ovirt-dashboard-0.11.11-0.1.el7.centos.noarch If you need any info, I will send you right away. Thanks!
Hello Gabriel, Can you please supply kernel versions of hosts that worked fine vs. hosts with the problem? What were OS exactly installed on hosts, were they RHVHs or RHELs?
All are CentOS 7.4 running oVirt. # cat /etc/redhat-release CentOS Linux release 7.4.1708 (Core) Kernel Versions(same): Fine: 3.10.0-693.17.1.el7.x86_64 Problem: 3.10.0-693.17.1.el7.x86_64
Edward, can you please take a look at the error message on https://bugzilla.redhat.com/show_bug.cgi?id=1540463#c9
I do not understand if this is related to the original problem or this is a new thing. If it is a different error, please open a separate BZ. Make sure all logs from both Engine and VDSM are provided.
Look this: https://bugzilla.redhat.com/show_bug.cgi?id=1540463#c1 It's the same error.
Hello Brian, Would you please explain in more details about your environment? I understand from your description that you want to deploy SHE environment within the same subnet, which has no gateway. Lets say all your hosts should be running within 1.1.1.0/24. Network 1.1.1.0/24 is not routable. On the other side you're telling that first interface is existing, but is it routable and have default gateway? Please provide more data about your setup.
Forth to our discussion with Ido, reproduction for the fix in this bug was as follows: 1.Configured first interface enp3s0f0 as static with 1.1.1.1/24 with default gateway 1.1.1.254. 2.Configured enp3s0f1 as second interface with static IPv4 1.1.1.2/24 without gateway. 3.During deployment used enp3s0f1 as bridged interface. 4.I used local disk as NFS storage for deployment. I was not able to reproduce the original error and hence moved this bug to verified. Now VDSM properly consumes empty line of default gateway from enp3s0f1. Components on host: rhvm-appliance-4.2-20180202.0.el7.noarch ovirt-hosted-engine-ha-2.2.6-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.11-1.el7ev.noarch Linux alma03.qa.lab.tlv.redhat.com 3.10.0-855.el7.x86_64 #1 SMP Tue Feb 20 06:46:45 EST 2018 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.5 Beta (Maipo)
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days