Bug 1806346

Summary: Engine sends setupNetworks for ovirtmgmt without any IP during installation of IPv6 only host.
Product: Red Hat Enterprise Virtualization Manager Reporter: Germano Veit Michel <gveitmic>
Component: ovirt-engineAssignee: Nobody <nobody>
Status: CLOSED DUPLICATE QA Contact: Michael Burman <mburman>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.3.8CC: dholler, Rhev-m-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-02-25 22:14:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Germano Veit Michel 2020-02-24 00:19:54 UTC
Description of problem:

Adding an IPv6 only host to the engine fails. At the end of the installation the engine sends a setupNetworks() without IPv4 and without IPv6 to create the ovirtmgmt bridge, leaving the host without IP.

Version-Release number of selected component (if applicable):
ovirt-engine-4.3.8.2-0.4.el7.noarch
vdsm-4.30.40-1.el7ev.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install RHEL 7.7 host, configure IPv6 only on eth0. Other interfaces (eth1 and eth2) have no config.

PROXY_METHOD=none
BROWSER_ONLY=no
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=no
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ethernet-eth0
UUID=f13e2615-16f4-47f5-a71b-d320dbe282b6
DEVICE=eth0
ONBOOT=yes
IPV6ADDR=2::2/64
IPV6_DEFAULTGW=2::128

2. Add Host to RHV. After ansible-deploy, the host is non-op due to missing networks. Then the engine sends this to create the ovirtmgmt bridge:

2020-02-24 09:49:34,586+10 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1219) [7d6ab528] EVENT_ID: VDS_ANSIBLE_INSTALL_FINISHED(561), Ansible host-deploy playbook execution has successfully finished on host host2.kvm.

2020-02-24 09:49:34,689+10 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to host2.kvm/2:0:0:0:0:0:0:2

2020-02-24 09:49:34,839+10 INFO  [org.ovirt.engine.core.bll.host.HostConnectivityChecker] (EE-ManagedThreadFactory-engine-Thread-1219) [7d6ab528] Engine managed to communicate with VDSM agent on host 'host2.kvm' with address 'host2.kvm' ('75e966fa-f19c-4e0e-981e-683eebaebda4')

2020-02-24 09:49:38,081+10 ERROR [org.ovirt.engine.core.bll.SetNonOperationalVdsCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-45) [35f30b49] Host 'host2.kvm' is set to Non-Operational, it is missing the following networks: 'ovirtmgmt,san-1,san-2'

2020-02-24 09:49:38,243+10 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1219) [35e50694] START, HostSetupNetworksVDSCommand(
  HostName = host2.kvm, 
  HostSetupNetworksVdsCommandParameters:{
    hostId='75e966fa-f19c-4e0e-981e-683eebaebda4', 
    vds='Host[host2.kvm,75e966fa-f19c-4e0e-981e-683eebaebda4]', 
    rollbackOnFailure='true', 
    commitOnSuccess='false', 
    connectivityTimeout='120', 
    networks='[
      HostNetwork:{
        defaultRoute='true', 
        bonding='false', 
        networkName='ovirtmgmt', 
        vdsmName='ovirtmgmt', 
        nicName='eth0', 
        vlan='null', 
        vmNetwork='true', 
        stp='false', 
        properties='null', 
        ipv4BootProtocol='NONE', 
        ipv4Address='null', 
        ipv4Netmask='null', 
        ipv4Gateway='null', 
        ipv6BootProtocol='AUTOCONF', 
        ipv6Address='null',             <--- ??
        ipv6Prefix='null',              <--- ??
        ipv6Gateway='null', 
        nameServers='null'}
    ]', 
    removedNetworks='[]', 
    bonds='[]', 
    removedBonds='[]', 
    clusterSwitchType='LEGACY', 
    managementNetworkChanged='true'}
), log id: 24d7b4a1

Which results in this:

MainProcess|jsonrpc/7::DEBUG::2020-02-24 09:49:38,960::ifcfg::578::root::(writeConfFile) Writing to file /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt configuration:
# Generated by VDSM version 4.30.40.1
DEVICE=ovirtmgmt
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no
IPV6INIT=yes
IPV6_AUTOCONF=yes

MainProcess|jsonrpc/7::DEBUG::2020-02-24 09:49:38,980::ifcfg::578::root::(writeConfFile) Writing to file /etc/sysconfig/network-scripts/ifcfg-eth0 configuration:
# Generated by VDSM version 4.30.40.1
DEVICE=eth0
BRIDGE=ovirtmgmt
ONBOOT=yes
MTU=1500
DEFROUTE=no
NM_CONTROLLED=no

3. Connectivity Check of course fails, and VDSM rolls back the config:

MainProcess|jsonrpc/7::INFO::2020-02-24 09:52:40,825::connectivity::48::root::(check) Connectivity check failed, rolling back

4. Even though ifcf-eth0 was restored, it looks like roll-back did not bring the interface up, so the host is completely without network.

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:19:c1:02 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:86:31:ae brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:5f:79:52 brd ff:ff:ff:ff:ff:ff
5: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 3e:d0:db:db:a5:85 brd ff:ff:ff:ff:ff:ff
6: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 26:86:e6:f0:44:47 brd ff:ff:ff:ff:ff:ff
22: ;vdsmdummy;: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 96:9a:d7:92:e8:53 brd ff:ff:ff:ff:ff:ff
24: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
    link/ether f2:94:10:58:80:ac brd ff:ff:ff:ff:ff:ff
    inet6 fe80::f094:10ff:fe58:80ac/64 scope link 
       valid_lft forever preferred_lft forever

Comment 2 Germano Veit Michel 2020-02-24 00:25:10 UTC
The network_caps returned to the engine prior to the setupNetworks contained the info about IPv6 on eth0:

MainProcess|jsonrpc/4::DEBUG::2020-02-24 09:49:36,342::supervdsm_server::106::SuperVdsm.ServerCallback::(wrapper) return network_caps with {'bridges': {}, 'bondings': {}, 'nameservers': [], 'nics': {'eth2': {'ipv6autoconf': True, 'addr': '', 'speed': 0, 'dhcpv6': False, 'ipv6addrs': [], 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'hwaddr': '52:54:00:5f:79:52', 'ipv6gateway': '::', 'gateway': ''}, 'eth1': {'ipv6autoconf': True, 'addr': '', 'speed': 0, 'dhcpv6': False, 'ipv6addrs': [], 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'hwaddr': '52:54:00:86:31:ae', 'ipv6gateway': '::', 'gateway': ''}, 'eth0': {'ipv6autoconf': True, 'addr': '', 'speed': 0, 'dhcpv6': False, 'ipv6addrs': ['2::2/64'], 'mtu': '1500', 'dhcpv4': False, 'netmask': '', 'ipv4defaultroute': False, 'ipv4addrs': [], 'hwaddr': '52:54:00:19:c1:02', 'ipv6gateway': '2::128', 'gateway': ''}}, 'supportsIPv6': True, 'vlans': {}, 'networks': {}}

Comment 5 Germano Veit Michel 2020-02-24 00:50:13 UTC
Installing the host on a new empty cluster apparently makes the problem go away.

My impression is that this may be caused by another host on the same cluster having IPv4 on ovirtgmtmt, which also matches the customer's setup.
Is it possible?

Comment 6 Dominik Holler 2020-02-25 08:25:48 UTC
Is this a duplicate of bug 1680970 ?

Comment 7 Germano Veit Michel 2020-02-25 22:14:59 UTC
(In reply to Dominik Holler from comment #6)
> Is this a duplicate of bug 1680970 ?

Yes it is. Sorry for the dup.

*** This bug has been marked as a duplicate of bug 1680970 ***