Description of problem: Cluster member looses network connection after reboot. Network is setup using bonding and vlans. Version-Release number of selected component (if applicable): ovirt-node-ng-installer-master-2017120709.iso 4.2.1-0.0.master.20171206161426.git88e9120.el7.centos Might also apply to 4.2.0 branch How reproducible: Steps to Reproduce: 1. Setup ovirt node with bonding and one vlan on top for management on three nodes 2. Setup self hosted engine with gluster 3. Complete cluster setup by adding hosts and storage domains 4. Set one of the hosts in maintenance mode 5. Reboot this host Actual results: reboot completes and host is not accessible via network Expected results: reboot completes and hosts is available again Additional info: Node network: em3 + em4 => bond0 => bond0.78 => ovirtmgmt
Please attach supervdsm and vdsm logs so we can look at the problem
Created attachment 1364959 [details] Logfiles
It seems that we are not acquiring correctly an external bond when it has a VLAN over it. I'm not exactly sure why it causes a disconnection in the presented scenario, but fixing the acquirement should result in a more predictive setup after reboot.
I just investigated a bit further. It seems that bond0 was set to be down. After executing the following command the connection is working again: # ip link set dev bond0 up
Just tested. Version used: ovirt-node-ng-installer-master-2018010109 Result: Works, no errrors
Verified on - vdsm-4.20.13-1.el7ev.x86_64 and 4.2.1.1-0.1.el7 cockpit-155-1.el7.x86_64 The Scenario is - 1) Create bond1 with IP and bond1.162 on top of him with IP as well(created via cockpit). Both hosts has default route before add host - [root@silver-vdsb yum.repos.d]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.35.1x8.254 0.0.0.0 UG 300 0 0 bond1 0.0.0.0 10.35.1x9.254 0.0.0.0 UG 400 0 0 bond1.162 10.35.1x8.0 0.0.0.0 255.255.255.0 U 300 0 0 bond1 10.35.1x9.0 0.0.0.0 255.255.255.0 U 400 0 0 bond1.162 [root@silver-vdsb yum.repos.d]# ip -4 a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 6: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000 inet 10.35.1x8.x/24 brd 10.35.1x8.255 scope global dynamic bond1 valid_lft 42379sec preferred_lft 42379sec 7: bond1.162@bond1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000 inet 10.35.1x9.x/24 brd 10.35.1x9.255 scope global dynamic bond1.162 valid_lft 42409sec preferred_lft 42409sec [root@silver-vdsb yum.repos.d]# ping -I bond1 8.8.8.8 PING 8.8.8.8 (8.8.8.8) from 10.35.1x8.x bond1: 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=48 time=62.5 ms [root@silver-vdsb yum.repos.d]# ping -I bond1.162 8.8.8.8 PING 8.8.8.8 (8.8.8.8) from 10.35.1x9.x bond1.162: 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=48 time=128 ms 2) Add host to RHV on top of the vlan bond bond1.162 3) ovirtmgmt network configured on top of the bond.162 [root@silver-vdsb ~]# brctl show bridge name bridge id STP enabled interfaces ;vdsmdummy; 8000.000000000000 no ovirtmgmt 8000.001d096871c1 no bond1.162 4) vdsm take ownership on the bond1 - [root@silver-vdsb ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1 # Generated by VDSM version 4.20.13-1.el7ev DEVICE=bond1 BONDING_OPTS='mode=1 miimon=100 primary=eno1' MACADDR=00:1d:09:68:71:c1 ONBOOT=yes BOOTPROTO=dhcp MTU=1500 DEFROUTE=yes NM_CONTROLLED=no IPV6INIT=yes IPV6_AUTOCONF=yes [root@silver-vdsb ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1.162 # Generated by VDSM version 4.20.13-1.el7ev DEVICE=bond1.162 VLAN=yes BRIDGE=new-default ONBOOT=yes MTU=1500 DEFROUTE=no NM_CONTROLLED=no IPV6INIT=no 5) and persist MACADDR in /var/lib/vdsm/persistence/netconf/bonds/bond1 { "hwaddr": "00:1d:09:68:71:c1", "nics": [ "eno1", "eno2" ], "switch": "legacy", "options": "mode=1 miimon=100 primary=eno1" 6) The default route after add host will be only via bond1.162 on which the management network configured on. bond1 has it's IP, but no default route via this interface after the host(we set default route for the management network). 6) Survive reboot
This bugzilla is included in oVirt 4.2.1 release, published on Feb 12th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.