Bug 1379115 - [OVS] Use Linux bonds with OVS networks (instead of OVS Bonds)
Summary: [OVS] Use Linux bonds with OVS networks (instead of OVS Bonds)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.18.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-4.1.0-alpha
: 4.19.2
Assignee: Petr Horáček
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks: OpenVswitch_Support
TreeView+ depends on / blocked
 
Reported: 2016-09-25 06:19 UTC by Edward Haas
Modified: 2017-02-01 14:49 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-01 14:49:23 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.1+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 64385 0 ovirt-4.0 MERGED net test: Rename link_test module to netlink_test 2021-01-15 19:24:34 UTC
oVirt gerrit 64386 0 ovirt-4.0 MERGED net test: sourceroute thread crashes due to a test race 2021-01-15 19:23:55 UTC
oVirt gerrit 64387 0 ovirt-4.0 MERGED net: Define netlink netdev flags constants under netlink.link 2021-01-15 19:24:34 UTC
oVirt gerrit 64388 0 ovirt-4.0 MERGED net: dhclient command - iface name should appear at the tail 2021-01-15 19:23:55 UTC
oVirt gerrit 64389 0 ovirt-4.0 MERGED net: Adding the 'link' package with an iface module 2021-01-15 19:23:55 UTC
oVirt gerrit 64390 0 ovirt-4.0 MERGED net: Adding bond api with sysfs driver under link 2021-01-15 19:23:55 UTC
oVirt gerrit 64391 0 ovirt-4.0 MERGED net func tests: Cleanup ifcfg files after each func test. 2021-01-15 19:23:55 UTC
oVirt gerrit 64392 0 ovirt-4.0 MERGED net tests: Set dummy ifaces up by default. 2021-01-15 19:23:55 UTC
oVirt gerrit 64393 0 ovirt-4.0 MERGED net: Relocating wait-for-event under its own module. 2021-01-15 19:23:56 UTC
oVirt gerrit 64394 0 ovirt-4.0 MERGED net: For IP-less networks, wait for link-up on ifup execution 2021-01-15 19:23:55 UTC
oVirt gerrit 64395 0 ovirt-4.0 MERGED net: Use is_link_up instead of operstate in netfunctestlib 2021-01-15 19:23:55 UTC
oVirt gerrit 64396 0 ovirt-4.0 MERGED net: Expose disable IPv6 through ip.address module 2021-01-15 19:23:55 UTC
oVirt gerrit 64397 0 ovirt-4.0 MERGED net: ifcfg - dhclient should always be stopped 2021-01-15 19:24:35 UTC
oVirt gerrit 64398 0 ovirt-4.0 MERGED net: Introduce iface.exists and start using it in ip.dhclient 2021-01-15 19:23:56 UTC
oVirt gerrit 64399 0 ovirt-4.0 MERGED net: dhclient kill - early exit if iface does not exists 2021-01-15 19:23:56 UTC
oVirt gerrit 64400 0 ovirt-4.0 MERGED net: dhclient - address flush before starting and on shutdown 2021-01-15 19:23:56 UTC
oVirt gerrit 64401 0 ovirt-4.0 MERGED net: Bond - Expose the (kernel) bond list. 2021-01-15 19:23:56 UTC
oVirt gerrit 64402 0 ovirt-4.0 MERGED net: Bond - adding a transaction context. 2021-01-15 19:23:56 UTC
oVirt gerrit 64403 0 ovirt-4.0 MERGED net: Bond - Add logging to the bond driver. 2021-01-15 19:23:56 UTC
oVirt gerrit 64404 0 ovirt-4.0 MERGED net: Bond - add refresh method to update bond config 2021-01-15 19:23:56 UTC
oVirt gerrit 64405 0 ovirt-4.0 MERGED net: Bond - preserve original slaves link state. 2021-01-15 19:23:56 UTC
oVirt gerrit 64406 0 ovirt-4.0 MERGED net: Link setup module - includes bond setup logic. 2021-01-15 19:23:56 UTC
oVirt gerrit 64407 0 ovirt-4.0 MERGED net: Disable IPv6 on OVS southbound iface (nic or bonding) 2021-01-15 19:23:56 UTC
oVirt gerrit 64408 0 ovirt-4.0 MERGED net: test ovs info with southbound nic 2021-01-15 19:23:56 UTC
oVirt gerrit 64409 0 ovirt-4.0 MERGED net: Remove OVS bond implementation. 2021-01-15 19:23:57 UTC
oVirt gerrit 64410 0 ovirt-4.0 MERGED net: Relocate sysfs_bond_permission to nettestlib 2021-01-15 19:23:57 UTC
oVirt gerrit 64411 0 ovirt-4.0 MERGED net: Use Linux bonds with OVS networks 2021-01-15 19:24:36 UTC
oVirt gerrit 64412 0 ovirt-4.0 MERGED net: Split OVS setup transaction to adjust for bond setup 2021-01-15 19:23:57 UTC
oVirt gerrit 64413 0 ovirt-4.0 MERGED net: Delete an OVS bridge when the last SB is detached. 2021-01-15 19:24:36 UTC
oVirt gerrit 64414 0 ovirt-4.0 MERGED net: Setup validation for OVS - Check nics usage 2021-01-15 19:23:57 UTC
oVirt gerrit 64415 0 ovirt-4.0 MERGED net: Log a setup networks transaction failure when using ovs switch 2021-01-15 19:23:57 UTC
oVirt gerrit 65058 0 master MERGED net test: fix ovs_test:test_dry_run 2021-01-15 19:23:57 UTC
oVirt gerrit 66200 0 master ABANDONED net: canonicalize bond options 2021-01-15 19:24:36 UTC
oVirt gerrit 66201 0 master MERGED net: enable link.bond to handle options 2021-01-15 19:23:57 UTC
oVirt gerrit 66202 0 master MERGED net: support bond options in link.setup 2021-01-15 19:23:57 UTC
oVirt gerrit 66203 0 master MERGED net: save switch type and options while changing bond 2021-01-15 19:23:57 UTC
oVirt gerrit 66281 0 master MERGED net: only set values once with ifacquire 2021-01-15 19:23:58 UTC
oVirt gerrit 66335 0 master MERGED net: restore ovs switch bonds early 2021-01-15 19:24:37 UTC
oVirt gerrit 66566 0 ovirt-4.0 MERGED net: only set values once with ifacquire 2021-01-15 19:23:58 UTC

Description Edward Haas 2016-09-25 06:19:51 UTC
The integrated OVS implementation needs to use the Linux Bond instead of the OVS Bond.

OVS bonds have several major limitations which brings us to use Linux
bonds instead.

Some Known Limitations with OVS bonds:
- Unable to apply QoS rules.
- Does not support all bond mode options (compared to the Linux bond).

Comment 1 Michael Burman 2016-10-05 14:43:13 UTC
We not there yet.

- Add host to ovs cluster over bond - failed
- Create bond and attach network to the bond on ovs cluster - failed.
Bond is broken and network didn't attached to the host. 

Please contact me and i will provide the env for further investigation.

Comment 2 Red Hat Bugzilla Rules Engine 2016-10-05 14:43:17 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 3 Michael Burman 2016-11-09 15:27:36 UTC
Some more critical scenarios should be fixed regarding bond+ovs - 

[1] - Currently bond mode options aren't implement yet. When creating the bond it ends up as mode=0 always and vm networks can't be attached to him.

[2] - vdsm can't start after reboot - it failing to restore the bond

Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: File "/usr/share/vdsm/vdsm-restore-net-config", line 479, in <module>
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: restore(args)
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: File "/usr/share/vdsm/vdsm-restore-net-config", line 442, in restore
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: unified_restoration()
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: File "/usr/share/vdsm/vdsm-restore-net-config", line 134, in unified_restoration
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: changed_config = _filter_changed_nets_bonds(available_config)
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: File "/usr/share/vdsm/vdsm-restore-net-config", line 261, in _filter_changed_nets_bonds
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: kernel_config = kernelconfig.KernelConfig(NetInfo(netswitch.netinfo()))
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: File "/usr/lib/python2.7/site-packages/vdsm/network/netswitch.py", line 308, in netinfo
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: ovs_netinfo, _netinfo, bridgeless_ovs_nets)
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: File "/usr/lib/python2.7/site-packages/vdsm/network/ovs/info.py", line 298, in fake_bridgeless
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: devtype_netinfo[iface_name].update(_shared_net_attrs(net_attrs))
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: KeyError: u'bond0'
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: Traceback (most recent call last):
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: File "/usr/bin/vdsm-tool", line 219, in main
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: return tool_command[cmd]["command"](*args)
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: File "/usr/lib/python2.7/site-packages/vdsm/tool/restore_nets.py", line 41, in restore_command
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: exec_restore(cmd)
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: File "/usr/lib/python2.7/site-packages/vdsm/tool/restore_nets.py", line 54, in exec_restore
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: raise EnvironmentError('Failed to restore the persisted networks')
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com vdsm-tool[13979]: EnvironmentError: Failed to restore the persisted networks
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com systemd[1]: vdsm-network.service: main process exited, code=exited, status=1/FAILURE
Nov 03 12:43:43 camel-vdsa.qa.lab.tlv.redhat.com systemd[1]: Failed to start Virtual Desktop Server Manager network restoration.

Port "bond0"
            Interface "bond0"
                error: "could not open network device bond0 (No such device)"


[3] - vdsm generates multiple comments in ifcfg-* of NM_CONTROLLED=no and ONBOOT=no. As well it generates ONBOOT=no although it is already there. 

[root@zeus-vds1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-enp4s0
# This device is now owned by VDSM.
# Please do not do any changes here while the device is used by VDSM.
# Once it is detached from VDSM, remove this prefix before applying
# any changes.
TYPE=Ethernet
BOOTPROTO=none
DEFROUTE=yes
PEERDNS=yes
PEERROUTES=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
NAME=enp4s0
UUID=2d0f6519-51c8-4421-927f-0832f68074a9
DEVICE=enp4s0
ONBOOT=no
NM_CONTROLLED=no  # Set by VDSM
ONBOOT=no  # Set by VDSM
NM_CONTROLLED=no  # Set by VDSM
ONBOOT=no  # Set by VDSM
NM_CONTROLLED=no  # Set by VDSM
ONBOOT=no  # Set by VDSM
NM_CONTROLLED=no  # Set by VDSM
ONBOOT=no  # Set by VDSM
NM_CONTROLLED=no  # Set by VDSM
ONBOOT=no  # Set by VDSM
NM_CONTROLLED=no  # Set by VDSM
ONBOOT=no  # Set by VDSM
NM_CONTROLLED=no  # Set by VDSM
ONBOOT=no  # Set by VDSM

Comment 4 Dan Kenigsberg 2016-12-05 09:08:21 UTC
Can be tested on 4.1.alpha builds.

Comment 5 Michael Burman 2016-12-11 16:13:59 UTC
Tested on - 4.1.0-0.2.master.20161210231201.git26a385e.el7.centos and vdsm-4.18.999-1128.git6b50e40.el7.centos

Scenarios that PASS:
[1] - Create ovs bond
[2] - Attach network to bond
[3] - Set static ip + prefix/netmask 
[4] - Change bond mode
[5] - Host survive reboot

Scenario that FAILED:
[1] - Add host over bond to ovs cluster is failed

Dan, how would you like to go on with this? do you want separate bug for the scenario that failed? or keep it here?

Comment 6 Michael Burman 2016-12-11 16:21:50 UTC
Another issue that relevant is:

get this error once trying to move host with bond from ovs to legacy:

2016-12-11 18:16:46,765 ERROR (jsonrpc/6) [vds] All bondings must be reconfigured on switch type change (API:1526)
Traceback (most recent call last):
  File "/usr/share/vdsm/API.py", line 1523, in setupNetworks
    supervdsm.getProxy().setupNetworks(networks, bondings, options)
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 53, in __call__
    return callMethod()
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 51, in <lambda>
    **kwargs)
  File "<string>", line 2, in setupNetworks
  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod
    raise convert_to_error(kind, result)
ConfigNetworkError: (21, 'All bondings must be reconfigured on switch type change')

Comment 7 Michael Burman 2016-12-12 06:46:35 UTC
This is correct for both directions 
ovs >> legacy
legacy >> ovs

Error while executing action SyncAllHostNetworks: Illegal Network parameters

I believe we should fail it at the moment.

Comment 8 Michael Burman 2016-12-12 09:49:42 UTC
As agreed with Dan, this bug can be considered as verified. 

- The add host over bond scenario will covered in a new bug.

- ConfigNetworkError: (21, 'All bondings must be reconfigured on switch type change') issue will be handled by BZ 1362399

Verified on 4.1.0-0.2.master.20161210231201.git26a385e.el7.centos

Comment 9 Sandro Bonazzola 2016-12-12 11:04:31 UTC
This bug is targeted 4.1 but it appears that the fix has been included in 4.0.6. Please crosscheck and re-target if it's fixed in 4.0.6.

Comment 10 Michael Burman 2016-12-12 11:13:17 UTC
Sandro

We not testing ovs on 4.0.6
This bug will be tested only on 4.1. Thanks

Comment 11 Dan Kenigsberg 2016-12-26 15:03:02 UTC
Native OvS feature has failed to reach 4.1 (and let alone 4.0.6) even though this specific bug is fixed and verified.


Note You need to log in before you can comment on or make changes to this bug.