Bug 1856279 - Failed to asign network to Infiniband Bond
Summary: Failed to asign network to Infiniband Bond
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.40.22
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.4.6
: ---
Assignee: Ales Musil
QA Contact: Michael Burman
URL:
Whiteboard:
: 1865855 (view as bug list)
Depends On: 1841017
Blocks: 1882542
TreeView+ depends on / blocked
 
Reported: 2020-07-13 09:44 UTC by Andrei
Modified: 2021-05-05 05:36 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1858726 1858731 (view as bug list)
Environment:
Last Closed: 2021-04-22 06:43:30 UTC
oVirt Team: Network
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)
supervdsm IB add (9.76 KB, text/plain)
2020-08-14 08:04 UTC, Andrei
no flags Details
VDSM IB test (19.35 KB, text/plain)
2020-08-14 08:05 UTC, Andrei
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1858731 0 high CLOSED [Docs] Failed to assign network to Infiniband Bond 2021-02-22 00:41:40 UTC

Internal Links: 1858726 1858731

Description Andrei 2020-07-13 09:44:46 UTC
Description of problem:

Version-Release number of selected component (if applicable):
oVirt 4.4.1.8-1.el8

How reproducible:

Steps to Reproduce:
1. Create Infiniband Bond 
2. Create new network with MTU 65520 (uncheck VM Network)
3. Assign created network to IB bond (Go to Hosts --> Netwrok Interface. Drag to IB Bond, press OK)

Actual results:
Error -  "Error while executing action HostSetupNetworks: Unexpected exception"

Expected results:
New network assigned to IB Bond. 

Additional info:

VDSM log 
2020-07-11 15:33:39,147+0300 INFO (jsonrpc/1) [api.network] START
setupNetworks(networks={'IB': {'netmask': '255.255.255.0',
'bonding': 'bond1', 'ipv6autoconf': False, 'bridged':
'false', 'ipaddr': '172.17.***.***', 'dhcpv6': False,
'mtu': 65520, 'switch': 'legacy'}}, bondings={},
options={'connectivityTimeout': 120, 'commitOnSuccess': True,
'connectivityCheck': 'true'}) from=::ffff:172.16.***.***,33614,
flow_id=fe589281-3171-41b4-b7aa-e28a4bbebe55 (api:48)
2020-07-11 15:33:40,088+0300 INFO (jsonrpc/1) [api.network] FINISH setupNetworks
error=MAC address cannot be specified in bond interface along with specified bond options
from=::ffff:172.16.***.***,33614, flow_id=fe589281-3171-41b4-b7aa-e28a4bbebe55 (api:52)
2020-07-11 15:33:40,088+0300 ERROR (jsonrpc/1) [jsonrpc.JsonRpcServer] Internal server
error (__init__:350)
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/yajsonrpc/__init__.py", line 345, in
_handle_request
res = method(**params)
File "/usr/lib/python3.6/site-packages/vdsm/rpc/Bridge.py", line 198, in
_dynamicMethod
result = fn(*methodArgs)
File "<decorator-gen-480>", line 2, in setupNetworks
File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in
method
ret = func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 1548, in
setupNetworks
supervdsm.getProxy().setupNetworks(networks, bondings, options)
File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in
__call__
return callMethod()
File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in
<lambda>
**kwargs)
File "<string>", line 2, in setupNetworks
File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in
_callmethod
raise convert_to_error(kind, result)
libnmstate.error.NmstateValueError: MAC address cannot be specified in bond interface
along with specified bond options

Comment 1 RHEL Program Management 2020-07-14 08:35:18 UTC
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.

Comment 2 Andrei 2020-07-20 12:11:51 UTC
Hi,

There are 2 more issues with IB.
- IB Bond are down after host reboot - temp solution - "service networ reboot" (no such problem on 4.3.X)
- The default IB mode is datagram, and setting to MTU 65520 didn't work as Datagram mode supprt up to 2044 (same problem on 4.3.X)
Is it posible to add mode selection to oVirt GUI or ...

Please let me know if i need to create new request.

Comment 3 Dominik Holler 2020-08-10 06:06:38 UTC
*** Bug 1865855 has been marked as a duplicate of this bug. ***

Comment 4 Dominik Holler 2020-08-10 13:54:00 UTC
Does the following procedure work as a workaround?

On every infiniband host:
nmcli connection add type infiniband con-name ib0 ifname ib0 transport-mode datagram mtu 2044
nmcli connection modify ib0 ipv4.addresses '192.0.2.1/24'
nmcli connection modify ib0 ipv4.method manual
nmcli connection modify ib0 ipv4.never-default true
nmcli connection modify ib0 ipv6.method disabled
nmcli connection up ib0

On oVirt Engine Administration Portal:
1. Create a new logical network, no VM network, MTU 2044
2. Use this network as migration network in the cluster
3. In "Setup Host Networks" dialog, assign this network to the infiniband interface, with
  the same settings as applied on the host, e.g.
  IPv4: Boot Protocol Static, IP 192.0.2.1, Netmaks/Routing Prefix 24, no Gateway
  IPv6: Boot Protocol None

Comment 5 Andrei 2020-08-10 19:15:27 UTC
Hi,

So basicly it is same config as we are using on 4.3 (the diefference is Conected mode and bond)

But, it is not working on 4.4 (all names are set according to config and log files)

nmcli - all commands sre OK.
oVirt
1 - OK (new network name - IB)
2 - I thinck it is Step 3
3 - Didn't work as well
 * When i drag drop IB network to ib1, all network automatically filled with nmcli settings.

Comment 6 Dominik Holler 2020-08-10 19:19:05 UTC
(In reply to Andrei from comment #5)
> Hi,
> 
> So basicly it is same config as we are using on 4.3 (the diefference is
> Conected mode and bond)
> 
> But, it is not working on 4.4 (all names are set according to config and log
> files)
> 
> nmcli - all commands sre OK.
> oVirt
> 1 - OK (new network name - IB)
> 2 - I thinck it is Step 3
> 3 - Didn't work as well

Which problem occurs in step 3?

>  * When i drag drop IB network to ib1, all network automatically filled with
> nmcli settings.

Comment 7 Andrei 2020-08-14 08:03:02 UTC
Hi,

I was able to setup fresh host.

And  i got "Error while executing action HostSetupNetworks: Unexpected exception"
when press OK on Netwrok config window.

You can find full log in attachment



2020-08-14 10:55:12,870+0300 INFO  (jsonrpc/2) [api.network] FINISH setupNetworks error=Unexpected failure of libnm when running the mainloop: run execution from=::ffff:172.16.**.***,54926, flow_id=2e38b2b1-a615-468f-8909-1a15eff85ac6 (api:52)
2020-08-14 10:55:12,870+0300 ERROR (jsonrpc/2) [jsonrpc.JsonRpcServer] Internal server error (__init__:350)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request
    res = method(**params)
  File "/usr/lib/python3.6/site-packages/vdsm/rpc/Bridge.py", line 198, in _dynamicMethod
    result = fn(*methodArgs)
  File "<decorator-gen-480>", line 2, in setupNetworks
  File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/API.py", line 1548, in setupNetworks
    supervdsm.getProxy().setupNetworks(networks, bondings, options)
  File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 56, in __call__
    return callMethod()
  File "/usr/lib/python3.6/site-packages/vdsm/common/supervdsm.py", line 54, in <lambda>
    **kwargs)
  File "<string>", line 2, in setupNetworks
  File "/usr/lib64/python3.6/multiprocessing/managers.py", line 772, in _callmethod
    raise convert_to_error(kind, result)
libnmstate.error.NmstateLibnmError: Unexpected failure of libnm when running the mainloop: run execution
2020-08-14 10:55:12,871+0300 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.setupNetworks failed (error -32603) in 5.49 seconds (__init__:312)

Comment 8 Andrei 2020-08-14 08:04:59 UTC
Created attachment 1711427 [details]
supervdsm IB add

Comment 9 Andrei 2020-08-14 08:05:20 UTC
Created attachment 1711428 [details]
VDSM IB test

Comment 10 Dominik Holler 2020-08-17 07:37:34 UTC
(In reply to Andrei from comment #7)
> Hi,
> 
> I was able to setup fresh host.
> 
> And  i got "Error while executing action HostSetupNetworks: Unexpected
> exception"
> when press OK on Netwrok config window.
> 

Thanks for checking this.
Did you configure ib0 with nmcli manually like described in comment 4 before?

Comment 11 Andrei 2020-08-19 10:51:19 UTC
Hi,

Yesm sure.

Comment 12 Trey Prinz 2020-08-25 14:42:37 UTC
Per a request on rhev-tech (thread with subject "Couple of questions regarding RHV"), I wanted to add the following information about a use case we are pursuing.  The end goal is twofold:

- Use NFS over IB as a storage domain for the actual VM's
- Mount a NFS share over IB inside of the VM's

Customer environment:

- RHEL-based HPC cluster
- RHEL-based NFS storage server
- RHEL-based physical application servers
- All systems are connected via IB

The physical application servers are old and they are replacing them and want a virtualized solution (they run into app conflicts and want to isolate them inside of VM's).  They will be replacing the current application servers with (3) new servers.

Other details:

- IB adapter:  Mellanox ConnectX-3
- RHEL version:  7.7
- They are not using IB bonding in the environment

Take care.

Comment 13 Dominik Holler 2020-09-16 09:25:58 UTC
(In reply to Andrei from comment #11)
> Hi,
> 
> Yesm sure.

In this case I can only recommend to manage the Infiniband network connection manually with NetworkManager, until nmstate supports Infiniband (bz 1841017).

Comment 14 Dominik Holler 2020-10-06 06:32:56 UTC
The development version of nmstate already contains support for Infiniband.
This nmstate version can be installed from
https://copr.fedorainfracloud.org/coprs/nmstate/nmstate-git/
If nispor is required as a dependency, it can be installed from
https://copr.fedorainfracloud.org/coprs/nmstate/nispor/

Feedback is welcome!

Comment 15 Ed Berger 2020-10-08 15:50:24 UTC
Hi Dominik,

I'm having difficulty installing the newer nmstate from the above repos on an oVirt node 4.4.2.

Is there a procedure to install the updated nmstate, python-libnmstate, NetworkManager-libnm 
with the imgbased ovirt 4.4.2 node-ng?   

disabling the versionlock.conf I get

]# dnf upgrade nmstate
Copr repo for nispor owned by nmstate            25 kB/s | 6.0 kB     00:00
Copr repo for nmstate-git owned by nmstate       25 kB/s | 4.6 kB     00:00
Extra Packages for Enterprise Linux 8 - x86_64  837 kB/s | 8.1 MB     00:09
CentOS-8 - Gluster 7                             36 kB/s |  37 kB     00:01
virtio-win builds roughly matching what will be 191 kB/s |  66 kB     00:00
Copr repo for EL8_collection owned by sbonazzo  1.4 MB/s | 401 kB     00:00
Copr repo for gluster-ansible owned by sac       55 kB/s | 8.4 kB     00:00
Copr repo for ovsdbapp owned by mdbarroso        19 kB/s | 2.0 kB     00:00
Copr repo for nmstate-stable owned by nmstate    21 kB/s | 2.7 kB     00:00
Copr repo for NetworkManager-1.22 owned by netw 264 kB/s |  41 kB     00:00
Advanced Virtualization packages for x86_64      96 kB/s | 127 kB     00:01
CentOS-8 - oVirt 4.4                            913 kB/s | 1.1 MB     00:01
CentOS-8 - OpsTools - collectd                  440 kB/s | 123 kB     00:00
Latest oVirt 4.4 Release                        505 kB/s | 874 kB     00:01
Error:
 Problem: package nmstate-0.4.1-0.20201006.1232git963a04d.el8.noarch requires python3-libnmstate = 0.4.1-0.20201006.1232git963a04d.el8, but none of the providers can be installed
  - cannot install the best update candidate for package nmstate-0.2.10-1.el8.noarch
  - nothing provides python3.6dist(varlink) needed by python3-libnmstate-0.4.1-0.20201006.1232git963a04d.el8.noarch
  - nothing provides NetworkManager-libnm >= 1:1.26.0 needed by python3-libnmstate-0.4.1-0.20201006.1232git963a04d.el8.noarch
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)

Maybe a 4.4-pre node-ng image update will have these newer versions?

Comment 16 Dominik Holler 2020-10-09 13:31:55 UTC
(In reply to Ed Berger from comment #15)
> Hi Dominik,
> 
> I'm having difficulty installing the newer nmstate from the above repos on
> an oVirt node 4.4.2.
> 

Happy that you gave the upcoming version a try!

> Is there a procedure to install the updated nmstate, python-libnmstate,
> NetworkManager-libnm 
> with the imgbased ovirt 4.4.2 node-ng?   
> 


I just tried CentOS based hosts up to now.


> disabling the versionlock.conf I get
> 

Should be not required.

> ]# dnf upgrade nmstate
> Copr repo for nispor owned by nmstate            25 kB/s | 6.0 kB     00:00
> Copr repo for nmstate-git owned by nmstate       25 kB/s | 4.6 kB     00:00
> Extra Packages for Enterprise Linux 8 - x86_64  837 kB/s | 8.1 MB     00:09
> CentOS-8 - Gluster 7                             36 kB/s |  37 kB     00:01
> virtio-win builds roughly matching what will be 191 kB/s |  66 kB     00:00
> Copr repo for EL8_collection owned by sbonazzo  1.4 MB/s | 401 kB     00:00
> Copr repo for gluster-ansible owned by sac       55 kB/s | 8.4 kB     00:00
> Copr repo for ovsdbapp owned by mdbarroso        19 kB/s | 2.0 kB     00:00
> Copr repo for nmstate-stable owned by nmstate    21 kB/s | 2.7 kB     00:00
> Copr repo for NetworkManager-1.22 owned by netw 264 kB/s |  41 kB     00:00
> Advanced Virtualization packages for x86_64      96 kB/s | 127 kB     00:01
> CentOS-8 - oVirt 4.4                            913 kB/s | 1.1 MB     00:01
> CentOS-8 - OpsTools - collectd                  440 kB/s | 123 kB     00:00
> Latest oVirt 4.4 Release                        505 kB/s | 874 kB     00:01
> Error:
>  Problem: package nmstate-0.4.1-0.20201006.1232git963a04d.el8.noarch
> requires python3-libnmstate = 0.4.1-0.20201006.1232git963a04d.el8, but none
> of the providers can be installed
>   - cannot install the best update candidate for package
> nmstate-0.2.10-1.el8.noarch
>   - nothing provides python3.6dist(varlink) needed by
> python3-libnmstate-0.4.1-0.20201006.1232git963a04d.el8.noarch
>   - nothing provides NetworkManager-libnm >= 1:1.26.0 needed by
> python3-libnmstate-0.4.1-0.20201006.1232git963a04d.el8.noarch
> (try to add '--skip-broken' to skip uninstallable packages or '--nobest' to
> use not only best candidate packages)
> 

Is this solved if also NetworkManager 1.26 repo from

https://copr.fedorainfracloud.org/coprs/networkmanager/NetworkManager-1.26/

is enabled.


https://copr.fedorainfracloud.org/coprs/networkmanager/NetworkManager-1.26/


> Maybe a 4.4-pre node-ng image update will have these newer versions?

Unfortunately not, because the build of nmstate is currently on copr, which does not yet support all platforms supported by oVirt.

Comment 17 Dominik Holler 2020-10-09 13:40:16 UTC
> 
> > disabling the versionlock.conf I get
> > 
> 
> Should be not required.
> 

Probably I was wrong in this regard.
versionlock.conf  disabling may be needed, ovirt Node is locking rpms there to prevent accidental upgrade.

Comment 18 Michael Burman 2021-04-22 06:41:59 UTC
QE don't have Infiniband to test this. This is fixed with nmstate 1.0, no work from RHV side done here.

Comment 19 Sandro Bonazzola 2021-05-05 05:36:36 UTC
This bugzilla is included in oVirt 4.4.6 release, published on May 4th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.6 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.