Bug 1230813 - Unable to add label to bond0 when trunk contains additional VLANs which are used to register the host to RHEV-M
Summary: Unable to add label to bond0 when trunk contains additional VLANs which are u...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.5.1
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: ovirt-3.6.0-rc3
: 3.6.0
Assignee: Marcin Mirecki
QA Contact: Meni Yakove
URL:
Whiteboard:
Depends On: 1136329
Blocks: 1241055
TreeView+ depends on / blocked
 
Reported: 2015-06-11 14:49 UTC by Simon Reber
Modified: 2019-09-12 08:34 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1241055 (view as bug list)
Environment:
Last Closed: 2016-03-09 21:07:31 UTC
oVirt Team: Network
Target Upstream Version:
Embargoed:
ylavi: Triaged+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:0376 0 normal SHIPPED_LIVE Red Hat Enterprise Virtualization Manager 3.6.0 2016-03-10 01:20:52 UTC

Description Simon Reber 2015-06-11 14:49:11 UTC
Description of problem:
Customer is installing and configuring a RHEV environment fully automated using REST API

Before they are joining the RHEV Hypervisors they are configuring the same with the following options:

 - Setup bond0 with mode 4 containing of eth1 and eth3
 - Setup bond0.1710 and bond0.1711 on top of this bond0
   - Network with VLANID 1710: Not used as VM network
   - Network with VLANID 1711: Used as VM network
 - Install `vdsm`

When done, they are registering the host, using IP from bond0.1711

After the host is successfully registered to RHEV, they are trying to add a label to bond0, called fiber.

This is failing with the following error:

[Cannot add Label. The following Network Interfaces were specified more than once: bond0.1710.]


When trying to add the label via Web-UI, the operation succeeds without any error. 

Version-Release number of selected component (if applicable):
rhevm-3.5.1.1-0.1.el6ev.noarch
vdsm-4.16.13.1-1.el6ev.x86_64

How reproducible:
Always via REST API

Steps to Reproduce:

**No reproducer available yet, as we are missing a lab with trunk port to build a bond with mode 4 and two VLAN on top**

But customer confirmed that setup looks as following:


RHEV-M 3.5.1

[root@hp-bl460cg8-1 ~]# rpm -qa | grep rhevm-3
rhevm-3.5.1.1-0.1.el6ev.noarch
[root@hp-bl460cg8-1 ~]# uname -a
Linux hp-bl460cg8-1.XXX.redhat.com 2.6.32-504.el6.x86_64 #1 SMP Tue Sep 16 01:56:35 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux

Hypervisor running Red Hat Enterprise Linux 6

[root@hp-dl560g8-1 network-scripts]# cat ifcfg-eth1
DEVICE=eth1
ONBOOT=yes
MASTER=bond0
SLAVE=yes
MTU=1500
NM_CONTROLLED=no
IPV6INIT=no
IPV6_AUTOCONF=no
[root@hp-dl560g8-1 network-scripts]# cat ifcfg-eth3
DEVICE=eth3
ONBOOT=yes
MASTER=bond0
SLAVE=yes
MTU=1500
NM_CONTROLLED=no
IPV6INIT=no
IPV6_AUTOCONF=no
[root@hp-dl560g8-1 network-scripts]# cat ifcfg-bond0
DEVICE=bond0
ONBOOT=yes
BONDING_OPTS='mode=4 miimon=100 xmit_hash_policy=layer3+4'
MTU=1500
NM_CONTROLLED=no
HOTPLUG=no
IPV6INIT=no
IPV6_AUTOCONF=no
DEFROUTE=no
[root@hp-dl560g8-1 network-scripts]# cat ifcfg-bond0.1710
DEVICE=bond0.1710
ONBOOT=yes
VLAN=yes
IPADDR=172.17.121.12
NETMASK=255.255.255.224
MTU=1500
BOOTPROTO=none
DEFROUTE=no
NM_CONTROLLED=no
HOTPLUG=no
IPV6INIT=no
IPV6_AUTOCONF=no
[root@hp-dl560g8-1 network-scripts]# cat ifcfg-bond0.1711
DEVICE=bond0.1711
ONBOOT=yes
VLAN=yes
IPADDR=172.26.121.13
NETMASK=255.255.255.224
MTU=1500
BOOTPROTO=none
DEFROUTE=yes
NM_CONTROLLED=no
HOTPLUG=no
IPV6INIT=no
IPV6_AUTOCONF=no
[root@hp-dl560g8-1 network-scripts]#

[root@hp-dl560g8-1 ~]# uname -a
Linux hp-dl560g8-1.XXX.redhat.com 2.6.32-504.16.2.el6.x86_64 #1 SMP Tue Mar 10 17:01:00 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux

[root@hp-dl560g8-1 network-scripts]# rpm -qa | grep vdsm-4
vdsm-4.16.13.1-1.el6ev.x86_64

Obtain the NIC reference:

[root@hp-bl460cg8-1 ~]# curl -X GET -H "Accept: application/xml" -H "Content-Type: application/xml" -k -u admin@internal:xxxxxx https://hp-bl460cg8-1.XXX.redhat.com/api/hosts/b044a71c-1a7e-4bb7-86fd-e6c5fff05f0b/nics/


When identified, set the label accordingl:

[root@hp-bl460cg8-1 ~]# curl -H "Content-Type: application/xml" -k -u 'admin@internal:xxxxxx' -X POST -d "<label id='fiber' />" https://hp-bl460cg8-1.XXX.redhat.com/api/hosts/b044a71c-1a7e-4bb7-86fd-e6c5fff05f0b/nics/b0d8b7a7-5a0b-4a7c-9e08-34e1cddd14c3/labels

This will return the following on customer site:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<fault>
    <reason>Operation Failed</reason>
    <detail>[Cannot add Label. The following Network Interfaces were specified more than once: bond0.1710.]</detail>
</fault>


Actual results:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<fault>
    <reason>Operation Failed</reason>
    <detail>[Cannot add Label. The following Network Interfaces were specified more than once: bond0.1710.]</detail>
</fault>


Expected results:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<label href="/api/hosts/b044a71c-1a7e-4bb7-86fd-e6c5fff05f0b/nics/b0d8b7a7-5a0b-4a7c-9e08-34e1cddd14c3/labels/foobar" id="foobar">
    <host_nic href="/api/hosts/b044a71c-1a7e-4bb7-86fd-e6c5fff05f0b/nics/b0d8b7a7-5a0b-4a7c-9e08-34e1cddd14c3" id="b0d8b7a7-5a0b-4a7c-9e08-34e1cddd14c3"/>
</label>


Additional info:

Customer noticed, when applying the following on the hypervisor before joining the same to RHEV will solve/work-around the problem:

# define networks needed for automated rhev hosts add

cat << theEnd > /opt/vdsm-sds.xml
<network>
  <name>vdsm-sds</name>
  <forward dev='bond0.1710' mode='passthrough'>
    <interface dev='bond0.1710'/>
  </forward>
</network>
theEnd

cat << theEnd > /opt/vdsm-rhevm.xml
<network>
  <name>vdsm-rhevm</name>
  <forward mode='bridge'/>
  <bridge name='rhevm' />
</network>
theEnd



virsh net-define /opt/vdsm-sds.xml
virsh net-start vdsm-sds
virsh net-autostart vdsm-sds

virsh net-define /opt/vdsm-rhevm.xml
virsh net-start vdsm-rhevm
virsh net-autostart vdsm-rhevm


rm -f /opt/vdsm*.xml

Comment 1 Juan Hernández 2015-06-11 15:02:30 UTC
The RESTAPI operation to add a label is extremely simple, it just invokes the backend LabelNic command passing the NIC identifier and label. So chances are that the issue is in the backend. I'm changing the component accordingly.

Comment 2 Simon Reber 2015-06-11 15:07:25 UTC
Forgot to add, the following messages are found when the operation fails:

2015-06-03 12:16:50,716 WARN  [org.ovirt.engine.core.bll.network.host.SetupNetworksCommand] (ajp-/127.0.0.1:8702-1) [5d063f8c] CanDoAction of action SetupNetworks failed for user admin@internal. Reasons: VAR__ACTION__SETUP,VAR__TYPE__NETWORKS,NETWORK_INTERFACES_ALREADY_SPECIFIED,$NETWORK_INTERFACES_ALREADY_SPECIFIED_LIST bond0.1710
2015-06-03 12:16:50,718 ERROR [org.ovirt.engine.core.bll.network.host.LabelNicCommand] (ajp-/127.0.0.1:8702-1) [5d063f8c] Transaction rolled-back for command: org.ovirt.engine.core.bll.network.host.LabelNicCommand.
2015-06-03 12:16:50,722 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-1) [5d063f8c] Correlation ID: 35e19111, Call Stack: null, Custom Event ID: -1, Message: Failed to label network interface card bond0 with label fiber on host hvm-lab10ch-3.mgmt.sccloudpoc.net.
2015-06-03 12:16:50,723 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (ajp-/127.0.0.1:8702-1) Operation Failed: [Cannot add Label. The following Network Interfaces were specified more than once: bond0.1710.]

Comment 3 Juan Hernández 2015-06-11 18:25:07 UTC
I reproduced this in my environment, following the instructions in the description.

Looking at the code I see that when adding the label the required networks are also added. If they are VLANs then the corresponding VLAN device is created on top of the bond without first checking if it already exists. From the NetworkParametersBuilder class:

  protected void configureNetwork(VdsNetworkInterface nic, List<VdsNetworkInterface> nics, Network network) {
    NetworkCluster networkCluster = getNetworkCluster(nic, network);
    if (NetworkUtils.isVlan(network)) {
      VdsNetworkInterface vlan = createVlanDevice(nic, network);

      // This ^ creates a new VLAN device on top of the bond without
      // taking into account that it may already exists. The result
      // is a duplicated device name that will be later rejected.

      addBootProtocolForRoleNetwork(networkCluster, vlan);
      nics.add(vlan);
    }
    ...      
  }

Note also that the GUI doesn't use the LabelNic command, it calls directly the SetupNetworks command, that is why this problem doesn't affect the GUI.

Comment 4 Simon Reber 2015-06-22 12:14:32 UTC
Do we need to provide further information or is the information from  Juan Hernández (https://bugzilla.redhat.com/show_bug.cgi?id=1230813#c3) sufficient to identify and fix the issue?

Comment 10 Marcin Mirecki 2015-10-01 12:42:14 UTC
This looks like a duplicate of https://bugzilla.redhat.com/1241055
That was fixed for 3.5
I do not see this error happening in master.
The error message: The following Network Interfaces were specified more than once
is reachable only from SetupNetworksCommand, which was replaced by HostSetupNetworksCommand in master/3.6, so it should not be visible in master/3.6

Comment 11 Alona Kaplan 2015-10-14 10:09:19 UTC
Should work using the new HostSetupNetworks command.

Comment 12 Yaniv Lavi 2015-10-14 13:02:48 UTC
Why did you reopen this?

Comment 13 Alona Kaplan 2015-10-18 06:06:33 UTC
(In reply to Yaniv Dary from comment #12)
> Why did you reopen this?

I moved it to closed by mistake. I want the qe to verify it works in 3.6.

Comment 14 Michael Burman 2015-10-18 13:04:29 UTC
Verified on - rhevm-3.6.0.1-0.1.el6.noarch and vdsm-4.17.9-1.el7ev.noarch

Comment 16 errata-xmlrpc 2016-03-09 21:07:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0376.html


Note You need to log in before you can comment on or make changes to this bug.