Bug 1667181 - [Bond] VDSM does not detect partner MAC when first creating the Bond
Summary: [Bond] VDSM does not detect partner MAC when first creating the Bond
Keywords:
Status: CLOSED DUPLICATE of bug 999947
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.30.3
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ovirt-4.3.1
: ---
Assignee: Bell Levin
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-17 16:39 UTC by Roni
Modified: 2019-01-31 10:13 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-01-31 10:13:10 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.3+


Attachments (Terms of Use)
bond_check_hook (711 bytes, text/x-python)
2019-01-17 16:39 UTC, Roni
no flags Details
jq (3.77 MB, application/x-executable)
2019-01-17 16:41 UTC, Roni
no flags Details

Description Roni 2019-01-17 16:39:41 UTC
Created attachment 1521310 [details]
bond_check_hook

Description of problem:
When first creating a bond with misconfigured LACP, a Zero MAC is reported by the exclamation icon tooltip, while a partner MAC is detected by the OS (via: /sys/class/net/bond0/bonding/ad_partner_mac)

After restarting vdsmd or waiting for some minutes the Zero MAC message is removed.

Version-Release number of selected component (if applicable):
4.3.0-0.8.master.20190115203932.gitaafcce1.el7

How reproducible:
100%

Steps to Reproduce:
1. Create bond with misconfigured LACP aggregator ID
2. 
3.

Actual results:
exclamation icon tooltip message:
"Bond is in link aggregation mode (mode 4)  but no partner mac has been reported for it. At least one slave has a different aggregator id."

Expected results:
"At least one slave has a different aggregator id."

Additional info:
1. At the host machine install the attached hook file (bond_check_hook) in the following location:
/usr/libexec/vdsm/hooks/after_get_caps/
2. Copy the attached jq file to the /usr/bin directory
3. run jq '.bondings' /tmp/hookFileJson to see vdsm output
4. cat /sys/class/net/bond0/bonding/ad_partner_mac to see partner MAC reported by the OS

Before restart vdsm:
--------------------
[root@puma22 bin]# jq '.bondings' /tmp/hookFileJson 
{
  "bond0": {
    "ipv6autoconf": false,
    "ad_partner_mac": "00:00:00:00:00:00",
    "speed": 11000,
    "dhcpv6": false,
    "ipv6addrs": [],
    "netmask": "",
    "active_slave": "",
    "ad_aggregator_id": "1",
    "dhcpv4": false,
    "switch": "legacy",
    "ipv4defaultroute": false,
    "ipv4addrs": [],
    "hwaddr": "00:9c:02:b0:9f:2c",
    "slaves": [
      "enp5s0f0",
      "enp4s0f1"
    ],
    "mtu": "1500",
    "ipv6gateway": "::",
    "gateway": "",
    "opts": {
      "mode": "4",
      "xmit_hash_policy": "2"
    },
    "addr": ""
  }
}
[root@puma22 bin]# 


After restart vdsmd:
---------------------
[root@puma22 bin]# jq '.bondings' /tmp/hookFileJson 
{
  "bond0": {
    "ipv6autoconf": false,
    "ad_partner_mac": "78:fe:3d:30:bb:80",
    "speed": 11000,
    "dhcpv6": false,
    "ipv6addrs": [],
    "netmask": "",
    "active_slave": "",
    "ad_aggregator_id": "2",
    "dhcpv4": false,
    "switch": "legacy",
    "ipv4defaultroute": false,
    "ipv4addrs": [],
    "hwaddr": "00:9c:02:b0:9f:2c",
    "slaves": [
      "enp5s0f0",
      "enp4s0f1"
    ],
    "mtu": "1500",
    "ipv6gateway": "::",
    "gateway": "",
    "opts": {
      "mode": "4",
      "xmit_hash_policy": "2"
    },
    "addr": ""
  }
}
[root@puma22 bin]# 

OS partnet MAC was '78:fe:3d:30:bb:80' before and after restarting vdsmd
[root@puma22 after_get_caps]# cat  /sys/class/net/bond0/bonding/ad_partner_mac 
78:fe:3d:30:bb:80

Comment 1 Roni 2019-01-17 16:41:17 UTC
Created attachment 1521313 [details]
jq

Comment 2 Dominik Holler 2019-01-22 11:26:29 UTC
Bell, can you please have a look if we can introduce a functional test for this?

Comment 3 Bell Levin 2019-01-23 14:03:08 UTC
Roni, it takes time for the bond to get the updated ad_partner_mac since it needs to take it from the peer interface (usually a switch port).
Usually depends on the switch configuration (the hello message rate). When the engine makes the setup, he asks for caps - if the caps did not update by that time - it is the operator's responsibility to refresh the caps.

Additional info:
- The connection is set up by two nics and a mode 4 bond. There is also a bond on the switch side.

Comment 4 Roni 2019-01-29 07:00:18 UTC
Hi Bell

I also mention at my description that the message changed from incorrect to correct after a long time
or after restarting vdsm. I tried refresh-cap but it wasn't help
What I don't understand is why it take so long time to display the correct message 
while the partner MAC is already updated at the OS from the beginning
(cat  /sys/class/net/bond0/bonding/ad_partner_mac)

Thx
Roni

Comment 5 Bell Levin 2019-01-29 14:01:00 UTC
Roni,

I was able to reproduce this scenario, but was able to change the message with get caps, 100% of the time.
I tried it from two seconds since the bond was created, up to a minute, and it always worked - which is the desired behavior.

If indeed you can reproduce the "get caps" not working, can you tell me how to reproduce this (hitting get caps and the message not changing)?

Comment 6 Roni 2019-01-29 15:09:43 UTC
Bell,

Yes, I see it too now, the get cap issue is not reproducing.
from my perspective, the get-cap is a workaround, not the desired behavior, 
It is expected that the message will be updated at the time the following sys key
is updated: /sys/class/net/bond0/bonding/ad_partner_mac
and the sys key it updates immediately when the Bond is created
anyway I think that it's a very low issue and can be closed if it difficult to fix

Comment 7 Dominik Holler 2019-01-29 15:22:21 UTC
If already a bug about adding a notification from VDSM to Engine here exists, we should reference this bug here.

Comment 8 Bell Levin 2019-01-30 21:58:21 UTC
bz #1240719
bz #999947

At the time, looks like those were with the same intention as Roni mentioned in his last comment.

Comment 9 Bell Levin 2019-01-31 10:13:10 UTC

*** This bug has been marked as a duplicate of bug 999947 ***


Note You need to log in before you can comment on or make changes to this bug.