Bug 1923999

Summary: When adding host to RHV-M with LACP bond, '"ad_actor_system=00:00:00:00:00:00' is seen in messages [rhel-8.3.0.z]] again
Product: Red Hat Enterprise Linux 8 Reporter: nsurati
Component: NetworkManagerAssignee: Thomas Haller <thaller>
Status: CLOSED ERRATA QA Contact: Vladimir Benes <vbenes>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.3CC: acardace, adevolder, atragler, bgalvani, ferferna, fge, jiji, jishi, jmaxwell, larnone, lrintel, mkalinin, network-qe, nsurati, rkhan, sukulkar, thaller, till, tpelka, tquinlan, vbenes
Target Milestone: rcKeywords: Regression, Triaged
Target Release: 8.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: NetworkManager-1.30.0-2.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1931881 1940435 (view as bug list) Environment:
Last Closed: 2021-05-18 13:32:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1929262, 1940435    
Attachments:
Description Flags
System logs with NM trace enabled none

Description nsurati 2021-02-02 12:12:58 UTC
Description of problem:

When RHV-H upgrade to 4.4.1 to 4.4.3 and after reconfiguration of bond still seeing messages "failed to set bonding attribute 'ad_actor_system' to '00:00:00:00:00:00'" 

Version-Release number of selected component (if applicable):

4.4.3

How reproducible:


Steps to Reproduce:
1. Install RHV-H 4.4.1 upgrade to RHV-H 4.4.3 and configure bond0 as LACP 
2. 'ad_actor_system=00:00:00:00:00:00' seen in messages

Actual results:

Getting following messages:

Feb  1 11:42:40 rhhipmuc04 kernel: bond0: Invalid ad_actor_system MAC address.
Feb  1 11:42:40 rhhipmuc04 kernel: bond0: option ad_actor_system: invalid value (00:00:00:00:00:00)
Feb  1 11:42:40 rhhipmuc04 NetworkManager[3149]: <error> [1612179760.6023] platform-linux: sysctl: failed to set 'bonding/ad_actor_system' to '00:00:00:00:00:00': (22) Invalid argument
Feb  1 11:42:40 rhhipmuc04 NetworkManager[3149]: <warn>  [1612179760.6024] device (bond0): failed to set bonding attribute 'ad_actor_system' to '00:00:00:00:00:00'
Feb  1 11:42:40 rhhipmuc04 kernel: bond0: option fail_over_mac: unable to set because the bond device has slaves

Expected results:

There should not be warning mesaages

Additional info:

This BUG resolved with BZ 1890497 however after applying fix also seeing messages

Comment 12 Gris Ge 2021-02-23 07:45:15 UTC
Even with correct setting, NetworkManager still complains:

sysctl: failed to set 'bonding/ad_actor_system' to '00:00:00:00:00:00': (22) Invalid argument


Reproducer:

 * sudo nmcli c add type bond connection.id bond0 ifname bond0 ipv4.method disabled ipv6.method disabled
 * sudo journalctl  -t NetworkManager  --since -1m -p 3


Meanwhile, this is just a no harming error message. If possible, please suggest customer to ignore it before our fix.
Changing to NetworkManager component.

Comment 13 Gris Ge 2021-02-23 07:52:51 UTC
The error message is harmless, the value(00:00:00:00:00:00) is already the default value in kernel.
The kernel is in the state we requested, NetworkManager just showing the wrong message.

Comment 14 Gris Ge 2021-02-23 08:16:56 UTC
Created attachment 1758781 [details]
System logs with NM trace enabled

Comment 15 Thomas Haller 2021-02-23 13:01:39 UTC
(In reply to Gris Ge from comment #12)
>  * sudo nmcli c add type bond connection.id bond0 ifname bond0 ipv4.method
> disabled ipv6.method disabled

It needs:

   nmcli c add type bond connection.id bond0 ifname bond0 ipv4.method disabled ipv6.method disabled bond.options 'mode=802.3ad'



But this seems more of a kernel issue:


    ip link add name bond1 type bond

    cat /sys/class/net/bond1/bonding/ad_actor_system
    # no output

    echo 802.3ad > /sys/class/net/bond1/bonding/mode 
    cat /sys/class/net/bond1/bonding/ad_actor_system
    # output 00:00:00:00:00:00

    echo 00:00:00:00:00:00 > /sys/class/net/bond1/bonding/ad_actor_system
    # -bash: echo: write error: Invalid argument

    echo 00:00:00:00:00:01 > /sys/class/net/bond1/bonding/ad_actor_system
    cat /sys/class/net/bond1/bonding/ad_actor_system
    # output 00:00:00:00:00:01

    echo 00:00:00:00:00:00 > /sys/class/net/bond1/bonding/ad_actor_system
    # -bash: echo: write error: Invalid argument



Meaning: with 802.3ad, the default value is 00:00:00:00:00:00, and kernel doc even says: "If the value is not given then system defaults to using the masters' mac address as actors' system address.".
But setting 00:00:00:00:00:00 via sysfs is always rejected with EINVAL.

That means,

  - you cannot set the current value (setting the value that is currently set, should not be an error)

  - after setting a different MAC address, you cannot reset the default value.

Comment 16 Thomas Haller 2021-02-23 13:41:28 UTC
Fixed upstream by https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/9e7af314546d7912ee23b3850230008902aca4d3 .

Note that this *only* avoids a warning in the logfile. Otherwise, there is no change (as there was no bug in NetworkManager).


I think this is a kernel issue however. I reported bug 1931881 for that.

It's not very severe, since it only happens if you want to change ad_actor_system back to "00:00:00:00:00:00". Both via `nmcli device reapply` or by activating a profile on an existing bond interface.

Comment 20 Marina Kalinin 2021-02-23 17:57:20 UTC
Nirav,
Can you please ensure we have a KCS about this stating that with 8.3 the bond issue is resolved and the message itself is harmless, as explained above?
Since I see we started getting more cases on it.

Comment 23 Vladimir Benes 2021-02-24 08:32:56 UTC
slight modification to catch the error added to NMCI:
https://gitlab.freedesktop.org/NetworkManager/NetworkManager-ci/-/commit/eaecfae8b8963e91ac83b7a5000e6cd4b2b82957

Comment 28 errata-xmlrpc 2021-05-18 13:32:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: NetworkManager and libnma security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:1574