Bug 1490741

Summary: Upping an already up bond connection will change the bond MAC [rhel-7.4.z]
Product: Red Hat Enterprise Linux 7 Reporter: Oneata Mircea Teodor <toneata>
Component: NetworkManagerAssignee: Beniamino Galvani <bgalvani>
Status: CLOSED ERRATA QA Contact: Desktop QE <desktop-qa-list>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: atragler, bgalvani, danken, edwardh, fgiudici, igkioka, lrintel, mburman, rkhan, salmy, snagar, sukulkar, thaller, vbenes
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: NetworkManager-1.8.0-10.el7_4 Doc Type: If docs needed, set a value
Doc Text:
Previously, the NetworkManager service set by default the MAC address of a master device (bond, bridge, team) to a random value generated by the kernel. As a consequence, the MAC address of the device changed every time the device was activated. The bug has been fixed and now NetworkManager no longer changes the MAC address of a master device by default. Instead, the interface inherits the Mac address from one of its slaves.
Story Points: ---
Clone Of: 1472965 Environment:
Last Closed: 2017-10-19 15:00:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1472965    
Bug Blocks:    
Attachments:
Description Flags
NM logs - rhv scenario failed none

Description Oneata Mircea Teodor 2017-09-12 07:33:50 UTC
This bug has been copied from bug #1472965 and has been proposed to be backported to 7.4 z-stream (EUS).

Comment 3 Dan Kenigsberg 2017-09-27 09:57:07 UTC
Michael, would you be kind to assist in verification (of the RHV use case)? (sorry for my former glitch)

Comment 4 Vladimir Benes 2017-09-27 10:43:54 UTC
We were able to reproduce and verify the fix. Automated test is present to prevent future issues of this type. I think there is no need to give it extra care. But of course one more round of verification cannot harm anything.

Comment 5 Michael Burman 2017-09-27 11:02:06 UTC
(In reply to Dan Kenigsberg from comment #3)
> Michael, would you be kind to assist in verification (of the RHV use case)?
> (sorry for my former glitch)

Dan, i have verified - https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=13723808 2 months ago.
and i will do it again when we will have a formal builds with:
1) NetwrokManger available with this fix. The QA currently still using NetworkManager-1.8.0-9.el7.x86_64

2) RHV-H build with this NetworkManager

Once we will get the builds, i will run my tests again to make sure we covered.

Comment 6 Vladimir Benes 2017-09-27 12:16:25 UTC
There you are:
https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=596448

official 1.8.0-10 packages

Comment 7 Michael Burman 2017-09-27 12:44:52 UTC
(In reply to Vladimir Benes from comment #6)
> There you are:
> https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=596448
> 
> official 1.8.0-10 packages

This is not what i mean. I know i can find the official package in brew. 
I don't intent to install it without an official update for rhel, until we(the QA) won't get the official build i'm not planing to test it. This is not the flow. 
And until this NM version will be included in rhv-h it will take some time for sure. 
So for now rhv scenarios are not going to be tested. 
Only with official builds. 
Thanks, Vladimir)

Comment 8 Vladimir Benes 2017-09-29 08:13:02 UTC
(In reply to Michael Burman from comment #7)
> (In reply to Vladimir Benes from comment #6)
> > There you are:
> > https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=596448
> > 
> > official 1.8.0-10 packages
> 
> This is not what i mean. I know i can find the official package in brew. 
> I don't intent to install it without an official update for rhel, until
> we(the QA) won't get the official build i'm not planing to test it. This is
> not the flow. 

I always thought QA flow was to test package prior release not after. I wast just wondering whether you wanted to test fix that's landing into z stream for you before releasing it to the wild. That's it.

Comment 9 Michael Burman 2017-10-01 07:55:56 UTC
I managed to reproduce this bug on NetworkManager-1.8.0-10.el7_4.x86_64.

rhv use case has failed. 
After creating the bond using nmcli, the bond got random MAC address generated by the kernel. 

[root@silver-vdsb ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP qlen 1000
    link/ether b6:09:24:9d:31:04 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP qlen 1000
    link/ether b6:09:24:9d:31:04 brd ff:ff:ff:ff:ff:ff
4: enp12s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:15:17:3d:15:a0 brd ff:ff:ff:ff:ff:ff
5: enp12s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:15:17:3d:15:a1 brd ff:ff:ff:ff:ff:ff
6: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether b6:09:24:9d:31:04 brd ff:ff:ff:ff:ff:ff
    inet 10.35.128.x/24 brd 10.35.128.255 scope global dynamic bond1
       valid_lft 42631sec preferred_lft 42631sec
    inet6 fe80::b409:24ff:fe9d:3104/64 scope link 
       valid_lft forever preferred_lft forever

b6:09:24:9d:31:04 - this is a random MAC address, not belong to eno1 slave or eno2 slave. 

[root@silver-vdsb ~]# nmcli c s
NAME         UUID                                  TYPE            DEVICE 
System eno1  58c95e36-21d7-4342-962a-48fda5b0e554  802-3-ethernet  eno1   
eno2         ae01acc4-85fa-4808-bcdc-01e160afac7f  802-3-ethernet  --     
enp12s0f0    ce85fe72-edec-4f91-868f-04d1acf05ee5  802-3-ethernet  --     
enp12s0f1    d5491205-d146-4840-a295-ec29377effbb  802-3-ethernet  --     
[root@silver-vdsb ~]# nmcli con mod uuid 58c95e36-21d7-4342-962a-48fda5b0e554 ipv4.method disabled ipv6.method ignore; \
> nmcli connection modify uuid 58c95e36-21d7-4342-962a-48fda5b0e554 connection.slave-type bond connection.master bond1 connection.autoconnect yes; \
> nmcli connection modify id eno2 connection.slave-type bond connection.master bond1 connection.autoconnect yes; \
> nmcli connection add type bond con-name bond1 ifname bond1 mode active-backup primary eno1; \
> nmcli connection modify id bond1 ipv4.method auto ipv6.method ignore; \
> nmcli con down uuid 58c95e36-21d7-4342-962a-48fda5b0e554; \
> nmcli con up uuid 58c95e36-21d7-4342-962a-48fda5b0e554; \
> nmcli con down id eno2; \
> nmcli con up id eno2; \
> nmcli con up id bond1
Connection 'bond1' (982702f1-e8fe-4f9a-9d25-6f3275320ba7) successfully added.

In the bottom line, the bug has been reproduced on NetworkManager-1.8.0-10.el7_4.x86_64 
Attaching log

Comment 10 Michael Burman 2017-10-01 07:56:30 UTC
Created attachment 1332878 [details]
NM logs - rhv scenario failed

Comment 11 Michael Burman 2017-10-01 13:54:56 UTC
Update, the bug is reproduced after the update from 1.8.0-9.el7_4.x86_64 > 1.8.0-10.el7_4.x86_64 if not rebooting or restarting the server. 

I have to say that it is very weird that the NetworkManager not restarting during the update process and that the fix isn't in unless restarting it manually. 
I have created my bond right after the update and i hit the bug. 
If i restart NM after update or reboot the server and then create my bond, then NM no longer generates weird MAC addresses. 

I guess it is all expected behavior from the NetworkManager side, but honestly it feels quite wrong behaviour.

Comment 12 Thomas Haller 2017-10-02 10:32:50 UTC
(In reply to Michael Burman from comment #11)
> Update, the bug is reproduced after the update from 1.8.0-9.el7_4.x86_64 >
> 1.8.0-10.el7_4.x86_64 if not rebooting or restarting the server. 

The upgrade process doesn't know which running processes are affected, hence it doesn't know what to restart.
Like, a relevant fix might not be in NetworkManager.rpm itself, but one of the several libraries. If those libraries are upgraded, no services are restarted either.

Even if the upgrade process would know which processes are affected by the upgrade, automatically restarting programs would be a bad idea. E.g. NM disconnects Wi-Fi when exiting, so if you do `yum upgrade` over SSH+Wi-Fi, you might cut yourself off. Getting this right is so hard, that it doesn't seem worth the effort.

Updating the package without rebooting is at your own risk. The admin has to ensure that all affected programs (that the ones he cares about) are restarted. But to get that right, can be a very complex problem for the admin.
So, just reboot if you want to be sure and if the system is important to you. On your toy system, feel free not to reboot.


> I have to say that it is very weird that the NetworkManager not restarting
> during the update process and that the fix isn't in unless restarting it
> manually. 
> I have created my bond right after the update and i hit the bug. 
> If i restart NM after update or reboot the server and then create my bond,
> then NM no longer generates weird MAC addresses. 
> 
> I guess it is all expected behavior from the NetworkManager side, but
> honestly it feels quite wrong behaviour.

Comment 15 errata-xmlrpc 2017-10-19 15:00:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2925