Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1497333

Summary: Addition inactive connection gets created for team (runner:lacp) upon NetworkManager restart
Product: Red Hat Enterprise Linux 7 Reporter: Venkatesh Kavtikwar <vkavtikw>
Component: NetworkManagerAssignee: Beniamino Galvani <bgalvani>
Status: CLOSED ERRATA QA Contact: Desktop QE <desktop-qa-list>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: atragler, bgalvani, fgiudici, lrintel, rkhan, salmy, sukulkar, thaller, vbenes
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 13:29:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
[PATCH] libnm-core: normalize "tx_hash" when comparing team config
none
[PATCH v2] libnm-core: normalize "tx_hash" when comparing team config none

Description Venkatesh Kavtikwar 2017-09-29 19:52:42 UTC
Description of problem:

Addition inactive connection gets created for team (runner:lacp) upon NetworkManager restart 


Version-Release number of selected component (if applicable):

NetworkManager-1.8.0-9.el7.x86_64
kernel-3.10.0-693.el7.x86_64

How reproducible:

Create a team interlace with "lacp" runner


Steps to Reproduce:

# nmcli connection add type team con-name tesm0 ifname team0 config '{"runner":{"name": "lacp"}}' ipv4.method manual ipv4.address 10.0.0.1/24

# nmcli connection add type team-slave ifname ens9 con-name team-slave1 master team0

# nmcli connection add type team-slave ifname ens10 con-name team-slave2 master team0


# nmcli connection show 
NAME         UUID                                  TYPE            DEVICE 
eth0         57bc792e-9a34-4d46-b523-20fcfdaccfed  802-3-ethernet  eth0   
eth1         e3d35e86-9f34-429d-8b14-359f589e8fb6  802-3-ethernet  eth1   
team-slave1  00e6cfa9-3a16-45c7-a813-648de5588891  802-3-ethernet  ens9   
team-slave2  50092145-6592-44f3-b8ed-84e4df61521d  802-3-ethernet  ens10  
team0        a00c11e9-2696-4c4e-b89d-a7680f626e4c  team            team0  

# systemctl restart NetworkManager

# nmcli connection show 
NAME         UUID                                  TYPE            DEVICE 
eth0         57bc792e-9a34-4d46-b523-20fcfdaccfed  802-3-ethernet  eth0   
eth1         e3d35e86-9f34-429d-8b14-359f589e8fb6  802-3-ethernet  eth1   
team-slave1  00e6cfa9-3a16-45c7-a813-648de5588891  802-3-ethernet  ens9   
team-slave2  50092145-6592-44f3-b8ed-84e4df61521d  802-3-ethernet  ens10  
team0        f6294832-165e-4811-887c-5b712c01f915  team            team0  --- additional connection where team0 gets new UUID
team0        a00c11e9-2696-4c4e-b89d-a7680f626e4c  team            --      --- old team0 connection becomes inactive

Please help us understand why is this behaviour?


Workaround :-

team0 connection restart remove the newly created UUID & restores the old one.

# nmcli connection down team0
# nmcli connection up team0

# nmcli connection show 
NAME         UUID                                  TYPE            DEVICE 
eth0         57bc792e-9a34-4d46-b523-20fcfdaccfed  802-3-ethernet  eth0   
eth1         e3d35e86-9f34-429d-8b14-359f589e8fb6  802-3-ethernet  eth1   
team-slave1  00e6cfa9-3a16-45c7-a813-648de5588891  802-3-ethernet  ens9   
team-slave2  50092145-6592-44f3-b8ed-84e4df61521d  802-3-ethernet  ens10  
team0        a00c11e9-2696-4c4e-b89d-a7680f626e4c  team            team0 

Expected results:

NetworkManager restart should not create additional interface and the interface should be persistent/constants across restarts.


Additional info:

We have enabled the NM debug mode and captured log is showing below message

NetworkManager[3098]: <debug> [1506709730.8493] Connection 'team0' differs from candidate 'team0' in team.config


Please help us understand and debug this further.

Comment 2 Beniamino Galvani 2017-10-09 19:15:40 UTC
Created attachment 1336543 [details]
[PATCH] libnm-core: normalize "tx_hash" when comparing team config

Or maybe, instead of adding such ugly code, we should declare team.config as not INFERRABLE so that it is not compared when matching connections. Now NM knows what connection was active using the state file and so probably we can relax most of the connection matching checks?

Comment 3 Thomas Haller 2017-10-18 16:55:08 UTC
(In reply to Beniamino Galvani from comment #2)
> Created attachment 1336543 [details]
> [PATCH] libnm-core: normalize "tx_hash" when comparing team config
> 

the patch looks good to me, but could you add a test that hits these code paths?
in test_nm_utils_team_config_equal().

I am especially curious about running this code with valgrind, to see if there are leaks.

> Or maybe, instead of adding such ugly code, we should declare team.config as
> not INFERRABLE so that it is not compared when matching connections. Now NM
> knows what connection was active using the state file and so probably we can
> relax most of the connection matching checks?

Maybe we want to relax our checks during a restart (because at that point, the state file already clearly indicates which connection to assume).

but note that we also call the matching during first-start. In that case, we probably want to do a stricter checking. See https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=c828277872997dc0a90f30ff5a70705eec0979c3

Comment 4 Beniamino Galvani 2017-10-19 08:37:04 UTC
Created attachment 1340595 [details]
[PATCH v2] libnm-core: normalize "tx_hash" when comparing team config

Comment 5 Beniamino Galvani 2017-10-19 08:41:28 UTC
(In reply to Thomas Haller from comment #3)
> the patch looks good to me, but could you add a test that hits these code
> paths?
> in test_nm_utils_team_config_equal().

Done

> 
> I am especially curious about running this code with valgrind, to see if
> there are leaks.

There are no leaks as the *_new() functions take ownership of the argument and the *_get() return a borrowed reference.

> > Or maybe, instead of adding such ugly code, we should declare team.config as
> > not INFERRABLE so that it is not compared when matching connections. Now NM
> > knows what connection was active using the state file and so probably we can
> > relax most of the connection matching checks?
> 
> Maybe we want to relax our checks during a restart (because at that point,
> the state file already clearly indicates which connection to assume).
> 
> but note that we also call the matching during first-start. In that case, we
> probably want to do a stricter checking. See
> https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/
> ?id=c828277872997dc0a90f30ff5a70705eec0979c3

I agree, we should do that in addition to the normalization patch (which is needed on first start).

Comment 6 Thomas Haller 2017-10-19 11:28:56 UTC
(In reply to Beniamino Galvani from comment #5)
> (In reply to Thomas Haller from comment #3)
> > I am especially curious about running this code with valgrind, to see if
> > there are leaks.
> 
> There are no leaks as the *_new() functions take ownership of the argument
> and the *_get() return a borrowed reference.

I didn't really doubt that, I just wanted valgrind to confirm that during unit-tests.

Patch lgtm

Comment 10 Vladimir Benes 2018-01-11 23:25:13 UTC
automated, tested, working well

Comment 13 errata-xmlrpc 2018-04-10 13:29:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0778