RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2076131 - Lose unmanaged port when rollback a linux bridge
Summary: Lose unmanaged port when rollback a linux bridge
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: NetworkManager
Version: 8.6
Hardware: x86_64
OS: Linux
urgent
unspecified
Target Milestone: rc
: ---
Assignee: Lubomir Rintel
QA Contact: Vladimir Benes
URL:
Whiteboard:
: 2076132 (view as bug list)
Depends On: 2035519
Blocks: 2076132
TreeView+ depends on / blocked
 
Reported: 2022-04-18 03:42 UTC by Gris Ge
Modified: 2022-12-05 11:31 UTC (History)
9 users (show)

Fixed In Version: NetworkManager-1.39.90-1.el8
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2076132 (view as bug list)
Environment:
Last Closed: 2022-11-08 10:10:31 UTC
Type: Bug
Target Upstream Version:
Embargoed:
pm-rhel: mirror+


Attachments (Terms of Use)
Reproducer script (984 bytes, application/x-shellscript)
2022-04-18 03:42 UTC, Gris Ge
no flags Details
logfile1 (682.88 KB, text/plain)
2022-04-19 09:35 UTC, Thomas Haller
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-119093 0 None None None 2022-04-18 03:51:41 UTC
Red Hat Product Errata RHBA-2022:7680 0 None None None 2022-11-08 10:10:55 UTC
freedesktop.org Gitlab NetworkManager NetworkManager merge_requests 1208 0 None merged bridge: ensure wired setting is always there so that wired.mtu changes can be reliably reapplied 2022-05-19 12:13:49 UTC

Description Gris Ge 2022-04-18 03:42:12 UTC
Description of problem:

A linux bridge with unmanaged veth port attached will lose this unmanaged port on checkpoint rollback.


Version-Release number of selected component (if applicable):
NetworkManager-1.36.0-3.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1. sudo ./bug.sh
2.
3.

Actual results:

vethtest1 detached from linux bridge

Expected results:

vethtest1 is still from linux bridge

Additional info:

Comment 1 Gris Ge 2022-04-18 03:42:44 UTC
Created attachment 1873178 [details]
Reproducer script

Comment 2 Thomas Haller 2022-04-19 09:35:07 UTC
Created attachment 1873483 [details]
logfile1

ran script from comment 1.
against NM from upstream main (a1ff31db3b473ccc35f754265395c6dee0e3926c).

Comment 3 Thomas Haller 2022-04-19 09:52:44 UTC
<info>  [1650359641.5058] audit: op="connection-activate" uuid="f187936a-ccac-4cc3-a1e4-78fd1bbdba71" name="brtest0" pid=2590 uid=0 result="success"
...
<debug> [1650359641.5263] platform: (vethtest1) link: releasing 17 from master 'brtest0' (18)
...
<info>  [1650359644.1261] audit: op="checkpoint-rollback" arg="/org/freedesktop/NetworkManager/Checkpoint/2" pid=2604 uid=0 result="success"


we see that the external port gets detached while reactivating the device. During rollback, it doesn't happen.


Via email, this issue was discussed is relation to bug 2035519. The error picture there was different, see log at https://bugzilla.redhat.com/show_bug.cgi?id=2035519#c10 . There, the port was attached during rollback.


It seems expected that `nmcli connection up` (which does a full (re)activation detaches unknown ports).
It also seems expected, that rollback does not restore ports that it knows nothing about. Note that the fix for bug 2035519 does not make `CheckpointCreate()` remember external ports and restore it. Instead, it makes that `CheckpointRollback()` preserves currently attached, external ports.

If `CheckpointCreate()` would remember external ports (and later restore), it would still not fix certain usecases:

1) in the reproducer, if you wouldn't do the rollback (because a success configuration), then the external ports are already lost during `nmcli connection up`.

2) in the reproducer, if any external ports are attached after the first CheckpointCreate(), they would not be restored during rollback, because they were not present when creating the checkpoint. If the original problems is that detaching external ports can break running containers, then this would still break containers that were started after CheckpointCreate.



Maybe `nmcli connection up` of a bridge profile should leave external ports attached. Which is not what was discussed in bug 2035519 and is unrelated to rollback. It also seems rather ugly to do that conceptually (that `nmcli connection up` does not bring the device is a fully known state -- including dropping unknown ports).



this bug report discusses very little about the motivation or use-case of the problem. So it's unclear what problem we are trying to fix.
With respect to the reproducer script, NM does what it's implemented to do.

Comment 4 Thomas Haller 2022-04-19 10:12:07 UTC
> There, the port was attached during rollback.

s/attached/detached/

Comment 5 Thomas Haller 2022-04-20 10:01:42 UTC
(In reply to Thomas Haller from comment #3)
> this bug report discusses very little about the motivation or use-case of
> the problem. So it's unclear what problem we are trying to fix.
> With respect to the reproducer script, NM does what it's implemented to do.

Linking to https://bugzilla.redhat.com/show_bug.cgi?id=2035519#c5 is not sufficient.
For one, the scenario shown there (in form of the log and what is dicussed) is different from what the reproducer script here does.

It says:

> The desired behaviour for our use case is to put the bridge to the state at the time the Checkpoint/85 was captured.

but what about the 2 problems (comment 3)?


The question is still, why are you re-activating the bridge? Isn't that already a very disruptive operation, that must not be done while there are containers attached?
Maybe ActivateConnection() should have a mode to preserve attached, external ports...

Comment 6 Gris Ge 2022-05-17 07:43:48 UTC
The detail use case and original reporter is at bug 2035519 .

Please check with engineers in that bug.

Comment 10 Vladimir Benes 2022-06-14 09:31:37 UTC
[root@gsm-r5s8-01 NetworkManager-ci]# rpm -q NetworkManager
NetworkManager-1.39.6-1.el8.x86_64
[root@gsm-r5s8-01 NetworkManager-ci]# sh bug.sh 
[root@gsm-r5s8-01 NetworkManager-ci]# nmcli  device 
DEVICE        TYPE      STATE         CONNECTION 
eth0          ethernet  connected     testeth0   
brtest0       bridge    connected     brtest0    
vethtest0     ethernet  connected     vethtest0  
eth1          ethernet  disconnected  --         
eth10         ethernet  disconnected  --         
eth2          ethernet  disconnected  --         
eth3          ethernet  disconnected  --         
eth4          ethernet  disconnected  --         
eth5          ethernet  disconnected  --         
eth6          ethernet  disconnected  --         
eth7          ethernet  disconnected  --         
eth8          ethernet  disconnected  --         
eth9          ethernet  disconnected  --         
vethtest0.ep  ethernet  unmanaged     --         
vethtest1     ethernet  unmanaged     --         
vethtest1.ep  ethernet  unmanaged     --         
lo            loopback  unmanaged     --  

and 
[root@gsm-r5s8-01 NetworkManager-ci]# rpm -q NetworkManager
NetworkManager-1.36.0-3.el8.x86_64
[root@gsm-r5s8-01 NetworkManager-ci]# sh bug.sh 
[root@gsm-r5s8-01 NetworkManager-ci]# nmcli  device 
DEVICE        TYPE      STATE         CONNECTION 
eth0          ethernet  connected     testeth0   
brtest0       bridge    connected     brtest0    
vethtest0     ethernet  connected     vethtest0  
eth1          ethernet  disconnected  --         
eth10         ethernet  disconnected  --         
eth2          ethernet  disconnected  --         
eth3          ethernet  disconnected  --         
eth4          ethernet  disconnected  --         
eth5          ethernet  disconnected  --         
eth6          ethernet  disconnected  --         
eth7          ethernet  disconnected  --         
eth8          ethernet  disconnected  --         
eth9          ethernet  disconnected  --         
vethtest0.ep  ethernet  unmanaged     --         
vethtest1     ethernet  unmanaged     --         
vethtest1.ep  ethernet  unmanaged     --         
lo            loopback  unmanaged     -- 


no idea where the difference is, I tend to move back to ASSIGNED

Comment 11 Lubomir Rintel 2022-06-16 11:29:47 UTC
Please drop all occurrences of "2>/dev/null" from bug.sh,
so that we can see why didn't nmstate succeed.

Here's more minimal testcase that exercises what has been changed:

  $ nmcli c add type bridge con-name xbr0 ifname xbr0
  $ nmcli c modify xbr0 mtu 666
  $ nmcli d reapply xbr0

Should fail with the old build and succeed with the new one.

Comment 13 Vladimir Benes 2022-07-18 11:27:26 UTC
the new version of the test correctly executed with 1.39.10-1

Comment 14 Gris Ge 2022-07-26 02:39:59 UTC
The reproducer in comment #0 failed on NetworkManager-1.39.10-30745.copr.8e8fed433f.el9.x86_64

Please investigate!

Comment 15 Thomas Haller 2022-07-28 12:23:42 UTC
back to assigned base on comment 14

Comment 25 errata-xmlrpc 2022-11-08 10:10:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7680

Comment 26 sfaye 2022-12-05 11:31:15 UTC
*** Bug 2076132 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.