Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2213258

Summary: NetworkManager MACSEC on a bond device - wobbly
Product: Red Hat Enterprise Linux 9 Reporter: lejeczek <peljasz>
Component: NetworkManagerAssignee: NetworkManager Development Team <nm-team>
Status: CLOSED MIGRATED QA Contact: Desktop QE <desktop-qa-list>
Severity: high Docs Contact:
Priority: unspecified    
Version: CentOS StreamCC: bgalvani, bstinson, ferferna, jwboyer, lrintel, rkhan, sfaye, sukulkar, thaller, till
Target Milestone: rcKeywords: MigratedToJIRA, Reopened, Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-21 10:19:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
debug NM boot log none

Description lejeczek 2023-06-07 16:26:34 UTC
Description of problem:

Hi guys, perhaps not a bug but future enhancement suggestion.
I have a MACSEC iface created off a 'bond' device, which bond is in 'broadcast' mode & is too managed by NM.
Connection is set with:
...
connection.interface-name:              macsec0
...
results in: (bare in mind only a single 'macsec' iface exists configured in NM)

-> $ ip macsec sh
8: macsec0: protect off validate strict sc off sa off encrypt on send_sci on end_station off scb off replay off 
    cipher suite: GCM-AES-128, using ICV length 16
    TXSC: ee7e1fe4a2440001 on SA 0
    offload: off 
14: macsec1: protect on validate strict sc on sa on encrypt on send_sci on end_station off scb off replay off 
    cipher suite: GCM-AES-128, using ICV length 16
    TXSC: 8cdcd4aae03c0001 on SA 2
        2: PN 4, state on, key bf...
    RXSC: a2c33455508c0001, state on
        2: PN 5403, state on, key bf..
    RXSC: 00110a6bf7b40001, state on

At which point - naturally - there is no connection to the peers.
I can do:
-> $ _CON=macsec-10.1.1-br; nmcli c d $_CON ; nmcli c u $_CON
and that "fixes" the issue:
-> $ ip macsec sh
16: macsec0: protect on validate strict sc off sa off encrypt on send_sci on end_station off scb off replay off 
    cipher suite: GCM-AES-128, using ICV length 16
    TXSC: 8cdcd4aae03c0001 on SA 0
    RXSC: 00110a6bf7b40001, state on
    RXSC: a2c33455508c0001, state on
    offload: off 

I think you get the gist of it - I presume there is a "randomness" to how NM stand up devices(stack) &| how system enumerates physical devices or more..

I'm fiddling with it "this" way, as I understand NM/kernel/drivers &| more cannot enslave a 'macsec' device - as of now - at least it's what NM tells me.

Here is 'nmcli' for macsec:
-> $ nmcli c add type macsec con-name macsec-10.1.1-br ifname macsec0 connection.autoconnect yes macsec.parent bond-1011 macsec.mode psk macsec.mka-cak d9... macsec.mka-ckn 7a... ipv4.method disabled ipv6.method disabled con.slave-type bridge con.master 99...

If..
-> $ nmcli c m macsec-10.1.1-br con.interface-name macsec1
then (after reboot)
8: macsec1: protect off validate strict sc off sa off encrypt on send_sci on end_station off scb off replay off 
    cipher suite: GCM-AES-128, using ICV length 16
    TXSC: 0279de0480af0001 on SA 0
    offload: off 
14: macsec0: protect on validate strict sc on sa on encrypt on send_sci on end_station off scb off replay off 
    cipher suite: GCM-AES-128, using ICV length 16
    TXSC: 8cdcd4aae03c0001 on SA 2
        2: PN 12, state on, key 0ef9..
    RXSC: 00110a6aba140001, state on
        2: PN 1584, state on, key 0ef9..
    RXSC: 00110a6bf7b40001, state on
        2: PN 3204, state on, key 0ef92..
    offload: off 


Version-Release number of selected component (if applicable):

NetworkManager-1.43.8-1.el9.x86_64
NetworkManager-libnm-1.43.8-1.el9.x86_64
NetworkManager-team-1.43.8-1.el9.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Thomas Haller 2023-06-07 19:42:37 UTC
I don't understand what happens.

Are you saying, that you have one macsec profile in NetworkManager, but see two macsec profiles in ip-link? Who creates the other one?

You say, the wrong thing is present after reboot. Please enable `level=TRACE` logging as described in the DEBUGGING section in `man NetworkManager`. Then reboot and reproduce. Then provide the complete journal (of that boot). Also show the output of `nmcli connection`, `nmcli device`, and `ip -d link` and `ip macsec show`. And, for the relevant profiles also show `nmcli -o connection show "$PROFILE_NAME"`.

Thank you.

> bare in mind only a single 'macsec' iface exists configured in NM

Btw, in NetworkManager you configure connection profiles (short "connections"), and by activating them the interface will be created by NetworkManager. It seems less confusing, if you think about having profiles, and what happens when you activate them, and which profiles are currently activated.

Comment 2 lejeczek 2023-06-08 07:09:06 UTC
I'll try my best to get that additional info but I the mean while, it should be trivial to reproduce at least certain bits:

a) I have a 2-port 10Gb NIC
b) I make a bond connection off such NIC, in "broadcast" mode
c) I make a MACSEC off that "bond" device/connection
d) lastly (probably optional & redundant - might as well add IP bits to the "bond" device/connection & end there though I've not tried) hook such MACSEC connection into a "bridge" connection/profile, which bridge has all IP bits set up for net communication.

All above is done with/under NM & as I described earlier.

Yes - I end up having two MACSEC devices (show up in: "ip macsec") unless ! I do "down & up" on connection/profile (as described above) but that "fixes" the issue only until next reboot, which reboot brings back macsecs 0 & 1 again.... "Who creates the other one?" - exactly!
I'ld only swap "who" for "what".

I failed to find a work-around with/in NM for TWO macseces after reboot issue - so I took a "poor man" route to work around this and I tell cron to "down & up" connection \@reboot.

This is a physical set-up I'm doing this on, but who knows, perhaps even in a KVM it might reproduce.

Comment 3 Thomas Haller 2023-06-08 07:43:51 UTC
It may be trivial to reproduce, when the involved configuration is clearly shown. But "make a MACSEC" is not a precise description of how to get there. What *exact* profiles did you configure?

Is there a problem with attaching the requested information from comment 1? Please attach it, the logfile is important. Thank you.

Comment 4 lejeczek 2023-06-08 08:04:35 UTC
I showed that in my first, original comment.
Here:
-> $ nmcli c add type macsec con-name macsec-10.1.1-br ifname macsec0 connection.autoconnect yes macsec.parent bond-1011 macsec.mode psk macsec.mka-cak d9... macsec.mka-ckn 7a... ipv4.method disabled ipv6.method disabled con.slave-type bridge con.master bridge-conn

I said, I'll try my best - I feel like I have the state less obvious: I'm trying to make living and I do that by having "up & running relatively okey computer systems" - in other words: cannot afford stop, tear down, restart &| debug at whim all day long. Give me a while please.

Comment 5 Thomas Haller 2023-06-08 09:54:07 UTC
the shown `nmcli c add` line creates only one macsec profile. For this setup to work, also the exact bond, bridge an bond-ports are relevant. I can make them up, and when I activate my "reproducer" a macsec0 interface is created. There is no macsec1 interface, and obviously it cannot come out of nowhere.

If you cannot reboot your system with debug logs enabled and reproduce the issue, then please provide the information that you have right at hand. As explained in comment 1.

Check all the profiles you have (`nmcli connection`). Ensure there is no profile that would create a macsec1 interface (by either deleting it, or at least ensure that it's not activated). See activated devices and profiles with `nmcli device`. And show that output here.

Comment 6 sfaye 2023-07-07 13:35:06 UTC
Hi, 

We cannot move this bug forward without the requested information in comment 5. Therefore, we will close this as INSUFFICIENT DATA for now. Please feel free to reopen it when you have the information we need. 

Thanks

Comment 7 lejeczek 2023-07-08 08:34:27 UTC
Created attachment 1974626 [details]
debug NM boot log

Comment 8 lejeczek 2023-07-17 11:12:39 UTC
Nothing, no good, NM log I attached?

Comment 10 Fernando F. Mancera 2023-08-29 12:42:14 UTC
Hi! I tried to reproduce the bug and couldn't do it.

I am investigating the logs and this is what I found:

```
<trace> [1688803707.0575] platform-linux: event-notification: RTM_NEWLINK, flags 0, seq 0: 14: macsec1@9 <DOWN;broadcast,multicast> mtu 8968 arp 1 macsec* not-init addrgenmode eui64 addr 00:11:0A:6A:BA:14 brd FF:FF:FF:FF:FF:FF tx-queue-l>
<debug> [1688803707.0576] platform: (macsec1) signal: link   added: 14: macsec1@9 <DOWN;broadcast,multicast> mtu 8968 arp 1 macsec* not-init addrgenmode eui64 addr 00:11:0A:6A:BA:14 brd FF:FF:FF:FF:FF:FF driver macsec tx-queue-len 1000 g>
<trace> [1688803707.0576] l3cfg[fe147c7569505772,ifindex=14]: created (netns=47f6f64c2294fb5f)
<trace> [1688803707.0577] l3cfg[fe147c7569505772,ifindex=14]: link ifname changed: "macsec1" (initial)
<trace> [1688803707.0577] l3cfg[fe147c7569505772,ifindex=14]: commit type register (type "none", source "device", existing 8e68224f1b81f55d) -> 8e68224f1b81f55d
<debug> [1688803707.0577] device[f35e50bc171aa7cb] (macsec1): ifindex: set ifindex 14 (l3cfg: fe147c7569505772)
<trace> [1688803707.0577] dbus-object[2b6f2890630d728e]: export: "/org/freedesktop/NetworkManager/IP4Config/14"
<trace> [1688803707.0578] dbus-object[ffaff660f296a932]: export: "/org/freedesktop/NetworkManager/IP6Config/14"
<debug> [1688803707.0579] device[f35e50bc171aa7cb] (macsec1): constructed (NMDeviceMacsec)
<debug> [1688803707.0579] device[f35e50bc171aa7cb] (macsec1): start setup of NMDeviceMacsec, kernel ifindex 14
<debug> [1688803707.0579] device[f35e50bc171aa7cb] (macsec1): hw-addr: hardware address now 00:11:0A:6A:BA:14
<debug> [1688803707.0579] device[f35e50bc171aa7cb] (macsec1): hw-addr: update initial MAC address 00:11:0A:6A:BA:14
<debug> [1688803707.0580] platform-linux: error reading net:/sys/class/net/macsec1/phys_port_id: error reading 4096 bytes from file descriptor: Operation not supported
<debug> [1688803707.0581] platform-linux: sysctl: reading 'net:/sys/class/net/macsec1/dev_id': '0x0'
<trace> [1688803707.0581] ethtool[14]: ETHTOOL_GDRVINFO, macsec1: success
<debug> [1688803707.0581] platform-linux: error reading net:/sys/class/net/macsec1/device/sriov_numvfs: Failed to open file "device/sriov_numvfs" with openat: No such file or directory
<debug> [1688803707.0582] device[f35e50bc171aa7cb] (macsec1): parent: ifindex 9, device cd7d4c0ecf21992f, bond-1011
<debug> [1688803707.0582] device[f35e50bc171aa7cb] (macsec1): add_pending_action (1): 'recheck-available'
<debug> [1688803707.0582] device[f35e50bc171aa7cb] (macsec1): unmanaged: flags set to [platform-init,external-down=0x104/0x104/unmanaged/unrealized], set-unmanaged [external-down=0x100])
<debug> [1688803707.0582] device[f35e50bc171aa7cb] (macsec1): unmanaged: flags set to [platform-init,external-down,!sleeping=0x104/0x105/unmanaged/unrealized], set-managed [sleeping=0x1])
<trace> [1688803707.0582] dbus-object[f35e50bc171aa7cb]: export: "/org/freedesktop/NetworkManager/Devices/14"
```

Here I can tell that NM is detecting a new link macsec1 is created, but NM is not creating it. There must be another component on the system that is creating it. NM just creates a device to track it, NM doesn't perform any operation on it. Do you know any other component in your system that might be creating this macsec1 interface?

Comment 11 RHEL Program Management 2023-09-21 10:16:06 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 12 RHEL Program Management 2023-09-21 10:19:52 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.

Comment 13 Red Hat Bugzilla 2024-01-20 04:25:45 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days