Bug 2151905
| Summary: | SR-IOV VFs not attached to Linux bond | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Carlos Goncalves <cgoncalves> | ||||||
| Component: | nmstate | Assignee: | Gris Ge <fge> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Mingyu Shi <mshi> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 9.0 | CC: | ferferna, jiji, jishi, network-qe, sfaye, till | ||||||
| Target Milestone: | rc | Keywords: | Triaged | ||||||
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | nmstate-2.2.3-2.el9 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2023-05-09 07:31:50 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
# oc get nns cnfdc8-worker-1 -o yaml
apiVersion: nmstate.io/v1beta1
kind: NodeNetworkState
metadata:
creationTimestamp: "2022-12-04T16:20:54Z"
generation: 1
labels:
nmstate.io/force-nns-refresh: "1670509523521952843"
name: cnfdc8-worker-1
ownerReferences:
- apiVersion: v1
kind: Node
name: cnfdc8-worker-1
uid: c7f0dc92-fd13-4cb8-a6f8-4cbf73d650d1
resourceVersion: "2415162"
uid: 0a07afbd-6138-42e0-9f49-0b9fbe070cc5
status:
currentState:
[...]
interfaces:
- accept-all-mac-addresses: false
ethtool:
feature:
rx-lro: false
rx-vlan-hw-parse: true
tx-generic-segmentation: false
tx-tcp-segmentation: false
ipv4:
address: []
enabled: false
ipv6:
address: []
enabled: false
link-aggregation:
mode: balance-rr
options:
all_slaves_active: dropped
arp_all_targets: any
arp_interval: 0
arp_validate: none
downdelay: 0
miimon: 0
packets_per_slave: 1
resend_igmp: 1
updelay: 0
use_carrier: true
port: [] <------------------- NO PORTS
mac-address: EE:59:64:AC:A0:B1
mtu: 1500
name: bond0
state: up
type: bond
[...]
Created attachment 1931068 [details]
NetworkManager journalctl
Attaching NetworkManager journalctl with log level=TRACE.
Hi Carlos, Could you attach nnce or even better nmstate logs? Created attachment 1932848 [details]
must-gather nmstate
Attached nmstate must-gather archive.
Node is cnfdc8-worker-1 with corresponding log directory nmstate-handler-svxrc/
Hi Carlos, I afraid I cannot find root cause from the logs neither. Could you provide me an login for a live debug? Thank you! Sure. Credentials will follow in a private comment. Please do let me know if you need any help. We can share a tmux/tmate session and reproduce live together. Root cause analyze:
* The `/etc/udev/rules.d/10-nm-unmanaged.rules` will mark all SRIOV VF as unmanaged in NetworkManager.
SUBSYSTEM=="net", ACTION=="add|move", ATTRS{phys_switch_id}!="", ATTR{phys_port_name}=="pf*vf*", ENV{NM_UNMANAGED}="1"
* When applying the `bond0-policy.yaml`, those VF ports are marked as `state: ignore` due to NM unmanaged, hence nmstate remove
these two VF from bond port list which lead to no error triggered by nmstate.
Possible fix would be:
A: Add these lines to the desire state to manually convert from `ignore` to `up`.
- name: ens1f0v0
type: ethernet
state: up
- name: ens1f0v1
type: ethernet
state: up
B: Patch nmstate to automatically convert interface from `ignore` to `up` when interface been listed. But this might be dangerous in some use case of CNV where `br-ex` contains many unmanaged interfaces.
So my suggestions is explicitly mark SR-IOV VF interface as `state: up` in desire state. Using Option B) is too riskly.
Hi Carlos, On the second through, nmstate can be smart: * For existing bridge/bond, when it contains unmanaged ports, do not auto convert `ignore` to `up` for ports. * For existing or new birdge bond without unmanaged ports, do auto convert `ignore` to `up` for ports. But if possible, I would like to do that in RHEL 9 (nmstate 2.x branch, rust based). For RHEL 8(nmstate 1.x branch, python based), I recommend option A) to explicitly mark VF as `state: up`. Is that OK to you? Good findings! I think your suggestion makes sense, yes.
Perhaps consider adding a warning log message when a user tries the first case (do not auto-convert 'ignore' to 'up' for unmanaged ports).
That udev rule is explicit about the port name ("pf*vf*"). Other NICs such as Mellanox ConnectX usually have different SR-IOV VF port names (e.g. eno1v0, eno2v2) so NetworkManager will manage them. From an OCP user-experience perspective, it would be just easier to document in the OCP docs that SR-IOV VFs must be set to state 'up' regardless of the NIC vendor/port name. This is outside of Nmstate scope, and sharing that we can seemingly live without a patched Nmstate.
Thanks a lot, Gris!
My previous comment about the udev rule is incorrect. The udev rule matches for every SR-IOV VF interface (any NIC vendor and port name). The acceptance criteria is: nmstate will automatically convert ignored interface to managed when it been listed in a controller with __no__ other ignored interfaces Carlos will follow up with documentation team for setting `state: up` of VF interfaces explicitly when using VF in bridge/bond/vlan and etc. For QE, to test this issue, use `nmcli d <vf_iface_name> set managed false` would reproduce this problem. Patch sent to upstream: https://github.com/nmstate/nmstate/pull/2181 With that patch, when desired controller listed currently ignored interfaces as its port, nmstate will automatically convert these ignored interfaces from 'state: ignore' to 'state: up' only when: 1. This ignored port is not mentioned in desire state. 2. This ignored port is listed as port of a desired controller. 3. Controller interface is new or does not contain ignored interfaces currently. (In reply to Gris Ge from comment #14) > Patch sent to upstream: https://github.com/nmstate/nmstate/pull/2181 > > > With that patch, when desired controller listed currently ignored interfaces > as its port, > nmstate will automatically convert these ignored interfaces from 'state: > ignore' to 'state: up' only when: > > 1. This ignored port is not mentioned in desire state. does "mentioned in desire" mean iface0 explictly listed as *state: ignore* in desired state: --- interfaces: - name: bond0 link-aggregation: port: - iface0 - name: iface0 state: ignore > 2. This ignored port is listed as port of a desired controller. > 3. Controller interface is new or does not contain ignored interfaces > currently. I don't get how this could happen, is it something like: ip link add veth1 type veth ip link add bond0 type bond ip link set veth1 master bond0 then attach *unmanaged* iface0 to bond0 via nmstate --> does it mean: since bond0 already contained unmanaged veth1, so desired attaching iface0 to bond0 won't be done? Or could you please give me an example, thanks! (In reply to Mingyu Shi from comment #17) > (In reply to Gris Ge from comment #14) > > Patch sent to upstream: https://github.com/nmstate/nmstate/pull/2181 > > > > > > With that patch, when desired controller listed currently ignored interfaces > > as its port, > > nmstate will automatically convert these ignored interfaces from 'state: > > ignore' to 'state: up' only when: > > > > 1. This ignored port is not mentioned in desire state. > does "mentioned in desire" mean iface0 explictly listed as *state: ignore* > in desired state: > --- > interfaces: > - name: bond0 > link-aggregation: > port: > - iface0 > - name: iface0 > state: ignore Just your yaml file explains the condition. If interface is marked as ignored explicitly, we do not auto-manage. > > > 2. This ignored port is listed as port of a desired controller. > > 3. Controller interface is new or does not contain ignored interfaces > > currently. > I don't get how this could happen, is it something like: > ip link add veth1 type veth > ip link add bond0 type bond > ip link set veth1 master bond0 > > then attach *unmanaged* iface0 to bond0 via nmstate --> does it mean: since > bond0 already contained unmanaged veth1, so desired attaching iface0 to > bond0 won't be done? > Or could you please give me an example, thanks! For condition 3: A: You use nmstate to create new controller with unmanaged port(not explicitly marked as ignored), all ports are auto managed. B: If controller already exists and has mixed managed and unmanaged ports, nmstate will __not__ do auto-managed. Verified with: nmstate-2.2.7-1.el9.x86_64 nispor-1.2.10-1.el9.x86_64 NetworkManager-1.42.0-1.el9.x86_64 DISTRO=RHEL-9.2.0-20230223.23 For condition 3 in #comment18 , *existing* controller MUST contain both managed and unmanaged type interfaces, only if so, nmstate won't auto attach a NEW unmanaged interface. Current: --- interfaces: - name: bond1 type: bond state: up link-aggregation: mode: balance-rr port: - veth0 - veth1 - veth2 # unmanaged desired YAML file: interfaces: - name: bond1 type: bond state: up link-aggregation: mode: balance-rr port: - veth0 - veth1 - veth2 # unmanaged - veth3 # unmanaged After applied the desired file, veth3 won't be touched Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (nmstate bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2190 |
Description of problem: A Linux bond with one or more SR-IOV VFs as bond ports are defined but the desired state is not configured. The bond is created but the VFs are not attached. The SR-IOV VFs were created via NodeNetworkConfigurationPolicy CR prior to creating the bond. Kubernetes-nmstate reports NodeNetworkConfigurationPolicy has been successfully configured. Version-Release number of selected component (if applicable): - Red Hat Enterprise Linux CoreOS 412.86.202212030032-0 - Kernel 4.18.0-372.32.1.el8_6.x86_64 - OpenShift 4.12.0-0.nightly-2022-12-03-062812 How reproducible: 100% Steps to Reproduce: 1. oc apply -f sriov-policy.yaml 2. oc apply -f bond0-policy.yaml 3. Actual results: # oc get nnce NAME STATUS REASON cnfdc8-worker-1.bond0-policy Available SuccessfullyConfigured cnfdc8-worker-1.sriov-policy Available SuccessfullyConfigured # ip link show bond0 53: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether 56:3b:ac:41:92:49 brd ff:ff:ff:ff:ff:ff # ip link show ens1f0v0 38: ens1f0v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 6a:ec:6b:a5:03:ea brd ff:ff:ff:ff:ff:ff # ip link show ens1f0v1 39: ens1f0v1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 1a:07:3e:b9:34:e3 brd ff:ff:ff:ff:ff:ff Expected results: Interfaces ens1f0v0 and ens1f0v1 should have been set with bond0 as master. Interfaces can be attached manually as follows: # ip link set ens1f0v0 master bond0 # ip link set ens1f0v1 master bond0 # ip link set bond0 up # ip link show bond0 53: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 6a:ec:6b:a5:03:ea brd ff:ff:ff:ff:ff:ff # ip link show ens1f0v0 38: ens1f0v0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 6a:ec:6b:a5:03:ea brd ff:ff:ff:ff:ff:ff # ip link show ens1f0v1 39: ens1f0v1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 1a:07:3e:b9:34:e3 brd ff:ff:ff:ff:ff:ff Additional info: apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: sriov-policy spec: nodeSelector: kubernetes.io/hostname: "cnfdc8-worker-1" desiredState: interfaces: - name: ens1f0 type: ethernet ethernet: sr-iov: total-vfs: 4 --- apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: bond0-policy spec: nodeSelector: kubernetes.io/hostname: "cnfdc8-worker-1" desiredState: interfaces: - name: bond0 type: bond state: up link-aggregation: mode: active-backup options: primary: ens1f0v0 port: - ens1f0v0 - ens1f0v1 ipv4: address: [] dhcp: false enabled: false ipv6: address: [] autoconf: false dhcp: false enabled: false