Description of problem: https://issues.redhat.com/browse/OCPBUGS-14107 When SR-IOV parameter failed to apply(for example, min_tx_rate not supported), NetworkManager will deactivate the connection which lead to disabling SR-IOV which remove preexist VFs before activation. This break user's network connection when pre-exist VFs are used in VLAN/Bond/bridge. Version-Release number of selected component (if applicable): NetworkManager-1.43.8-32322.copr.d07383d3f3.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create SR-IOV VFs using nmstate YAML: ``` --- interfaces: - name: eth1 type: ethernet state: up ethernet: sr-iov: total-vfs: 2 - name: eth2 type: ethernet state: up ethernet: sr-iov: total-vfs: 2 ``` 2. Create VLAN over VF: ``` --- interfaces: - name: eth1v1.101 type: vlan state: up vlan: base-iface: eth1v1 id: 101 - name: eth2v1.101 type: vlan state: up vlan: base-iface: eth2v1 id: 101 ``` 3. Assign VLAN to a bond: ``` --- interfaces: - name: bond0 type: bond state: up link-aggregation: mode: balance-rr port: - eth1v1.101 - eth2v1.101 ``` 4. Apply invalid SR-IOV configuration: ``` interfaces: - name: eth1 type: ethernet state: up ethernet: sr-iov: total-vfs: 5 vfs: - id: 2 max-tx-rate: 200 ``` Actual results: * The eth1v1 and eth1v1.101 been removed and readded. * The bond0 lose eth1v1.101. Expected results: * The eth1v1, eth1v1.101 is untouched due to SR-IOV failures. * The bond0 still have eth1v1.101 and eth2v1.101, no detach/reatach. Additional info: This is known limitation in NM that it disable SR-IOV if SR-IOV parament setting failed. Disabling SR-IOV will cause VF been removed from system which will break the network access.
Patch sent to upstream: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1682 With this patch applied, nmstate will show error as: NmstateError: VerificationError: Verification failure: enp196s0f0.interface.ethernet.sr-iov.vfs[1].max-tx-rate desire '200', current '0' Instead of NmstateError: VerificationError: Verification failure: enp196s0f0.interface.ethernet.sr-iov.total-vfs: desire '2', current '0' And NetworkManager will not deactivate the SRIOV during `nmstatectl apply` due to SR-IOV VF parameter error.
RHEL 9.2 zstream scratch build could be found at https://people.redhat.com/fge/bz_2210164/
Verification of the custom RPM has partially failed. Setup: Server Version: 4.14.0-ec.2 RPM https://people.redhat.com/fge/bz_2210164/ Sceanrio: 1) Apply configuration to create bond interface with interface vlan based on VF. IT takes several min, but finally it created === interfaces: - name: ens1f0 type: ethernet state: up ethernet: sr-iov: total-vfs: 2 - name: ens1f0v0.481 type: vlan state: up vlan: base-iface: ens1f0v0 id: 481 - name: bond3 type: bond state: up link-aggregation: mode: balance-rr port: - ens1f0v0.481 === 2: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:fd:fe:bf:f2:bc brd ff:ff:ff:ff:ff:ff vf 0 link/ether 2e:a1:1e:af:13:35 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 1 link/ether ee:52:88:6c:6c:bc brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off altname enp59s0f0 802: bond3: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 2e:a1:1e:af:13:35 brd ff:ff:ff:ff:ff:ff 803: ens1f0v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 2e:a1:1e:af:13:35 brd ff:ff:ff:ff:ff:ff altname enp59s0f0v0 804: ens1f0v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether ee:52:88:6c:6c:bc brAppld ff:ff:ff:ff:ff:ff altname enp59s0f0v1 806: ens1f0v0.481@ens1f0v0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond3 state UP mode DEFAULT group default qlen 1000 link/ether 2e:a1:1e:af:13:35 brd ff:ff:ff:ff:ff:ff === 2) Apply incorect configuration - Intel nic doesn't support min-tx-rate === interfaces: - name: ens1f0 type: ethernet state: up ethernet: sr-iov: total-vfs: 2 vfs: - id: 1 max-tx-rate: 200 min-tx-rate: 100 === Result: 1) Failed to apply the wrong policy as expected === $ oc get nncp NAME STATUS REASON all-interface-worker-0 Available SuccessfullyConfigured wrong-interface-worker-1 Degraded FailedToConfigure === [2023-07-06T22:37:57Z INFO nmstate::query_apply::net_state] Rollbacked to checkpoint /org/freedesktop/NetworkManager/Checkpoint/14 NmstateError: VerificationError: Verification failure: ens1f0.interface.ethernet.sr-iov.vfs[1].min-tx-rate desire '100', current '0' 2) Vlan configuration is removed -Failed === 2: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:fd:fe:bf:f2:bc brd ff:ff:ff:ff:ff:ff vf 0 link/ether 62:29:a2:8f:4c:f3 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 1 link/ether ba:f7:f3:04:69:ef brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off altname enp59s0f0 802: bond3: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether ca:c1:75:cf:72:5b brd ff:ff:ff:ff:ff:ff 809: ens1f0v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 62:29:a2:8f:4c:f3 brd ff:ff:ff:ff:ff:ff altname enp59s0f0v0 810: ens1f0v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether ba:f7:f3:04:69:ef brd ff:ff:ff:ff:ff:ff altname enp59s0f0v1 === sh-4.4# cat /sys/class/net/bond3/bonding/slaves sh-4.4# === [core@worker-0 ~]$ nmcli con NAME UUID TYPE DEVICE ovs-if-br-ex 13953df0-3a2d-4742-b4f0-46c2afa2b933 ovs-interface br-ex lo 74ec3513-cc05-4cff-8202-d285285d7904 loopback lo bond3 68fa1e24-11cf-45cb-84db-68a6d4e65256 bond bond3 br-ex 9350dc78-da4f-4663-9aee-3b0f49ebfe23 ovs-bridge br-ex ens1f0 93d576f4-20cb-4f83-9696-d61d116a4c9a ethernet ens1f0 ens3f0np0 ea1316d7-eab7-456c-b706-36ee8cd46f18 ethernet ens3f0np0 ens3f0v0 fe1e62b3-7043-4502-90fb-9455f8aae8c2 ethernet ens3f0v0 ens3f0v0.481-slave-ovs-clone 49498d70-e77e-4f18-bcb9-6db2c61a4fa0 vlan ens3f0v0.481 ens3f1np1 c3feed57-f0e9-43ed-a8d7-3d8585d6df51 ethernet ens3f1np1 ens3f1v0 936f4ac1-d994-4592-a9a6-265ef5755883 ethernet ens3f1v0 ens3f1v0.481-slave-ovs-clone 492806be-b419-442d-a2cd-182e6d32d937 vlan ens3f1v0.481 ovs-if-phys0 0024026e-e74c-470f-8cf2-240f005427e0 bond bond0 ovs-port-br-ex 31dc7054-f9a8-401e-b7f1-ade4225d684f ovs-port br-ex ovs-port-phys0 ec80211d-3e8e-4850-a046-43535fd5e6d3 ovs-port bond0 Wired connection 1 4394027f-0b8a-3d66-9e2c-f67da5937f0a ethernet -- Wired connection 10 5936893f-2701-3a02-a2f4-cf597fac4d5b ethernet -- Wired connection 11 4a6dc456-e01e-3667-beab-20d7daba0f51 ethernet -- Wired connection 12 1e514bf3-dc3a-3dc9-ae07-86077254bfd0 ethernet -- Wired connection 13 2fcc8186-6ed7-35a4-a831-d41d0a69f484 ethernet -- Wired connection 2 97d1bc50-4945-347a-9f73-feab16caa5a4 ethernet -- Wired connection 3 97e3255b-e92e-31cc-b935-22e159bbad5c ethernet -- Wired connection 4 c50f87fd-72a4-3936-9d62-caea9991a81a ethernet -- Wired connection 5 ed052a43-17df-38e9-82c1-2dda1189a16c ethernet -- Wired connection 6 653ae237-87c5-3e6a-8a71-dd6a8716adb0 ethernet -- Wired connection 7 d9e79c4e-2cb0-3d6b-b1cb-12b47e34dcca ethernet -- Wired connection 8 63aef5bb-4390-3435-b254-f85b83bff12f ethernet -- Wired connection 9 8d5592a4-7d39-3955-84a6-07ab18230717 ethernet -- bond0 fa3a2ece-ea18-4e0b-b504-2818b93bc977 bond -- ens1f0v0.481 52aab5b1-6366-440a-bf9d-dedba7fafb41 vlan -- ens3f0v0.481 b559ffc9-a185-49e7-84c7-259e16d45b19 vlan -- ens3f1v0.481 0e06b134-9904-4b16-a5f7-cf73c87c6cdd vlan -- Journalctl: http://pastebin.test.redhat.com/1104374
Resetting needinfo. Info provided by Evgeny in comment #11.
Hi Evgeny Levin, Thanks for the test feedback. It looks like bug https://bugzilla.redhat.com/show_bug.cgi?id=2217903 which is about to release on Aug 01. I have uploaded new scratch build NetworkManager-1.42.2-6.sriov.el9 in https://people.redhat.com/fge/bz_2210164/ Can you try again?
When checking ice intel SRIOV NIC, we found kernel bug https://bugzilla.redhat.com/show_bug.cgi?id=2222597 for the support of max_tx_rate/min_tx_rate .