RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2210164 - Do not disable SR-IOV when activation failed due to SR-IOV parameter failure
Summary: Do not disable SR-IOV when activation failed due to SR-IOV parameter failure
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: NetworkManager
Version: 9.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Gris Ge
QA Contact: Matej Berezny
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-26 02:22 UTC by Gris Ge
Modified: 2024-03-07 04:25 UTC (History)
13 users (show)

Fixed In Version: NetworkManager-1.43.11-1.el9
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-11-07 08:38:04 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker CNF-7461 0 None None None 2023-06-28 14:44:51 UTC
Red Hat Issue Tracker NMT-581 0 None None None 2023-05-26 02:26:37 UTC
Red Hat Issue Tracker RHELPLAN-158243 0 None None None 2023-05-26 02:26:42 UTC
Red Hat Product Errata RHBA-2023:6585 0 None None None 2023-11-07 08:38:44 UTC
freedesktop.org Gitlab NetworkManager NetworkManager-ci merge_requests 1470 0 None opened sriov: added sriov_dont_disable_on_acitvation_fail test 2023-08-07 15:13:54 UTC
freedesktop.org Gitlab NetworkManager NetworkManager merge_requests 1682 0 None opened sriov: Do not fail activation on SR-IOV VF failures 2023-06-28 14:27:13 UTC

Description Gris Ge 2023-05-26 02:22:48 UTC
Description of problem:

https://issues.redhat.com/browse/OCPBUGS-14107

When SR-IOV parameter failed to apply(for example, min_tx_rate not supported),
NetworkManager will deactivate the connection which lead to disabling SR-IOV
which remove preexist VFs before activation. This break user's network
connection when pre-exist VFs are used in VLAN/Bond/bridge.

Version-Release number of selected component (if applicable):
NetworkManager-1.43.8-32322.copr.d07383d3f3.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create SR-IOV VFs using nmstate YAML:

```
---
interfaces:
  - name: eth1
    type: ethernet
    state: up
    ethernet:
      sr-iov:
        total-vfs: 2
  - name: eth2
    type: ethernet
    state: up
    ethernet:
      sr-iov:
        total-vfs: 2
```

2. Create VLAN over VF:


```
---
interfaces:
  - name: eth1v1.101
    type: vlan
    state: up
    vlan:
      base-iface: eth1v1
      id: 101
  - name: eth2v1.101
    type: vlan
    state: up
    vlan:
      base-iface: eth2v1
      id: 101
```

3. Assign VLAN to a bond:

```
---
interfaces:
- name: bond0
  type: bond
  state: up
  link-aggregation:
    mode: balance-rr
    port:
    - eth1v1.101
    - eth2v1.101
```

4. Apply invalid SR-IOV configuration:

```
interfaces:
- name: eth1
  type: ethernet
  state: up
  ethernet:
   sr-iov:
     total-vfs: 5
     vfs:
     - id: 2
       max-tx-rate: 200
```

Actual results:

 * The eth1v1 and eth1v1.101 been removed and readded.
 * The bond0 lose eth1v1.101.

Expected results:

 * The eth1v1, eth1v1.101 is untouched due to SR-IOV failures.
 * The bond0 still have eth1v1.101 and eth2v1.101, no detach/reatach.

Additional info:

This is known limitation in NM that it disable SR-IOV if SR-IOV parament
setting failed. Disabling SR-IOV will cause VF been removed from system
which will break the network access.

Comment 4 Gris Ge 2023-06-28 14:25:13 UTC
Patch sent to upstream: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1682

With this patch applied, nmstate will show error as:

NmstateError: VerificationError: Verification failure: enp196s0f0.interface.ethernet.sr-iov.vfs[1].max-tx-rate desire '200', current '0'

Instead of

NmstateError: VerificationError: Verification failure: enp196s0f0.interface.ethernet.sr-iov.total-vfs: desire '2', current '0'

And NetworkManager will not deactivate the SRIOV during `nmstatectl apply` due to SR-IOV VF parameter error.

Comment 5 Gris Ge 2023-06-28 14:26:48 UTC
RHEL 9.2 zstream scratch build could be found at https://people.redhat.com/fge/bz_2210164/

Comment 11 elevin 2023-07-06 22:59:11 UTC
Verification of the custom RPM has partially failed.

Setup:
Server Version: 4.14.0-ec.2
RPM https://people.redhat.com/fge/bz_2210164/ 

Sceanrio:
1) Apply configuration to create  bond interface with interface vlan based on VF. IT takes several min, but finally it created
===  
   interfaces:
      - name: ens1f0
        type: ethernet
        state: up
        ethernet:
          sr-iov:
            total-vfs: 2
      - name: ens1f0v0.481
        type: vlan
        state: up
        vlan:
          base-iface: ens1f0v0
          id: 481
      - name: bond3
        type: bond
        state: up
        link-aggregation:
          mode: balance-rr
          port:
          - ens1f0v0.481
===
2: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:fd:fe:bf:f2:bc brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 2e:a1:1e:af:13:35 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether ee:52:88:6c:6c:bc brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    altname enp59s0f0
802: bond3: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 2e:a1:1e:af:13:35 brd ff:ff:ff:ff:ff:ff
803: ens1f0v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 2e:a1:1e:af:13:35 brd ff:ff:ff:ff:ff:ff
    altname enp59s0f0v0
804: ens1f0v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether ee:52:88:6c:6c:bc brAppld ff:ff:ff:ff:ff:ff
    altname enp59s0f0v1
806: ens1f0v0.481@ens1f0v0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond3 state UP mode DEFAULT group default qlen 1000
    link/ether 2e:a1:1e:af:13:35 brd ff:ff:ff:ff:ff:ff
===

2) Apply incorect configuration - Intel nic doesn't support min-tx-rate 
===
     interfaces:
     - name: ens1f0
       type: ethernet
       state: up
       ethernet:
         sr-iov:
           total-vfs: 2
           vfs:
           - id: 1
             max-tx-rate: 200
             min-tx-rate: 100
===

Result:
1) Failed to apply the wrong policy as expected
===
 $ oc get nncp
NAME                       STATUS      REASON
all-interface-worker-0     Available   SuccessfullyConfigured
wrong-interface-worker-1   Degraded    FailedToConfigure
===
[2023-07-06T22:37:57Z INFO  nmstate::query_apply::net_state] Rollbacked to checkpoint /org/freedesktop/NetworkManager/Checkpoint/14
NmstateError: VerificationError: Verification failure: ens1f0.interface.ethernet.sr-iov.vfs[1].min-tx-rate desire '100', current '0'

2) Vlan configuration is removed  -Failed
===
2: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:fd:fe:bf:f2:bc brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 62:29:a2:8f:4c:f3 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether ba:f7:f3:04:69:ef brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    altname enp59s0f0
802: bond3: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether ca:c1:75:cf:72:5b brd ff:ff:ff:ff:ff:ff

809: ens1f0v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 62:29:a2:8f:4c:f3 brd ff:ff:ff:ff:ff:ff
    altname enp59s0f0v0
810: ens1f0v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether ba:f7:f3:04:69:ef brd ff:ff:ff:ff:ff:ff
    altname enp59s0f0v1
===
sh-4.4# cat /sys/class/net/bond3/bonding/slaves 
sh-4.4# 
===
[core@worker-0 ~]$ nmcli con
NAME                          UUID                                  TYPE           DEVICE       
ovs-if-br-ex                  13953df0-3a2d-4742-b4f0-46c2afa2b933  ovs-interface  br-ex        
lo                            74ec3513-cc05-4cff-8202-d285285d7904  loopback       lo           
bond3                         68fa1e24-11cf-45cb-84db-68a6d4e65256  bond           bond3        
br-ex                         9350dc78-da4f-4663-9aee-3b0f49ebfe23  ovs-bridge     br-ex        
ens1f0                        93d576f4-20cb-4f83-9696-d61d116a4c9a  ethernet       ens1f0       
ens3f0np0                     ea1316d7-eab7-456c-b706-36ee8cd46f18  ethernet       ens3f0np0    
ens3f0v0                      fe1e62b3-7043-4502-90fb-9455f8aae8c2  ethernet       ens3f0v0     
ens3f0v0.481-slave-ovs-clone  49498d70-e77e-4f18-bcb9-6db2c61a4fa0  vlan           ens3f0v0.481 
ens3f1np1                     c3feed57-f0e9-43ed-a8d7-3d8585d6df51  ethernet       ens3f1np1    
ens3f1v0                      936f4ac1-d994-4592-a9a6-265ef5755883  ethernet       ens3f1v0     
ens3f1v0.481-slave-ovs-clone  492806be-b419-442d-a2cd-182e6d32d937  vlan           ens3f1v0.481 
ovs-if-phys0                  0024026e-e74c-470f-8cf2-240f005427e0  bond           bond0        
ovs-port-br-ex                31dc7054-f9a8-401e-b7f1-ade4225d684f  ovs-port       br-ex        
ovs-port-phys0                ec80211d-3e8e-4850-a046-43535fd5e6d3  ovs-port       bond0        
Wired connection 1            4394027f-0b8a-3d66-9e2c-f67da5937f0a  ethernet       --           
Wired connection 10           5936893f-2701-3a02-a2f4-cf597fac4d5b  ethernet       --           
Wired connection 11           4a6dc456-e01e-3667-beab-20d7daba0f51  ethernet       --           
Wired connection 12           1e514bf3-dc3a-3dc9-ae07-86077254bfd0  ethernet       --           
Wired connection 13           2fcc8186-6ed7-35a4-a831-d41d0a69f484  ethernet       --           
Wired connection 2            97d1bc50-4945-347a-9f73-feab16caa5a4  ethernet       --           
Wired connection 3            97e3255b-e92e-31cc-b935-22e159bbad5c  ethernet       --           
Wired connection 4            c50f87fd-72a4-3936-9d62-caea9991a81a  ethernet       --           
Wired connection 5            ed052a43-17df-38e9-82c1-2dda1189a16c  ethernet       --           
Wired connection 6            653ae237-87c5-3e6a-8a71-dd6a8716adb0  ethernet       --           
Wired connection 7            d9e79c4e-2cb0-3d6b-b1cb-12b47e34dcca  ethernet       --           
Wired connection 8            63aef5bb-4390-3435-b254-f85b83bff12f  ethernet       --           
Wired connection 9            8d5592a4-7d39-3955-84a6-07ab18230717  ethernet       --           
bond0                         fa3a2ece-ea18-4e0b-b504-2818b93bc977  bond           --           
ens1f0v0.481                  52aab5b1-6366-440a-bf9d-dedba7fafb41  vlan           --           
ens3f0v0.481                  b559ffc9-a185-49e7-84c7-259e16d45b19  vlan           --           
ens3f1v0.481                  0e06b134-9904-4b16-a5f7-cf73c87c6cdd  vlan           --

Journalctl:
http://pastebin.test.redhat.com/1104374

Comment 12 Carlos Goncalves 2023-07-10 11:52:13 UTC
Resetting needinfo. Info provided by Evgeny in comment #11.

Comment 13 Gris Ge 2023-07-11 07:47:01 UTC
Hi Evgeny Levin,


Thanks for the test feedback. It looks like bug https://bugzilla.redhat.com/show_bug.cgi?id=2217903 which is about to release on Aug 01.

I have uploaded new scratch build NetworkManager-1.42.2-6.sriov.el9 in https://people.redhat.com/fge/bz_2210164/

Can you try again?

Comment 14 Gris Ge 2023-07-13 08:37:53 UTC
When checking ice intel SRIOV NIC, we found kernel bug https://bugzilla.redhat.com/show_bug.cgi?id=2222597 for the support of max_tx_rate/min_tx_rate .

Comment 18 Carlos Goncalves 2023-09-18 07:08:05 UTC
Evgeny, I'm not sure whether you tested Gris' scratch build linked in comment #13. I recall you held conversations with him on Slack but not sure if you tested his scratch build and provided feedback.

Comment 20 errata-xmlrpc 2023-11-07 08:38:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6585

Comment 21 Red Hat Bugzilla 2024-03-07 04:25:50 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.