RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2160416 - SR-IOV VF not disabled as desired, gets IPv4 and default route via DHCP
Summary: SR-IOV VF not disabled as desired, gets IPv4 and default route via DHCP
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: nmstate
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: ---
Assignee: Gris Ge
QA Contact: Mingyu Shi
URL:
Whiteboard:
Depends On:
Blocks: 2169642 2169643
TreeView+ depends on / blocked
 
Reported: 2023-01-12 10:54 UTC by Carlos Goncalves
Modified: 2023-10-16 03:21 UTC (History)
7 users (show)

Fixed In Version: nmstate-1.4.2-4.el8
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2169642 2169643 (view as bug list)
Environment:
Last Closed: 2023-05-16 08:26:40 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
nmstatectl output log (116.12 KB, text/plain)
2023-01-12 10:58 UTC, Carlos Goncalves
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github nmstate nmstate pull 2215 0 None open [nmstate-1.4] nm: Fix error when mark unmanaged interface as down 2023-02-02 03:17:37 UTC
Red Hat Issue Tracker NMT-116 0 None None None 2023-01-22 14:16:39 UTC
Red Hat Issue Tracker RHELPLAN-144929 0 None None None 2023-01-12 10:56:42 UTC
Red Hat Product Errata RHBA-2023:2772 0 None None None 2023-05-16 08:26:53 UTC

Description Carlos Goncalves 2023-01-12 10:54:42 UTC
Description of problem:
Nmstate does not set SR-IOV VF state to down, even though it suggests that the desired state was applied and exits without an error code.

With the VF up, it may eventually get an IP address and a default route configured via DHCP. The problematic part in my lab is the default route, which overlaps with another default route. OCP deployments will fail.

Logging in to the node and manually removing the default route via the VF interface is a workaround to continue OCP deployment.


Version-Release number of selected component (if applicable):
mstate-1.3.3-1.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1. The main scenario is an assisted-based installer but it can be simply reproduced by creating a file with the desired state:

  interfaces:
    - name: eno1v1
      type: ethernet
      state: down
      ipv4:
        dhcp: false
        enabled: false
      ipv6:
        dhcp: false
        enabled: false

2. nmstatectl apply eno1v1-state-down.yam

Actual results:

VF interface (eno1v1) is up, gets an IPv4 and default route via DHCP.
The following output is from an assisted-based installer deployment, as soon as the node boots up from the agent ISO (pre-installation):

[root@cnfdc8-worker-1 system-connections]# ls                                                                                                              
bond0.nmconnection  eno1.nmconnection  eno1v0.nmconnection                                          

[root@cnfdc8-worker-1 system-connections]# nmcli
eno1v1: connected to Wired connection 14                      
        "Mellanox MT27710"                                                                                                                                                                                                                  
        ethernet (mlx5_core), 2E:52:81:48:5F:97, hw, mtu 1500                                                                          
        ip4 default, ip6 default                                                        
        inet4 10.19.17.169/23                                                      
        route4 10.19.16.0/23 metric 108                                                        
        route4 default via 10.19.17.254 metric 108                                                                                              
        inet6 2620:52:0:1310:4e4a:d1d6:89b0:1085/64                                                                                                  
        inet6 fe80::4e29:8245:6810:537d/64                                                                                                          
        route6 fe80::/64 metric 1024                                                        
        route6 2620:52:0:1310::/64 metric 108                                                                                                          
        route6 default via fe80::200:5eff:fe00:201 metric 108                                                  
        route6 default via fe80::9e8a:cb00:6704:ab00 metric 108
        route6 default via fe80::9e8a:cb00:6704:9200 metric 108      
                                                                               
cni-podman0: connected (externally) to cni-podman0                                                  
        "cni-podman0"                                                                              
        bridge, 66:71:3B:66:27:6E, sw, mtu 1500                                      
        inet4 10.88.0.1/16                                                            
        route4 10.88.0.0/16 metric 0                                                                                                    
        inet6 fe80::6471:3bff:fe66:276e/64                                                                                                                                                                                                  
        route6 fe80::/64 metric 256                                                  
                                                                                               
bond0: connected to bond0                                                                                                                        
        "bond0"                                                                                                                                                                                                                              
        bond, 66:BA:78:DF:91:03, sw, mtu 1500                                                                                                                                                                                                
        inet4 10.19.16.57/23                                                                                                                                                                                                                
        route4 default via 10.19.17.254 metric 300                                                                                              
        route4 10.19.16.0/23 metric 300                                                                                                              
                                                                                                                                                   
eno1: connected to eno1                                                
        "Mellanox MT27710"                                                      
        ethernet (mlx5_core), 0C:42:A1:55:F3:06, hw, mtu 1500                                                                                                                                                                                
                                                                             
eno1v0: connected to eno1v0                                                          
        "Mellanox MT27710"                                                    
        ethernet (mlx5_core), 66:BA:78:DF:91:03, hw, mtu 1500                                                                          
        master bond0

[root@cnfdc8-worker-1 system-connections]# ip r                                          
default via 10.19.17.254 dev eno1v1 proto dhcp src 10.19.17.169 metric 108           <---------------------------
default via 10.19.17.254 dev bond0 proto static metric 300
10.19.16.0/23 dev eno1v1 proto kernel scope link src 10.19.17.169 metric 108
10.19.16.0/23 dev bond0 proto kernel scope link src 10.19.16.57 metric 300
10.88.0.0/16 dev cni-podman0 proto kernel scope link src 10.88.0.1 linkdown

Expected results:

VF should be disabled (state off, no IPv4, no routes).


Additional info:

VF netdevice can manually be disabled:

# ip link show eno1v1
18: eno1v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 8e:9e:e7:11:de:a5 brd ff:ff:ff:ff:ff:ff
# ip link set down eno1v1
# ip link show eno1v1
18: eno1v1: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether 8e:9e:e7:11:de:a5 brd ff:ff:ff:ff:ff:ff

Comment 1 Carlos Goncalves 2023-01-12 10:58:01 UTC
Created attachment 1937572 [details]
nmstatectl output log

Comment 2 Carlos Goncalves 2023-01-13 10:42:28 UTC
Just a nit in comment #0, "assisted-based installer" is inaccurate. It should have been "agent-based installer".

Comment 5 Gris Ge 2023-02-01 03:30:20 UTC
Hi Carlos Goncalves,

The PF and VF are different NIC in the view of kernel network stack(as SR-IOV is PCI level tech). The link status of PF does not reflect on VF as designed. Depend how you use the VF, you may:
 A: Use `state: absent` on PF which remove the config of PF which lead to resetting SR-IOV config, then VF got removed.
 B: Set VF interface as `state: down`.

If none of above fit your use case, please elaborate why you want PF link state been reflect to VF in your setup.

Thank you!

Comment 6 Gris Ge 2023-02-01 03:36:10 UTC
My bad. I read the desire state wrong. You are setting the `state: down` for VF. Let me investigate more.

Comment 7 Gris Ge 2023-02-02 03:17:37 UTC
I have sent the patch to upstream: https://github.com/nmstate/nmstate/pull/2215
This is purely from log investigation.

Still confirming whether it fix reporter's issue or not.

Comment 8 Gris Ge 2023-02-07 16:01:48 UTC
Problem fixed by four patches:
 * https://github.com/nmstate/nmstate/pull/2215
 * https://github.com/nmstate/nmstate/pull/2221
 * https://github.com/nmstate/nmstate/pull/2222
 * https://github.com/nmstate/nmstate/pull/2223

The root cause analyst result:

# Current state
 * `eno1` is SR_IOV supported NIC
 * `eno1` as SR-IOV disabled and assigned to OVS bridge `br-ex`.
 * The `br-ex` has the same name interface providing default gateway.
 * `NetworkManager-config-server` not installed, hence NetworkManager will automatically set ipv6.auto and ipv4.auto on any new NIC pluged in.

# Desire state
 * `eno1` enable SR-IOV to total_vfs 1.
 * Want the VF `eno1v1` enabled with ipv6 and ipv4 disabled.

# Problems:
 * Once enable SR-IOV on `eno1`, the `eno1v1` will get default gateway via DHCP which break the `br-ex` gateway which causing the nmstate hanlder lose network connection.
 * network state trigger rollback after default gateway gone.

# Fixes:
 * Allowing enabling SR-IOV and setting VF in single desired state YAML.
 * Do not touch OVS-port of `eno1` for changing SR-IOV settings. Because this will break `br-ex` gateway connection also.

Carlos has tested by scratch build rpm.

Comment 9 Gris Ge 2023-02-07 16:06:36 UTC
This the yaml solving the use case:

```yaml
interfaces:
- ethernet:
    sr-iov:
      total-vfs: 1
  name: eno1
  type: ethernet
- name: eno1v0
  type: ethernet
  state: up
  ipv4:
    enabled: false
  ipv6:
    enabled: false
```

Comment 22 errata-xmlrpc 2023-05-16 08:26:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (nmstate bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2772


Note You need to log in before you can comment on or make changes to this bug.