Bug 2160416
| Summary: | SR-IOV VF not disabled as desired, gets IPv4 and default route via DHCP | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Carlos Goncalves <cgoncalves> | ||||
| Component: | nmstate | Assignee: | Gris Ge <fge> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Mingyu Shi <mshi> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 8.0 | CC: | fbaudin, ferferna, jiji, jishi, network-qe, sfaye, till | ||||
| Target Milestone: | rc | Keywords: | Triaged, ZStream | ||||
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | nmstate-1.4.2-4.el8 | Doc Type: | No Doc Update | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 2169642 2169643 (view as bug list) | Environment: | |||||
| Last Closed: | 2023-05-16 08:26:40 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 2169642, 2169643 | ||||||
| Attachments: |
|
||||||
Created attachment 1937572 [details]
nmstatectl output log
Just a nit in comment #0, "assisted-based installer" is inaccurate. It should have been "agent-based installer". Hi Carlos Goncalves, The PF and VF are different NIC in the view of kernel network stack(as SR-IOV is PCI level tech). The link status of PF does not reflect on VF as designed. Depend how you use the VF, you may: A: Use `state: absent` on PF which remove the config of PF which lead to resetting SR-IOV config, then VF got removed. B: Set VF interface as `state: down`. If none of above fit your use case, please elaborate why you want PF link state been reflect to VF in your setup. Thank you! My bad. I read the desire state wrong. You are setting the `state: down` for VF. Let me investigate more. I have sent the patch to upstream: https://github.com/nmstate/nmstate/pull/2215 This is purely from log investigation. Still confirming whether it fix reporter's issue or not. Problem fixed by four patches: * https://github.com/nmstate/nmstate/pull/2215 * https://github.com/nmstate/nmstate/pull/2221 * https://github.com/nmstate/nmstate/pull/2222 * https://github.com/nmstate/nmstate/pull/2223 The root cause analyst result: # Current state * `eno1` is SR_IOV supported NIC * `eno1` as SR-IOV disabled and assigned to OVS bridge `br-ex`. * The `br-ex` has the same name interface providing default gateway. * `NetworkManager-config-server` not installed, hence NetworkManager will automatically set ipv6.auto and ipv4.auto on any new NIC pluged in. # Desire state * `eno1` enable SR-IOV to total_vfs 1. * Want the VF `eno1v1` enabled with ipv6 and ipv4 disabled. # Problems: * Once enable SR-IOV on `eno1`, the `eno1v1` will get default gateway via DHCP which break the `br-ex` gateway which causing the nmstate hanlder lose network connection. * network state trigger rollback after default gateway gone. # Fixes: * Allowing enabling SR-IOV and setting VF in single desired state YAML. * Do not touch OVS-port of `eno1` for changing SR-IOV settings. Because this will break `br-ex` gateway connection also. Carlos has tested by scratch build rpm. This the yaml solving the use case:
```yaml
interfaces:
- ethernet:
sr-iov:
total-vfs: 1
name: eno1
type: ethernet
- name: eno1v0
type: ethernet
state: up
ipv4:
enabled: false
ipv6:
enabled: false
```
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (nmstate bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2772 |
Description of problem: Nmstate does not set SR-IOV VF state to down, even though it suggests that the desired state was applied and exits without an error code. With the VF up, it may eventually get an IP address and a default route configured via DHCP. The problematic part in my lab is the default route, which overlaps with another default route. OCP deployments will fail. Logging in to the node and manually removing the default route via the VF interface is a workaround to continue OCP deployment. Version-Release number of selected component (if applicable): mstate-1.3.3-1.el8.x86_64 How reproducible: 100% Steps to Reproduce: 1. The main scenario is an assisted-based installer but it can be simply reproduced by creating a file with the desired state: interfaces: - name: eno1v1 type: ethernet state: down ipv4: dhcp: false enabled: false ipv6: dhcp: false enabled: false 2. nmstatectl apply eno1v1-state-down.yam Actual results: VF interface (eno1v1) is up, gets an IPv4 and default route via DHCP. The following output is from an assisted-based installer deployment, as soon as the node boots up from the agent ISO (pre-installation): [root@cnfdc8-worker-1 system-connections]# ls bond0.nmconnection eno1.nmconnection eno1v0.nmconnection [root@cnfdc8-worker-1 system-connections]# nmcli eno1v1: connected to Wired connection 14 "Mellanox MT27710" ethernet (mlx5_core), 2E:52:81:48:5F:97, hw, mtu 1500 ip4 default, ip6 default inet4 10.19.17.169/23 route4 10.19.16.0/23 metric 108 route4 default via 10.19.17.254 metric 108 inet6 2620:52:0:1310:4e4a:d1d6:89b0:1085/64 inet6 fe80::4e29:8245:6810:537d/64 route6 fe80::/64 metric 1024 route6 2620:52:0:1310::/64 metric 108 route6 default via fe80::200:5eff:fe00:201 metric 108 route6 default via fe80::9e8a:cb00:6704:ab00 metric 108 route6 default via fe80::9e8a:cb00:6704:9200 metric 108 cni-podman0: connected (externally) to cni-podman0 "cni-podman0" bridge, 66:71:3B:66:27:6E, sw, mtu 1500 inet4 10.88.0.1/16 route4 10.88.0.0/16 metric 0 inet6 fe80::6471:3bff:fe66:276e/64 route6 fe80::/64 metric 256 bond0: connected to bond0 "bond0" bond, 66:BA:78:DF:91:03, sw, mtu 1500 inet4 10.19.16.57/23 route4 default via 10.19.17.254 metric 300 route4 10.19.16.0/23 metric 300 eno1: connected to eno1 "Mellanox MT27710" ethernet (mlx5_core), 0C:42:A1:55:F3:06, hw, mtu 1500 eno1v0: connected to eno1v0 "Mellanox MT27710" ethernet (mlx5_core), 66:BA:78:DF:91:03, hw, mtu 1500 master bond0 [root@cnfdc8-worker-1 system-connections]# ip r default via 10.19.17.254 dev eno1v1 proto dhcp src 10.19.17.169 metric 108 <--------------------------- default via 10.19.17.254 dev bond0 proto static metric 300 10.19.16.0/23 dev eno1v1 proto kernel scope link src 10.19.17.169 metric 108 10.19.16.0/23 dev bond0 proto kernel scope link src 10.19.16.57 metric 300 10.88.0.0/16 dev cni-podman0 proto kernel scope link src 10.88.0.1 linkdown Expected results: VF should be disabled (state off, no IPv4, no routes). Additional info: VF netdevice can manually be disabled: # ip link show eno1v1 18: eno1v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 8e:9e:e7:11:de:a5 brd ff:ff:ff:ff:ff:ff # ip link set down eno1v1 # ip link show eno1v1 18: eno1v1: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether 8e:9e:e7:11:de:a5 brd ff:ff:ff:ff:ff:ff