Bug 2005240
Summary: | Fail to attach linux bond interface to ovs-bridge | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Radim Hrazdil <rhrazdil> | ||||
Component: | nmstate | Assignee: | Gris Ge <fge> | ||||
Status: | CLOSED ERRATA | QA Contact: | Mingyu Shi <mshi> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 8.5 | CC: | acabral, bgalvani, bstinson, ferferna, fge, jiji, jishi, jwboyer, lrintel, mshi, network-qe, rkhan, sfaye, sukulkar, till | ||||
Target Milestone: | rc | Keywords: | Triaged, ZStream | ||||
Target Release: | 8.6 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 2128233 (view as bug list) | Environment: | |||||
Last Closed: | 2022-11-08 09:17:50 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 2128233 | ||||||
Attachments: |
|
Hi Beniamino, Could you take a quick log on above NM logs for the error: reason=<enum NM_ACTIVE_CONNECTION_STATE_REASON_DEVICE_DISCONNECTED of type NM.ActiveConnectionStateReason><enum NM_DEVICE_STATE_REASON_DEPENDENCY_FAILED of type NM.DeviceStateReason> The RHEL 9.x nmstate 2.x rust has no this problem as we retry on activation failure. For 8.x nmstate 1.x is way to complex to get this retry working properly, I will wait NM team fix this issue from their end. Assigning to NetworkManager component hoping they could fix this `NM_ACTIVE_CONNECTION_STATE_REASON_DEVICE_DISCONNECTED` failure. To reproduce this problem on RHEL 8: sudo ip netns add tmp sudo ip link add eth2 type veth peer name eth2peer sudo ip link add eth1 type veth peer name eth1peer sudo ip link set eth1 up sudo ip link set eth2 up sudo ip link set eth1peer netns tmp sudo ip link set eth2peer netns tmp sudo ip netns exec tmp ip link set eth1peer up sudo ip netns exec tmp ip link set eth2peer up sudo nmcli device set eth1 managed yes sudo nmcli device set eth2 managed yes sudo nmcli device set eth1 managed yes echo 'interfaces: - link-aggregation: mode: active-backup options: miimon: 140 primary: eth1 port: - eth1 - eth2 name: bond101 state: up type: bond - bridge: options: stp: false port: - name: bond101 name: br22 state: up type: ovs-bridge' | sudo nmstatectl set - Also revoke dev_ack as we are not sure we can fix it in NM or not. Hi, the desired configuration is: br22 (ovs-bridge) ^ | ovs-port-bond101 (ovs-port) ^ | bond101 (bond) ^ ^ | | eth1 eth2 and the sequence of operations performed by nmstate is: [1631736148.8830] audit: op="checkpoint-create" arg="/org/freedesktop/NM/Checkpoint/15" pid=377695 uid=0 result="success" [1631736148.9020] audit: op="connection-add" uuid="2baca9ad-a992-476b-880c-6f377246403b" name="bond101" [1631736148.9037] audit: op="connection-add" uuid="ae3e6fa3-f8f9-447b-b940-15184c8dfddd" name="br22" [1631736148.9081] audit: op="connection-add" uuid="7223b3ae-2270-48ea-b960-0f926b2f1e03" name="ovs-port-bond101" (1) [1631736148.9128] audit: op="connection-activate" uuid="ae3e6fa3-f8f9-447b-b940-15184c8dfddd" name="br22" [1631736148.9394] audit: op="connection-activate" uuid="2baca9ad-a992-476b-880c-6f377246403b" name="bond101" (2) [1631736149.0126] audit: op="connection-activate" uuid="7223b3ae-2270-48ea-b960-0f926b2f1e03" name="ovs-port-bond101" (3) [1631736149.0670] audit: op="connection-activate" uuid="cf81a923-5999-4a21-bd7e-dc96c7977451" name="eth1" [1631736149.0881] audit: op="connection-activate" uuid="cdb090be-2087-43e6-8e54-c973a2393b65" name="eth2" [1631736149.1707] audit: op="checkpoint-rollback" arg="/org/freedesktop/NM/Checkpoint/15" pid=377695 uid=0 result="success" At (1) br22 is activated manually and since it has connection.autoconnect-slaves=yes then also ovs-port-bond101 gets activated (all the remaining ports are also activated recursively). Then at (2) nmstate asks to activate again ovs-port-bond101. This disconnects the device and causes the disconnection of bond101, which enters the deactivating state. At (3), eth1 is activated. When the device enters the prepare state, the state of the controller interface (bond101) is still deactivating and thus the activation fails with reason "dependency-failed". I don't think this is easily solvable by NM, as there are activations interrupting each other. In particular, eth1 gets activated when the controller (bond101) has not settled yet and is still deactivating. Can nmstate wait that the controller is ready before starting new activation? Sure. Let me try from my end. Patch sent to upstream https://github.com/nmstate/nmstate/pull/1963 Previously, we are treating controller profile holding `IP_CONFIG` state as activated, this lead to race problem mentioned above. Since OVS bridge and OVS port are not allowed to hold IP address, we wait its activation reach `NM.ActiveConnectionState.ACTIVATED`. *** Bug 1966478 has been marked as a duplicate of this bug. *** Verified with: nmstate-1.3.2-1.el8.x86_64 nispor-1.2.7-1.el8.x86_64 NetworkManager-1.39.12-1.el8.x86_64 openvswitch2.15-2.15.0-114.el8fdp.x86_64 Run 50 times, all pass. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (nmstate bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:7465 |
Created attachment 1823747 [details] NetworkManager+nmstatectl logs Description of problem: When attaching linux bond interface to an ovs-bridge, nmstatectl sometimes fails with an error: libnmstate.error.NmstateLibnmError: Activate profile uuid:cf81a923-5999-4a21-bd7e-dc96c7977451 iface:eth1 type: ethernet failed: reason=<enum NM_ACTIVE_CONNECTION_STATE_REASON_DEVICE_DISCONNECTED of type NM.ActiveConnectionStateReason><enum NM_DEVICE_STATE_REASON_DEPENDENCY_FAILED of type NM.DeviceStateReason> Version-Release number of selected component (if applicable): NetworkManager-1.32.10-2.el8.x86_64 NetworkManager-ovs-1.32.10-2.el8.x86_64 NetworkManager-config-server-1.32.8-1.el8.noarch NetworkManager-libnm-1.32.10-2.el8.x86_64 NetworkManager-team-1.32.10-2.el8.x86_64 NetworkManager-tui-1.32.10-2.el8.x86_64 python3-libnmstate-1.1.0-3.el8.noarch nmstate-1.1.0-3.el8.noarch openvswitch2.15-2.15.0-35.el8s.x86_6 How reproducible: 50% Steps to Reproduce: 1. cat ds.yaml interfaces: - link-aggregation: mode: active-backup options: miimon: 140 primary: eth1 port: - eth1 - eth2 name: bond101 state: up type: bond - bridge: options: stp: false port: - name: bond101 name: br22 state: up type: ovs-bridge 2. nmstatectl set ds.yaml Additional info: NetworkManager trace logs and nmstatectl output attached Current state before applying the desired state: [vagrant@node02 ~]$ nmstatectl show Unhandled IFLA_INFO_DATA for iface type Other("IpTun") 2021-09-15 19:38:09,623 root DEBUG NetworkManager version 1.32.10 2021-09-15 19:38:09,626 root DEBUG Async action: Retrieve applied config: ethernet eth0 started 2021-09-15 19:38:09,627 root DEBUG Async action: Retrieve applied config: ethernet eth1 started 2021-09-15 19:38:09,627 root DEBUG Async action: Retrieve applied config: ethernet eth2 started 2021-09-15 19:38:09,630 root DEBUG Async action: Retrieve applied config: ethernet eth0 finished 2021-09-15 19:38:09,631 root DEBUG Async action: Retrieve applied config: ethernet eth1 finished 2021-09-15 19:38:09,631 root DEBUG Async action: Retrieve applied config: ethernet eth2 finished 2021-09-15 19:38:09,634 root DEBUG Interface ethernet.eth0 found. Merging the interface information. 2021-09-15 19:38:09,634 root DEBUG Interface ethernet.eth1 found. Merging the interface information. 2021-09-15 19:38:09,634 root DEBUG Interface ethernet.eth2 found. Merging the interface information. Unhandled IFLA_INFO_DATA for iface type Other("IpTun") Unhandled IFLA_INFO_DATA for iface type Other("IpTun") --- dns-resolver: config: {} running: search: [] server: - 192.168.66.2 - fe80::cd:2dff:feda:21a%eth0 - fd00::1 route-rules: config: [] routes: config: - destination: fd10:244::8c40/128 metric: 1024 next-hop-address: '::' next-hop-interface: cali100ec759187 table-id: 254 - destination: fd10:244::bac0/122 metric: 1024 next-hop-address: fd00::103 next-hop-interface: eth0 table-id: 254 - destination: fd10:244::c480/122 metric: 1024 next-hop-address: fd00::101 next-hop-interface: eth0 table-id: 254 - destination: fd10:244::f8c0/122 metric: 1024 next-hop-address: fd00::104 next-hop-interface: eth0 table-id: 254 - destination: 10.244.140.64/26 metric: 0 next-hop-address: 0.0.0.0 next-hop-interface: '' table-id: 254 - destination: 10.244.140.65/32 metric: 0 next-hop-address: 0.0.0.0 next-hop-interface: cali100ec759187 table-id: 254 - destination: 10.244.186.192/26 metric: 0 next-hop-address: 192.168.66.103 next-hop-interface: tunl0 table-id: 254 - destination: 10.244.196.128/26 metric: 0 next-hop-address: 192.168.66.101 next-hop-interface: tunl0 table-id: 254 - destination: 10.244.248.192/26 metric: 0 next-hop-address: 192.168.66.104 next-hop-interface: tunl0 table-id: 254 running: - destination: fd00::102/128 metric: 103 next-hop-address: '::' next-hop-interface: eth0 table-id: 254 - destination: fd00::/64 metric: 103 next-hop-address: '::' next-hop-interface: eth0 table-id: 254 - destination: fd10:244::8c40/128 metric: 1024 next-hop-address: '::' next-hop-interface: cali100ec759187 table-id: 254 - destination: fd10:244::bac0/122 metric: 1024 next-hop-address: fd00::103 next-hop-interface: eth0 table-id: 254 - destination: fd10:244::c480/122 metric: 1024 next-hop-address: fd00::101 next-hop-interface: eth0 table-id: 254 - destination: fd10:244::f8c0/122 metric: 1024 next-hop-address: fd00::104 next-hop-interface: eth0 table-id: 254 - destination: fe80::/64 metric: 103 next-hop-address: '::' next-hop-interface: eth0 table-id: 254 - destination: fe80::/64 metric: 256 next-hop-address: '::' next-hop-interface: cali100ec759187 table-id: 254 - destination: ::/0 metric: 103 next-hop-address: fe80::cd:2dff:feda:21a next-hop-interface: eth0 table-id: 254 - destination: 0.0.0.0/0 metric: 103 next-hop-address: 192.168.66.2 next-hop-interface: eth0 table-id: 254 - destination: 10.244.140.64/26 metric: 0 next-hop-address: 0.0.0.0 next-hop-interface: '' table-id: 254 - destination: 10.244.140.65/32 metric: 0 next-hop-address: 0.0.0.0 next-hop-interface: cali100ec759187 table-id: 254 - destination: 10.244.186.192/26 metric: 0 next-hop-address: 192.168.66.103 next-hop-interface: tunl0 table-id: 254 - destination: 10.244.196.128/26 metric: 0 next-hop-address: 192.168.66.101 next-hop-interface: tunl0 table-id: 254 - destination: 10.244.248.192/26 metric: 0 next-hop-address: 192.168.66.104 next-hop-interface: tunl0 table-id: 254 - destination: 192.168.66.0/24 metric: 103 next-hop-address: 0.0.0.0 next-hop-interface: eth0 table-id: 254 interfaces: - name: cali100ec759187 type: veth state: up accept-all-mac-addresses: false ethtool: feature: highdma: true rx-checksum: true rx-gro: false rx-gro-list: false rx-udp-gro-forwarding: false rx-vlan-hw-parse: true rx-vlan-stag-hw-parse: true tx-checksum-ip-generic: true tx-checksum-sctp: true tx-generic-segmentation: true tx-gre-csum-segmentation: true tx-gre-segmentation: true tx-ipxip4-segmentation: true tx-ipxip6-segmentation: true tx-nocache-copy: false tx-scatter-gather-fraglist: true tx-sctp-segmentation: true tx-tcp-ecn-segmentation: true tx-tcp-mangleid-segmentation: true tx-tcp-segmentation: true tx-tcp6-segmentation: true tx-udp_tnl-csum-segmentation: true tx-udp_tnl-segmentation: true tx-vlan-hw-insert: true tx-vlan-stag-hw-insert: true ipv4: enabled: false address: [] ipv6: enabled: true address: - ip: fe80::ecee:eeff:feee:eeee prefix-length: 64 mac-address: EE:EE:EE:EE:EE:EE mtu: 1480 veth: peer: eth2 - name: eth0 type: ethernet state: up accept-all-mac-addresses: false ethernet: auto-negotiation: false ethtool: feature: rx-gro: true rx-gro-list: false rx-udp-gro-forwarding: false tx-checksum-ip-generic: true tx-generic-segmentation: true tx-nocache-copy: false tx-tcp-ecn-segmentation: true tx-tcp-mangleid-segmentation: false tx-tcp-segmentation: true tx-tcp6-segmentation: true ring: rx: 256 tx: 256 ipv4: enabled: true address: - ip: 192.168.66.102 prefix-length: 24 auto-dns: true auto-gateway: true auto-route-table-id: 0 auto-routes: true dhcp: true ipv6: enabled: true address: - ip: fd00::102 prefix-length: 128 - ip: fe80::909:a9f1:bca7:3b3c prefix-length: 64 auto-dns: true auto-gateway: true auto-route-table-id: 0 auto-routes: true autoconf: true dhcp: true lldp: enabled: false mac-address: 52:55:00:D1:55:02 mtu: 1500 - name: eth1 type: ethernet state: up accept-all-mac-addresses: false ethernet: auto-negotiation: false ethtool: feature: rx-gro: true rx-gro-list: false rx-udp-gro-forwarding: false tx-checksum-ip-generic: true tx-generic-segmentation: true tx-nocache-copy: false tx-tcp-ecn-segmentation: true tx-tcp-mangleid-segmentation: false tx-tcp-segmentation: true tx-tcp6-segmentation: true ring: rx: 256 tx: 256 ipv4: enabled: false address: [] dhcp: false ipv6: enabled: false address: [] autoconf: false dhcp: false lldp: enabled: false mac-address: 52:55:00:D1:56:02 mtu: 1500 - name: eth2 type: ethernet state: up accept-all-mac-addresses: false ethernet: auto-negotiation: false ethtool: feature: rx-gro: true rx-gro-list: false rx-udp-gro-forwarding: false tx-checksum-ip-generic: true tx-generic-segmentation: true tx-nocache-copy: false tx-tcp-ecn-segmentation: true tx-tcp-mangleid-segmentation: false tx-tcp-segmentation: true tx-tcp6-segmentation: true ring: rx: 256 tx: 256 ipv4: enabled: false address: [] dhcp: false ipv6: enabled: false address: [] autoconf: false dhcp: false lldp: enabled: false mac-address: 52:55:00:D1:56:03 mtu: 1500 - name: lo type: unknown state: up accept-all-mac-addresses: false ethtool: feature: rx-gro: true rx-gro-list: false rx-udp-gro-forwarding: false tx-generic-segmentation: true tx-sctp-segmentation: true tx-tcp-ecn-segmentation: true tx-tcp-mangleid-segmentation: true tx-tcp-segmentation: true tx-tcp6-segmentation: true ipv4: enabled: true address: - ip: 127.0.0.1 prefix-length: 8 ipv6: enabled: true address: - ip: ::1 prefix-length: 128 mac-address: 00:00:00:00:00:00 mtu: 65536 - name: tunl0 type: unknown state: up accept-all-mac-addresses: false ethtool: feature: highdma: true rx-gro: true rx-gro-list: false rx-udp-gro-forwarding: false tx-checksum-ip-generic: true tx-generic-segmentation: true tx-nocache-copy: false tx-scatter-gather-fraglist: true tx-sctp-segmentation: true tx-tcp-ecn-segmentation: true tx-tcp-mangleid-segmentation: true tx-tcp-segmentation: true tx-tcp6-segmentation: true ipv4: enabled: true address: - ip: 10.244.140.64 prefix-length: 32 ipv6: enabled: false address: [] mac-address: 00:00:00:00 mtu: 1480 [vagrant@node02 ~]$ nmcli con NAME UUID TYPE DEVICE eth0 1c45a8cb-36d0-406f-80fa-fae1fd8d9ec1 ethernet eth0 eth1 cf81a923-5999-4a21-bd7e-dc96c7977451 ethernet eth1 eth2 cdb090be-2087-43e6-8e54-c973a2393b65 ethernet eth2