Bug 2207690

Summary: VLAN of bond will not get autoconnect when bond port link revived
Product: Red Hat Enterprise Linux 8 Reporter: Gris Ge <fge>
Component: NetworkManagerAssignee: Gris Ge <fge>
Status: VERIFIED --- QA Contact: Matej Berezny <mberezny>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.8CC: andbartl, bgalvani, bnemec, lrintel, rkhan, sfaye, sukulkar, thaller, till, vbenes
Target Milestone: rcKeywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: NetworkManager-1.40.16-8.el8 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 2217894 2217899 2217900 2217901 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2217894, 2217899, 2217900, 2217901    
Attachments:
Description Flags
NetworkManager trace log none

Description Gris Ge 2023-05-16 14:25:19 UTC
Description of problem:

Given a system with VLAN over bond holding static IPv4 address created by nmstate
When user bring all bond ports link down and bring them up again after 10 seconds
Then user are expecting VLAN of bond got reactivated with original desired IPv4 address.


Version-Release number of selected component (if applicable):
NetworkManager-1.30.0-14.el8_4.x86_64

How reproducible:
100%

Steps to Reproduce:
1. wget http://file.apac.redhat.com/fge/OCPBUGS-11300/bug.sh
2. chmod +x bug.sh
3. sudo ./bug.sh

Actual results:

The `bond0.1656` got no IPv4 address assigned.

Expected results:

The `bond0.1656` got IPv4 address assigned.

Additional info:

This is customer case in https://issues.redhat.com/browse/OCPBUGS-11300

Comment 1 Gris Ge 2023-05-16 14:25:46 UTC
Created attachment 1964907 [details]
NetworkManager trace log

Comment 2 Gris Ge 2023-05-16 14:26:22 UTC
The content of http://file.apac.redhat.com/fge/OCPBUGS-11300/bug.sh is:


#!/bin/bash -x

ip netns add tmp
ip link add eth2 type veth peer name eth2peer
ip link add eth1 type veth peer name eth1peer
ip link set eth1 up
ip link set eth2 up
ip link set eth1peer netns tmp
ip link set eth2peer netns tmp
ip netns exec tmp ip link set eth1peer up
ip netns exec tmp ip link set eth2peer up

nmcli device set eth1 managed yes
nmcli device set eth2 managed yes


echo 'interfaces:
- name: bond0
  type: bond
  state: up
  ipv4:
    enabled: false
  ipv6:
    enabled: false
  link-aggregation:
    mode: active-backup
    options:
      miimon: 100
    port:
    - eth1
    - eth2
- name: bond0.1656
  type: vlan
  state: up
  ipv4:
    enabled: true
    dhcp: false
    address:
    - ip: 192.168.122.10
      prefix-length: 24
  ipv6:
    enabled: false
  vlan:
    base-iface: bond0
    id: 1656
- name: eth2
  type: ethernet
  state: up
- name: eth1
  type: ethernet
  state: up' | sudo nmstatectl set -

echo 'interfaces:
- name: bond0
  type: bond
  state: up
  ipv4:
    enabled: false
  ipv6:
    enabled: false
  link-aggregation:
    mode: active-backup
    options:
      miimon: 100
    port:
    - eth1
    - eth2
- name: bond0.1656
  type: vlan
  state: up
  ipv4:
    enabled: true
    dhcp: false
    address:
    - ip: 192.168.122.10
      prefix-length: 24
  ipv6:
    enabled: false
  vlan:
    base-iface: bond0
    id: 1656
- name: eth2
  type: ethernet
  state: up
- name: eth1
  type: ethernet
  state: up' | sudo nmstatectl set -

ip addr show bond0.1656

ip link set eth1 down
ip link set eth2 down
sleep 10
ip addr show bond0.1656

ip link set eth1 up
ip link set eth2 up
sleep 10

ip addr show bond0.1656
nmcli c
nmcli d

Comment 4 Beniamino Galvani 2023-05-31 09:06:09 UTC
From the log and the discussion we had, it seems that the problem is that nmstate blocks autoconnect when adding/updating the connection; this prevents the connection to go up when the bond gets carrier again.

The solution is that nmstate should unblock autoconnect when it's done. It seems to me there is no API currently to do that, so we should either:

  1. unblock the connection when the Reapply() D-Bus is called. This probably solves the nmstate use case, but I can't think of a reason why this should be expected from an API point of view;

  2. introduce a new Update() flag to unblock autoconnect; this seems the best approach to me;

  3. maybe, use the AUTOCONNECT property on the device instead to block autoconnect?

Comment 5 Gris Ge 2023-06-01 07:11:26 UTC
Patch send to upstream https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1649

NetworkManager will unblock the autoconnect on the success of reapply action considering the reapply serves the same purpose of activation.

Comment 6 Gris Ge 2023-06-07 11:00:42 UTC
Forgot to mention, to reproduce the problem, you also need to crate /etc/NetworkManager/conf.d/ignore_carrier.conf with content


[main]
ignore-carrier=no

Comment 7 Gris Ge 2023-06-08 08:46:04 UTC
RHEL 8.4 scratch build uploaded to https://people.redhat.com/fge/bz_2207690/

Comment 8 sfaye 2023-06-12 08:12:34 UTC
*** Bug 2211456 has been marked as a duplicate of this bug. ***