Bug 1900038
| Summary: | [RFE] don't take down vlan if parent interface doesn't get configured | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Dusty Mabe <dustymabe> |
| Component: | NetworkManager | Assignee: | NetworkManager Development Team <nm-team> |
| Status: | CLOSED WONTFIX | QA Contact: | Desktop QE <desktop-qa-list> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | medium | ||
| Version: | 8.2 | CC: | acardace, bgalvani, derekh, djuran, ferferna, fge, lrintel, rkhan, sukulkar, thaller, till |
| Target Milestone: | rc | Keywords: | FutureFeature, Triaged |
| Target Release: | 8.0 | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Enhancement | |
| Doc Text: |
Feature: don't take down vlan if parent interface doesn't get configured
Reason: I have a situation where my vlan device gets taken down because
a device it's built on top of doesn't get DHCP.
Result:
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-05-20 07:27:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Hi Dusty, Can you try? nmcli connection modify <parent_iface> ipv4.dhcp-timeout infinity ipv6.dhcp-timeout infinity It should not bring down the interface due to DHCP failures. Hey Gris, Sorry for the late reply, I've been away. I added the timeout (Fedora 33 system (NetworkManager-1.26.6-1.fc33.x86_64) but the bond and vlan still gets taken down eventually. This is admittedly a misconfiguration, but I think we can do better. Once DHCP fails for the bond we should be able to simply check to see if the bond is used in any other devices that are successfully up before we take it down. We can leave the bond in a degraded state (yellow in `nmcli c show` view) but not take it down because of the higher level devices using it. Hi Dusty, Might related to `ipv6.ra_timeout`. Let me try around. Hi Dusty, I checked, you need to use this command to set dhcp and ipv6-autoconf to infinity timeout: nmcli connection modify <connection_id> ipv4.dhcp-timeout infinity ipv6.dhcp-timeout infinity ipv6.ra-timeout infinity We are planning this use case at https://docs.google.com/document/d/17LIu6xml9OrJHghS6t3RVVceN1fWtknqH73WwIlWubo/edit targeting 8.6/9.1. *** Bug 1908302 has been marked as a duplicate of this bug. *** Workaround exists: nmcli connection modify <connection_id> ipv4.dhcp-timeout infinity ipv6.dhcp-timeout infinity ipv6.ra-timeout infinity Acceptance criteria: NetworkManager should not remove virtual interface on DHCP timeout when that interface is been used as VLAN parent or bridge/bond/etc controller. Hence set to medium priority. We are out of capacity for 8.6. Postpone to further planning. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. |
Description of problem: I have a situation where my vlan device gets taken down because a device it's built on top of doesn't get DHCP. NOTE: this probably applies to more cases, but vlan on top of a bond is an easy way to reproduce the issue. This can happen easily if you configure a vlan on top of a bond, but forget to disable both ipv4 and ipv6 DHCP on the bond itself. So we have something like `bond0` and `bond0.100` where `bond0` won't get DHCP from anywhere (either ipv4 or ipv6), but only one of them is disabled in configuration. In that case, after the ipv6 times out then entire setup will be taken down (including the vlan), even though the vlan was up and working fine. ``` [core@dhcpvlanbond ~]$ nmcli c show NAME UUID TYPE DEVICE bond0 75ac1a13-dbce-36e4-8ecb-c6ed6fce5322 bond bond0 bond0.100 bc927f10-6620-3b6c-9946-9186cc4df6aa vlan bond0.100 bond0-slave-ens2 4fb61355-f5fd-3ade-940e-5fbd7d6d3f63 ethernet ens2 bond0-slave-ens3 7a194b59-78a7-3447-83fb-f5336d848e19 ethernet ens3 [core@dhcpvlanbond ~]$ [ 39.004144] bond0: (slave ens2): Releasing backup interface [ 39.005104] bond0: (slave ens2): the permanent HWaddr of slave - 52:54:00:ea:6b:17 - is still in use by bond - set the HWaddr of slave to a different address to avoid conflicts [ 39.007009] bond0: (slave ens3): making interface the new active one [ 39.053150] IPv6: ADDRCONF(NETDEV_UP): ens2: link is not ready [ 39.053827] 8021q: adding VLAN 0 to HW filter on device ens2 [ 39.055303] bond0: (slave ens3): Releasing backup interface [ 39.099861] IPv6: ADDRCONF(NETDEV_UP): ens3: link is not ready [ 39.099862] 8021q: adding VLAN 0 to HW filter on device ens3 [ 39.101850] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready [ 39.106281] bond0 (unregistering): Released all slaves [ 39.126264] IPv6: ADDRCONF(NETDEV_UP): ens2: link is not ready [ 39.128190] IPv6: ADDRCONF(NETDEV_UP): ens3: link is not ready [ 41.055559] e1000: ens2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 41.057642] IPv6: ADDRCONF(NETDEV_CHANGE): ens2: link becomes ready [ 41.119575] e1000: ens3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 41.121723] IPv6: ADDRCONF(NETDEV_CHANGE): ens3: link becomes ready [core@dhcpvlanbond ~]$ nmcli c show NAME UUID TYPE DEVICE bond0 75ac1a13-dbce-36e4-8ecb-c6ed6fce5322 bond -- bond0-slave-ens2 4fb61355-f5fd-3ade-940e-5fbd7d6d3f63 ethernet -- bond0-slave-ens3 7a194b59-78a7-3447-83fb-f5336d848e19 ethernet -- bond0.100 bc927f10-6620-3b6c-9946-9186cc4df6aa vlan -- ``` We should probably be able to leave the bond0 up in this case, even though it's slightly misconfigured because there is a device on top of it that needs it to stay up. Version-Release number of selected component (if applicable): NetworkManager-1.22.8-6.el8_2.x86_64 How reproducible: Always Steps to Reproduce: Set up a bond on top of a vlan. Something like: ``` [core@dhcpvlanbond ~]$ sudo tail -n 100 /etc/NetworkManager/system-connections/* ==> /etc/NetworkManager/system-connections/bond0-slave-ens2.nmconnection <==ns/* [connection] id=bond0-slave-ens2 type=ethernet interface-name=ens2 master=bond0 slave-type=bond ==> /etc/NetworkManager/system-connections/bond0-slave-ens3.nmconnection <== [connection] id=bond0-slave-ens3 type=ethernet interface-name=ens3 master=bond0 slave-type=bond ==> /etc/NetworkManager/system-connections/bond0.100.nmconnection <== [connection] id=bond0.100 type=vlan interface-name=bond0.100 [vlan] egress-priority-map= flags=1 id=100 ingress-priority-map= parent=bond0 [ipv4] dns-search= may-fail=false method=auto ==> /etc/NetworkManager/system-connections/bond0.nmconnection <== [connection] id=bond0 type=bond interface-name=bond0 [bond] miimon=100 mode=active-backup [ipv4] method=disabled ``` Note that there is no `ipv6.method=disabled` in the `bond0.nmconnection` file. Actual results: The bond0 gets taken down when ipv6 never succeeds and takes down the bond0.100 as well. Expected results: The bond0.100 stays up. Additional info: