Bug 1680970 - Static IPv6 Address is lost on host deploy if NM manages the interface
Summary: Static IPv6 Address is lost on host deploy if NM manages the interface
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.30.8
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-4.4.0
: ---
Assignee: Edward Haas
QA Contact: Roni
URL:
Whiteboard:
: 1747314 1806346 (view as bug list)
Depends On: 1683597 1820988
Blocks: 1791555 1688008 1741578 1807256
TreeView+ depends on / blocked
 
Reported: 2019-02-25 15:26 UTC by arend.lapere
Modified: 2020-07-14 14:37 UTC (History)
10 users (show)

Fixed In Version: rhv-4.4.0-28
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1688008 1807256 (view as bug list)
Environment:
Last Closed: 2020-05-20 20:02:25 UTC
oVirt Team: Network
sbonazzo: ovirt-4.4?


Attachments (Terms of Use)
Engine.log (965.36 KB, text/plain)
2019-02-26 16:29 UTC, arend.lapere
no flags Details
vdsm.log (367.32 KB, text/plain)
2019-02-26 16:29 UTC, arend.lapere
no flags Details
ovirt-engine-setup-20190226170604-v83a48.log (2.94 MB, text/plain)
2019-02-26 16:30 UTC, arend.lapere
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3981311 Troubleshoot None [RHV 4.3] Static IPv6 Address is lost on host deploy if NM manages the interface 2019-03-12 14:18:15 UTC
oVirt gerrit 102857 None MERGED net, nmstate: Use autconf report from nmstate for getCapabilities 2020-11-13 12:05:09 UTC

Description arend.lapere 2019-02-25 15:26:55 UTC
Description of problem:
Static IPv6 Address is lost when moving to oVirt MGMT bridge

Version-Release number of selected component (if applicable):
VDSM 4.30.8

How reproducible:
Always

Steps to Reproduce:
1. Provision a clean server (I used CentOS 7.6), with both an IPv4 address (which is DHCP) and a static IPv6 address
2. I ran the Ansible oVirt installation scripts to add a host but should be the same like adding a host via web UI
3. Observe that the IPv6 is dropped

Actual results:

This is the file which is generated:
# Generated by VDSM version 4.30.8.1
DEVICE=ovirtmgmt
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
BOOTPROTO=dhcp
MTU=1500
DEFROUTE=yes
NM_CONTROLLED=no
IPV6INIT=yes
IPV6_AUTOCONF=yes

However, the EM1 interface is config was different:
BOOTPROTO="dhcp"
IPV6INIT=yes
IPV6_AUTOCONF=no
IPV6ADDR=fd00::1:1298:36ff:fea3:d613
IPV6_DEFAULTGW=fd00::1:20d:b9ff:fe4a:c80c
IPV6_PEERDNS=no
DOMAIN="democustomer.televic.com"
DEVICE=em1
HWADDR="10:98:36:a3:d6:13"
ONBOOT=yes
PEERDNS=yes
PEERROUTES=yes
DEFROUTE=yes
MTU=1500

Expected results:
# Generated by VDSM version 4.30.8.1
DEVICE=ovirtmgmt
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
BOOTPROTO=dhcp
MTU=1500
DEFROUTE=yes
NM_CONTROLLED=no
IPV6INIT=yes
IPV6_AUTOCONF=no
IPV6ADDR=fd00::1:1298:36ff:fea3:d613
IPV6_DEFAULTGW=fd00::1:20d:b9ff:fe4a:c80c
IPV6_PEERDNS=no

Additional info:

Comment 1 Dominik Holler 2019-02-26 07:28:31 UTC
Hello Arend,
thank you for reporting this bug!

(In reply to arend.lapere from comment #0)
> Description of problem:
> Static IPv6 Address is lost when moving to oVirt MGMT bridge
> 
> Version-Release number of selected component (if applicable):
> VDSM 4.30.8
> 
> How reproducible:
> Always
> 
> Steps to Reproduce:
> 1. Provision a clean server (I used CentOS 7.6), with both an IPv4 address
> (which is DHCP) and a static IPv6 address
> 2. I ran the Ansible oVirt installation scripts to add a host but should be
> the same like adding a host via web UI

Can you please share the commands, how the Ansible oVirt installation scripts
are triggered?
If possible, sharing logfiles, especially engine.log from engine's host/VM
and vdsm.log from the host would be helpful.

> 3. Observe that the IPv6 is dropped

Comment 2 Dominik Holler 2019-02-26 11:41:15 UTC
Arend, which version of oVirt Engine did you use? This behavior might be fixed in 4.3.1.

Comment 3 arend.lapere 2019-02-26 15:08:14 UTC
Hey Dominik,


I installed yesterday or friday, using this oVirt repository: https://resources.ovirt.org/pub/yum-repo/ovirt-release43.rpm
I didn't check which version I've used and currently I've already rebuilt my system using version 4.2

I'll re-install using the latest greatest version and check if it is solved, otherwise I'll drop all the logs that you requested, including the part of the ansible script that calls "add host".

Thank you so far for helping :-)


Kr,
Arend

Comment 4 arend.lapere 2019-02-26 16:28:42 UTC
Okay, just tried this using:
Software Version:4.3.1.2-0.0.master.20190225111554.git314f81b.el7

Result: same as before, IPv6 is lost. Not sure if it is relevant, but in the past, I've used bonded interfaces (also with a static IPv6 address), and that (at least sometimes) seemed to "remember" the IPv6 address when configuring oVirtMgmt network bridge...

The commands that triggered this, can be observed below (excerpt from role: https://github.com/oVirt/ovirt-ansible-infra/blob/master/roles/ovirt.hosts/tasks/main.yml#L26)

- name: Add hosts
  ovirt_host:
    auth: "{{ ovirt_auth }}"
    state: "{{ item.state | default(omit) }}"
    name: "{{ item.name }}"
    address: "{{ item.address }}"
    cluster: "{{ item.cluster }}"
    password: "{{ item.password | default(omit) }}"
    public_key: "{{ item.public_key | default(omit) }}"
    override_iptables: true
    timeout: "{{ item.timeout | default(ovirt_hosts_add_timeout) }}"
    poll_interval: "{{ item.poll_interval | default(20) }}"
    hosted_engine: "{{ item.hosted_engine | default(omit) }}"
  with_items:
    - "{{ hosts | default([]) }}"
  loop_control:
    label: "{{ item.name }}"
  async: "{{ ovirt_hosts_max_timeout }}"
  poll: 0
  register: add_hosts
  tags:
    - hosts

This is a single instance oVirt machine (engine runs bare-metal)

In the attached engine.log; an interesting line pops up around 17:12:11 => 
2019-02-26 17:12:11,000+01 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (EE-ManagedThreadFactory-engine-Thread-1) [b2c13fc] START, HostSetupNetworksVDSCommand(HostName = gb60kf2.democustomer.televic.com, HostSetupNetworksVdsCommandParameters:{hostId='ece4fdc2-ce29-4673-86b6-7391abdc5673', vds='Host[gb60kf2.democustomer.televic.com,ece4fdc2-ce29-4673-86b6-7391abdc5673]', rollbackOnFailure='true', commitOnSuccess='false', connectivityTimeout='120', networks='[HostNetwork:{defaultRoute='true', bonding='false', networkName='ovirtmgmt', vdsmName='ovirtmgmt', nicName='em1', vlan='null', vmNetwork='true', stp='false', properties='null', ipv4BootProtocol='DHCP', ipv4Address='null', ipv4Netmask='null', ipv4Gateway='null', ipv6BootProtocol='AUTOCONF', ipv6Address='null', ipv6Prefix='null', ipv6Gateway='null', nameServers='null'}]', removedNetworks='[]', bonds='[]', removedBonds='[]', clusterSwitchType='LEGACY', managementNetworkChanged='true'}), log id: 6c5a4be4

Although my EM1 has the following config, it still states it is autoconf'ed:
BOOTPROTO="dhcp"
IPV6INIT=yes
IPV6_AUTOCONF=no
IPV6ADDR=fd00::1:1298:36ff:fea3:d613
IPV6_DEFAULTGW=fd00::1:20d:b9ff:fe4a:c80c
IPV6_PEERDNS=no
DOMAIN="democustomer.televic.com"
DEVICE=em1
HWADDR="10:98:36:a3:d6:13"
ONBOOT=yes
PEERDNS=yes
PEERROUTES=yes
DEFROUTE=yes
MTU=1500

Also attached the vdsm.log and ovirt-engine-setup log

Comment 5 arend.lapere 2019-02-26 16:29:33 UTC
Created attachment 1538903 [details]
Engine.log

Comment 6 arend.lapere 2019-02-26 16:29:53 UTC
Created attachment 1538904 [details]
vdsm.log

Comment 7 arend.lapere 2019-02-26 16:30:59 UTC
Created attachment 1538905 [details]
ovirt-engine-setup-20190226170604-v83a48.log

Comment 8 Dominik Holler 2019-02-27 09:38:58 UTC
Arend, thanks for adding this information.

(In reply to arend.lapere from comment #4)

> Although my EM1 has the following config, it still states it is autoconf'ed:
> BOOTPROTO="dhcp"
> IPV6INIT=yes
> IPV6_AUTOCONF=no

https://bugzilla.redhat.com/show_bug.cgi?id=1665153#c3 looks like
IPV6_AUTOCONF is ignored by network manager.
Let's check if this is documented.

> IPV6ADDR=fd00::1:1298:36ff:fea3:d613
> IPV6_DEFAULTGW=fd00::1:20d:b9ff:fe4a:c80c
> IPV6_PEERDNS=no
> DOMAIN="democustomer.televic.com"
> DEVICE=em1
> HWADDR="10:98:36:a3:d6:13"
> ONBOOT=yes
> PEERDNS=yes
> PEERROUTES=yes
> DEFROUTE=yes
> MTU=1500
> 
> Also attached the vdsm.log and ovirt-engine-setup log

Comment 9 Dominik Holler 2019-02-27 10:07:35 UTC
(In reply to Dominik Holler from comment #8)
> Arend, thanks for adding this information.
> 
> (In reply to arend.lapere from comment #4)
> 
> > Although my EM1 has the following config, it still states it is autoconf'ed:
> > BOOTPROTO="dhcp"
> > IPV6INIT=yes
> > IPV6_AUTOCONF=no
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1665153#c3 looks like
> IPV6_AUTOCONF is ignored by network manager.
> Let's check if this is documented.
> 

according to man nm-settings-ifcfg-rh
IPV6_AUTOCONF should be not ignored, but network manager's behavior is not like
expected by oVirt/VDSM. I created 1683597 to discuss this.

> > IPV6ADDR=fd00::1:1298:36ff:fea3:d613
> > IPV6_DEFAULTGW=fd00::1:20d:b9ff:fe4a:c80c
> > IPV6_PEERDNS=no
> > DOMAIN="democustomer.televic.com"
> > DEVICE=em1
> > HWADDR="10:98:36:a3:d6:13"
> > ONBOOT=yes
> > PEERDNS=yes
> > PEERROUTES=yes
> > DEFROUTE=yes
> > MTU=1500
> > 
> > Also attached the vdsm.log and ovirt-engine-setup log

Comment 11 arend.lapere 2019-03-08 10:59:59 UTC
I've got some new information, when running this on a server with a bonded interface, this behaviour does not occur and the IPv6 is set correctly!

Comment 14 Edward Haas 2019-06-24 07:11:34 UTC
I think this is related to a bug we got in VDSM when NetworkManager is running.
VDSM checks the existence of a running dhclient on the relevant interface, and from that it interprets if DHCP is enabled.

Therefore, I suspect that when the host is added by Engine, VDSM reports as if the interface has DHCPv6 enabled, and Engine applies it back through setupNetworks, mentioning DHCPv6 and no static IP.

As a workaround, just make sure NM is not managing the specific interface.
OR
Re-apply the required configuration after adding the host to Engine.

Solving this issue in 4.3 may be challenging.
For 4.4 we plan to use nmstate/nm as the back-end, there I do not expect to see the problem.

Comment 15 Dominik Holler 2019-06-24 09:57:18 UTC
> As a workaround, just make sure NM is not managing the specific interface.

This is documented by bug 1688008, we are fine in this regard.

> Re-apply the required configuration after adding the host to Engine.

This might be a problem if the affected interface is the management interface.


Comment https://bugzilla.redhat.com/show_bug.cgi?id=1683597#c8 sounds like VDSM should check if NM is installed, running, managing the interface, and ask NM instead of /proc about dhcpv6 and/or autoconf/ra.
Is this correct?

Comment 16 Dominik Holler 2019-06-25 13:52:48 UTC
Deferred, since RHV-4.4 is going to use NM.

Comment 17 Edward Haas 2019-06-25 16:33:57 UTC
(In reply to Dominik Holler from comment #15)
> 
> Comment https://bugzilla.redhat.com/show_bug.cgi?id=1683597#c8 sounds like
> VDSM should check if NM is installed, running, managing the interface, and
> ask NM instead of /proc about dhcpv6 and/or autoconf/ra.
> Is this correct?

Yes.

Comment 18 Germano Veit Michel 2019-09-02 23:53:07 UTC
*** Bug 1747314 has been marked as a duplicate of this bug. ***

Comment 19 Dominik Holler 2020-01-13 10:03:06 UTC
We should ensure that this flow is working with nmstate.

Comment 23 Germano Veit Michel 2020-02-25 22:14:59 UTC
*** Bug 1806346 has been marked as a duplicate of this bug. ***

Comment 26 Roni 2020-05-19 09:23:48 UTC
Verified on v4.4.0.2-0.1

ovirt-engine-4.4.0-0.33
vdsm-4.40.16-1
nmstate-0.2.6-13
NetworkManager-1.22.8-4

Comment 27 Michael Burman 2020-05-19 09:30:16 UTC
With ansible-runner-service-1.0.2-1.el8ev.noarch

Comment 28 Sandro Bonazzola 2020-05-20 20:02:25 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.