Bug 1760179

Summary:

IPv6 address never assigned, possibly "linklocal6: waiting for link-local addresses failed due to timeout"

Product:

[Fedora] Fedora

Reporter:

Ian Wienand <iwienand>

Component:

NetworkManager

Assignee:

Lubomir Rintel <lkundrak>

Status:

CLOSED NOTABUG

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

CC:

bgalvani, dcbw, fgiudici, gnome-sig, john.j5live, lkundrak, mclasen, rhughes, rstrode, sandmann, tdecacqu

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2019-10-11 05:36:38 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Boot where IPv6 is not configured	none
Boot where IPv6 is configured correctly	none
Boot where IPv6 is not configured (timestamps removed for side-by-side)	none
Boot where IPv6 is configured correctly (timestamps removed for side-by-side)	none
Boot where IPv6 is not configured, then a restart of the NetworkManager service	none
Increased link-local timeout	none

Description Ian Wienand 2019-10-10 04:47:50 UTC

The host will start but the global ipv6 will not be assigned.  We have seen this problem constantly with several versions of Fedora and CentOS 7 on several different cloud providers.  It is racy; sometimes it occurs but sometimes not.  However, one consistent point is that once the host is up, if you re-start NetworkManager, the global IPv6 address will be allocated.

We're using a pretty standard if-cfg plugin script for the interface:

---
# Automatically generated, do not edit
DEVICE=ens3
BOOTPROTO=dhcp
HWADDR=fa:16:3e:32:10:68
ONBOOT=yes
NM_CONTROLLED=yes
TYPE=Ethernet
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
---

I have captured a "good" and and "bad" boot with DEBUG on.  I have also attached "cut" versions that strips of the timestamps, etc. to better allow comparison.

The first thing is, on a good boot, ens3 is listed as UP and I notice there is a "tentative" fe80: link-local address.

---- good boot ---
platform-linux: create (ignore netns, initial netns, use udev)
platform-linux: Netlink socket for events established: port=811, fd=7
platform-linux: populate platform cache
platform-linux: kernel-support: IFLA_INET6_ADDR_GEN_MODE: detected
platform: signal: link   added: 1: lo <UP,LOWER_UP;loopback,up,running,lowerup> mtu 65536 arp 772 loopback? not-init addrgenmode eui64 addr 00:00:00:00:00:00 driver unknown rx:0,0 tx:0,0
platform: signal: link   added: 2: ens3 <UP,LOWER_UP;broadcast,multicast,up,running,lowerup> mtu 1450 arp 1 ethernet? not-init addrgenmode eui64 addr FA:16:3E:32:10:68 driver virtio_net rx:7,602 tx:0,0
platform: signal: address 4   added: 127.0.0.1/8 lft forever pref forever lifetime 1-0[4294967295,4294967295] dev 1 flags permanent src kernel
platform: signal: address 6   added: ::1/128 lft forever pref forever lifetime 1-0[4294967295,4294967295] dev 1 flags permanent src kernel
platform: signal: address 6   added: fe80::f816:3eff:fe32:1068/64 lft forever pref forever lifetime 1-0[4294967295,4294967295] dev 2 flags permanent,tentative src kernel
platform-linux: kernel-support: RTA_PREF: ability to set router preference for IPv6 routes: detected
platform: signal: route   6   added: ::1/128 via :: dev 1 metric 256 mss 0 rt-src rt-kernel
platform: signal: route   6   added: fe80::/64 via :: dev 2 metric 256 mss 0 rt-src rt-kernel
platform: signal: route   6   added: table 255 ff00::/8 via :: dev 2 metric 256 mss 0 rt-src rt-boot
platform: signal: qdisc   added: noqueue dev 1 family 0 handle 0 parent ffffffff info 2
platform: signal: qdisc   added: fq_codel dev 2 family 0 handle 0 parent ffffffff info 2
platform: signal: link changed: 2: ens3 <UP,LOWER_UP;broadcast,multicast,up,running,lowerup> mtu 1450 arp 1 ethernet? init addrgenmode eui64 addr FA:16:3E:32:10:68 driver virtio_net rx:7,602 tx:0,0
platform: signal: link changed: 1: lo <UP,LOWER_UP;loopback,up,running,lowerup> mtu 65536 arp 772 loopback? init addrgenmode eui64 addr 00:00:00:00:00:00 driver unknown rx:0,0 tx:0,0
---

while on a bad boot it is DOWN, and no LL address

--- bad boot ---
platform-linux: create (ignore netns, initial netns, use udev)
platform-linux: Netlink socket for events established: port=749, fd=7
platform-linux: populate platform cache
platform-linux: kernel-support: IFLA_INET6_ADDR_GEN_MODE: detected
platform: signal: link   added: 1: lo <UP,LOWER_UP;loopback,up,running,lowerup> mtu 65536 arp 772 loopback? not-init addrgenmode eui64 addr 00:00:00:00:00:00 driver unknown rx:0,0 tx:0,0
platform: signal: link   added: 2: ens3 <DOWN;broadcast,multicast> mtu 1450 arp 1 ethernet? not-init addrgenmode eui64 addr FA:16:3E:32:10:68 driver virtio_net rx:0,0 tx:0,0
platform: signal: address 4   added: 127.0.0.1/8 lft forever pref forever lifetime 1-0[4294967295,4294967295] dev 1 flags permanent src kernel
platform: signal: address 6   added: ::1/128 lft forever pref forever lifetime 1-0[4294967295,4294967295] dev 1 flags permanent src kernel
platform-linux: kernel-support: RTA_PREF: ability to set router preference for IPv6 routes: detected
platform: signal: route   6   added: ::1/128 via :: dev 1 metric 256 mss 0 rt-src rt-kernel
platform: signal: qdisc   added: noqueue dev 1 family 0 handle 0 parent ffffffff info 2
platform: signal: link changed: 2: ens3 <DOWN;broadcast,multicast> mtu 1450 arp 1 ethernet? init addrgenmode eui64 addr FA:16:3E:32:10:68 driver virtio_net rx:0,0 tx:0,0
platform: signal: link changed: 1: lo <UP,LOWER_UP;loopback,up,running,lowerup> mtu 65536 arp 772 loopback? init addrgenmode eui64 addr 00:00:00:00:00:00 driver unknown rx:0,0 tx:0,0
---

So then, later in the good boot, we seem to go through all the ipv6 setup

--- good boot ---
evice[0x561cb462d4b0] (ens3): queued IP6 config change
device[0x561cb462d4b0] (ens3): ip6-config: update (commit=0, new-config=0x561cb4681730)
device[0x561cb462d4b0] (ens3): ip6-config: update IP Config instance (/org/freedesktop/NetworkManager/IP6Config/3)
dns-mgr: (device_ip_config_changed): queueing DNS updates (1)
dns-mgr: (device_ip_config_changed): DNS configuration did not change
dns-mgr: (device_ip_config_changed): no DNS changes to commit (0)
device[0x561cb462d4b0] (ens3): linklocal6: waiting for link-local addresses successful, continue with method auto
device[0x561cb462d4b0] (ens3): addrconf6: using the device EUI-64 identifier
device[0x561cb462d4b0] (ens3): ip6-config: update (commit=1, new-config=0x561cb4681730)
platform: address: adding or updating IPv6 address: fe80::f816:3eff:fe32:1068/64 lft forever pref forever lifetime 17-0[4294967295,4294967295] dev 2 flags permanent,noprefixroute src unknown
platform-linux: do-add-ip6-address[2: fe80::f816:3eff:fe32:1068]: success
device[0x561cb462d4b0] (ens3): ip6-config: update IP Config instance (/org/freedesktop/NetworkManager/IP6Config/3)
dns-mgr: (device_ip_config_changed): queueing DNS updates (1)
dns-mgr: (device_ip_config_changed): DNS configuration did not change
dns-mgr: (device_ip_config_changed): no DNS changes to commit (0)
platform-linux: sysctl: setting '/proc/sys/net/ipv6/conf/ens3/accept_ra' to '1' (current value is '0')
platform-linux: sysctl: setting '/proc/sys/net/ipv6/conf/ens3/accept_ra_defrtr' to '0' (current value is identical)
platform-linux: sysctl: setting '/proc/sys/net/ipv6/conf/ens3/accept_ra_pinfo' to '0' (current value is identical)
platform-linux: sysctl: setting '/proc/sys/net/ipv6/conf/ens3/accept_ra_rtr_pref' to '0' (current value is identical)
ndisc[0x561cb46a50e0,"ens3"]: starting neighbor discovery: 2
ndisc-lndp[0x561cb46a50e0,"ens3"]: processing libndp events
ndisc[0x561cb46a50e0,"ens3"]: scheduling RA timeout in 30 seconds
ndisc[0x561cb46a50e0,"ens3"]: scheduling explicit router solicitation request in 0 seconds.
ndisc[0x561cb46a50e0,"ens3"]: router solicitation sent
ndisc[0x561cb46a50e0,"ens3"]: scheduling router solicitation retry in 4 seconds.
ndisc-lndp[0x561cb46a50e0,"ens3"]: processing libndp events
ndisc-lndp[0x561cb46a50e0,"ens3"]: received router advertisement at 18
... and so on ...
---

While on the bad boot, we get just

--- bad boot ---
device[0x55aa5477d990] (ens3): linklocal6: waiting for link-local addresses failed due to timeout
device[0x55aa5477d990] (ens3): activation-stage: schedule activate_stage4_ip6_config_timeout,v6 (id 125)
device[0x55aa5477d990] (ens3): activation-stage: invoke activate_stage4_ip6_config_timeout,v6 (id 125)
device[0x55aa5477d990] (ens3): activation-stage: complete activate_stage4_ip6_config_timeout,v6 (id 125)
platform: signal: address 6 changed: fe80::f816:3eff:fe32:1068/64 lft forever pref forever lifetime 20-0[4294967295,4294967295] dev 2 flags permanent,noprefixroute src kernel
device[0x55aa5477d990] (ens3): queued IP6 config change
device[0x55aa5477d990] (ens3): ip6-config: update (commit=0, new-config=0x55aa547cd6c0)
device[0x55aa5477d990] (ens3): ip6-config: update IP Config instance (/org/freedesktop/NetworkManager/IP6Config/3)
---

It never tries to setup the interface, and IPv6 remains unconfigured.

I am guessing we're tickling something a little odd, because about the only google matches for the message "linklocal6: waiting for link-local addresses failed due to timeout" are actual source code links; it's not like there's a lot of people seeming to be reporting this error showing up.

On the bad boot we see

--- bad boot ---

[1570668746.9664] device[0x55aa5477d990] (ens3): linklocal6: starting IPv6 with method 'auto', but the device has no link-local addresses configured. Wait.
[1570668746.9665] device[0x55aa5477d990] (ens3): linklocal6: generated EUI-64 IPv6LL address fe80::f816:3eff:fe32:1068
[1570668762.0184] device[0x55aa5477d990] (ens3): linklocal6: waiting for link-local addresses failed due to timeout
---

on the good boot we see

--- good boot ---
[1570669580.0219] device[0x561cb462d4b0] (ens3): linklocal6: starting IPv6 with method 'auto', but the device has no link-local addresses configured. Wait.
[1570669580.0219] device[0x561cb462d4b0] (ens3): linklocal6: generated EUI-64 IPv6LL address fe80::f816:3eff:fe32:1068
[1570669595.3932] device[0x561cb462d4b0] (ens3): linklocal6: waiting for link-local addresses successful, continue with method auto
---

This must mean we're into "check_and_add_ipv6ll_addr" [2] and calling "ip_config_merge_and_apply"

It feels like we must be in

---
 static void
 linklocal6_check_complete(NMDevice *self)
 {
  ...
	if (   !priv->ext_ip6_config_captured
	    || !nm_ip6_config_find_first_address (priv->ext_ip6_config_captured,
	                                            NM_PLATFORM_MATCH_WITH_ADDRTYPE_LINKLOCAL
	                                          | NM_PLATFORM_MATCH_WITH_ADDRSTATE_NORMAL)) {
		/* we don't have a non-tentative link local address yet. Wait longer. */
		return;
	}
 }
---

I rebuilt NM with a debug message in there which was hit:

---
<debug> [1570679718.1360] device[0x563b351930f0] (ens3): linklocal6: no non-tentative address yet
---

So I further went and doubled the timeout 

---
diff -ur ./NetworkManager-1.12.6-orig/src/devices/nm-device.c ./NetworkManager-1.12.6/src/devices/nm-device.c
--- a/src/devices/nm-device.c   2019-10-10 03:58:12.950912422 +0000
+++ b/src/devices/nm-device.c   2019-10-10 04:00:15.503037320 +0000
@@ -8359,7 +8359,7 @@
         * FIXME: use dad/retrans sysctl values if they are higher than a minimum time.
         * (rh #1101809)
         */
-       priv->linklocal6_timeout_id = g_timeout_add_seconds (15, linklocal6_timeout_cb, self);
+       priv->linklocal6_timeout_id = g_timeout_add_seconds (30, linklocal6_timeout_cb, self);
        return FALSE;
 }
---

With this increased timeout in place, I can *not* seem to replicate the failure to get an address.  The host starts, the initial logs look the same (DOWN state) and takes a little while to get going on ipv6 (I can actually log in via ipv4 and see the tentative ipv6 address) but gets it's ipv6 global address after a short period (I've rebooted the host 10+ times now).

I have not exactly determined why the interface starts DOWN sometimes and UP other times?  Perhaps something to do with ipv6 autoconfiguration and DAD requests from the kernel?  Note I have no idea what's happening in the cloud provider that responds to these requests -- perhaps sometimes we get luckly and get a faster response but other times not?

But this seems to match the emperical behaviour that a "service NetworkManager restart" consistently brings up the interface -- by this time the LL address is populated and NM continues to configure the interface.  The timeout being too short means that on the initial try NM gives up, and leaves the interface unconfigured.

We run exactly the same VM images in quite a few different clouds and we only seem to hit this on a small minority of them.  But I think that leads further credence to the idea that "in the wild" 15 seconds isn't enough time to wait for the LL address to be configured.  Turning it up might be a solution, or possibly making this a configurable value would help too.  If the message was something like "waiting for link-local addresses failed due to timeout; consider increasing IPV6_LINKLOCAL_TIMEOUT" I imagine that would be really helpful.

[1] https://github.com/NetworkManager/NetworkManager/blob/master/src/devices/nm-device.c#L9094
[2] https://github.com/NetworkManager/NetworkManager/blob/e36c297fd8c6b1b57cd120739cc5ee8eab57aa08/src/devices/nm-device.c#L9143

Comment 1 Ian Wienand 2019-10-10 04:49:48 UTC

Created attachment 1624174 [details]
Boot where IPv6 is not configured

Comment 2 Ian Wienand 2019-10-10 04:50:28 UTC

Created attachment 1624175 [details]
Boot where IPv6 is configured correctly

Comment 3 Ian Wienand 2019-10-10 04:51:14 UTC

Created attachment 1624176 [details]
Boot where IPv6 is not configured (timestamps removed for side-by-side)

Comment 4 Ian Wienand 2019-10-10 04:51:51 UTC

Created attachment 1624177 [details]
Boot where IPv6 is configured correctly (timestamps removed for side-by-side)

Comment 5 Ian Wienand 2019-10-10 04:52:34 UTC

Created attachment 1624178 [details]
Boot where IPv6 is not configured, then a restart of the NetworkManager service

Comment 6 Ian Wienand 2019-10-10 04:54:55 UTC

Created attachment 1624179 [details]
Increased link-local timeout

Comment 7 Beniamino Galvani 2019-10-10 06:11:13 UTC

From the first log file (bad.txt):

<debug> [1570668746.9669] platform: signal: address 6   added: fe80::f816:3eff:fe32:1068/64 lft forever pref forever lifetime 1-0[4294967295,4294967295] dev 2 flags permanent,noprefixroute,tentative src kernel
...
<debug> [1570668765.3705] platform: signal: address 6 changed: fe80::f816:3eff:fe32:1068/64 lft forever pref forever lifetime 20-0[4294967295,4294967295] dev 2 flags permanent,noprefixroute src kernel

So, yes, the problem is that the link-local address remains tentative for too long due to duplicate address detection done by kernel (~18 seconds while the timeout in NM is 15). It should usually take 1 or 2 seconds. Do you have special sysctl configuration for IPv6? What is the output of: 

 sysctl -a --pattern 'net.ipv6.conf.(all|ens3)'

?

Comment 8 Beniamino Galvani 2019-10-10 06:42:47 UTC

> I have not exactly determined why the interface starts DOWN sometimes and UP other times? Perhaps something to do with ipv6 autoconfiguration and DAD requests from the kernel?  Note I have no idea what's happening in the cloud provider that responds to these requests -- perhaps sometimes we get luckly and get a faster response but other times not?

I don't think this is the case; kernel sends a neighbor discovery for the tentative address and if there is no response within 1 second it promotes the address to non-tentative. If somebody else is using the same address the address should get the dadfailed flag, which doesn't happen according to logs.

> But this seems to match the emperical behaviour that a "service NetworkManager restart" consistently brings up the interface -- by this time the LL address is populated and NM continues to configure the interface.  The timeout being too short means that on the initial try NM gives up, and leaves the interface unconfigured.

> We run exactly the same VM images in quite a few different clouds and we only seem to hit this on a small minority of them.  But I think that leads further credence to the idea that "in the wild" 15 seconds isn't enough time to wait for the LL address to be configured.  

Yes, that is strange, especially because after restarting NM the address becomes non-tentative much faster. But I think the interval only depends on kernel, not on external factors.

Comment 9 Ian Wienand 2019-10-10 06:50:56 UTC

(In reply to Beniamino Galvani from comment #7)
>It should usually take 1 or 2 seconds. 

Now I know what question to ask :) I can certainly talk to the cloud provider about why the response might be taking so long.  There might be logs on their end that will help.

> Do you have special sysctl configuration for IPv6? 

No, nothing in particular

> What is the output of: 
> 
>  sysctl -a --pattern 'net.ipv6.conf.(all|ens3)'

# sysctl -a --pattern 'net.ipv6.conf.(all|ens3)'
net.ipv6.conf.all.accept_dad = 0
net.ipv6.conf.all.accept_ra = 1
net.ipv6.conf.all.accept_ra_defrtr = 1
net.ipv6.conf.all.accept_ra_from_local = 0
net.ipv6.conf.all.accept_ra_min_hop_limit = 1
net.ipv6.conf.all.accept_ra_mtu = 1
net.ipv6.conf.all.accept_ra_pinfo = 1
net.ipv6.conf.all.accept_ra_rt_info_max_plen = 0
net.ipv6.conf.all.accept_ra_rt_info_min_plen = 0
net.ipv6.conf.all.accept_ra_rtr_pref = 1
net.ipv6.conf.all.accept_redirects = 1
net.ipv6.conf.all.accept_source_route = 0
net.ipv6.conf.all.addr_gen_mode = 0
net.ipv6.conf.all.autoconf = 1
net.ipv6.conf.all.dad_transmits = 1
net.ipv6.conf.all.disable_ipv6 = 0
net.ipv6.conf.all.disable_policy = 0
net.ipv6.conf.all.drop_unicast_in_l2_multicast = 0
net.ipv6.conf.all.drop_unsolicited_na = 0
net.ipv6.conf.all.enhanced_dad = 1
net.ipv6.conf.all.force_mld_version = 0
net.ipv6.conf.all.force_tllao = 0
net.ipv6.conf.all.forwarding = 0
net.ipv6.conf.all.hop_limit = 64
net.ipv6.conf.all.ignore_routes_with_linkdown = 0
net.ipv6.conf.all.keep_addr_on_down = 0
net.ipv6.conf.all.max_addresses = 16
net.ipv6.conf.all.max_desync_factor = 600
net.ipv6.conf.all.mc_forwarding = 0
net.ipv6.conf.all.mldv1_unsolicited_report_interval = 10000
net.ipv6.conf.all.mldv2_unsolicited_report_interval = 1000
net.ipv6.conf.all.mtu = 1280
net.ipv6.conf.all.ndisc_notify = 0
net.ipv6.conf.all.ndisc_tclass = 0
net.ipv6.conf.all.optimistic_dad = 0
net.ipv6.conf.all.proxy_ndp = 0
net.ipv6.conf.all.regen_max_retry = 3
net.ipv6.conf.all.router_probe_interval = 60
net.ipv6.conf.all.router_solicitation_delay = 1
net.ipv6.conf.all.router_solicitation_interval = 4
net.ipv6.conf.all.router_solicitation_max_interval = 3600
net.ipv6.conf.all.router_solicitations = -1
net.ipv6.conf.all.seg6_enabled = 0
net.ipv6.conf.all.seg6_require_hmac = 0
net.ipv6.conf.all.suppress_frag_ndisc = 1
net.ipv6.conf.all.temp_prefered_lft = 86400
net.ipv6.conf.all.temp_valid_lft = 604800
net.ipv6.conf.all.use_oif_addrs_only = 0
net.ipv6.conf.all.use_optimistic = 0
net.ipv6.conf.all.use_tempaddr = 0
net.ipv6.conf.ens3.accept_dad = 1
net.ipv6.conf.ens3.accept_ra = 1
net.ipv6.conf.ens3.accept_ra_defrtr = 0
net.ipv6.conf.ens3.accept_ra_from_local = 0
net.ipv6.conf.ens3.accept_ra_min_hop_limit = 1
net.ipv6.conf.ens3.accept_ra_mtu = 1
net.ipv6.conf.ens3.accept_ra_pinfo = 0
net.ipv6.conf.ens3.accept_ra_rt_info_max_plen = 0
net.ipv6.conf.ens3.accept_ra_rt_info_min_plen = 0
net.ipv6.conf.ens3.accept_ra_rtr_pref = 0
net.ipv6.conf.ens3.accept_redirects = 1
net.ipv6.conf.ens3.accept_source_route = 0
net.ipv6.conf.ens3.addr_gen_mode = 1
net.ipv6.conf.ens3.autoconf = 1
net.ipv6.conf.ens3.dad_transmits = 1
net.ipv6.conf.ens3.disable_ipv6 = 0
net.ipv6.conf.ens3.disable_policy = 0
net.ipv6.conf.ens3.drop_unicast_in_l2_multicast = 0
net.ipv6.conf.ens3.drop_unsolicited_na = 0
net.ipv6.conf.ens3.enhanced_dad = 1
net.ipv6.conf.ens3.force_mld_version = 0
net.ipv6.conf.ens3.force_tllao = 0
net.ipv6.conf.ens3.forwarding = 0
net.ipv6.conf.ens3.hop_limit = 64
net.ipv6.conf.ens3.ignore_routes_with_linkdown = 0
net.ipv6.conf.ens3.keep_addr_on_down = 0
net.ipv6.conf.ens3.max_addresses = 16
net.ipv6.conf.ens3.max_desync_factor = 600
net.ipv6.conf.ens3.mc_forwarding = 0
net.ipv6.conf.ens3.mldv1_unsolicited_report_interval = 10000
net.ipv6.conf.ens3.mldv2_unsolicited_report_interval = 1000
net.ipv6.conf.ens3.mtu = 1450
net.ipv6.conf.ens3.ndisc_notify = 0
net.ipv6.conf.ens3.ndisc_tclass = 0
net.ipv6.conf.ens3.optimistic_dad = 0
net.ipv6.conf.ens3.proxy_ndp = 0
net.ipv6.conf.ens3.regen_max_retry = 3
net.ipv6.conf.ens3.router_probe_interval = 60
net.ipv6.conf.ens3.router_solicitation_delay = 30
net.ipv6.conf.ens3.router_solicitation_interval = 4
net.ipv6.conf.ens3.router_solicitation_max_interval = 3600
net.ipv6.conf.ens3.router_solicitations = -1
net.ipv6.conf.ens3.seg6_enabled = 0
net.ipv6.conf.ens3.seg6_require_hmac = 0
net.ipv6.conf.ens3.suppress_frag_ndisc = 1
net.ipv6.conf.ens3.temp_prefered_lft = 86400
net.ipv6.conf.ens3.temp_valid_lft = 604800
net.ipv6.conf.ens3.use_oif_addrs_only = 0
net.ipv6.conf.ens3.use_optimistic = 0
net.ipv6.conf.ens3.use_tempaddr = 0

Comment 10 Ian Wienand 2019-10-10 06:54:46 UTC

(In reply to Beniamino Galvani from comment #8)
> > I have not exactly determined why the interface starts DOWN sometimes and UP other times? Perhaps something to do with ipv6 autoconfiguration and DAD requests from the kernel?  Note I have no idea what's happening in the cloud provider that responds to these requests -- perhaps sometimes we get luckly and get a faster response but other times not?
> 
> I don't think this is the case; kernel sends a neighbor discovery for the
> tentative address and if there is no response within 1 second it promotes
> the address to non-tentative. If somebody else is using the same address the
> address should get the dadfailed flag, which doesn't happen according to
> logs.

Hrm, it certainly remains tentative for longer than that.  The host can actually boot, I can log in via ssh (over ipv4) and quickly do a "ip addr" and will see it change

---
-bash-4.2# ssh root.48.15
 Last login: Thu Oct 10 04:28:35 2019 from 2001:44b8:3177:ac00:b321:eb44:783:2320
 [root@ianw-test-glean-centos ~]# ip addr
 ...
 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc fq_codel state UP group default qlen 1000
     link/ether fa:16:3e:32:10:68 brd ff:ff:ff:ff:ff:ff
     inet 192.168.48.15/24 brd 192.168.48.255 scope global dynamic noprefixroute ens3
        valid_lft 86395sec preferred_lft 86395sec
     inet6 fe80::f816:3eff:fe32:1068/64 scope link tentative noprefixroute 
        valid_lft forever preferred_lft forever
---

> Yes, that is strange, especially because after restarting NM the address
> becomes non-tentative much faster. But I think the interval only depends on
> kernel, not on external factors.

I think that when I can restart nm, the address is now permanent, so it hasn't really gone back into the tentative state? ("becomes non-tentative much faster"; i.e. when restarting after login it no longer needs to change state)

Comment 11 Ian Wienand 2019-10-11 05:36:38 UTC

Upon further debugging, I think that we have caused this by increasing the RA delay [1]

The root cause is (again, I think) that our network configuration tool has made the interface UP before network-manager starts [2] (it is doing this in an attempt to probe which interfaces seem active and thus should be configured).  

This means in some cases the interface can have accepted an RA and have an ipv6 address assigned; after that nm will refuse to further configure the interface.  We had increased the RA delay to try and work around this before we understood this behaviour (very similar to what's described in [3])

So I think this "bug" is a red herring related to all that.  It still might be better if this was configurable, but I don't think it's worth effort.  Hopefully this can be some breadcrumbs if anyone else experiences something similar.

[1] https://opendev.org/openstack/diskimage-builder/src/commit/5b5385cf84a422b0394f6bd95d7700f2f8a9bf86/diskimage_builder/elements/simple-init/post-install.d/80-simple-init#L60
[2] https://review.opendev.org/#/c/688031/
[3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=755202

Comment 12 Beniamino Galvani 2019-10-11 08:11:51 UTC

Ah, right. I didn't notice router_solicitation_delay also delays DAD. To avoid the problem with NM picking up the existing configuration, I think a better solution would be to flush addresses on the interface and bring it down.