2091885 – ipv6 default entry from RA stays after a change in the link-local address of the router

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2091885 - ipv6 default entry from RA stays after a change in the link-local address of the router

Summary: ipv6 default entry from RA stays after a change in the link-local address of ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	NetworkManager
Sub Component:
Version:	8.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Beniamino Galvani
QA Contact:	Matej Berezny
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2096175 2097623 2097624
TreeView+	depends on / blocked

Reported:	2022-05-31 09:08 UTC by Konstantinos
Modified:	2022-11-08 11:23 UTC (History)
CC List:	15 users (show)
Fixed In Version:	NetworkManager-1.39.7-2.el9
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	2096175 2097623 2097624 (view as bug list)
Environment:
Last Closed:	2022-11-08 10:10:38 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Reproducer script (1.66 KB, application/x-shellscript) 2022-06-01 13:55 UTC, Beniamino Galvani	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHELPLAN-123736	0	None	None	None	2022-05-31 09:16:12 UTC
Red Hat Product Errata	RHBA-2022:7680	0	None	None	None	2022-11-08 10:12:04 UTC

Description Konstantinos 2022-05-31 09:08:09 UTC

Description of problem:
In an ipv6 only deployment of OCP 4.10:
The router sends RAs with link-local ip fe80::5054:ff:fe3d:d3f4/64. For a reason (new device, VRRP change) the link-local address changes. The OCP nodes have wrong default routing.



Version-Release number of selected component (if applicable):
````
oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.5    True        True          18d    
[root@openshift-master-0 ~]# nmcli --version
nmcli tool, version 1.30.0-13.el8_4
[root@openshift-master-0 ~]# uname -r
4.18.0-305.40.2.el8_4.x86_64
```

How reproducible:


Steps to Reproduce:
1. Have a router sending RA (e.g. radvd)
2. Change the link-local address 
```
sudo ip add add  fe80::1/64 dev hub-baremetal
sudo ip add del   fe80::5054:ff:fe3d:d3f4/64 dev hub-baremetal
sudo systemctl restart radvd
```
3. Inspect the routing table in the nodes `ip -6 -d route` for the default.

Actual results:

```
[root@openshift-master-0 ~]# ndptool monitor -t ra                                                                                                                                                          [50/382]
NDP payload len 80, from addr: fe80::5054:ff:fe3d:d3f4, iface: br-ex
  Type: RA
  Hop limit: 64
  Managed address configuration: yes
  Other configuration: no
  Default router preference: medium
  Router lifetime: 3600s
  Reachable time: unspecified
  Retransmit time: unspecified
  Source linkaddr: 52:54:00:3d:d3:f4
  Prefix: 2620:52:0:1305::/64, valid_time: 86400s, preferred_time: 14400s, on_link: yes, autonomous_addr_conf: yes, router_addr: yes
  Route: ::/0, lifetime: 9000s, preference: low
[root@openshift-master-0 ~]# ip -6 -d route 
...
unicast default via fe80::5054:ff:fe3d:d3f4 dev br-ex proto ra scope global metric 49 pref medium
```

After
```
[root@openshift-master-0 ~]# ndptool monitor -t ra
NDP payload len 80, from addr: fe80::1, iface: br-ex <---- link-local change
  Type: RA
  Hop limit: 64
  Managed address configuration: yes
  Other configuration: no
  Default router preference: medium
  Router lifetime: 3600s
  Reachable time: unspecified
  Retransmit time: unspecified
  Source linkaddr: 52:54:00:3d:d3:f4
  Prefix: 2620:52:0:1305::/64, valid_time: 86400s, preferred_time: 14400s, on_link: yes, autonomous_addr_conf: yes, router_addr: yes

[root@openshift-master-0 ~]# ip -6 -d route 
unicast default proto ra scope global metric 49 pref medium
        nexthop via fe80::5054:ff:fe3d:d3f4 dev br-ex weight 1 <-- this is dead entry
        nexthop via fe80::1 dev br-ex weight 1
```

Expected results:

This entry not to exist
```
        nexthop via fe80::5054:ff:fe3d:d3f4 dev br-ex weight 1
```
and to observe expire information when doing an `ip -6 route`
e.g 
```
default via fe80::1a8b:9dff:fed4:822 dev enp0s31f6 proto ra metric 1024 expires 1798sec mtu 1500 pref medium
```
Additional info:

In the node we have
```
[root@openshift-master-0 ~]# sysctl -a|grep "net.ipv6.conf.br-ex.accept_ra "
net.ipv6.conf.br-ex.accept_ra = 0
```
Is that expected? I would have guessed that we need value `2`
Why we do not see the expires information?

Comment 2 Konstantinos 2022-05-31 09:18:38 UTC

I would expect that on a graceful stop of the router, an RA will be sent out with lifetime 0 so the route to be removed instantly. Or in case of no graceful shutdown the route entry to expire (Router life time 3600s in the RA).

Comment 3 Beniamino Galvani 2022-05-31 09:52:18 UTC

Hi,

the default route has an expiration that is tracked internally to NM.

Initially it is:

 <debug> [1653986718.9211] ndisc[0x55b5fe757cf0,"br-ex"]:   gateway fe80::5054:ff:fe3d:d3f4 pref medium exp 3600.000

After the gateway changes, it becomes:

 <debug> [1653986796.3509] ndisc[0x55b5fe757cf0,"br-ex"]:   gateway fe80::5054:ff:fe3d:d3f4 pref medium exp 3580.010
 <debug> [1653986796.3510] ndisc[0x55b5fe757cf0,"br-ex"]:   gateway fe80::1 pref medium exp 3600.000

 ...

 <debug> [1653987214.5928] ndisc[0x55b5fe757cf0,"br-ex"]:   gateway fe80::5054:ff:fe3d:d3f4 pref medium exp 3161.769
 <debug> [1653987214.5928] ndisc[0x55b5fe757cf0,"br-ex"]:   gateway fe80::1 pref medium exp 3600.000

The first route will be removed eventually when the gateway information from RA expires.

Comment 4 Beniamino Galvani 2022-05-31 09:53:52 UTC

(In reply to Konstantinos from comment #0)
> In the node we have
> ```
> [root@openshift-master-0 ~]# sysctl -a|grep "net.ipv6.conf.br-ex.accept_ra "
> net.ipv6.conf.br-ex.accept_ra = 0
> ```
> Is that expected? I would have guessed that we need value `2`

Yes, NM manages IPv6 in userspace and to do so disables some kernel sysctls.

Comment 5 Beniamino Galvani 2022-05-31 10:04:04 UTC

> I would expect that on a graceful stop of the router, an RA will be sent out with lifetime 0 so the route to be removed instantly

From the radvd.conf man page it seems that you need to explicitly set "AdvDefaultLifetime 0" to announce a zero lifetime for the gateway.

Comment 6 Konstantinos 2022-05-31 11:01:58 UTC

Thanks for the replies!

> The first route will be removed eventually when the gateway information from RA expires.

This is not happening AFAICT, is there a way to see the remaining expiring time per route entry(probably yes from logs but any other way)?

Here are the logs after running around 2hours
```
May 31 08:46:16 openshift-master-0.hub-virtual.lab NetworkManager[3245]: <debug> [1653986776.3611] ndisc[0x55b5fe757cf0,"br-ex"]:   gateway fe80::5054:ff:fe3d:d3f4 pref medium exp 3600.000
May 31 09:46:07 openshift-master-0.hub-virtual.lab NetworkManager[3245]: <debug> [1653990367.2916] ndisc[0x55b5fe757cf0,"br-ex"]:   gateway fe80::5054:ff:fe3d:d3f4 pref medium exp 9.070
[root@openshift-master-0 ~]# date
Tue May 31 10:56:31 UTC 2022
[root@openshift-master-0 ~]# ip -6 route|grep -A2 "default proto ra"
default proto ra metric 49 pref medium
        nexthop via fe80::5054:ff:fe3d:d3f4 dev br-ex weight 1
        nexthop via fe80::1 dev br-ex weight 1


[root@openshift-master-0 ~]# journalctl -b _SYSTEMD_UNIT=NetworkManager.service | grep "fe80::5054:ff:fe3d:d3f4" |tail -4
May 31 09:46:07 openshift-master-0.hub-virtual.lab NetworkManager[3245]: <debug> [1653990367.2916] ndisc[0x55b5fe757cf0,"br-ex"]:   gateway fe80::5054:ff:fe3d:d3f4 pref medium exp 9.070
May 31 09:46:07 openshift-master-0.hub-virtual.lab NetworkManager[3245]: <debug> [1653990367.2923] platform: (br-ex) route: append     IPv6 route: type unicast ::/0 via fe80::5054:ff:fe3d:d3f4 dev 11 metric 49 mss 0 rt-src ndisc
May 31 09:46:07 openshift-master-0.hub-virtual.lab NetworkManager[3245]: <debug> [1653990367.2923] platform-linux: do-add-ip6-route[type unicast ::/0 via fe80::5054:ff:fe3d:d3f4 dev 11 metric 49 mss 0 rt-src rt-ra]: failure 17 (File exists)
May 31 09:46:07 openshift-master-0.hub-virtual.lab NetworkManager[3245]: <debug> [1653990367.2923] platform: (br-ex) route-sync: adding route type unicast ::/0 via fe80::5054:ff:fe3d:d3f4 dev 11 metric 49 mss 0 rt-src ndisc failed with EEXIST, however we cannot find such a route
```


> From the radvd.conf man page it seems that you need to explicitly set "AdvDefaultLifetime 0" to announce a zero lifetime for the gateway.
Ack I can check that also, nevertheless radvd is my setup and I cannot tell what happens with real routers.

Thx!

Comment 7 Konstantinos 2022-05-31 11:24:39 UTC

btw when I do `systemctl stop radvd` then by default I observe an RA
```
interface hub-baremetal
{
        AdvSendAdvert on;
        # Note: {Min,Max}RtrAdvInterval cannot be obtained with radvdump
        AdvManagedFlag on;
        AdvOtherConfigFlag off;
        AdvReachableTime 0;
        AdvRetransTimer 0;
        AdvCurHopLimit 64;
        AdvDefaultLifetime 0;
        AdvHomeAgentFlag off;
        AdvDefaultPreference medium;
        AdvSourceLLAddress on;

        prefix 2620:52:0:1305::/64
        {
                AdvValidLifetime 86400;
                AdvPreferredLifetime 14400;
                AdvOnLink on;
                AdvAutonomous on;
                AdvRouterAddr on;
        }; # End of prefix definition


        route ::/0
        {
                AdvRoutePreference low;
                AdvRouteLifetime 0;
        }; # End of route definition

}; 

```

and probably because  of that 
```
RemoveRoute on|off
Upon shutdown, announce this route with a zero second lifetime. This should cause the route to be immediately removed from the receiving end-nodes' route table.

Default: on
```

Comment 8 Beniamino Galvani 2022-05-31 11:49:24 UTC

(In reply to Konstantinos from comment #6)
> Thanks for the replies!
> 
> > The first route will be removed eventually when the gateway information from RA expires.
> 
> This is not happening AFAICT

Can you attach full logs?

> is there a way to see the remaining expiring
> time per route entry(probably yes from logs but any other way)?

At the moment no, it's only in logs. I think NM should set the lifetime also in kernel so that it can be seen with "ip route".

Comment 9 Beniamino Galvani 2022-05-31 11:50:20 UTC

(In reply to Konstantinos from comment #7)
> btw when I do `systemctl stop radvd` then by default I observe an RA
> ...
> and probably because  of that 
> ```
> RemoveRoute on|off

Right, good catch.

Comment 11 Konstantinos 2022-05-31 12:16:29 UTC

I have attached logs after almost 5hours.

> I think NM should set the lifetime also in kernel so that it can be seen with "ip route"
I see that this is not the case for me, (I try ip -6 -d route), is it a bug?

Comment 12 Beniamino Galvani 2022-06-01 13:51:54 UTC

> > The first route will be removed eventually when the gateway information from RA expires.
>
> This is not happening AFAICT

It's a bug in NetworkManager caused by the presence of a multipath
IPv6 route. When there are two default routes, the kernel merges them
as:

 default proto ra metric 101 pref medium
	 nexthop via fe80::38e1:1bff:fe6a:e266 dev eth0 weight 1
	 nexthop via fe80::1 dev eth0 weight 1

Before commit [1], NM doesn't parse multipath routes and so it might
get a partial vision of what is configured in kernel. Specifically, it
is aware only of one of the next-hops; then when one of the two routes
expires and must be removed, NM could decide that the route is already
missing in kernel and no action is necessary.

[1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/dac12a8d6178a6d82fc0913ad8825c8556380ba8

Comment 13 Beniamino Galvani 2022-06-01 13:53:26 UTC

(In reply to Konstantinos from comment #11)
> > I think NM should set the lifetime also in kernel so that it can be seen with "ip route"
> I see that this is not the case for me, (I try ip -6 -d route), is it a bug?

I meant, ideally it should do that in the future. It doesn't at the moment.

Comment 14 Beniamino Galvani 2022-06-01 13:55:28 UTC

Created attachment 1885802 [details]
Reproducer script

Comment 27 Beniamino Galvani 2022-06-23 07:33:05 UTC

Moving to MODIFIED as this is already fixed in RHEL 8.7.

Comment 31 errata-xmlrpc 2022-11-08 10:10:38 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7680

Note You need to log in before you can comment on or make changes to this bug.