Bug 2060684

Summary: platform cache inconsistency with `ip route change` for IPv6 multipath routes
Product: Red Hat Enterprise Linux 9 Reporter: Thomas Haller <thaller>
Component: NetworkManagerAssignee: Thomas Haller <thaller>
Status: CLOSED ERRATA QA Contact: Matej Berezny <mberezny>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 9.2CC: bgalvani, djasa, ferferna, lrintel, mberezny, rkhan, sfaye, sukulkar, till, vbenes
Target Milestone: rcKeywords: Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: NetworkManager-1.41.90-1.el9 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-09 08:17:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Thomas Haller 2022-03-04 00:26:03 UTC
Try this:




>>>

ip netns del x
ip netns add x
ip -netns x link add v type veth peer w
ip -netns x link set v up
ip -netns x link set w up
ip -netns x addr add 1:2:3:4::100/64 dev v

ip netns exec x ip monitor route &

ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::1 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::2 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::3 dev v

ip -netns x route change 5:1::1/128 nexthop via 1:2:3:4::5 dev v

<<<


First we add 3 routes (which kernel all merges together into one multipath routes).

Then, `ip route change` will drop them all. The RTM_NEWROUTE message has a `NLM_F_REPLACE` flag, but kernel actually replaced the entire multipath route.

NetworkManage now splits such ECMP routes into multiple single hop routes. Our current handling of `NLM_F_REPLACE` will only remove the first route (in an ordered list). It thus will leak:

`./src/core/platform/tests/monitor -p` will show:

```
 <debug> [1646353215.3356] platform: (v) signal: route   6   added: type unicast 1:2:3:4::/64 dev 12 metric 256 mss 0 rt-src rt-kernel
 <debug> [1646353215.3357] platform: (v) signal: address 6   added: 1:2:3:4::100/64 lft forever pref forever lifetime 7-0[4294967295,4294967295] dev 12 flags tentative,permanent src kernel
 <debug> [1646353215.3602] platform: (v) signal: route   6   added: type unicast 5:1::1/128 via 1:2:3:4::1 dev 12 metric 1024 mss 0 rt-src rt-boot
 <debug> [1646353215.3681] platform: (v) signal: route   6   added: type unicast 5:1::1/128 via 1:2:3:4::2 dev 12 metric 1024 mss 0 rt-src rt-boot
 <debug> [1646353215.3749] platform: (v) signal: route   6   added: type unicast 5:1::1/128 via 1:2:3:4::3 dev 12 metric 1024 mss 0 rt-src rt-boot
 <debug> [1646353215.3866] platform: (v) signal: route   6   added: type unicast 5:1::1/128 via 1:2:3:4::5 dev 12 metric 1024 mss 0 rt-src rt-boot
 <debug> [1646353215.3867] platform: (v) signal: route   6 removed: type unicast 5:1::1/128 via 1:2:3:4::1 dev 12 metric 1024 mss 0 rt-src rt-boot
```



This is wrong.

Comment 1 Thomas Haller 2022-03-04 00:28:30 UTC
also, we need to take care that on-link hosts (without a gateway) are treated specially

# ip -netns x route append 5:1::1/128 nexthop dev v
Error: Device only routes can not be added for IPv6 using the multipath API.


they also don't get merged! So if the first route in the list of routes to be deleted is such a route, then we need to only delete that route (not all ECMP routes).

Comment 2 Thomas Haller 2022-03-04 00:40:09 UTC
try also:



```
ip netns del x
ip netns add x
ip -netns x link add v type veth peer w
ip -netns x link set v up
ip -netns x link set w up
ip -netns x addr add 1:2:3:4::100/64 dev v

ip netns exec x ip monitor route &

ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::1 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::2 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::3 dev v
ip -netns x route append 5:1::1/128 dev v

ip -netns x route change 5:1::1/128 nexthop via 1:2:3:4::5 dev v
```

and

```
ip netns del x
ip netns add x
ip -netns x link add v type veth peer w
ip -netns x link set v up
ip -netns x link set w up
ip -netns x addr add 1:2:3:4::100/64 dev v

ip netns exec x ip monitor route &

ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::1 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::2 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::3 dev v
ip -netns x route append 5:1::1/128 dev v

ip -netns x route change 5:1::1/128 nexthop via 1:2:3:4::5 dev v
```




See also:

```


ip netns del x
ip netns add x
ip -netns x link add v type veth peer w
ip -netns x link set v up
ip -netns x link set w up
ip -netns x addr add 1:2:3:4::100/64 dev v nodad

ip netns exec x ip monitor route &

sleep 3

ip -netns x route append 5:1::1/128 src 1:2:3:4::100 nexthop via 1:2:3:4::1 dev v
ip -netns x route append 5:1::1/128 src 1:2:3:4::100 nexthop via 1:2:3:4::2 dev v
ip -netns x route append 5:1::1/128 src 1:2:3:4::100 nexthop via 1:2:3:4::3 dev v

ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::11 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::12 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::13 dev v

ip -netns x route append 5:1::1/128 dev v

ip -netns x route change 5:1::1/128 nexthop via 1:2:3:4::5 dev v
```


*SIGH*

Comment 3 Till Maas 2022-03-04 07:02:26 UTC
Does this mean that https://bugzilla.redhat.com/show_bug.cgi?id=1837254 is actually not full fixed?

Comment 4 Thomas Haller 2022-03-04 07:11:57 UTC
(In reply to Till Maas from comment #3)
> Does this mean that https://bugzilla.redhat.com/show_bug.cgi?id=1837254 is
> actually not full fixed?

that depends on your definition of 1837254. But no, it's "for the most part" fixed.

Comment 5 Thomas Haller 2022-03-04 07:21:33 UTC
this issue is mostly about when using `ip route change` or `ip route replace` (or better: what that corresponds to on netlink).

NetworkManager doesn't use that (it uses the equivalent of `ip route append`), so by using NetworkManager alone on an interface, you cannot confuse the cache this  way.
(there are probably other, less understood ways to confuse the cache, if you try hard enough).

Also, bug 1837254 existed since forever, and it got now significantly harder to introduce a cache inconsistency. For that reason, the bug as far as 1837254 is concerned, is fixed.

Comment 6 David Jaša 2022-03-30 10:06:36 UTC
(In reply to Thomas Haller from comment #5)
> NetworkManager doesn't use that (it uses the equivalent of `ip route
> append`), so by using NetworkManager alone on an interface, you cannot
> confuse the cache this  way.

Just to confirm: the testing should be based on observing if NM reports routes correctly after externally run 'ip route change'?

Comment 8 Ana Cabral 2022-08-10 13:30:03 UTC
This was agreed to be in 8.8, in order to prioritize team commitments.

Comment 9 Ana Cabral 2022-08-10 14:44:23 UTC
We updated Jira but forgot to update here.

Comment 18 Thomas Haller 2023-01-19 11:47:57 UTC
this is mostly fixed now, with commit https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/adad1d435814e6e195c97836073bce62bac40a47

there are remaining issues, but I think they cannot be fixed without kernel fixes. I reported bug 2162315 and bug 2161994. 
There are also older bug reports bug 1337855 and bug 1337860, which were never fixed, but which I think would also help.


As far as this issue is concerned, this is moved as much forward as it can.

Comment 20 Thomas Haller 2023-02-01 21:56:33 UTC
upstream unit test now passes: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/04fb0429652d7597ea53c1e054c7232f0b17c44a

in addition to the reported bugs from comment 18, I also reported bug 2165720

Comment 21 Thomas Haller 2023-02-09 12:37:09 UTC
all relevant code changes are already in, since NetworkManager-1.41.90-1.el9.

Later, unit tests were improved further (which doesn't affect the NetworkManager RPM).

Comment 25 errata-xmlrpc 2023-05-09 08:17:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2485