RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2060684 - platform cache inconsistency with `ip route change` for IPv6 multipath routes
Summary: platform cache inconsistency with `ip route change` for IPv6 multipath routes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: NetworkManager
Version: 9.2
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: rc
: ---
Assignee: Thomas Haller
QA Contact: Matej Berezny
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-04 00:26 UTC by Thomas Haller
Modified: 2023-05-09 10:22 UTC (History)
10 users (show)

Fixed In Version: NetworkManager-1.41.90-1.el9
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-09 08:17:27 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker NMT-235 0 None None None 2023-02-01 07:30:46 UTC
Red Hat Issue Tracker RHELPLAN-114437 0 None None None 2022-03-04 00:34:16 UTC
Red Hat Product Errata RHBA-2023:2485 0 None None None 2023-05-09 08:17:51 UTC
freedesktop.org Gitlab NetworkManager NetworkManager-ci merge_requests 1294 0 None merged [mr/1494] test routes in NetworkManager cache 2023-02-13 14:23:41 UTC
freedesktop.org Gitlab NetworkManager NetworkManager merge_requests 1210 0 None opened Draft: platform cache inconsistency with `ip route change` for IPv6 multipath routes 2022-05-05 10:01:56 UTC

Description Thomas Haller 2022-03-04 00:26:03 UTC
Try this:




>>>

ip netns del x
ip netns add x
ip -netns x link add v type veth peer w
ip -netns x link set v up
ip -netns x link set w up
ip -netns x addr add 1:2:3:4::100/64 dev v

ip netns exec x ip monitor route &

ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::1 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::2 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::3 dev v

ip -netns x route change 5:1::1/128 nexthop via 1:2:3:4::5 dev v

<<<


First we add 3 routes (which kernel all merges together into one multipath routes).

Then, `ip route change` will drop them all. The RTM_NEWROUTE message has a `NLM_F_REPLACE` flag, but kernel actually replaced the entire multipath route.

NetworkManage now splits such ECMP routes into multiple single hop routes. Our current handling of `NLM_F_REPLACE` will only remove the first route (in an ordered list). It thus will leak:

`./src/core/platform/tests/monitor -p` will show:

```
 <debug> [1646353215.3356] platform: (v) signal: route   6   added: type unicast 1:2:3:4::/64 dev 12 metric 256 mss 0 rt-src rt-kernel
 <debug> [1646353215.3357] platform: (v) signal: address 6   added: 1:2:3:4::100/64 lft forever pref forever lifetime 7-0[4294967295,4294967295] dev 12 flags tentative,permanent src kernel
 <debug> [1646353215.3602] platform: (v) signal: route   6   added: type unicast 5:1::1/128 via 1:2:3:4::1 dev 12 metric 1024 mss 0 rt-src rt-boot
 <debug> [1646353215.3681] platform: (v) signal: route   6   added: type unicast 5:1::1/128 via 1:2:3:4::2 dev 12 metric 1024 mss 0 rt-src rt-boot
 <debug> [1646353215.3749] platform: (v) signal: route   6   added: type unicast 5:1::1/128 via 1:2:3:4::3 dev 12 metric 1024 mss 0 rt-src rt-boot
 <debug> [1646353215.3866] platform: (v) signal: route   6   added: type unicast 5:1::1/128 via 1:2:3:4::5 dev 12 metric 1024 mss 0 rt-src rt-boot
 <debug> [1646353215.3867] platform: (v) signal: route   6 removed: type unicast 5:1::1/128 via 1:2:3:4::1 dev 12 metric 1024 mss 0 rt-src rt-boot
```



This is wrong.

Comment 1 Thomas Haller 2022-03-04 00:28:30 UTC
also, we need to take care that on-link hosts (without a gateway) are treated specially

# ip -netns x route append 5:1::1/128 nexthop dev v
Error: Device only routes can not be added for IPv6 using the multipath API.


they also don't get merged! So if the first route in the list of routes to be deleted is such a route, then we need to only delete that route (not all ECMP routes).

Comment 2 Thomas Haller 2022-03-04 00:40:09 UTC
try also:



```
ip netns del x
ip netns add x
ip -netns x link add v type veth peer w
ip -netns x link set v up
ip -netns x link set w up
ip -netns x addr add 1:2:3:4::100/64 dev v

ip netns exec x ip monitor route &

ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::1 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::2 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::3 dev v
ip -netns x route append 5:1::1/128 dev v

ip -netns x route change 5:1::1/128 nexthop via 1:2:3:4::5 dev v
```

and

```
ip netns del x
ip netns add x
ip -netns x link add v type veth peer w
ip -netns x link set v up
ip -netns x link set w up
ip -netns x addr add 1:2:3:4::100/64 dev v

ip netns exec x ip monitor route &

ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::1 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::2 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::3 dev v
ip -netns x route append 5:1::1/128 dev v

ip -netns x route change 5:1::1/128 nexthop via 1:2:3:4::5 dev v
```




See also:

```


ip netns del x
ip netns add x
ip -netns x link add v type veth peer w
ip -netns x link set v up
ip -netns x link set w up
ip -netns x addr add 1:2:3:4::100/64 dev v nodad

ip netns exec x ip monitor route &

sleep 3

ip -netns x route append 5:1::1/128 src 1:2:3:4::100 nexthop via 1:2:3:4::1 dev v
ip -netns x route append 5:1::1/128 src 1:2:3:4::100 nexthop via 1:2:3:4::2 dev v
ip -netns x route append 5:1::1/128 src 1:2:3:4::100 nexthop via 1:2:3:4::3 dev v

ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::11 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::12 dev v
ip -netns x route append 5:1::1/128 nexthop via 1:2:3:4::13 dev v

ip -netns x route append 5:1::1/128 dev v

ip -netns x route change 5:1::1/128 nexthop via 1:2:3:4::5 dev v
```


*SIGH*

Comment 3 Till Maas 2022-03-04 07:02:26 UTC
Does this mean that https://bugzilla.redhat.com/show_bug.cgi?id=1837254 is actually not full fixed?

Comment 4 Thomas Haller 2022-03-04 07:11:57 UTC
(In reply to Till Maas from comment #3)
> Does this mean that https://bugzilla.redhat.com/show_bug.cgi?id=1837254 is
> actually not full fixed?

that depends on your definition of 1837254. But no, it's "for the most part" fixed.

Comment 5 Thomas Haller 2022-03-04 07:21:33 UTC
this issue is mostly about when using `ip route change` or `ip route replace` (or better: what that corresponds to on netlink).

NetworkManager doesn't use that (it uses the equivalent of `ip route append`), so by using NetworkManager alone on an interface, you cannot confuse the cache this  way.
(there are probably other, less understood ways to confuse the cache, if you try hard enough).

Also, bug 1837254 existed since forever, and it got now significantly harder to introduce a cache inconsistency. For that reason, the bug as far as 1837254 is concerned, is fixed.

Comment 6 David Jaša 2022-03-30 10:06:36 UTC
(In reply to Thomas Haller from comment #5)
> NetworkManager doesn't use that (it uses the equivalent of `ip route
> append`), so by using NetworkManager alone on an interface, you cannot
> confuse the cache this  way.

Just to confirm: the testing should be based on observing if NM reports routes correctly after externally run 'ip route change'?

Comment 8 Ana Cabral 2022-08-10 13:30:03 UTC
This was agreed to be in 8.8, in order to prioritize team commitments.

Comment 9 Ana Cabral 2022-08-10 14:44:23 UTC
We updated Jira but forgot to update here.

Comment 18 Thomas Haller 2023-01-19 11:47:57 UTC
this is mostly fixed now, with commit https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/adad1d435814e6e195c97836073bce62bac40a47

there are remaining issues, but I think they cannot be fixed without kernel fixes. I reported bug 2162315 and bug 2161994. 
There are also older bug reports bug 1337855 and bug 1337860, which were never fixed, but which I think would also help.


As far as this issue is concerned, this is moved as much forward as it can.

Comment 20 Thomas Haller 2023-02-01 21:56:33 UTC
upstream unit test now passes: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/04fb0429652d7597ea53c1e054c7232f0b17c44a

in addition to the reported bugs from comment 18, I also reported bug 2165720

Comment 21 Thomas Haller 2023-02-09 12:37:09 UTC
all relevant code changes are already in, since NetworkManager-1.41.90-1.el9.

Later, unit tests were improved further (which doesn't affect the NetworkManager RPM).

Comment 25 errata-xmlrpc 2023-05-09 08:17:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (NetworkManager bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2485


Note You need to log in before you can comment on or make changes to this bug.