906505 – Provide IPv6 router discovery API and handle IPv6 flags in link notifications properly

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 906505 - Provide IPv6 router discovery API and handle IPv6 flags in link notifications properly

Summary: Provide IPv6 router discovery API and handle IPv6 flags in link notifications...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Jiri Pirko
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	891245 (view as bug list)
Depends On:
Blocks:	880347
TreeView+	depends on / blocked

Reported:	2013-01-31 18:16 UTC by Pavel Šimerda (pavlix)
Modified:	2015-05-05 01:23 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-03-07 15:57:58 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Pavel Šimerda (pavlix) 2013-01-31 18:16:45 UTC

From NetworkManager mailing list by Stuart D Gathman:

I have a single default router sending RAs, and another router which
does *not* advertise a default route, but instead advertises two
specific routes.  I'm not sure whether NM or the kernel is to blame, but
while radvdump shows both RAs arriving, the only route installed is the
default route.  The specific routes are ignored.

#
# radvd configuration generated by radvdump 1.8.5
# based on Router Advertisement from fe80::20c:f1ff:fecb:6de9
# received by interface em1
#

interface em1
{
     AdvSendAdvert on;
     # Note: {Min,Max}RtrAdvInterval cannot be obtained with radvdump
     AdvManagedFlag off;
     AdvOtherConfigFlag off;
     AdvReachableTime 0;
     AdvRetransTimer 0;
     AdvCurHopLimit 64;
     AdvDefaultLifetime 0;
     AdvHomeAgentFlag off;
     AdvDefaultPreference medium;
     AdvSourceLLAddress on;

     route 2001:4830:1659:8888::/64
     {
         AdvRoutePreference high;
         AdvRouteLifetime 90;
     }; # End of route definition


     route 2001:470:8:809::/64
     {
         AdvRoutePreference high;
         AdvRouteLifetime 90;
     }; # End of route definition

}; # End of interface definition
#
# radvd configuration generated by radvdump 1.8.5
# based on Router Advertisement from fe80::862b:2bff:fe5a:191f
# received by interface em1
#

interface em1
{
     AdvSendAdvert on;
     # Note: {Min,Max}RtrAdvInterval cannot be obtained with radvdump
     AdvManagedFlag on;
     AdvOtherConfigFlag on;
     AdvReachableTime 0;
     AdvRetransTimer 0;
     AdvCurHopLimit 64;
     AdvDefaultLifetime 1800;
     AdvHomeAgentFlag off;
     AdvDefaultPreference medium;
     AdvSourceLLAddress on;

     prefix 2001:470:8:488::/64
     {
         AdvValidLifetime 2592000;
         AdvPreferredLifetime 604800;
         AdvOnLink on;
         AdvAutonomous on;
         AdvRouterAddr off;
     }; # End of prefix definition

}; # End of interface definition


# ip -6 route show

2001:470:8:488::1 via 2001:470:8:488::1 dev em1  metric 0
     cache
2001:470:8:488::50 via 2001:470:8:488::50 dev em1  metric 0
     cache
2001:470:8:488::/64 dev em1  proto kernel  metric 256
2001:470:8:809::1 via fe80::862b:2bff:fe5a:191f dev em1  metric 0
     cache
2607:f8b0:4004:801::1001 via fe80::862b:2bff:fe5a:191f dev em1 metric 0
     cache
2607:f8b0:4004:801::1008 via fe80::862b:2bff:fe5a:191f dev em1 metric 0
     cache
fe80::/64 dev em1  proto kernel  metric 256
default via fe80::862b:2bff:fe5a:191f dev em1  proto static  metric 1
default via fe80::862b:2bff:fe5a:191f dev em1  proto ra  metric 1024  
expires 1794sec

Comment 2 Stuart D Gathman 2013-02-01 18:29:34 UTC

I discovered you have to set net.ipv6.conf.eth0.accept_ra_rt_info_max_plen=64 (or even higher number) for specific routes to be accepted.  The default is 0.

Comment 3 Pavel Šimerda (pavlix) 2013-02-03 14:57:29 UTC

(In reply to comment #2)
> I discovered you have to set
> net.ipv6.conf.eth0.accept_ra_rt_info_max_plen=64 (or even higher number) for
> specific routes to be accepted.  The default is 0.

Thanks for information. There is still the problem with the default route that should have been added, right?

Comment 4 Stuart D Gathman 2013-02-04 00:32:24 UTC

No, the default route is added correctly.  This is not a bug.  I asked around in IP6 group why the default max_plen is 0, and the answer given was because specific routes make it easy to surreptitiously grab packets.  I guess advertising another default route would be more visible, since it would duplicate the legit default route(s), but a specific route (like 2000::/4) would be more invisible. 

Whether or not you agree with that, it is certainly a good idea to increase max_plen *only* on trusted interfaces (where you expect some specific routes).  So the default max_plen is probably good.  (Not that I'm an expert.)

Comment 5 Pavel Šimerda (pavlix) 2013-02-04 11:50:30 UTC

(In reply to comment #4)
> No, the default route is added correctly.

Why?

> This is not a bug.

Why?

I'm sorry but statements without explanations or sources are not really helpful.

> I asked
> around in IP6 group why the default max_plen is 0, and the answer given was
> because specific routes make it easy to surreptitiously grab packets.

While default route doesn't? I have *no* problems with kernel defaults as we can change them.

But unfortunately, the kernel model of accepting/declining acquired route information is IMO flawed. I must be able to change the settings at any time with immediate effect. I can only guess that the current implementation doesn't remember routes that weren't put into the routing table because of the original setting, and therefore cannot put them to the routing table when I *change* the setting.

I must admit that the IPv6 autoconf specifications don't really make it easy.

> I guess advertising another default route would be more visible, since it
> would duplicate the legit default route(s), but a specific route (like
> 2000::/4) would be more invisible.

It depends solely on the UI.

> Whether or not you agree with that, it is certainly a good idea to increase
> max_plen *only* on trusted interfaces (where you expect some specific
> routes).

I don't feel so much certainty exactly because this looks more like nitpicking than security. The only think I'm certain is that we should talk about it in the NetworkManager team.

> So the default max_plen is probably good.  (Not that I'm an expert.)

As I described earlier, we will definitely need to fix kernel autoconfiguration to do the proper work (for the network management daemon), or we will need to *abandon* kernel autoconfiguration implementation, disable it, and run a userspace autoconf daemon (or one integrated in NM, if it proves simple enough).

I'm still inclined towards using the kernel implementation, that's why I'm working with Jiří Pírko on that. But I can't prevent the kernel guys from choosing the opposite (i.e. from asking us to use userspace implementation for any nontrivial IPv6 configuration instead of providing all necessary features).

Comment 6 Stuart D Gathman 2013-02-04 19:54:27 UTC

It's more of a distro default.  It is easy enough to add a different max_plen to /etc/sysctl.conf.  I plan to just add changing max_plen in sysctl as part of my customization - and do it only for the LAN interface.  

You can certainly change max_plen on the fly - and the routes get added on the very next RA (which is within 10 seconds on our LAN).  If NM wants to change max_plen for interfaces it is managing (and I think it should - to avoid the end user confusion that I experienced on my client only desktop), that is very easy to do (and maybe put it back when stopping NM).

So, if the component is NetworkManager, I would agree this is a bug (autoconf did not work correctly).  But the kernel did exactly what it was supposed to given the sysctl settings.  You can't expect an end user to know to fix /etc/sysctl.conf just so IP6 autoconf will work.  So NM should set max_plen for them.

Comment 7 Pavel Šimerda (pavlix) 2013-02-04 22:00:03 UTC


(In reply to comment #5)
> (In reply to comment #4)
> > No, the default route is added correctly.
> 
> Why?
> 
> > This is not a bug.
> 
> Why?

Stuart: I still have no answer to that. From my point of view, what kernel does is clearly against the RFC.

(In reply to comment #6)
> You can certainly change max_plen on the fly - and the routes get added on
> the very next RA

And I would still consider that wrong.

Unfortunately, the administrators' point of view (as well as NetworkManager's one) is not 100% compatible with the configuration model described in IPv6 autoconf standards. That means different people will have different opinions according to their source of information.

But that gets us back to the easiest (afaik currenlty unavailable) workaround of the possibility to ask the kernel to perform router discovery immediately without disturbing anything.

Jiří: Any comments? Should I create a separate bug report for explicit request to trigger router discovery?

> (which is within 10 seconds on our LAN).

And which is 10 minutes by default on some router advertisement software and may be much longer on some networks.

> So, if the component is NetworkManager, I would agree this is a bug
> (autoconf did not work correctly).

We are certainly going to revisit this in NetworkManager after we get through the kernel problems.

Comment 8 Stuart D Gathman 2013-02-05 02:22:54 UTC

Router solicitation is done whenever the interface is brought up.  It would be useful for there to also be a way to trigger this without bringing the interface down.  As a work around NM can set max_plen before bring the interface up. 

In fact, as a power user, I can simply do "ifconfig iface down", and NM brings it up again within a few seconds.  This triggers router solicitations.

(This is a strange situation, where I just asked a question on the NM mailing list, and it gets a bug, and I'm arguing that it isn't really a kernel bug.)

Comment 9 Stuart D Gathman 2013-02-05 02:28:23 UTC

(In reply to comment #7)
> > > This is not a bug.
> > 
> > Why?
> 
> Stuart: I still have no answer to that. From my point of view, what kernel
> does is clearly against the RFC.

So set max_plen=128 as a distro default on all interfaces, and we are RFC compliant.  Hmm, a specific route for a single IP6?  Maybe that's what the kernel guys were thinking wasn't a good idea.

> > So, if the component is NetworkManager, I would agree this is a bug
> > (autoconf did not work correctly).
> 
> We are certainly going to revisit this in NetworkManager after we get
> through the kernel problems.

Comment 10 Pavel Šimerda (pavlix) 2013-02-05 15:37:22 UTC

(In reply to comment #8)
> It would be useful for there to also be a way to trigger this without
> bringing the interface down.

Exactly.

> As a work around NM can set max_plen before bring the interface up. 

It is not possible as NetworkManager should never bring it down just to be able to bring it up then.

> (This is a strange situation, where I just asked a question on the NM
> mailing list, and it gets a bug, and I'm arguing that it isn't really a
> kernel bug.)

It is a kernel bug to install a default route when the RA says AdvDefaultLifetime=0, isn't it?

(In reply to comment #9)
> (In reply to comment #7)
> > Stuart: I still have no answer to that. From my point of view, what kernel
> > does is clearly against the RFC.
> 
> So set max_plen=128 as a distro default on all interfaces, and we are RFC
> compliant.  Hmm, a specific route for a single IP6?  Maybe that's what the
> kernel guys were thinking wasn't a good idea.

Sorry for the confusion, I referred to the default route, not the specific routes.

How to deal with autoconf security issues is another topic that will most probably have to be handled in NM configuration. I wouldn't care so much about kernel defaults if we can change them from userspace and then at any time, without setting down the interface, trigger new router discovery.

I must say that your feedback pushed me more towards the idea that we really need the API to trigger router discovery on a specific running interface. Maybe it's not a perfect solution, but it at least solves most of the problems in one turn.

See also bug #905880.

Comment 11 Stuart D Gathman 2013-02-06 03:55:16 UTC

(In reply to comment #10)

> It is a kernel bug to install a default route when the RA says
> AdvDefaultLifetime=0, isn't it?

In this scenario, the kernel did *not* install the default route from
the RA with AdvDefaultLifetime=0.  There were 2 routers, and it (correctly)
installed the default route from the RA advertising it.  It merely failed to
install the *specific* routes from the RA advertising AdvDefaultLifetime=0
(and the two specific routes), due to the max_plen restriction.

> Sorry for the confusion, I referred to the default route, not the specific
> routes.
> 
> How to deal with autoconf security issues is another topic that will most
> probably have to be handled in NM configuration. I wouldn't care so much
> about kernel defaults if we can change them from userspace and then at any
> time, without setting down the interface, trigger new router discovery.
> 
> I must say that your feedback pushed me more towards the idea that we really
> need the API to trigger router discovery on a specific running interface.
> Maybe it's not a perfect solution, but it at least solves most of the
> problems in one turn.

Yes, I agree that would let NM or even user scripts deal with most policy
issues.  (Any that I can think of, in fact, at my current IP6 experience level.)

However, can't a user space program with RAW IP access just send out the necessary packet?

Comment 12 Pavel Šimerda (pavlix) 2013-02-06 14:30:58 UTC

(In reply to comment #11)
> (In reply to comment #10)
> 
> > It is a kernel bug to install a default route when the RA says
> > AdvDefaultLifetime=0, isn't it?
> 
> In this scenario, the kernel did *not* install the default route from
> the RA with AdvDefaultLifetime=0.

Aha, this is the clarification I needed. Also you for the feedback on static route limitation.

> > I must say that your feedback pushed me more towards the idea that we really
> > need the API to trigger router discovery on a specific running interface.
> > Maybe it's not a perfect solution, but it at least solves most of the
> > problems in one turn.
> 
> Yes, I agree that would let NM or even user scripts deal with most policy
> issues.  (Any that I can think of, in fact, at my current IP6 experience
> level.)
> 
> However, can't a user space program with RAW IP access just send out the
> necessary packet?

It is unnecessarily complex to handle router discovery in userspace when it is already implemented in the kernel. And there is quite a number of timing issues. See below:

1) Userspace program sends Router Solicitation directly and plans the next Router Solicitation.

1b) After some time, program re-sends router solicitation and plans another.

1c) When we tried several times, we need to cancel the whole action.

2) Program recieves router advertisement. We don't want to parse it as kernel will send us the information anyway but we must stop the userspace router discovery.

3) We recieve the information from kernel but without indication, whether it's new or old. This may lead to a race condition.

4) At some point of time, we have to be able to say that we recieved all information we needed and that we can proceed further.

I already created an experimental implementation for this and I don't really like it. The next step is either to do more stuff in userspace and ignore the kernel implementation, or fix the kernel API.

So, what I believe we need now is...

1) To be able to activate kernel router discovery without side effects.

2) To be notified about the started router discovery (e.g. via netlink notification with IF_RS_SENT), which is how I think it already works. What I don't know is, whether rtnl_link_get_kernel() returns the link object with
IPv6 flag information or not.

I'm afraid it doesn't, which would mean there's a possible race condition of processing an obsolete link notification with the IF_RA_RCVD flag set.

3a) To be notified about the successful router discovery. New notification with IF_RA_SENT should be good enough. But it must be possible to check it with the current kernel status via rtnl_link_get_kernel() to avoid processing an obsolete notification as described above.

3b) To be notified about unsuccessful router discovery. I guess link notification without IF_RS_SENT and IF_RA_RCVD should be good enough. Actually, when discovery is unsuccessful, all flags should be IMO removed.

Changing the summary accordingly.

As a side note, we will probably never fully support situations with multiple routers that send different information through RA as we don't know whether more RAs will arrive or not. This is probably the biggest drawback of IPv6 stateless autoconfiguration that was apparently designed with embedded TCP/IP stacks in mind, than modern networking world with lots of different connection and security methods.

Comment 13 Stuart D Gathman 2013-02-27 20:36:57 UTC

(In reply to comment #12)
> As a side note, we will probably never fully support situations with
> multiple routers that send different information through RA as we don't know
> whether more RAs will arrive or not. This is probably the biggest drawback
> of IPv6 stateless autoconfiguration that was apparently designed with
> embedded TCP/IP stacks in mind, than modern networking world with lots of
> different connection and security methods.

Just adjust config with each RA arrival.  I assume you are thinking about 
the problem of whether a new RA with a different router addr *replaces*
or *supplements* existing RA(s).  

My interpretation would be that new RA(s) supplement existing ones if there 
is no route overlap.  (As is the case in my example.)  This will cover the
vast majority of practical applications (routers to specific subnets).

   --

If there *is* route overlap, then there is ambiguity.  Let's consider just
RAs with only the default route for starters.  Here is a possible
interpretation:

  If the priority is the same, then replace.
  If the priority is different, then supplement (one is a backup router).

Here is another:

  Always supplement.  TTLs should be short enough to provide for reasonable
  recovery should one of a pair of equal priority routers die (load sharing).

   --

For mixed default and specific routes, the simplest is to always supplement
(and hope TTLs are reasonable).  The next simplest is to treat each route
independently (replacing a route with the same destination and priority).

Comment 14 Pavel Šimerda (pavlix) 2013-02-28 09:34:10 UTC

I believe we (NetworkManager developers) came to a consensus that the current way kernel handles IPv6 autoconfiguration will never properly work for us. If you want a more thorough description of the problem, see:

https://fedoraproject.org/wiki/Networking/Ideas/AutomaticConfiguration

I don't think it is worth adding all of this stuff to the kernel as a userspace implementation will be much easier to debug and modify. I'm almost sure we're going for one of the userspace solutions.

Therefore I would like to state that we no longer consider this functionality a requirement for future NetworkManager's IPv6 support.

Comment 15 Pavel Šimerda (pavlix) 2013-02-28 09:45:40 UTC

(In reply to comment #13)
> Just adjust config with each RA arrival.  I assume you are thinking about 
> the problem of whether a new RA with a different router addr *replaces*
> or *supplements* existing RA(s).

Not at all. This thing doesn't need more discussion for this bugreport (and it can be safely closed as WONTFIX), but you deserve a little bit of details from me:

1) The user expects his application to connect to a server.

2) The application waits for NetworkManager to broadcast that the connection is fully configured. Or at least a new connection attempt is triggered by the signal from NetworkManager.

3a) IPv4: NetworkManager waits for DHCP contract and when it is negotiated, it considers the connection done.

3b) IPv6: NetworkManager triggers router solicitation and waits for the first router advertisement to arrive. Then it considers the network connection done.

(Let's ignore the problems that are caused by doing #3a and #3b together)

The problem with #3b is, that the specific route for the server may be in a second router advertisement that comes after the first one. It's perfectly OK to amend route information according to the standards.

But that means the application will recieve the “connectivity established” signal *before* the necessary route (from the second RA) is actually there.

DHCP has the big advantage of being contract-based. As you cannot negotiate any routes with DHCPv6, you only have the asynchronous configuration and you cannot say at any point of time that you are finished.

The only way to do that is issue a best practice document that will describe what (still standards-compliant) behavior should be actually avoided.


> 
> My interpretation would be that new RA(s) supplement existing ones if there 
> is no route overlap.  (As is the case in my example.)  This will cover the
> vast majority of practical applications (routers to specific subnets).
> 
>    --
> 
> If there *is* route overlap, then there is ambiguity.  Let's consider just
> RAs with only the default route for starters.  Here is a possible
> interpretation:
> 
>   If the priority is the same, then replace.
>   If the priority is different, then supplement (one is a backup router).
> 
> Here is another:
> 
>   Always supplement.  TTLs should be short enough to provide for reasonable
>   recovery should one of a pair of equal priority routers die (load sharing).
> 
>    --
> 
> For mixed default and specific routes, the simplest is to always supplement
> (and hope TTLs are reasonable).  The next simplest is to treat each route
> independently (replacing a route with the same destination and priority).

Comment 16 Jiri Pirko 2013-03-07 15:57:58 UTC

This will be likely resolved in userspace. Closing as wontfix for now. Please feel free to reopen if situation changes.

Comment 17 Jiri Pirko 2013-03-08 12:54:11 UTC

*** Bug 891245 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.