Bug 1362528 - Unable to disable IPv6 RAs acceptance on interfaces by using nic templates/os-net-config
Summary: Unable to disable IPv6 RAs acceptance on interfaces by using nic templates/os...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 9.0 (Mitaka)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: 10.0 (Newton)
Assignee: Dan Sneddon
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks: 1364498 1396696
TreeView+ depends on / blocked
 
Reported: 2016-08-02 12:50 UTC by Marius Cornea
Modified: 2020-12-11 12:17 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Due to logic in the kernel, net.ipv6.conf.default.autoconf and accept_ra are not by themselves sufficient to disable those features if IP forwarding is turned off. Since the OpenStack overcloud nodes do not have IP forwarding enabled typically, these settings may be overridden. Consequence: In the default configuration, systems deployed with IPv6 on OSP 9 may accept unwanted routes and IP addresses from routers on connected networks. Fix: In order to avoid this issue, the kernel sysctl settings for net.ipv6.conf.all.autoconf and net.ipv6.conf.all.accept_ra should be set to "0". This may be done by editing a file in the openstack-tripleo-heat-templates used for the deployment. In the openstack-tripleo-heat-templates/puppet/hieradata/common.yaml file, locate the settings for net.ipv6.conf.default.autoconf/accept_ra, and add the same settings for net.ipv6.conf.all.autoconf and net.ipv6.conf.all.accept_ra (all should have a value of "0"). Result: When Puppet applies the new settings, the kernel will stop accepting new autoconf IP addresses or routes from routing advertisements (RAs). This will leave the system functioning with static IPs and routes as designed.
Clone Of:
: 1364498 1396696 (view as bug list)
Environment:
Last Closed: 2016-11-19 01:08:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1609125 0 None None None 2016-08-03 21:04:08 UTC
Launchpad 1632830 0 None None None 2016-10-12 20:28:04 UTC
OpenStack gerrit 385603 0 None MERGED Disable IPv6 RAs & Autoconf For All (Not Just Default) 2021-01-14 10:52:44 UTC

Description Marius Cornea 2016-08-02 12:50:36 UTC
Description of problem:

Currently we cannot disable dhcpv6 on interfaces by using the nic templates, resulting in the inability to control ip assignment and v6 routes on interfaces connected to a network which runs a dhcpv6 server. 

There are particular scenarios when IPv6 deployments could fail because an ipv6 default route might be provided by the DHCPv6 server and installed before the static route configured on the ExternalInterfaceDefaultRoute so the undercloud won't be able to reach the public VIP during postconfig. 

Version-Release number of selected component (if applicable):
os-net-config-0.2.4-3.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Specify the following configuration in the controller nic template:

-
  type: interface
  name: nic5
  use_dhcp: false
  use_dhcpv6: false
  addresses:
    -
      ip_netmask: {get_param: ManagementIpSubnet}

2. Check the resulting ifcfg script:
[root@overcloud-controller-0 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth4 
# This file is autogenerated by os-net-config
DEVICE=eth4
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
BOOTPROTO=static
IPADDR=172.16.17.161
NETMASK=255.255.255.128

3. Check the actuall interface configuration:

[root@overcloud-controller-0 ~]# ip a s dev eth4
6: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:10:9c:b3 brd ff:ff:ff:ff:ff:ff
    inet 172.16.17.161/25 brd 172.16.17.255 scope global eth4
       valid_lft forever preferred_lft forever
    inet6 2001:db8:ca2:3:5054:ff:fe10:9cb3/64 scope global mngtmpaddr dynamic 
       valid_lft 3529sec preferred_lft 3529sec
    inet6 fe80::5054:ff:fe10:9cb3/64 scope link 
       valid_lft forever preferred_lft forever

[root@overcloud-controller-0 ~]# ip -6 r | grep default
default via fe80::5054:ff:fe4f:248b dev eth4  proto ra  metric 1024  expires 1705sec hoplimit 64


Actual results:
Note that the default route received on eth4 interface was prefered instead of the static one configured for the external network vlan:

[root@overcloud-controller-0 ~]# cat /etc/sysconfig/network-scripts/route6-vlan100 
default via 2001:db8:ca2:4::1 dev vlan100


Expected results:
use_dhcpv6 is honored and there is no ipv6 address nor any ipv6 routes received on the nic that it's configured for.

Additional info:

Comment 2 Marius Cornea 2016-08-02 13:39:23 UTC
In order to avoid this, one needs to make sure that there's no dhcpv6 server running on the networks the nodes are connected to.

Comment 3 Dan Sneddon 2016-08-02 17:03:35 UTC
(In reply to Marius Cornea from comment #0)
> Description of problem:

I'm not sure why this is happening. The default for network interfaces is to have DHCPv6 disabled:

"""
DHCPV6C=answer
    where answer is one of the following:

        yes — Use DHCP to obtain an IPv6 address for this interface.
        no — Do not use DHCP to obtain an IPv6 address for this interface. This is the default value. 
"""
From: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/s1-networkscripts-interfaces.html

However, even if DHCPv6 is picking up an IPv6 default route, that will only be used for routing IPv6 traffic. The default route for IPv4 traffic will be unaffected. You can verify this by running "ip r | grep default".

If you are trying to use IPv6 on some interfaces, and DHCPv6 is running on one of the IPv4 networks, this might be a problem, but probably not. Static IP routes are favored over those learned via DHCP (via metrics), so if there was a static IPv6 default route it would also not be overridden by a DHCPv6 server.

So I think that this bug is mostly cosmetic.

Comment 4 Marius Cornea 2016-08-02 20:35:45 UTC
(In reply to Dan Sneddon from comment #3)
> (In reply to Marius Cornea from comment #0)
> > Description of problem:
> 
> I'm not sure why this is happening. The default for network interfaces is to
> have DHCPv6 disabled:
> 
> """
> DHCPV6C=answer
>     where answer is one of the following:
> 
>         yes — Use DHCP to obtain an IPv6 address for this interface.
>         no — Do not use DHCP to obtain an IPv6 address for this interface.
> This is the default value. 
> """
> From:
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/
> html/Deployment_Guide/s1-networkscripts-interfaces.html
> 
> However, even if DHCPv6 is picking up an IPv6 default route, that will only
> be used for routing IPv6 traffic. The default route for IPv4 traffic will be
> unaffected. You can verify this by running "ip r | grep default".
> 
> If you are trying to use IPv6 on some interfaces, and DHCPv6 is running on
> one of the IPv4 networks, this might be a problem, but probably not. Static
> IP routes are favored over those learned via DHCP (via metrics), so if there
> was a static IPv6 default route it would also not be overridden by a DHCPv6
> server.

This is what I expected as well but it looks that this is not the case. I'm using an IPv4 management network which also runs the DHCPv6 server and an IPv6 network for the external network with the default IPv6 route assigned. Even though the route6-vlan100 (external network vlan) script is correctly set the default route that gets installed in the routing table is the one learned via the management network:

[root@overcloud-controller-0 heat-admin]# cat /etc/sysconfig/network-scripts/route6-vlan100 
default via 2001:db8:ca2:4::1 dev vlan100

[root@overcloud-controller-0 heat-admin]# ip -6 r | grep default
default via fe80::5054:ff:fe4f:248b dev eth4  proto ra  metric 1024  expires 1301sec hoplimit 64

This leads me to believe that the static ipv6 route in route6-vlan100 wasn't actually applied.  

> So I think that this bug is mostly cosmetic.

Comment 7 Dan Sneddon 2016-08-02 21:01:37 UTC
This route is actually being learned through RAs, not through DHCPv6, and the address on eth4 is configured through SLAAC (based on the MAC address + base address received from the router via RAs).

I believe the workaround is to add the following to /etc/sysconfig/network-scripts/ifcfg-eth4:

IPV6_AUTOCONF=no

I opened up an upstream bug on this:
https://bugs.launchpad.net/os-net-config/+bug/1609125

Comment 9 Marius Cornea 2016-08-03 07:48:51 UTC
I added IPV6_AUTOCONF=no to ifcfg-eth4 and rebooted the machine but it got back with route learned via RAs:

[root@overcloud-controller-0 heat-admin]# cat /etc/sysconfig/network-scripts/ifcfg-eth4
# This file is autogenerated by os-net-config
DEVICE=eth4
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
BOOTPROTO=static
IPADDR=172.16.17.160
NETMASK=255.255.255.128
IPV6_AUTOCONF=no

[root@overcloud-controller-0 heat-admin]# ip -6 r | grep default
default via fe80::5054:ff:fe4f:248b dev eth4  proto ra  metric 1024  expires 1733sec hoplimit 64

Comment 10 Marius Cornea 2016-08-03 08:03:51 UTC
Neverheless it seems that the script is working as expected as it disables the accept_ra for that interface.

[root@overcloud-controller-0 ~]# sysctl -a | grep net.ipv6.conf.eth4.accept_ra
net.ipv6.conf.eth4.accept_ra = 0
net.ipv6.conf.eth4.accept_ra_defrtr = 1
net.ipv6.conf.eth4.accept_ra_pinfo = 1
net.ipv6.conf.eth4.accept_ra_rt_info_max_plen = 0
net.ipv6.conf.eth4.accept_ra_rtr_pref = 1

Comment 11 Marius Cornea 2016-08-03 09:25:17 UTC
OK, probably the initscript which sets the sysctl value is ran after the RAs were received and at that moment the address/routes were already set. I noticed that after the expiration interval both the ip address and route get removed.

Comment 12 Bob Fournier 2016-08-03 14:45:06 UTC
I wonder if if cfg-eth4 also needs to have the static default gateway defined by
IPV6_DEFAULTGW=<gw>

Marius - a couple questions...

Can you provide the snippet from the templates that shows the define for the static IPv6 default gateway?

You indicated that after the expiration interval the ip address and route get removed, do you ever see the static IPv6 default gateway you have configured in the routing table?

I agree that this isn't related to dhcpv6, IIRC a node gets its routes from neighbor discovery via RAs, not from DHCPv6 so the use_dhcpv6 setting will not come into play here. The summary should probably be updated.

In order to prevent the IPv6 router from sending RAs you could also disable the sending of RAs there but I'm not sure what control you have over the router. That would also prevent any nodes from getting an autoconfigured address, so all IPv6 config would have to be static.  This seems to be the way many IPv6 rollouts are going due to security concerns.

Comment 13 Marius Cornea 2016-08-03 15:38:10 UTC
(In reply to Bob Fournier from comment #12)
> I wonder if if cfg-eth4 also needs to have the static default gateway
> defined by
> IPV6_DEFAULTGW=<gw>
> 
> Marius - a couple questions...
> 
> Can you provide the snippet from the templates that shows the define for the
> static IPv6 default gateway?

Sure:

            -
              type: ovs_bridge
              name: {get_input: bridge_name}
              use_dhcp: false
              members:
                -
                  type: interface
                  name: nic2
                  primary: true
                -
                  type: vlan
                  vlan_id: {get_param: ExternalNetworkVlanID}
                  dns_servers: {get_param: DnsServers}
                  addresses:
                  -
                    ip_netmask: {get_param: ExternalIpSubnet}
                  routes:
                    -
                      default: true
                      next_hop: {get_param: ExternalInterfaceDefaultRoute}


> You indicated that after the expiration interval the ip address and route
> get removed, do you ever see the static IPv6 default gateway you have
> configured in the routing table?

No, after the expiration interval the routing table doesn't have any default ipv6 route installed anymore.  

> I agree that this isn't related to dhcpv6, IIRC a node gets its routes from
> neighbor discovery via RAs, not from DHCPv6 so the use_dhcpv6 setting will
> not come into play here. The summary should probably be updated.

Agree, will update the title.

> In order to prevent the IPv6 router from sending RAs you could also disable
> the sending of RAs there but I'm not sure what control you have over the
> router. That would also prevent any nodes from getting an autoconfigured
> address, so all IPv6 config would have to be static.  This seems to be the
> way many IPv6 rollouts are going due to security concerns.

This is actually a libvirt network in a virtual environment, tried several configuration but I wasn't able to disable the RAs so I was expecting to be able to control this via the nodes configuration.

Another approach that I tried was to completely disable IPv6 per interface via sysctl:

parameter_defaults:
  controllerExtraConfig:
    sysctl_settings:
      net.ipv6.conf.eth4.disable_ipv6:
        value: 1

While this removed the autoconfigured address/route the controller nodes ended without a default route so deployment failed in postconfig. This happened because the static route failed to get installed when os-net-config first ran due to the existence of an already learned default route via RAs and the sysctl value was applied in a later step of the deployment.

Comment 15 Dan Sneddon 2016-08-03 17:14:56 UTC
I wonder if it would help if we set "net.ipv6.conf.all.accept_ra = 0" on the controller nodes? This might be done with an ExtraConfig, or perhaps we can edit the overcloud-full images to include this sysctl setting on first boot.

Comment 18 Marius Cornea 2016-08-03 18:12:09 UTC
(In reply to Dan Sneddon from comment #15)
> I wonder if it would help if we set "net.ipv6.conf.all.accept_ra = 0" on the
> controller nodes? This might be done with an ExtraConfig, or perhaps we can
> edit the overcloud-full images to include this sysctl setting on first boot.

I added net.ipv6.conf.all.accept_ra = 0 in /etc/sysctl.conf inside the overcloud-full image but the autoconfigured address still showed up on eth4 after deployment. Note that there are some other ipv6 sysctl parameters set during deployment in sysctl.conf:

[root@overcloud-controller-0 heat-admin]# cat /etc/sysctl.conf 
# HEADER: This file was autogenerated at 2016-08-03 13:42:25 -0400
# HEADER: by puppet.  While it can still be managed manually, it
# HEADER: is definitely not recommended.
# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
net.ipv6.conf.all.accept_ra=0
net.ipv4.ip_nonlocal_bind=1
net.ipv6.conf.default.autoconf=0
net.ipv6.conf.default.accept_ra=0
net.netfilter.nf_conntrack_max=500000
net.core.netdev_max_backlog=10000
net.ipv4.tcp_keepalive_intvl=1
net.ipv4.tcp_keepalive_time=5
net.ipv4.tcp_keepalive_probes=5
net.nf_conntrack_max=500000

Comment 19 Dan Sneddon 2016-08-03 19:13:20 UTC
I found some anecdotal information on the Web which seems to indicate that this doesn't function as you would expect:

net.ipv6.conf.all.accept_ra=0

In fact, this doesn't disable accepting RAs on a particular interface (why not? RHEL bug?).

To me, this is smelling more and more like RHEL isn't handling IPv6 routing correctly. I think we may want to open a bug against RHEL and see if we can get some help from kernel/network developers.

Comment 20 Dan Sneddon 2016-08-03 20:24:43 UTC
(In reply to Dan Sneddon from comment #19)
> I found some anecdotal information on the Web which seems to indicate that
> this doesn't function as you would expect:
> 
> net.ipv6.conf.all.accept_ra=0
> 
> In fact, this doesn't disable accepting RAs on a particular interface (why
> not? RHEL bug?).
> 
> To me, this is smelling more and more like RHEL isn't handling IPv6 routing
> correctly. I think we may want to open a bug against RHEL and see if we can
> get some help from kernel/network developers.

I've been working with bfournie on this. I confirmed what I think is a potential bug in RHEL:

# cat /etc/sysctl.d/99-sysctl.conf | grep autoconf
net.ipv6.conf.default.autoconf=0

Yet, IPv6 autoconf is still enabled for the individual interfaces:

# sysctl -a | grep autoconf | grep eth4
net.ipv6.conf.eth4.autoconf = 1

So it appears that setting the autoconf default has no effect on particular interfaces.

We also discovered that this setting will prevent the default route from being learned through RAs:

net.ipv6.conf.eth4.accept_ra_defrtr = 0

So now we are exploring several possible fixes:

1) See whether net.ipv6.conf.default.accept_ra_defrtr will be properly applied to individual interfaces. If so, we can add this to the default sysctl settings.

2) Research whether there are ifcfg settings that influence this setting

3) Update os-net-config to implement a workaround. It's possible to make os-net-config modify sysctl settings, but I'd rather not if we can find another solution.

Comment 21 Dan Sneddon 2016-08-03 20:47:05 UTC
The following test was successful:

1) Added this line to ifcfg-eth4:

IPV6_AUTOCONF=no

2) Restarted eth4

Result: no default route

3) Restarted vlan100

Result: default via 2001:db8:ca2:4::1 dev vlan100  metric 1024

So I believe the solution is that os-net-config should add IPV6_AUTOCONF=no when use_dhcpv6 is set to no or false.

Comment 22 Dan Sneddon 2016-08-03 20:57:48 UTC
I think a quick and ugly workaround exists:

If the os-net-config NIC config templates have a static IPv6 address, then IPV6_AUTOCONF=no will be added along with the static IP.

We could add a bogus IPv6 IP to eth4, which would result in IPV6_AUTOCONF=no being added to the ifcfg-eth4 file.

If IPv6 autoconf were later desired for eth4, this line could be removed from the ifcfg files along with setting the sysctl settings for net.ipv6.conf.eth4.accept_ra_defrtr to 0 after the deployment was completed.

-
  type: interface
  name: nic5
  use_dhcp: false
  use_dhcpv6: false
  addresses:
    -
      ip_netmask: {get_param: ManagementIpSubnet}
    -
      ip_netmask: fd00:fd00:dead:beef::1/64

The above will configure a valid IPv4 address on eth4, along with an unused IPv6 address. The presence of the IPv6 address will cause os-net-config to disable IPv6 autoconf.

Comment 23 Dan Sneddon 2016-08-03 21:39:25 UTC
Created upstream patch for os-net-config. This patch will disable IPV6_AUTOCONF is use_dhcpv6 is false.

https://review.openstack.org/#/c/350794/

The upstream bug is here:

https://bugs.launchpad.net/os-net-config/+bug/1609125

Comment 24 Marius Cornea 2016-08-04 09:43:56 UTC
(In reply to Dan Sneddon from comment #23)
> Created upstream patch for os-net-config. This patch will disable
> IPV6_AUTOCONF is use_dhcpv6 is false.
> 
> https://review.openstack.org/#/c/350794/

I applied the patch to the overcloud image and it looks like it fixes the issue during the deployment. Once the deployment is finished I get:

default via 2001:db8:ca2:4::1 dev vlan100  metric 1024 

But if I reboot a controller node, when it comes back online the autoconf address/route is there. Here is the network journal:

overcloud-controller-0.localdomain network[771]: Bringing up interface eth4:  RTNETLINK answers: File exists
overcloud-controller-0.localdomain network[771]: [  OK  ]
overcloud-controller-0.localdomain ovs-vsctl[1655]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-port br-ex vlan100 tag=100 -- set Interface vlan100 type=internal
overcloud-controller-0.localdomain network[771]: Bringing up interface vlan100:  RTNETLINK answers: File exists
overcloud-controller-0.localdomain network[771]: [  OK  ]

I think this happens because eth4 already has the autoconf ip set when the initscript is ran. 

> The upstream bug is here:
> 
> https://bugs.launchpad.net/os-net-config/+bug/1609125


Note You need to log in before you can comment on or make changes to this bug.