Bug 1362528
Summary: | Unable to disable IPv6 RAs acceptance on interfaces by using nic templates/os-net-config | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> | |
Component: | rhosp-director | Assignee: | Dan Sneddon <dsneddon> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Omri Hochman <ohochman> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 9.0 (Mitaka) | CC: | adahms, bfournie, bschmaus, dbecker, dsneddon, gkadam, gkeegan, jason.dobies, jcoufal, jslagle, mburns, mcornea, mlammon, morazi, rhel-osp-director-maint, tvignaud, vcojot | |
Target Milestone: | ga | Keywords: | Reopened | |
Target Release: | 10.0 (Newton) | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: Due to logic in the kernel, net.ipv6.conf.default.autoconf and accept_ra are not by themselves sufficient to disable those features if IP forwarding is turned off. Since the OpenStack overcloud nodes do not have IP forwarding enabled typically, these settings may be overridden.
Consequence: In the default configuration, systems deployed with IPv6 on OSP 9 may accept unwanted routes and IP addresses from routers on connected networks.
Fix: In order to avoid this issue, the kernel sysctl settings for net.ipv6.conf.all.autoconf and net.ipv6.conf.all.accept_ra should be set to "0". This may be done by editing a file in the openstack-tripleo-heat-templates used for the deployment. In the openstack-tripleo-heat-templates/puppet/hieradata/common.yaml file, locate the settings for net.ipv6.conf.default.autoconf/accept_ra, and add the same settings for net.ipv6.conf.all.autoconf and net.ipv6.conf.all.accept_ra (all should have a value of "0").
Result: When Puppet applies the new settings, the kernel will stop accepting new autoconf IP addresses or routes from routing advertisements (RAs). This will leave the system functioning with static IPs and routes as designed.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1364498 1396696 (view as bug list) | Environment: | ||
Last Closed: | 2016-11-19 01:08:31 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1364498, 1396696 |
Description
Marius Cornea
2016-08-02 12:50:36 UTC
In order to avoid this, one needs to make sure that there's no dhcpv6 server running on the networks the nodes are connected to. (In reply to Marius Cornea from comment #0) > Description of problem: I'm not sure why this is happening. The default for network interfaces is to have DHCPv6 disabled: """ DHCPV6C=answer where answer is one of the following: yes — Use DHCP to obtain an IPv6 address for this interface. no — Do not use DHCP to obtain an IPv6 address for this interface. This is the default value. """ From: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/s1-networkscripts-interfaces.html However, even if DHCPv6 is picking up an IPv6 default route, that will only be used for routing IPv6 traffic. The default route for IPv4 traffic will be unaffected. You can verify this by running "ip r | grep default". If you are trying to use IPv6 on some interfaces, and DHCPv6 is running on one of the IPv4 networks, this might be a problem, but probably not. Static IP routes are favored over those learned via DHCP (via metrics), so if there was a static IPv6 default route it would also not be overridden by a DHCPv6 server. So I think that this bug is mostly cosmetic. (In reply to Dan Sneddon from comment #3) > (In reply to Marius Cornea from comment #0) > > Description of problem: > > I'm not sure why this is happening. The default for network interfaces is to > have DHCPv6 disabled: > > """ > DHCPV6C=answer > where answer is one of the following: > > yes — Use DHCP to obtain an IPv6 address for this interface. > no — Do not use DHCP to obtain an IPv6 address for this interface. > This is the default value. > """ > From: > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/ > html/Deployment_Guide/s1-networkscripts-interfaces.html > > However, even if DHCPv6 is picking up an IPv6 default route, that will only > be used for routing IPv6 traffic. The default route for IPv4 traffic will be > unaffected. You can verify this by running "ip r | grep default". > > If you are trying to use IPv6 on some interfaces, and DHCPv6 is running on > one of the IPv4 networks, this might be a problem, but probably not. Static > IP routes are favored over those learned via DHCP (via metrics), so if there > was a static IPv6 default route it would also not be overridden by a DHCPv6 > server. This is what I expected as well but it looks that this is not the case. I'm using an IPv4 management network which also runs the DHCPv6 server and an IPv6 network for the external network with the default IPv6 route assigned. Even though the route6-vlan100 (external network vlan) script is correctly set the default route that gets installed in the routing table is the one learned via the management network: [root@overcloud-controller-0 heat-admin]# cat /etc/sysconfig/network-scripts/route6-vlan100 default via 2001:db8:ca2:4::1 dev vlan100 [root@overcloud-controller-0 heat-admin]# ip -6 r | grep default default via fe80::5054:ff:fe4f:248b dev eth4 proto ra metric 1024 expires 1301sec hoplimit 64 This leads me to believe that the static ipv6 route in route6-vlan100 wasn't actually applied. > So I think that this bug is mostly cosmetic. This route is actually being learned through RAs, not through DHCPv6, and the address on eth4 is configured through SLAAC (based on the MAC address + base address received from the router via RAs). I believe the workaround is to add the following to /etc/sysconfig/network-scripts/ifcfg-eth4: IPV6_AUTOCONF=no I opened up an upstream bug on this: https://bugs.launchpad.net/os-net-config/+bug/1609125 I added IPV6_AUTOCONF=no to ifcfg-eth4 and rebooted the machine but it got back with route learned via RAs: [root@overcloud-controller-0 heat-admin]# cat /etc/sysconfig/network-scripts/ifcfg-eth4 # This file is autogenerated by os-net-config DEVICE=eth4 ONBOOT=yes HOTPLUG=no NM_CONTROLLED=no PEERDNS=no BOOTPROTO=static IPADDR=172.16.17.160 NETMASK=255.255.255.128 IPV6_AUTOCONF=no [root@overcloud-controller-0 heat-admin]# ip -6 r | grep default default via fe80::5054:ff:fe4f:248b dev eth4 proto ra metric 1024 expires 1733sec hoplimit 64 Neverheless it seems that the script is working as expected as it disables the accept_ra for that interface. [root@overcloud-controller-0 ~]# sysctl -a | grep net.ipv6.conf.eth4.accept_ra net.ipv6.conf.eth4.accept_ra = 0 net.ipv6.conf.eth4.accept_ra_defrtr = 1 net.ipv6.conf.eth4.accept_ra_pinfo = 1 net.ipv6.conf.eth4.accept_ra_rt_info_max_plen = 0 net.ipv6.conf.eth4.accept_ra_rtr_pref = 1 OK, probably the initscript which sets the sysctl value is ran after the RAs were received and at that moment the address/routes were already set. I noticed that after the expiration interval both the ip address and route get removed. I wonder if if cfg-eth4 also needs to have the static default gateway defined by IPV6_DEFAULTGW=<gw> Marius - a couple questions... Can you provide the snippet from the templates that shows the define for the static IPv6 default gateway? You indicated that after the expiration interval the ip address and route get removed, do you ever see the static IPv6 default gateway you have configured in the routing table? I agree that this isn't related to dhcpv6, IIRC a node gets its routes from neighbor discovery via RAs, not from DHCPv6 so the use_dhcpv6 setting will not come into play here. The summary should probably be updated. In order to prevent the IPv6 router from sending RAs you could also disable the sending of RAs there but I'm not sure what control you have over the router. That would also prevent any nodes from getting an autoconfigured address, so all IPv6 config would have to be static. This seems to be the way many IPv6 rollouts are going due to security concerns. (In reply to Bob Fournier from comment #12) > I wonder if if cfg-eth4 also needs to have the static default gateway > defined by > IPV6_DEFAULTGW=<gw> > > Marius - a couple questions... > > Can you provide the snippet from the templates that shows the define for the > static IPv6 default gateway? Sure: - type: ovs_bridge name: {get_input: bridge_name} use_dhcp: false members: - type: interface name: nic2 primary: true - type: vlan vlan_id: {get_param: ExternalNetworkVlanID} dns_servers: {get_param: DnsServers} addresses: - ip_netmask: {get_param: ExternalIpSubnet} routes: - default: true next_hop: {get_param: ExternalInterfaceDefaultRoute} > You indicated that after the expiration interval the ip address and route > get removed, do you ever see the static IPv6 default gateway you have > configured in the routing table? No, after the expiration interval the routing table doesn't have any default ipv6 route installed anymore. > I agree that this isn't related to dhcpv6, IIRC a node gets its routes from > neighbor discovery via RAs, not from DHCPv6 so the use_dhcpv6 setting will > not come into play here. The summary should probably be updated. Agree, will update the title. > In order to prevent the IPv6 router from sending RAs you could also disable > the sending of RAs there but I'm not sure what control you have over the > router. That would also prevent any nodes from getting an autoconfigured > address, so all IPv6 config would have to be static. This seems to be the > way many IPv6 rollouts are going due to security concerns. This is actually a libvirt network in a virtual environment, tried several configuration but I wasn't able to disable the RAs so I was expecting to be able to control this via the nodes configuration. Another approach that I tried was to completely disable IPv6 per interface via sysctl: parameter_defaults: controllerExtraConfig: sysctl_settings: net.ipv6.conf.eth4.disable_ipv6: value: 1 While this removed the autoconfigured address/route the controller nodes ended without a default route so deployment failed in postconfig. This happened because the static route failed to get installed when os-net-config first ran due to the existence of an already learned default route via RAs and the sysctl value was applied in a later step of the deployment. I wonder if it would help if we set "net.ipv6.conf.all.accept_ra = 0" on the controller nodes? This might be done with an ExtraConfig, or perhaps we can edit the overcloud-full images to include this sysctl setting on first boot. (In reply to Dan Sneddon from comment #15) > I wonder if it would help if we set "net.ipv6.conf.all.accept_ra = 0" on the > controller nodes? This might be done with an ExtraConfig, or perhaps we can > edit the overcloud-full images to include this sysctl setting on first boot. I added net.ipv6.conf.all.accept_ra = 0 in /etc/sysctl.conf inside the overcloud-full image but the autoconfigured address still showed up on eth4 after deployment. Note that there are some other ipv6 sysctl parameters set during deployment in sysctl.conf: [root@overcloud-controller-0 heat-admin]# cat /etc/sysctl.conf # HEADER: This file was autogenerated at 2016-08-03 13:42:25 -0400 # HEADER: by puppet. While it can still be managed manually, it # HEADER: is definitely not recommended. # System default settings live in /usr/lib/sysctl.d/00-system.conf. # To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file # # For more information, see sysctl.conf(5) and sysctl.d(5). net.ipv6.conf.all.accept_ra=0 net.ipv4.ip_nonlocal_bind=1 net.ipv6.conf.default.autoconf=0 net.ipv6.conf.default.accept_ra=0 net.netfilter.nf_conntrack_max=500000 net.core.netdev_max_backlog=10000 net.ipv4.tcp_keepalive_intvl=1 net.ipv4.tcp_keepalive_time=5 net.ipv4.tcp_keepalive_probes=5 net.nf_conntrack_max=500000 I found some anecdotal information on the Web which seems to indicate that this doesn't function as you would expect: net.ipv6.conf.all.accept_ra=0 In fact, this doesn't disable accepting RAs on a particular interface (why not? RHEL bug?). To me, this is smelling more and more like RHEL isn't handling IPv6 routing correctly. I think we may want to open a bug against RHEL and see if we can get some help from kernel/network developers. (In reply to Dan Sneddon from comment #19) > I found some anecdotal information on the Web which seems to indicate that > this doesn't function as you would expect: > > net.ipv6.conf.all.accept_ra=0 > > In fact, this doesn't disable accepting RAs on a particular interface (why > not? RHEL bug?). > > To me, this is smelling more and more like RHEL isn't handling IPv6 routing > correctly. I think we may want to open a bug against RHEL and see if we can > get some help from kernel/network developers. I've been working with bfournie on this. I confirmed what I think is a potential bug in RHEL: # cat /etc/sysctl.d/99-sysctl.conf | grep autoconf net.ipv6.conf.default.autoconf=0 Yet, IPv6 autoconf is still enabled for the individual interfaces: # sysctl -a | grep autoconf | grep eth4 net.ipv6.conf.eth4.autoconf = 1 So it appears that setting the autoconf default has no effect on particular interfaces. We also discovered that this setting will prevent the default route from being learned through RAs: net.ipv6.conf.eth4.accept_ra_defrtr = 0 So now we are exploring several possible fixes: 1) See whether net.ipv6.conf.default.accept_ra_defrtr will be properly applied to individual interfaces. If so, we can add this to the default sysctl settings. 2) Research whether there are ifcfg settings that influence this setting 3) Update os-net-config to implement a workaround. It's possible to make os-net-config modify sysctl settings, but I'd rather not if we can find another solution. The following test was successful: 1) Added this line to ifcfg-eth4: IPV6_AUTOCONF=no 2) Restarted eth4 Result: no default route 3) Restarted vlan100 Result: default via 2001:db8:ca2:4::1 dev vlan100 metric 1024 So I believe the solution is that os-net-config should add IPV6_AUTOCONF=no when use_dhcpv6 is set to no or false. I think a quick and ugly workaround exists: If the os-net-config NIC config templates have a static IPv6 address, then IPV6_AUTOCONF=no will be added along with the static IP. We could add a bogus IPv6 IP to eth4, which would result in IPV6_AUTOCONF=no being added to the ifcfg-eth4 file. If IPv6 autoconf were later desired for eth4, this line could be removed from the ifcfg files along with setting the sysctl settings for net.ipv6.conf.eth4.accept_ra_defrtr to 0 after the deployment was completed. - type: interface name: nic5 use_dhcp: false use_dhcpv6: false addresses: - ip_netmask: {get_param: ManagementIpSubnet} - ip_netmask: fd00:fd00:dead:beef::1/64 The above will configure a valid IPv4 address on eth4, along with an unused IPv6 address. The presence of the IPv6 address will cause os-net-config to disable IPv6 autoconf. Created upstream patch for os-net-config. This patch will disable IPV6_AUTOCONF is use_dhcpv6 is false. https://review.openstack.org/#/c/350794/ The upstream bug is here: https://bugs.launchpad.net/os-net-config/+bug/1609125 (In reply to Dan Sneddon from comment #23) > Created upstream patch for os-net-config. This patch will disable > IPV6_AUTOCONF is use_dhcpv6 is false. > > https://review.openstack.org/#/c/350794/ I applied the patch to the overcloud image and it looks like it fixes the issue during the deployment. Once the deployment is finished I get: default via 2001:db8:ca2:4::1 dev vlan100 metric 1024 But if I reboot a controller node, when it comes back online the autoconf address/route is there. Here is the network journal: overcloud-controller-0.localdomain network[771]: Bringing up interface eth4: RTNETLINK answers: File exists overcloud-controller-0.localdomain network[771]: [ OK ] overcloud-controller-0.localdomain ovs-vsctl[1655]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-port br-ex vlan100 tag=100 -- set Interface vlan100 type=internal overcloud-controller-0.localdomain network[771]: Bringing up interface vlan100: RTNETLINK answers: File exists overcloud-controller-0.localdomain network[771]: [ OK ] I think this happens because eth4 already has the autoconf ip set when the initscript is ran. > The upstream bug is here: > > https://bugs.launchpad.net/os-net-config/+bug/1609125 |