Description of problem: Currently we cannot disable dhcpv6 on interfaces by using the nic templates, resulting in the inability to control ip assignment and v6 routes on interfaces connected to a network which runs a dhcpv6 server. There are particular scenarios when IPv6 deployments could fail because an ipv6 default route might be provided by the DHCPv6 server and installed before the static route configured on the ExternalInterfaceDefaultRoute so the undercloud won't be able to reach the public VIP during postconfig. Version-Release number of selected component (if applicable): os-net-config-0.2.4-3.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Specify the following configuration in the controller nic template: - type: interface name: nic5 use_dhcp: false use_dhcpv6: false addresses: - ip_netmask: {get_param: ManagementIpSubnet} 2. Check the resulting ifcfg script: [root@overcloud-controller-0 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth4 # This file is autogenerated by os-net-config DEVICE=eth4 ONBOOT=yes HOTPLUG=no NM_CONTROLLED=no PEERDNS=no BOOTPROTO=static IPADDR=172.16.17.161 NETMASK=255.255.255.128 3. Check the actuall interface configuration: [root@overcloud-controller-0 ~]# ip a s dev eth4 6: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:10:9c:b3 brd ff:ff:ff:ff:ff:ff inet 172.16.17.161/25 brd 172.16.17.255 scope global eth4 valid_lft forever preferred_lft forever inet6 2001:db8:ca2:3:5054:ff:fe10:9cb3/64 scope global mngtmpaddr dynamic valid_lft 3529sec preferred_lft 3529sec inet6 fe80::5054:ff:fe10:9cb3/64 scope link valid_lft forever preferred_lft forever [root@overcloud-controller-0 ~]# ip -6 r | grep default default via fe80::5054:ff:fe4f:248b dev eth4 proto ra metric 1024 expires 1705sec hoplimit 64 Actual results: Note that the default route received on eth4 interface was prefered instead of the static one configured for the external network vlan: [root@overcloud-controller-0 ~]# cat /etc/sysconfig/network-scripts/route6-vlan100 default via 2001:db8:ca2:4::1 dev vlan100 Expected results: use_dhcpv6 is honored and there is no ipv6 address nor any ipv6 routes received on the nic that it's configured for. Additional info:
In order to avoid this, one needs to make sure that there's no dhcpv6 server running on the networks the nodes are connected to.
(In reply to Marius Cornea from comment #0) > Description of problem: I'm not sure why this is happening. The default for network interfaces is to have DHCPv6 disabled: """ DHCPV6C=answer where answer is one of the following: yes — Use DHCP to obtain an IPv6 address for this interface. no — Do not use DHCP to obtain an IPv6 address for this interface. This is the default value. """ From: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/s1-networkscripts-interfaces.html However, even if DHCPv6 is picking up an IPv6 default route, that will only be used for routing IPv6 traffic. The default route for IPv4 traffic will be unaffected. You can verify this by running "ip r | grep default". If you are trying to use IPv6 on some interfaces, and DHCPv6 is running on one of the IPv4 networks, this might be a problem, but probably not. Static IP routes are favored over those learned via DHCP (via metrics), so if there was a static IPv6 default route it would also not be overridden by a DHCPv6 server. So I think that this bug is mostly cosmetic.
(In reply to Dan Sneddon from comment #3) > (In reply to Marius Cornea from comment #0) > > Description of problem: > > I'm not sure why this is happening. The default for network interfaces is to > have DHCPv6 disabled: > > """ > DHCPV6C=answer > where answer is one of the following: > > yes — Use DHCP to obtain an IPv6 address for this interface. > no — Do not use DHCP to obtain an IPv6 address for this interface. > This is the default value. > """ > From: > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/ > html/Deployment_Guide/s1-networkscripts-interfaces.html > > However, even if DHCPv6 is picking up an IPv6 default route, that will only > be used for routing IPv6 traffic. The default route for IPv4 traffic will be > unaffected. You can verify this by running "ip r | grep default". > > If you are trying to use IPv6 on some interfaces, and DHCPv6 is running on > one of the IPv4 networks, this might be a problem, but probably not. Static > IP routes are favored over those learned via DHCP (via metrics), so if there > was a static IPv6 default route it would also not be overridden by a DHCPv6 > server. This is what I expected as well but it looks that this is not the case. I'm using an IPv4 management network which also runs the DHCPv6 server and an IPv6 network for the external network with the default IPv6 route assigned. Even though the route6-vlan100 (external network vlan) script is correctly set the default route that gets installed in the routing table is the one learned via the management network: [root@overcloud-controller-0 heat-admin]# cat /etc/sysconfig/network-scripts/route6-vlan100 default via 2001:db8:ca2:4::1 dev vlan100 [root@overcloud-controller-0 heat-admin]# ip -6 r | grep default default via fe80::5054:ff:fe4f:248b dev eth4 proto ra metric 1024 expires 1301sec hoplimit 64 This leads me to believe that the static ipv6 route in route6-vlan100 wasn't actually applied. > So I think that this bug is mostly cosmetic.
This route is actually being learned through RAs, not through DHCPv6, and the address on eth4 is configured through SLAAC (based on the MAC address + base address received from the router via RAs). I believe the workaround is to add the following to /etc/sysconfig/network-scripts/ifcfg-eth4: IPV6_AUTOCONF=no I opened up an upstream bug on this: https://bugs.launchpad.net/os-net-config/+bug/1609125
I added IPV6_AUTOCONF=no to ifcfg-eth4 and rebooted the machine but it got back with route learned via RAs: [root@overcloud-controller-0 heat-admin]# cat /etc/sysconfig/network-scripts/ifcfg-eth4 # This file is autogenerated by os-net-config DEVICE=eth4 ONBOOT=yes HOTPLUG=no NM_CONTROLLED=no PEERDNS=no BOOTPROTO=static IPADDR=172.16.17.160 NETMASK=255.255.255.128 IPV6_AUTOCONF=no [root@overcloud-controller-0 heat-admin]# ip -6 r | grep default default via fe80::5054:ff:fe4f:248b dev eth4 proto ra metric 1024 expires 1733sec hoplimit 64
Neverheless it seems that the script is working as expected as it disables the accept_ra for that interface. [root@overcloud-controller-0 ~]# sysctl -a | grep net.ipv6.conf.eth4.accept_ra net.ipv6.conf.eth4.accept_ra = 0 net.ipv6.conf.eth4.accept_ra_defrtr = 1 net.ipv6.conf.eth4.accept_ra_pinfo = 1 net.ipv6.conf.eth4.accept_ra_rt_info_max_plen = 0 net.ipv6.conf.eth4.accept_ra_rtr_pref = 1
OK, probably the initscript which sets the sysctl value is ran after the RAs were received and at that moment the address/routes were already set. I noticed that after the expiration interval both the ip address and route get removed.
I wonder if if cfg-eth4 also needs to have the static default gateway defined by IPV6_DEFAULTGW=<gw> Marius - a couple questions... Can you provide the snippet from the templates that shows the define for the static IPv6 default gateway? You indicated that after the expiration interval the ip address and route get removed, do you ever see the static IPv6 default gateway you have configured in the routing table? I agree that this isn't related to dhcpv6, IIRC a node gets its routes from neighbor discovery via RAs, not from DHCPv6 so the use_dhcpv6 setting will not come into play here. The summary should probably be updated. In order to prevent the IPv6 router from sending RAs you could also disable the sending of RAs there but I'm not sure what control you have over the router. That would also prevent any nodes from getting an autoconfigured address, so all IPv6 config would have to be static. This seems to be the way many IPv6 rollouts are going due to security concerns.
(In reply to Bob Fournier from comment #12) > I wonder if if cfg-eth4 also needs to have the static default gateway > defined by > IPV6_DEFAULTGW=<gw> > > Marius - a couple questions... > > Can you provide the snippet from the templates that shows the define for the > static IPv6 default gateway? Sure: - type: ovs_bridge name: {get_input: bridge_name} use_dhcp: false members: - type: interface name: nic2 primary: true - type: vlan vlan_id: {get_param: ExternalNetworkVlanID} dns_servers: {get_param: DnsServers} addresses: - ip_netmask: {get_param: ExternalIpSubnet} routes: - default: true next_hop: {get_param: ExternalInterfaceDefaultRoute} > You indicated that after the expiration interval the ip address and route > get removed, do you ever see the static IPv6 default gateway you have > configured in the routing table? No, after the expiration interval the routing table doesn't have any default ipv6 route installed anymore. > I agree that this isn't related to dhcpv6, IIRC a node gets its routes from > neighbor discovery via RAs, not from DHCPv6 so the use_dhcpv6 setting will > not come into play here. The summary should probably be updated. Agree, will update the title. > In order to prevent the IPv6 router from sending RAs you could also disable > the sending of RAs there but I'm not sure what control you have over the > router. That would also prevent any nodes from getting an autoconfigured > address, so all IPv6 config would have to be static. This seems to be the > way many IPv6 rollouts are going due to security concerns. This is actually a libvirt network in a virtual environment, tried several configuration but I wasn't able to disable the RAs so I was expecting to be able to control this via the nodes configuration. Another approach that I tried was to completely disable IPv6 per interface via sysctl: parameter_defaults: controllerExtraConfig: sysctl_settings: net.ipv6.conf.eth4.disable_ipv6: value: 1 While this removed the autoconfigured address/route the controller nodes ended without a default route so deployment failed in postconfig. This happened because the static route failed to get installed when os-net-config first ran due to the existence of an already learned default route via RAs and the sysctl value was applied in a later step of the deployment.
I wonder if it would help if we set "net.ipv6.conf.all.accept_ra = 0" on the controller nodes? This might be done with an ExtraConfig, or perhaps we can edit the overcloud-full images to include this sysctl setting on first boot.
(In reply to Dan Sneddon from comment #15) > I wonder if it would help if we set "net.ipv6.conf.all.accept_ra = 0" on the > controller nodes? This might be done with an ExtraConfig, or perhaps we can > edit the overcloud-full images to include this sysctl setting on first boot. I added net.ipv6.conf.all.accept_ra = 0 in /etc/sysctl.conf inside the overcloud-full image but the autoconfigured address still showed up on eth4 after deployment. Note that there are some other ipv6 sysctl parameters set during deployment in sysctl.conf: [root@overcloud-controller-0 heat-admin]# cat /etc/sysctl.conf # HEADER: This file was autogenerated at 2016-08-03 13:42:25 -0400 # HEADER: by puppet. While it can still be managed manually, it # HEADER: is definitely not recommended. # System default settings live in /usr/lib/sysctl.d/00-system.conf. # To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file # # For more information, see sysctl.conf(5) and sysctl.d(5). net.ipv6.conf.all.accept_ra=0 net.ipv4.ip_nonlocal_bind=1 net.ipv6.conf.default.autoconf=0 net.ipv6.conf.default.accept_ra=0 net.netfilter.nf_conntrack_max=500000 net.core.netdev_max_backlog=10000 net.ipv4.tcp_keepalive_intvl=1 net.ipv4.tcp_keepalive_time=5 net.ipv4.tcp_keepalive_probes=5 net.nf_conntrack_max=500000
I found some anecdotal information on the Web which seems to indicate that this doesn't function as you would expect: net.ipv6.conf.all.accept_ra=0 In fact, this doesn't disable accepting RAs on a particular interface (why not? RHEL bug?). To me, this is smelling more and more like RHEL isn't handling IPv6 routing correctly. I think we may want to open a bug against RHEL and see if we can get some help from kernel/network developers.
(In reply to Dan Sneddon from comment #19) > I found some anecdotal information on the Web which seems to indicate that > this doesn't function as you would expect: > > net.ipv6.conf.all.accept_ra=0 > > In fact, this doesn't disable accepting RAs on a particular interface (why > not? RHEL bug?). > > To me, this is smelling more and more like RHEL isn't handling IPv6 routing > correctly. I think we may want to open a bug against RHEL and see if we can > get some help from kernel/network developers. I've been working with bfournie on this. I confirmed what I think is a potential bug in RHEL: # cat /etc/sysctl.d/99-sysctl.conf | grep autoconf net.ipv6.conf.default.autoconf=0 Yet, IPv6 autoconf is still enabled for the individual interfaces: # sysctl -a | grep autoconf | grep eth4 net.ipv6.conf.eth4.autoconf = 1 So it appears that setting the autoconf default has no effect on particular interfaces. We also discovered that this setting will prevent the default route from being learned through RAs: net.ipv6.conf.eth4.accept_ra_defrtr = 0 So now we are exploring several possible fixes: 1) See whether net.ipv6.conf.default.accept_ra_defrtr will be properly applied to individual interfaces. If so, we can add this to the default sysctl settings. 2) Research whether there are ifcfg settings that influence this setting 3) Update os-net-config to implement a workaround. It's possible to make os-net-config modify sysctl settings, but I'd rather not if we can find another solution.
The following test was successful: 1) Added this line to ifcfg-eth4: IPV6_AUTOCONF=no 2) Restarted eth4 Result: no default route 3) Restarted vlan100 Result: default via 2001:db8:ca2:4::1 dev vlan100 metric 1024 So I believe the solution is that os-net-config should add IPV6_AUTOCONF=no when use_dhcpv6 is set to no or false.
I think a quick and ugly workaround exists: If the os-net-config NIC config templates have a static IPv6 address, then IPV6_AUTOCONF=no will be added along with the static IP. We could add a bogus IPv6 IP to eth4, which would result in IPV6_AUTOCONF=no being added to the ifcfg-eth4 file. If IPv6 autoconf were later desired for eth4, this line could be removed from the ifcfg files along with setting the sysctl settings for net.ipv6.conf.eth4.accept_ra_defrtr to 0 after the deployment was completed. - type: interface name: nic5 use_dhcp: false use_dhcpv6: false addresses: - ip_netmask: {get_param: ManagementIpSubnet} - ip_netmask: fd00:fd00:dead:beef::1/64 The above will configure a valid IPv4 address on eth4, along with an unused IPv6 address. The presence of the IPv6 address will cause os-net-config to disable IPv6 autoconf.
Created upstream patch for os-net-config. This patch will disable IPV6_AUTOCONF is use_dhcpv6 is false. https://review.openstack.org/#/c/350794/ The upstream bug is here: https://bugs.launchpad.net/os-net-config/+bug/1609125
(In reply to Dan Sneddon from comment #23) > Created upstream patch for os-net-config. This patch will disable > IPV6_AUTOCONF is use_dhcpv6 is false. > > https://review.openstack.org/#/c/350794/ I applied the patch to the overcloud image and it looks like it fixes the issue during the deployment. Once the deployment is finished I get: default via 2001:db8:ca2:4::1 dev vlan100 metric 1024 But if I reboot a controller node, when it comes back online the autoconf address/route is there. Here is the network journal: overcloud-controller-0.localdomain network[771]: Bringing up interface eth4: RTNETLINK answers: File exists overcloud-controller-0.localdomain network[771]: [ OK ] overcloud-controller-0.localdomain ovs-vsctl[1655]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --may-exist add-port br-ex vlan100 tag=100 -- set Interface vlan100 type=internal overcloud-controller-0.localdomain network[771]: Bringing up interface vlan100: RTNETLINK answers: File exists overcloud-controller-0.localdomain network[771]: [ OK ] I think this happens because eth4 already has the autoconf ip set when the initscript is ran. > The upstream bug is here: > > https://bugs.launchpad.net/os-net-config/+bug/1609125