Bug 1445054 - Setting ipv6.disable=1 prevents both IPv4 and IPv6 socket opening for VXLAN tunnels
Summary: Setting ipv6.disable=1 prevents both IPv4 and IPv6 socket opening for VXLAN t...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.3
Hardware: Unspecified
OS: Unspecified
urgent
medium
Target Milestone: rc
: ---
Assignee: Jiri Benc
QA Contact: Jan Tluka
URL:
Whiteboard:
: 1437778 1452611 1467387 (view as bug list)
Depends On:
Blocks: 1323132 1454636
TreeView+ depends on / blocked
 
Reported: 2017-04-24 19:58 UTC by Robb Manes
Modified: 2020-09-22 12:07 UTC (History)
11 users (show)

Fixed In Version: kernel-3.10.0-668.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1454636 (view as bug list)
Environment:
Last Closed: 2017-08-02 06:13:10 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1490811 medium CLOSED [rhel-alt]fail to set vxlan up with error Address family not supported by protocol when ipv6 is disabled 2020-10-14 00:28:05 UTC
Red Hat Knowledge Base (Solution) 3039771 None None None 2017-05-19 14:54:00 UTC
Red Hat Product Errata RHSA-2017:1842 normal SHIPPED_LIVE Important: kernel security, bug fix, and enhancement update 2017-08-01 18:22:09 UTC

Internal Links: 1490811

Description Robb Manes 2017-04-24 19:58:09 UTC
Description of problem:
When booting with ipv6.disable=1, vxlan will fail to initialize with the error "vxlan: Cannot bind port 4789, err=-97" which is EAFNOSUPPORT.  This is normally fine, except that should this IPv6 check fail due to EAFNOSUPPORT on the first call of __vxlan_sock_add(), the second non-IPv6 call of __vxlan_sock_add() will not occur.  This is due to the check done in vxlan_add_sock():

When __vxlan_add_sock() will return failure due to EAFNOSUPPORT via vxlan_sock_add->__vxlan_sock_add->vxlan_create_sock->udp_sock_create() refusing the AF_INET6 address, the first check fails and the second will never retry, so no vxlan socket is made.

static int vxlan_sock_add(struct vxlan_dev *vxlan)
{
        bool ipv6 = vxlan->flags & VXLAN_F_IPV6;
        bool metadata = vxlan->flags & VXLAN_F_COLLECT_METADATA;
        int ret = 0; 

        RCU_INIT_POINTER(vxlan->vn4_sock, NULL);
#if IS_ENABLED(CONFIG_IPV6)
        RCU_INIT_POINTER(vxlan->vn6_sock, NULL);
        if (ipv6 || metadata)
                ret = __vxlan_sock_add(vxlan, true);
#endif
        if ((!ret || ret == -EAFNOSUPPORT) && (!ipv6 || metadata))
                ret = __vxlan_sock_add(vxlan, false);
        if (ret < 0) 
                vxlan_sock_release(vxlan);
        return ret; 
}

Fix, as authored by Marcelo Leitner, is as below:

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index ebc98bb..2dfeda8 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2832,7 +2832,7 @@ static int vxlan_sock_add(struct vxlan_dev *vxlan)
        if (ipv6 || metadata)
                ret = __vxlan_sock_add(vxlan, true);
 #endif
-       if (!ret && (!ipv6 || metadata))
+       if ((!ret || ret == -EAFNOSUPPORT) && (!ipv6 || metadata))
                ret = __vxlan_sock_add(vxlan, false);
        if (ret < 0)
                vxlan_sock_release(vxlan);

We now track to see if EAFNOSUPPORT is returned, and if it is we will fall through to an IPv4 socket using __vxlan_sock_add().

Version-Release number of selected component (if applicable):
Occurs in latest net-next and affects RHEL7 by proxy.

How reproducible:
Every time.

Steps to Reproduce:
1. Configure vxlan tunnel (I used an OVS bridge) on a host and attach a tap or veth to it:
	# ovs-vsctl add-br vxlan-br vx1
	# ovs-vsctl add-port vxlan-br vx1 -- set interface vx1 type=vxlan options:remote_ip=192.168.222.21
	# ovs-vsctl add-port vxlan-br $TAP/VETH
2. Attach a seperate network to the IP of the tap or veth device.
3. Reboot with ipv6.disable=1 as a command line option, vxlan tunnel will no longer function.

Actual results:
vxlan socket is not created for IPv4 despite only IPv6 being disabled

Expected results:
Only IPv6 socket should not be created, IPv4 socket should also be made.

Comment 1 Robb Manes 2017-04-24 20:13:56 UTC
Valid workaround is to remove ipv6.disable=1, which will open both sockets.

Full reproduction steps:

- Configure two individual hosts on the same subnet, using 192.168.1.2 and 192.168.1.3 as an example.
- On each host (two separate hosts):
  - Add an internal OpenVSwitch bridge to the host:
    # ovs-vsctl add-br br-int
  - Add a VXLAN port to the br-int device with the remote_ip address the peer system:
    # ovs-vsctl add-port br-int vx1 -- set interface vx1 type=vxlan options:remote_ip=192.168.1.2
      ||OR||
    # ovs-vsctl add-port br-int vx1 -- set interface vx1 type=vxlan options:remote_ip=192.168.1.3
  - Add a veth pair to the OVS bridge and attach one end to the bridge, and provide it with a separate network address (10.10.10.2 and 10.10.10.3 on each host used in this example):
    # ip link set type veth
    # ip link set veth0 up
    # ip link set veth1 up
    # ovs-vsctl add-port br-int veth1
    # ip addr add 10.10.10.2/24 dev veth0
      ||OR||
    # ip addr add 10.10.10.3/24 dev veth0

Both sides should be able to ping 10.10.10.* devices on the vxlan tunnel to their respective peer.  After adding ipv6.disable=1 to the /etc/sysconfig/grub line "GRUB_CMDLINE_LINUX" rebuild GRUB2 configuration:
- grub2-mkconfig -o /boot/grub2/grub.cfg

Reboot, and VXLAN should no longer function.  Applying the patch in #1 will resolve the problem.

Comment 2 Marcelo Ricardo Leitner 2017-04-24 20:22:40 UTC
Hi Jiri. I believe this is a follow-up for b1be00a6c39f ("vxlan: support both IPv4 and IPv6 sockets in a single vxlan device").

The patch above will cause it log an error for ipv6 socket while it will be silent for ipv4 if it works. That doesn't seem very friendly, but the admin did disable ipv6 for that interface, so maybe that's okay.

Comment 3 Jiri Benc 2017-04-27 19:29:14 UTC
Patches submitted upstream:

http://patchwork.ozlabs.org/patch/756113/
http://patchwork.ozlabs.org/patch/756114/

Comment 5 Ben Bennett 2017-05-08 18:42:59 UTC
*** Bug 1437778 has been marked as a duplicate of this bug. ***

Comment 7 Rafael Aquini 2017-05-17 13:28:03 UTC
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing

Comment 9 Rafael Aquini 2017-05-18 13:45:20 UTC
Patch(es) available on kernel-3.10.0-668.el7

Comment 11 Paolo Abeni 2017-05-19 13:57:21 UTC
*** Bug 1452611 has been marked as a duplicate of this bug. ***

Comment 13 Jan Tluka 2017-05-19 16:20:35 UTC
Reproduced on 3.10.0-632.el7 kernel using script in comment 1. Adding ip6_disable=1 to kernel cmdline made vxlan not working.

Verified on 3.10.0-668.el7 kernel. The vxlan works with or without ip6_disable=1 on kernel cmdline.

Comment 16 Steven Walter 2017-07-05 18:56:48 UTC
*** Bug 1467387 has been marked as a duplicate of this bug. ***

Comment 17 errata-xmlrpc 2017-08-02 06:13:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1842


Note You need to log in before you can comment on or make changes to this bug.