Description of problem: VXLAN packets destined to a host are received by the host's ethernet interface, but the vxlan sys port does not pick it up. Version-Release number of selected component (if applicable): OSP stable/newton overcloud [root@controller-0 ~]# cat /etc/*release* cat: /etc/lsb-release.d: Is a directory NAME="Red Hat Enterprise Linux Server" VERSION="7.3 (Maipo)" ID="rhel" ID_LIKE="fedora" VERSION_ID="7.3" PRETTY_NAME="Red Hat Enterprise Linux Server 7.3 (Maipo)" root@controller-0 ~]# uname -a Linux controller-0.localdomain 3.10.0-513.el7.x86_64 #1 SMP Wed Oct 12 09:41:28 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux [root@controller-0 ~]# rpm -q openvswitch openvswitch-2.5.0-14.git20160727.el7fdb.x86_64 How reproducible: Reproduced 3 times (on different setups) with a single compute and single control node Steps to Reproduce: 1. Configure bridges with vxlan ports on each node. 2. Verify ping between nodes across IPs used for vxlan ports. 3. Configure the bridge's local port with an IP on each node. 4. Ping from one local bridge port to the other. Actual results: ARP request packet is seen on ingress to the node, with correct vxlan header. However the packet never gets picked up by the vxlan port for OVS and does not make to the OVS local port. Expected results: ARP request enters OVS bridge to local port and ARP reply is sent back to the opposite node. Additional info: Outputs: ###control node### [root@controller-0 ~]# ifconfig br1 br1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 16.0.0.2 netmask 255.255.255.0 broadcast 16.0.0.255 inet6 fe80::7cac:ccff:fed5:bb4e prefixlen 64 scopeid 0x20<link> ether 7e:ac:cc:d5:bb:4e txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 5 bytes 438 (438.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@controller-0 ~]# ifconfig eth1 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 15.0.0.1 netmask 255.255.255.0 broadcast 15.0.0.255 inet6 fe80::5054:ff:feb7:e86d prefixlen 64 scopeid 0x20<link> ether 52:54:00:b7:e8:6d txqueuelen 1000 (Ethernet) RX packets 19610 bytes 975230 (952.3 KiB) RX errors 0 dropped 9114 overruns 0 frame 0 TX packets 365 bytes 41420 (40.4 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@controller-0 ~]# ovs-vsctl show adad57f0-efae-4976-9f32-3d2e9a3af3e2 Manager "ptcp:6640" Bridge "br1" Port "vxlan0" Interface "vxlan0" type: vxlan options: {remote_ip="15.0.0.2"} Port "br1" Interface "br1" type: internal ovs_version: "2.5.0" [root@controller-0 ~]# ping 15.0.0.2 PING 15.0.0.2 (15.0.0.2) 56(84) bytes of data. 64 bytes from 15.0.0.2: icmp_seq=1 ttl=64 time=0.272 ms 64 bytes from 15.0.0.2: icmp_seq=2 ttl=64 time=0.226 ms 64 bytes from 15.0.0.2: icmp_seq=3 ttl=64 time=0.213 ms ^C --- 15.0.0.2 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2005ms rtt min/avg/max/mdev = 0.213/0.237/0.272/0.025 ms ###compute node #### [root@compute-0 ~]# ifconfig br1 br1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 16.0.0.1 netmask 255.255.255.0 broadcast 16.0.0.255 inet6 fe80::2887:54ff:fe7f:6648 prefixlen 64 scopeid 0x20<link> ether 2a:87:54:7f:66:48 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 356 bytes 15180 (14.8 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@compute-0 ~]# ifconfig eth1 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 15.0.0.2 netmask 255.255.255.0 broadcast 15.0.0.255 inet6 fe80::5054:ff:fe79:998a prefixlen 64 scopeid 0x20<link> ether 52:54:00:79:99:8a txqueuelen 1000 (Ethernet) RX packets 9742 bytes 532432 (519.9 KiB) [root@compute-0 ~]# ovs-vsctl show 9477dad5-0379-4d18-988d-c2416ce67726 Manager "ptcp:6640" Bridge "br1" Port "br1" Interface "br1" type: internal Port "vxlan0" Interface "vxlan0" type: vxlan options: {remote_ip="15.0.0.1"} ovs_version: "2.5.0" RX errors 0 dropped 249 overruns 0 frame 0 TX packets 10352 bytes 494510 (482.9 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ####ping from compute br1 port to br1 port on control node### [root@compute-0 ~]# ping 16.0.0.2 PING 16.0.0.2 (16.0.0.2) 56(84) bytes of data. From 16.0.0.1 icmp_seq=1 Destination Host Unreachable From 16.0.0.1 icmp_seq=2 Destination Host Unreachable From 16.0.0.1 icmp_seq=3 Destination Host Unreachable ####capture of arp request on control node ETH1### [root@controller-0 ~]# tcpdump -i eth1 port 4789 -e -xx -n tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes 20:45:02.823458 52:54:00:79:99:8a > 52:54:00:b7:e8:6d, ethertype IPv4 (0x0800), length 92: 15.0.0.2.57311 > 15.0.0.1.4789: VXLAN, flags [I] (0x08), vni 0 2a:87:54:7f:66:48 > Broadcast, ethertype ARP (0x0806), length 42: Request who-has 16.0.0.2 tell 16.0.0.1, length 28 0x0000: 5254 00b7 e86d 5254 0079 998a 0800 4500 0x0010: 004e e5eb 4000 4011 36b1 0f00 0002 0f00 0x0020: 0001 dfdf 12b5 003a 0000 0800 0000 0000 0x0030: 0000 ffff ffff ffff 2a87 547f 6648 0806 0x0040: 0001 0800 0604 0001 2a87 547f 6648 1000 0x0050: 0001 0000 0000 0000 1000 0002 ^C 1 packet captured 1 packet received by filter 0 packets dropped by kernel ###capture of arp request on control node br1 (nothing)### [root@controller-0 ~]# tcpdump -i br1 -e -xx -n tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on br1, link-type EN10MB (Ethernet), capture size 65535 bytes ^C 0 packets captured 0 packets received by filter 0 packets dropped by kernel ###controller ovs stats## root@controller-0 ~]# ovs-ofctl dump-ports br1 OFPST_PORT reply (xid=0x2): 2 ports port LOCAL: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=5, bytes=438, drop=0, errs=0, coll=0 port 2: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=5, bytes=438, drop=0, errs=0, coll=0 ###capture on compute node ETH1#### [root@compute-0 ~]# tcpdump -i eth1 port 4789 -n -xx -e tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes 20:47:50.864632 52:54:00:79:99:8a > 52:54:00:b7:e8:6d, ethertype IPv4 (0x0800), length 92: 15.0.0.2.57311 > 15.0.0.1.4789: VXLAN, flags [I] (0x08), vni 0 2a:87:54:7f:66:48 > Broadcast, ethertype ARP (0x0806), length 42: Request who-has 16.0.0.2 tell 16.0.0.1, length 28 0x0000: 5254 00b7 e86d 5254 0079 998a 0800 4500 0x0010: 004e 4067 4000 4011 dc35 0f00 0002 0f00 0x0020: 0001 dfdf 12b5 003a 0000 0800 0000 0000 0x0030: 0000 ffff ffff ffff 2a87 547f 6648 0806 0x0040: 0001 0800 0604 0001 2a87 547f 6648 1000 0x0050: 0001 0000 0000 0000 1000 0002 ^C 1 packet captured 1 packet received by filter 0 packets dropped by kernel ###capture on compute node br1### [root@compute-0 ~]# tcpdump -i br1 -n -xx -e tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on br1, link-type EN10MB (Ethernet), capture size 65535 bytes 20:48:35.877332 2a:87:54:7f:66:48 > Broadcast, ethertype ARP (0x0806), length 42: Request who-has 16.0.0.2 tell 16.0.0.1, length 28 0x0000: ffff ffff ffff 2a87 547f 6648 0806 0001 0x0010: 0800 0604 0001 2a87 547f 6648 1000 0001 0x0020: 0000 0000 0000 1000 0002 ###compute node ovs stats### [root@compute-0 ~]# ovs-ofctl dump-ports br1 OFPST_PORT reply (xid=0x2): 2 ports port LOCAL: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=596, bytes=25260, drop=0, errs=0, coll=0 port 1: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0 tx pkts=596, bytes=25260, drop=0, errs=0, coll=0
Hi Tim, I looked at your setup today. The VXLAN UDP ports were being blocked by iptables. Adding an exception allowed traffic to pass on the overlay. [heat-admin@compute-0 ~]$ sudo iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT udp -- anywhere anywhere udp dpt:4789 ... [heat-admin@compute-0 ~]$ ping 16.0.0.1 PING 16.0.0.1 (16.0.0.1) 56(84) bytes of data. 64 bytes from 16.0.0.1: icmp_seq=1 ttl=64 time=0.611 ms 64 bytes from 16.0.0.1: icmp_seq=2 ttl=64 time=0.206 ms 64 bytes from 16.0.0.1: icmp_seq=3 ttl=64 time=0.236 ms ^C --- 16.0.0.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2000ms rtt min/avg/max/mdev = 0.206/0.351/0.611/0.184 ms
Thanks Eric for debugging on my the setup. As you thought it looks like there is a bug in how firewall is being configured with TripleO. I filed it upstream as https://bugs.launchpad.net/tripleo/+bug/1641191 There is no bug with OVS or the kernel, the problem is VXLAN traffic is being blocked by iptables because TripleO firewall is not configured to allow it if neutron OVS agent is not being used. Going to move this bug to OSP Director and provide a fix upstream.
Code is merged in master branch. Will need to be backported to stable/newton.
(In reply to Tim Rozet from comment #2) > Thanks Eric for debugging on my the setup. As you thought it looks like > there is a bug in how firewall is being configured with TripleO. I filed it > upstream as > https://bugs.launchpad.net/tripleo/+bug/1641191 > > There is no bug with OVS or the kernel, the problem is VXLAN traffic is > being blocked by iptables because TripleO firewall is not configured to > allow it if neutron OVS agent is not being used. > > Going to move this bug to OSP Director and provide a fix upstream. Thanks Eric and Tim for the collaboration and quick turnaround! /Nir
Verified with openstack-tripleo-heat-templates-5.1.0-3.el7ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html