Bug 1580217
Summary: | [ovn]ipv6 load balancer for layer4 on logical router doesn't work | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | haidong li <haili> |
Component: | openvswitch | Assignee: | Mark Michelson <mmichels> |
Status: | CLOSED ERRATA | QA Contact: | haidong li <haili> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.5 | CC: | atelang, atragler, kfida, lmanasko, mmichels, pvauter, tredaelli |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | openvswitch-2.9.0-69.el7fdn | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-11-05 14:59:03 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
haidong li
2018-05-21 03:47:01 UTC
I tried to reproduce this and I was unable to. When I set up an IPv6 load balancer with a port, it worked as expected. I noticed something suspicious in the output of `ovn-nbctl lr-lb-list`: [root@dell-per730-19 ~]# ovn-nbctl ls-lb-list s2 UUID LB PROTO VIP IPs a7b0f293-8897-43bd-ada5-61b67382ce45 tcp/udp 300::1 2001:db8:103::11,2001:db8:103::12 (null) [300::1]:8000 [2001:db8:103::11]:80,[2001:db8:103::12]:80 Notice how the PROTO is "(null)" for the VIP with a port number. When I run the command on my machine, it looks like this: [vagrant@central ~]$ sudo ovn-nbctl lr-lb-list ro0 UUID LB PROTO VIP IPs ad707ab6-3f78-4547-9e50-c8a0e1d8bb2d lb0 tcp [fd0f:f07:71c6:b050::100]:8000 [fd0f:0f07:71c6:af56::194]:8000,[fd0f:0f07:71c6:af56::195]:8000 tcp/udp fd0f:f07:71c6:b050::100 fd0f:0f07:71c6:af56::194 Notice that the PROTO is "tcp" for the VIP with a port number. The way I created my load balancer was to issue the following two commands: ovn-nbctl lb-add lb0 fd0f:0f07:71c6:b050::100 fd0f:0f07:71c6:af56::194 ovn-nbctl lb-add lb0 [fd0f:0f07:71c6:b050::100]:8000 [fd0f:0f07:71c6:af56::194]:8000,[fd0f:0f07:71c6:af56::195]:8000 ovn-nbctl lr-lb-add ro0 lb0 Notice that I did not specify a protocol, but it defaulted to "tcp". Did you create your load balancers this way? Or did you add them directly to the database? If you add them directly to the database and you specify "tcp" as the protocol, does this issue still occur? Yes I tested directly to the database.But the issue still exist in my environment after I set the TCP param or use the command you mentioned: [root@hp-dl380g9-04 ovn]# ovn-nbctl lb-add lb0 300::1 2001:db8:103::11,2001:db8:103::12 [root@hp-dl380g9-04 ovn]# ovn-nbctl lb-add lb0 [300::1]:8000 [2001:db8:103::11]:80,[2001:db8:103::12]:80 [root@hp-dl380g9-04 ovn]# ovn-nbctl lr-lb-add r1 lb0 [root@hp-dl380g9-04 ovn]# ovn-nbctl lr-lb-list r1 UUID LB PROTO VIP IPs 22bdef9d-dc3d-45e0-8055-c68fd2f0cd73 lb0 tcp/udp 300::1 2001:db8:103::11,2001:db8:103::12 tcp [300::1]:8000 [2001:db8:103::11]:80,[2001:db8:103::12]:80 [root@hp-dl380g9-04 ovn]# ovn-nbctl lb-list UUID LB PROTO VIP IPs 34b7145f-0d91-45e8-b3ad-42922d1a8b38 tcp/udp 30.0.0.2 172.16.103.11,172.16.103.12 (null) 30.0.0.2:8000 172.16.103.11:80,172.16.103.12:80 6529a06b-6e5b-4c12-8aca-9ea7798a906d tcp/udp 300::1 2001:db8:103::11,2001:db8:103::12 tcp [300::1]:8000 [2001:db8:103::11]:80,[2001:db8:103::12]:80 2e6e7e49-b0af-444e-95ad-8cad040b6483 tcp/udp 30.0.0.1 172.16.103.11,172.16.103.12 (null) 30.0.0.1:8000 172.16.103.11:80,172.16.103.12:80 22bdef9d-dc3d-45e0-8055-c68fd2f0cd73 lb0 tcp/udp 300::1 2001:db8:103::11,2001:db8:103::12 tcp [300::1]:8000 [2001:db8:103::11]:80,[2001:db8:103::12]:80 [root@hp-dl380g9-04 ovn]# virsh console hv1_vm00 Connected to domain hv1_vm00 Escape character is ^] [root@localhost ~]# ping6 300::1 PING 300::1(300::1) 56 data bytes 64 bytes from 2001:db8:103::11: icmp_seq=1 ttl=63 time=1.51 ms 64 bytes from 2001:db8:103::11: icmp_seq=2 ttl=63 time=0.590 ms --- 300::1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 0.590/1.050/1.510/0.460 ms [root@localhost ~]# curl -g [300::1]:8000 >> log3.txt % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:00:04 --:--:-- 0 [root@localhost ~]# curl -g [2001:db8:103::11]:80 i am vm1 [root@localhost ~]# curl -g [2001:db8:103::12]:80 i am vm2 By the way,can you please login the machines I used to check the configuration if convenient,the password is redhat hp-dl380g9-04.rhts.eng.pek2.redhat.com hp-dl388g8-09.rhts.eng.pek2.redhat.com I figured out how to reproduce this locally. In my setup, on my logical router, I set options:chassis="central". In your setup, you set options:redirect-chassis="hv1" on the r1_s2 logical router port. I changed my configuration to use redirect-chassis on the logical router port and now I have the same problem. I will look into why this is happening and report back when I have a fix. I figured out the problem and have created a fix locally. The issue is that there is a rule for un-DNATting return traffic from the load balancer destination that does not get installed when using IPv6. The fix is to install this rule for IPv6. I have submitted this patch for review upstream: https://patchwork.ozlabs.org/patch/935066/ This has been committed upstream in OVS master and OVS 2.9 On second inspection, it turns out this is committed to master but not to 2.9. I am putting this back into POST until it is committed to the upstream 2.9 branch. I have sent an e-mail to Ben Pfaff requesting the backport. This is now backported to the 2.9 branch as well. This issue is verified on the latest version: [root@localhost ~]# curl -g [300::1]:8000 >> log3.txt % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 9 100 9 0 0 461 0 --:--:-- --:--:-- --:--:-- 500 [root@localhost ~]# echo $? 0 [root@localhost ~]# logout Red Hat Enterprise Linux Server 7.5 (Maipo) Kernel 3.10.0-862.el7.x86_64 on an x86_64 localhost login: spawn virsh console hv1_vm00 Connected to domain hv1_vm00 Escape character is ^] Red Hat Enterprise Linux Server 7.5 (Maipo) Kernel 3.10.0-862.el7.x86_64 on an x86_64 localhost login: root Password: Last login: Tue Oct 9 10:30:13 on ttyS0 [root@localhost ~]# curl -g [300::1]:8000 >> log3.txt % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 9 100 9 0 0 1428 0 --:--:-- --:--:-- --:--:-- 1800 job link: https://beaker.engineering.redhat.com/jobs/2911725 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:3500 |