+++ This bug was initially created as a clone of Bug #1719291 +++ When trying to reach the FIP of an OVN Load Balancer, traffic never gets delivered to a member. However, it works with the fixed IP. --- Additional comment from Daniel Alvarez Sanchez on 2019-06-11 12:56:52 UTC --- Just to expand a bit after a conversation with Numan, the problem is that we can see the DNAT action happening from the FIP to the VIP of the LB but a second DNAT action is missing to translate from the VIP to the member IP address. --- Additional comment from Numan Siddique on 2019-06-24 11:23:53 UTC --- These patches addresses this issue - https://review.opendev.org/#/c/667027 and https://review.opendev.org/#/c/667028/ --- Additional comment from RHEL Program Management on 2019-06-27 09:56:00 UTC --- This item has been properly Triaged and planned for the release, and Target Release is now set to match the release flag. For details, see https://mojo.redhat.com/docs/DOC-1144661#jive_content_id_OSP_Release_Planning --- Additional comment from RHEL Program Management on 2019-07-22 09:54:54 UTC --- This item has been properly Triaged and planned for the release, and Target Release is now set to match the release flag. For details, see https://mojo.redhat.com/docs/DOC-1144661#jive_content_id_OSP_Release_Planning --- Additional comment from Maciej Józefczyk on 2019-07-22 10:04:08 UTC --- Target release for this change is OSP15 because Kuryr guys will use Octavia from OSP15 on OSP13 deployment. --- Additional comment from RHEL Program Management on 2019-07-22 10:04:14 UTC --- This bugzilla has been removed from the release since it does not have an acked release flag. For details, see https://mojo.redhat.com/docs/DOC-1144661#jive_content_id_OSP_Release_Planning.' --- Additional comment from RHEL Program Management on 2019-07-22 10:04:14 UTC --- This item has been properly Triaged and planned for the release, and Target Release is now set to match the release flag. For details, see https://mojo.redhat.com/docs/DOC-1144661#jive_content_id_OSP_Release_Planning --- Additional comment from Maciej Józefczyk on 2019-08-07 06:28:04 UTC --- The fix has been merged to upstream stable/stein: https://review.opendev.org/#/c/672647/ Waiting for sync. --- Additional comment from errata-xmlrpc on 2019-12-03 17:11:45 UTC --- Bug report changed to ON_QA status by Errata System. A QE request has been submitted for advisory RHBA-2019:49192-01 https://errata.devel.redhat.com/advisory/49192 --- Additional comment from Roman Safronov on 2019-12-23 13:34:18 UTC --- According to Carlos (cgoncalves) OSP 15 + Octavia + OVN provider driver is not supported. https://review.opendev.org/#/c/696139/ hasn't been included in OSP 15 yet so it is not possible to create a load balancer with OVN provider on OSP15 The fix should be verified on OSP16. --- Additional comment from RHEL Program Management on 2019-12-23 13:40:34 UTC --- This item has been properly Triaged and planned for the release, and Target Release is now set to match the release flag. For details, see https://mojo.redhat.com/docs/DOC-1144661#jive_content_id_OSP_Release_Planning --- Additional comment from RHEL Program Management on 2019-12-23 13:54:07 UTC --- This item has been properly Triaged and planned for the release, and Target Release is now set to match the release flag. For details, see https://mojo.redhat.com/docs/DOC-1144661#jive_content_id_OSP_Release_Planning --- Additional comment from RHEL Program Management on 2019-12-23 13:56:21 UTC --- This bugzilla has been removed from the release since it does not have an acked release flag. For details, see https://mojo.redhat.com/docs/DOC-1144661#jive_content_id_OSP_Release_Planning.' --- Additional comment from RHEL Program Management on 2019-12-23 13:56:21 UTC --- This item has been properly Triaged and planned for the release, and Target Release is now set to match the release flag. For details, see https://mojo.redhat.com/docs/DOC-1144661#jive_content_id_OSP_Release_Planning --- Additional comment from Scott Lewis on 2020-01-02 17:08:40 UTC --- This item has been properly Triaged and planned for the appropriate release, and is being tagged for tracking. --- Additional comment from Roman Safronov on 2020-01-06 11:26:50 UTC --- Tried to verify on 16.0-RHEL-8/RHOS_TRUNK-16.0-RHEL-8-20191217.n.1 with python3-networking-ovn-7.0.1-0.20191205040313.2ef5322.el8ost.noarch Still no ping from a FIP assigned to an OVN Load Balancer VIP. According to Maciej all was set as expected: ACls are set, request is passed to client, client ACKed the TCP connection but the ACK is not send back via tunnel to client Note: tested environment had DVR enabled --- Additional comment from Maciej Józefczyk on 2020-01-07 08:14:05 UTC --- After debug with Numan it looks like its a regression in core OVN. If a member IP has also FIP attached the traffic is not routed back via LB FIP. Numan, could you please add and link a BZ related to the issue in OVN? Thanks!
the fix is available in 2.11.1-29. This should be available in FDP 20.B
Hi Numan, could you give suggestions on how to reproduce the issue? thanks
Hi Jianlin, Create a load balancer. For example : 10.0.0.4:80="10.0.0.10:80,20.0.0.10:80" Then create dnat_and_snat entry for logical port (which has 10.0.0.4). If suppose the provider network CIDR is 172.16.0.0/24, then 10.0.0.4 <-> 172.16.0.4 Then update the LB with another VIP - "172.16.0.4:80=10.0.0.10:80,20.0.0.10:80". From external, curl 172.16.0.4:80 and it should work. Let me know if you have more questions.
reproduced on ovn2.11.1-20 with following steps: on one system: #!/bin/bash systemctl restart openvswitch systemctl restart ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 ovs-vsctl set open . external-ids:system-id=hv0 external-ids:ovn-remote=tcp:20.0.30.26:6642 external-ids:ovn-encap-type=geneve external-ids:ovn-encap-ip=20.0.30.26 systemctl restart ovn-controller ovn-nbctl ls-add ls1 ovn-nbctl lsp-add ls1 ls1p1 ovn-nbctl lsp-set-addresses ls1p1 00:01:02:01:01:01 ovn-nbctl lsp-add ls1 ls1p2 ovn-nbctl lsp-set-addresses ls1p2 00:01:02:01:01:02 ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal ip netns add server0 ip link set vm1 netns server0 ip netns exec server0 ip link set lo up ip netns exec server0 ip link set vm1 up ip netns exec server0 ip link set vm1 address 00:01:02:01:01:01 ip netns exec server0 ip addr add 192.168.0.1/24 dev vm1 ovs-vsctl set Interface vm1 external_ids:iface-id=ls1p1 ovs-vsctl add-port br-int vm2 -- set interface vm2 type=internal ip netns add server1 ip link set vm2 netns server1 ip netns exec server1 ip link set lo up ip netns exec server1 ip link set vm2 up ip netns exec server1 ip link set vm2 address 00:01:02:01:01:02 ip netns exec server1 ip addr add 192.168.0.2/24 dev vm2 ovs-vsctl set Interface vm2 external_ids:iface-id=ls1p2 ovn-nbctl lr-add lr1 ovn-nbctl lrp-add lr1 lr1ls1 00:01:02:0d:01:01 192.168.0.254/24 ovn-nbctl lsp-add ls1 ls1lr1 ovn-nbctl lsp-set-type ls1lr1 router ovn-nbctl lsp-set-options ls1lr1 router-port=lr1ls1 ovn-nbctl lsp-set-addresses ls1lr1 "00:01:02:0d:01:01 192.168.0.254" ovn-nbctl lrp-add lr1 lr1p 00:01:02:0d:0f:01 172.16.1.254/24 ovn-nbctl ls-add public ovn-nbctl lsp-add public plr1 ovn-nbctl lsp-set-type plr1 router ovn-nbctl lsp-set-options plr1 router-port=lr1p ovn-nbctl lsp-set-addresses plr1 "00:01:02:0d:0f:01 172.16.1.254" ovn-nbctl lsp-add public ln_public ovn-nbctl lsp-set-type ln_public localnet ovn-nbctl lsp-set-addresses ln_public unknown ovn-nbctl lsp-set-options ln_public network_name=provider ovs-vsctl add-br br-provider ovs-vsctl set open . external-ids:ovn-bridge-mappings=provider:br-provider ip link set br-provider up #ovn-nbctl set logical_router_port lr1p options:redirect-chassis=hv0 ovn-nbctl lrp-set-gateway-chassis lr1p hv0 20 ovn-nbctl lrp-set-gateway-chassis lr1p hv1 10 ovn-nbctl lb-add lb0 192.168.2.1:80 192.168.0.1:80,192.168.0.2:80 ovn-nbctl lb-add lb0 172.16.1.10:80 192.168.0.1:80,192.168.0.2:80 ovn-nbctl lr-lb-add lr1 lb0 ip netns add client0 ip link add veth0_c0 type veth peer name veth0_c0_p ip link set veth0_c0 netns client0 ip netns exec client0 ip link set lo up ip netns exec client0 ip link set veth0_c0 up ip netns exec client0 ip addr add 172.16.1.1/24 dev veth0_c0 ip netns exec client0 ip route add default via 172.16.1.254 ovs-vsctl add-port br-provider veth0_c0_p ip link set veth0_c0_p up ovs-vsctl add-port br-provider ens4f4d1 ip netns exec server0 ip route add default via 192.168.0.254 ip netns exec server1 ip route add default via 192.168.0.254 ovn-nbctl lr-nat-add lr1 dnat_and_snat 172.16.1.10 192.168.2.1 on the other system: #!/bin/bash systemctl restart openvswitch ovs-vsctl set open . external-ids:system-id=hv1 external-ids:ovn-remote=tcp:20.0.30.26:6642 external-ids:ovn-encap-type=geneve external-ids:ovn-encap-ip=20.0.30.25 systemctl restart ovn-controller ovs-vsctl add-br br-provider ovs-vsctl set open . external-ids:ovn-bridge-mappings=provider:br-provider ip link set br-provider up ip netns add client1 ip link add veth0_c1 type veth peer name veth0_c1_p ip link set veth0_c1 netns client1 ip netns exec client1 ip link set lo up ip netns exec client1 ip link set veth0_c1 up ip netns exec client1 ip addr add 172.16.1.2/24 dev veth0_c1 ip netns exec client1 ip route add default via 172.16.1.254 ovs-vsctl add-port br-provider veth0_c1_p ip link set veth0_c1_p up ovs-vsctl add-port br-provider p4p2 after setup on ovn2.11.1-20: run arping on client0: ip netns exec client0 arping 172.16.1.10 -c 1 get two arp reply: [root@hp-dl380pg8-12 ovn2.11.1-20]# ip netns exec client0 tcpdump -i any -nnle arp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes 07:42:13.361136 Out 1a:87:42:f6:89:5c ethertype ARP (0x0806), length 44: Request who-has 172.16.1.10 (ff:ff:ff:ff:ff:ff) tell 172.16.1.1, length 28 07:42:13.361400 In 00:01:02:0d:0f:01 ethertype ARP (0x0806), length 44: Reply 172.16.1.10 is-at 00:01:02:0d:0f:01, length 28 07:42:13.361882 In 00:01:02:0d:0f:01 ethertype ARP (0x0806), length 62: Reply 172.16.1.10 is-at 00:01:02:0d:0f:01, length 46 <=== two arp reply verified on 2.11.1-32: [root@hp-dl380pg8-12 bz1788456]# ip netns exec client0 arping 172.16.1.10 -c 1 ARPING 172.16.1.10 from 172.16.1.1 veth0_c0 Unicast reply from 172.16.1.10 [00:01:02:0D:0F:01] 0.759ms Sent 1 probes (1 broadcast(s)) Received 1 response(s) [root@hp-dl380pg8-12 bz1788456]# ^C [root@hp-dl380pg8-12 bz1788456]# rpm -qa | grep -E "openvswitch|ovn" ovn2.11-2.11.1-32.el7fdp.x86_64 openvswitch-selinux-extra-policy-1.0-14.el7fdp.noarch ovn2.11-central-2.11.1-32.el7fdp.x86_64 openvswitch2.11-2.11.0-35.el7fdp.x86_64 ovn2.11-host-2.11.1-32.el7fdp.x86_64 [root@hp-dl380pg8-12 ovn2.11.1-32]# ip netns exec client0 tcpdump -i any -nnle arp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes 07:57:09.962138 Out be:a7:5e:a6:fd:a5 ethertype ARP (0x0806), length 44: Request who-has 172.16.1.10 (ff:ff:ff:ff:ff:ff) tell 172.16.1.1, length 28 07:57:09.962378 In 00:01:02:0d:0f:01 ethertype ARP (0x0806), length 44: Reply 172.16.1.10 is-at 00:01:02:0d:0f:01, length 28 <=== receive one arp reply
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0750