Description of problem: control flow and data flow for the same program are load balanced to different dstination Version-Release number of selected component (if applicable): ovn2.13.0-11 How reproducible: Always Steps to Reproduce: 1. setup load balance: 30.0.0.1 192.168.1.1,192.168.1.3 2. start netserver on 192.168.1.1 and 192.168.1.3 3. start netperf on another host repeatedly Actual results: netperf would fail sometimes with error:netperf: send_omni: connect_data_socket failed: Connection refused Expected results: netperf should not fail Additional info: systemctl start openvswitch systemctl start ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:20.0.48.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.48.25 systemctl restart ovn-controller ip netns add server0 ip link add veth0_s0 netns server0 type veth peer name veth0_s0_p ip netns exec server0 ip link set lo up ip netns exec server0 ip link set veth0_s0 up ip netns exec server0 ip link set veth0_s0 address 00:00:00:01:01:02 ip netns exec server0 ip addr add 192.168.1.1/24 dev veth0_s0 ip netns exec server0 ip -6 addr add 2001::1/64 dev veth0_s0 ip netns exec server0 ip route add default via 192.168.1.254 dev veth0_s0 ip netns exec server0 ip -6 route add default via 2001::a dev veth0_s0 ovs-vsctl add-port br-int veth0_s0_p ip link set veth0_s0_p up ovs-vsctl set interface veth0_s0_p external_ids:iface-id=ls1p1 ovn-nbctl ls-add ls1 ovn-nbctl lsp-add ls1 ls1p1 #ovn-nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:02 2001::1 192.168.1.1" ovn-nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:02 192.168.1.1 2001::1" ovn-nbctl lsp-add ls1 ls1p2 ovn-nbctl lsp-set-addresses ls1p2 "00:00:00:01:02:02 192.168.1.2 2001::2" ovn-nbctl lr-add lr1 ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:00:01 192.168.1.254/24 2001::a/64 ovn-nbctl lsp-add ls1 ls1-lr1 ovn-nbctl lsp-set-addresses ls1-lr1 "00:00:00:00:00:01 192.168.1.254 2001::a" ovn-nbctl lsp-set-type ls1-lr1 router ovn-nbctl lsp-set-options ls1-lr1 router-port=lr1-ls1 ovn-nbctl lrp-add lr1 lr1-ls2 00:00:00:00:00:02 192.168.2.254/24 2002::a/64 ovn-nbctl ls-add ls2 ovn-nbctl lsp-add ls2 ls2-lr1 ovn-nbctl lsp-set-addresses ls2-lr1 "00:00:00:00:00:02 192.168.2.254 2002::a" ovn-nbctl lsp-set-type ls2-lr1 router ovn-nbctl lsp-set-options ls2-lr1 router-port=lr1-ls2 ovn-nbctl lsp-add ls2 ls2p1 ovn-nbctl lsp-set-addresses ls2p1 "00:00:00:02:01:02 192.168.2.1 2002::1" ovn-nbctl lsp-add ls1 ls1p3 ovn-nbctl lsp-set-addresses ls1p3 "00:00:00:01:03:02 192.168.1.3 2001::3" ip netns add server2 ip link add veth0_s2 netns server2 type veth peer name veth0_s2_p ip netns exec server2 ip link set lo up ip netns exec server2 ip link set veth0_s2 up ip netns exec server2 ip link set veth0_s2 address 00:00:00:01:03:02 ip netns exec server2 ip addr add 192.168.1.3/24 dev veth0_s2 ip netns exec server2 ip -6 addr add 2001::3/64 dev veth0_s2 ip netns exec server2 ip route add default via 192.168.1.254 dev veth0_s2 ip netns exec server2 ip -6 route add default via 2001::a dev veth0_s2 ovs-vsctl add-port br-int veth0_s2_p ip link set veth0_s2_p up ovs-vsctl set interface veth0_s2_p external_ids:iface-id=ls1p3 ip netns add client0 ip link add veth0_c0 netns client0 type veth peer name veth0_c0_p ip netns exec client0 ip link set lo up ip netns exec client0 ip link set veth0_c0 up ip netns exec client0 ip link set veth0_c0 address 00:00:00:02:01:02 ip netns exec client0 ip addr add 192.168.2.1/24 dev veth0_c0 ip netns exec client0 ip -6 addr add 2002::1/64 dev veth0_c0 ip netns exec client0 ip route add default via 192.168.2.254 dev veth0_c0 ip netns exec client0 ip -6 route add default via 2002::a dev veth0_c0 ovs-vsctl add-port br-int veth0_c0_p ip link set veth0_c0_p up ovs-vsctl set interface veth0_c0_p external_ids:iface-id=ls2p1 ovn-nbctl set logical_router lr1 options:chassis=hv1 ovn-nbctl lb-add lb0 30.0.0.1 192.168.1.1,192.168.1.3 ovn-nbctl lr-lb-add lr1 lb0 ip netns exec server0 netserver -d ip netns exec server2 netserver -d for i in `seq 1 10` do if ip netns exec client0 netperf -4 -H 30.0.0.1 -t TCP_STREAM -l 1 then echo "FAIL" break fi done [root@dell-per740-42 ~]# rpm -qa | grep -E "openvswitch|ovn" ovn2.13-central-2.13.0-11.el8fdp.x86_64 openvswitch-selinux-extra-policy-1.0-23.el8fdp.noarch kernel-kernel-networking-openvswitch-ovn_ha-1.0-55.noarch kernel-kernel-networking-openvswitch-ovn-common-1.0-7.noarch openvswitch2.13-2.13.0-9.el8fdp.x86_64 python3-openvswitch2.13-2.13.0-9.el8fdp.x86_64 ovn2.13-2.13.0-11.el8fdp.x86_64 kernel-kernel-networking-openvswitch-ovn-load_balance-1.0-4.noarch ovn2.13-host-2.13.0-11.el8fdp.x86_64 [root@dell-per740-42 ~]# ip netns exec client0 netperf -4 -H 30.0.0.1 -t TCP_STREAM -l 1 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 30.0.0.1 () port 0 AF_INET netperf: send_omni: connect_data_socket failed: Connection refused tcpdump on 1.1 and 1.3 when netperf fail: [root@dell-per740-42 test]# ip netns exec server2 tcpdump -i veth0_s2 -nnle -v -c 5 tcpdump: listening on veth0_s2, link-type EN10MB (Ethernet), capture size 262144 bytes 21:57:29.042351 00:00:00:00:00:01 > 00:00:00:01:03:02, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 63, id 14920, offset 0, flags [DF], proto TCP (6), length 60) 192.168.2.1.51397 > 192.168.1.3.12865: Flags [S], cksum 0x1216 (correct), seq 233147885, win 29200, options [mss 1460,sackOK,TS val 231643869 ecr 0,nop,wscale 7], length 0 21:57:29.042399 00:00:00:01:03:02 > 00:00:00:00:00:01, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) 192.168.1.3.12865 > 192.168.2.1.51397: Flags [S.], cksum 0x8483 (incorrect -> 0x3288), seq 3973882037, ack 233147886, win 28960, options [mss 1460,sackOK,TS val 2788076715 ecr 231643869,nop,wscale 7], length 0 21:57:29.043062 00:00:00:00:00:01 > 00:00:00:01:03:02, ethertype IPv4 (0x0800), length 66: (tos 0x0, ttl 63, id 14921, offset 0, flags [DF], proto TCP (6), length 52) 192.168.2.1.51397 > 192.168.1.3.12865: Flags [.], cksum 0xd18e (correct), ack 1, win 229, options [nop,nop,TS val 231643870 ecr 2788076715], length 0 21:57:29.044163 00:00:00:00:00:01 > 00:00:00:01:03:02, ethertype IPv4 (0x0800), length 722: (tos 0x0, ttl 63, id 14922, offset 0, flags [DF], proto TCP (6), length 708) 192.168.2.1.51397 > 192.168.1.3.12865: Flags [P.], cksum 0x1e0b (correct), seq 1:657, ack 1, win 229, options [nop,nop,TS val 231643871 ecr 2788076715], length 656 21:57:29.044187 00:00:00:01:03:02 > 00:00:00:00:00:01, ethertype IPv4 (0x0800), length 66: (tos 0x0, ttl 64, id 5905, offset 0, flags [DF], proto TCP (6), length 52) 192.168.1.3.12865 > 192.168.2.1.51397: Flags [.], cksum 0x847b (incorrect -> 0xcef3), ack 657, win 237, options [nop,nop,TS val 2788076717 ecr 231643871], length 0 5 packets captured 5 packets received by filter 0 packets dropped by kernel <=== control flow is load balanced to 192.168.1.3 and negotiate the port for data flow [root@dell-per740-42 test]# ip netns exec server0 tcpdump -i veth0_s0 -nnle -v -c 5 tcpdump: listening on veth0_s0, link-type EN10MB (Ethernet), capture size 262144 bytes 21:57:29.050209 00:00:00:00:00:01 > 00:00:00:01:01:02, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 63, id 55888, offset 0, flags [DF], proto TCP (6), length 60) 192.168.2.1.45127 > 192.168.1.1.39151: Flags [S], cksum 0x88d1 (correct), seq 1759735293, win 29200, options [mss 1460,sackOK,TS val 231643877 ecr 0,nop,wscale 7], length 0 21:57:29.050238 00:00:00:01:01:02 > 00:00:00:00:00:01, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 192.168.1.1.39151 > 192.168.2.1.45127: Flags [R.], cksum 0x0b65 (correct), seq 0, ack 1759735294, win 0, length 0 <=== data flow is load balanced to 192.168.1.1, and send to port 39151, but as 39151 is not opened on 192.168.1.1, so tcp failed
This bug is dependent on https://bugzilla.redhat.com/show_bug.cgi?id=1707513 . OVN currently uses conntrack to ensure traffic is sent to the same load balancer backend. Conntrack uses a 4-tuple (source IP, source port, destination IP, destination port). Allowing a different hashing method would allow for us to send traffic from the same IP to the same destination for a period of time. This way, separate signaling and data connections can be sent to the same destination. As a short-term workaround, you may consider using the -N option to netperf if possible. It will disable some of the features of netperf, but it may still allow for what you want to collect.
Yeah the BZ 1707513 will address this use case.
In order to test this, you need to set the load balancer selection fields - ip_src and ip_dst. Please see the commit - https://github.com/ovn-org/ovn/commit/5af304e7478adcf5ac50ed41e96a55bebebff3e8#diff-9878c4b324a3aff23893a548d48354fd
verified on ovn2.13.0-33: setup env following steps in description. add selection_fields for load_balancer: [root@dell-per740-12 bz1825037]# ovn-nbctl list load_balancer _uuid : 2002c903-196d-4461-8bf3-70e8a104cfeb external_ids : {} health_check : [] ip_port_mappings : {} name : lb0 protocol : tcp selection_fields : [] vips : {"30.0.0.1"="192.168.1.1,192.168.1.3"} [root@dell-per740-12 bz1825037]# ovn-nbctl set load_balancer 2002c903-196d-4461-8bf3-70e8a104cfeb selection_fields="ip_src,ip_dst" [root@dell-per740-12 bz1825037]# ovn-nbctl list load_balancer _uuid : 2002c903-196d-4461-8bf3-70e8a104cfeb external_ids : {} health_check : [] ip_port_mappings : {} name : lb0 protocol : tcp selection_fields : [ip_dst, ip_src] vips : {"30.0.0.1"="192.168.1.1,192.168.1.3"} [root@dell-per740-12 bz1825037]# ip netns exec client0 netperf -4 -H 30.0.0.1 -t TCP_STREAM -l 1 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 30.0.0.1 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 1.00 10281.27 <=== try netperf for 10 times, all passed
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2941