Description of problem: An OVN Load Balancer can be applied to multiple OVN logical switches and/or logical routers. Currently this is done by adding the load balancer UUID to the corresponding Logical_Switch.load_balancers/Logical_Router.load_balancers set of load balancers in the Northbound database. In specific scenarios (e.g., ovn-kubernetes) a significant number of the load balancers must be applied to the same set of logical switches/logical routers (e.g., all the node-switches and all the node-gw-routers). In these cases, having a way to group switches/routers would simplify configuration to something like: - create a LS-group containing all node logical switches UUIDs - create a LR-group containing all node gateway logical routers UUIDs - apply the load balancer to the LS-group and to the LR-group Such a change will not affect the way load balancers are represented in the Southbound database. This has a number of advantages: - it will reduce complexity on the CMS side. - it will reduce network traffic when adding a new service/load balancer - it will reduce NBDB in-memory transaction logs (smaller transaction jsons) - it should reduce NBDB cpu usage when processing new load balancers - it might make further optimizations in ovn-northd simpler when datapath groups are enabled.
Further investigation shows that it's even more beneficial to move to load balancer groups instead of logical switch/router groups. Changing the subject accordingly.
V1 up for review: http://patchwork.ozlabs.org/project/ovn/list/?series=265023&state=*
Accepted upstream, will be available in 21.12 (downstream ovn-2021-21.12).
Was backported upstream to v21.09.1 and downstream to ovn-2021-21.09.1-20.el8fdp
test on version: # rpm -qa|grep ovn ovn-2021-central-21.09.1-23.el8fdp.x86_64 ovn-2021-21.09.1-23.el8fdp.x86_64 ovn-2021-host-21.09.1-23.el8fdp.x86_64 test script: ovn-nbctl ls-add public uuid1=`ovn-nbctl create load_balancer vips:30.0.0.1="172.16.0.103,172.16.0.102"` uuid2=`ovn-nbctl create load_balancer vips:30.0.0.2="172.16.0.103,172.16.0.101"` uuid3=`ovn-nbctl create load_balancer vips:30.0.0.3="172.16.0.102,172.16.0.101"` uuid4=`ovn-nbctl create load_balancer vips:30.0.0.4="172.16.0.103,172.16.0.102"` lbg=$(ovn-nbctl create load_balancer_group name=lbg -- \ add load_balancer_group lbg load_balancer $uuid1 -- \ add load_balancer_group lbg load_balancer $uuid2 -- \ add load_balancer_group lbg load_balancer $uuid3 -- \ add load_balancer_group lbg load_balancer $uuid4) ovn-nbctl --wait=sb add logical_switch public load_balancer_group $lbg # r1 i=1 for m in `seq 0 9`;do for n in `seq 1 99`;do ovn-nbctl lr-add r${i} ovn-nbctl lrp-add r${i} r${i}_public 00:de:ad:ff:$m:$n 172.16.$m.$n/16 ovn-nbctl lrp-add r${i} r${i}_s${i} 00:de:ad:fe:$m:$n 173.$m.$n.1/24 ovn-nbctl lr-nat-add r${i} dnat_and_snat 172.16.${m}.$((n+100)) 173.$m.$n.2 ovn-nbctl lrp-set-gateway-chassis r${i}_public hv1 ovn-nbctl --wait=sb add logical_router r${i} load_balancer_group $lbg # s1 ovn-nbctl ls-add s${i} ovn-nbctl --wait=sb add logical_switch s${i} load_balancer_group $lbg # s1 - r1 ovn-nbctl lsp-add s${i} s${i}_r${i} ovn-nbctl lsp-set-type s${i}_r${i} router ovn-nbctl lsp-set-addresses s${i}_r${i} "00:de:ad:fe:$m:$n 173.$m.$n.1" ovn-nbctl lsp-set-options s${i}_r${i} router-port=r${i}_s${i} # s1 - vm1 ovn-nbctl lsp-add s$i vm$i ovn-nbctl lsp-set-addresses vm$i "00:de:ad:01:$m:$n 173.$m.$n.2" ovn-nbctl lrp-add r$i r${i}_public 40:44:00:00:$m:$n 172.16.$m.$n/16 ovn-nbctl lsp-add public public_r${i} ovn-nbctl lsp-set-type public_r${i} router ovn-nbctl lsp-set-addresses public_r${i} router ovn-nbctl lsp-set-options public_r${i} router-port=r${i}_public nat-addresses=router let i++ if [ $i -gt 3 ];then break; fi done if [ $i -gt 3 ];then break; fi done ovn-nbctl lsp-add public ln_p1 ovn-nbctl lsp-set-addresses ln_p1 unknown ovn-nbctl lsp-set-type ln_p1 localnet ovn-nbctl lsp-set-options ln_p1 network_name=nattest ovn-nbctl show ovn-sbctl show ovs-vsctl show rlRun "ovn-sbctl dump-flows|grep 'table=4 (lr_in_unsnat ), priority=0 , match=(1), action=(next;)'" rlRun "ovn-sbctl dump-flows|grep 'table=0.*ls_out_pre_lb.*priority=100.*match=(ip)'" #add host vm1 ip netns add vm1 ovs-vsctl add-port br-int vm1 -- set interface vm1 type=internal ip link set vm1 netns vm1 ip netns exec vm1 ip link set vm1 address 00:de:ad:01:00:01 ip netns exec vm1 ip addr add 173.0.1.2/24 dev vm1 ip netns exec vm1 ip link set vm1 up ovs-vsctl set Interface vm1 external_ids:iface-id=vm1 ip netns add vm2 ovs-vsctl add-port br-int vm2 -- set interface vm2 type=internal ip link set vm2 netns vm2 ip netns exec vm2 ip link set vm2 address 00:de:ad:01:00:02 ip netns exec vm2 ip addr add 173.0.2.2/24 dev vm2 ip netns exec vm2 ip link set vm2 up ovs-vsctl set Interface vm2 external_ids:iface-id=vm2 ip netns add vm3 ovs-vsctl add-port br-int vm3 -- set interface vm3 type=internal ip link set vm3 netns vm3 ip netns exec vm3 ip link set vm3 address 00:de:ad:01:00:03 ip netns exec vm3 ip addr add 173.0.3.2/24 dev vm3 ip netns exec vm3 ip link set vm3 up ovs-vsctl set Interface vm3 external_ids:iface-id=vm3 #set provide network ovs-vsctl add-br nat_test ip link set nat_test up ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=nattest:nat_test #ovs-vsctl add-port nat_test $nic_test2 #ip link set $nic_test2 up ip netns add vm0 ip link set vm0 netns vm0 ip netns exec vm0 ip link set vm0 address 00:00:00:00:00:01 ip netns exec vm0 ip addr add 172.16.0.100/16 dev vm0 ip netns exec vm0 ip link set vm0 up ovs-vsctl add-port nat_test vm0 -- set interface vm0 type=internal ip link set vm0 netns vm0 ip netns exec vm0 ip link set vm0 address 00:00:00:00:00:01 ip netns exec vm0 ip addr add 172.16.0.100/16 dev vm0 ip netns exec vm0 ip link set vm0 up ovs-vsctl set Interface vm0 external_ids:iface-id=vm0 ip netns exec vm1 ip route add default via 173.0.1.1 ip netns exec vm2 ip route add default via 173.0.2.1 ip netns exec vm3 ip route add default via 173.0.3.1 ovn-nbctl lr-nat-del r1 dnat_and_snat 172.16.0.101 ovn-nbctl lr-nat-add r1 dnat_and_snat 172.16.0.101 173.0.1.2 vm1 00:00:00:01:02:03 then, ping the vip,all pass ip netns exec vm1 ping 30.0.0.1 -c 5 ip netns exec vm2 ping 30.0.0.2 -c 5 ip netns exec vm3 ping 30.0.0.3 -c 5 ip netns exec vm2 ping 30.0.0.4 -c 5 ip netns exec vm3 ping 30.0.0.1 -c 5 #delete one of the LB group ovn-nbctl lb-del $uuid1 ovn-nbctl --wait=sb sync ovn-nbctl --wait=hv sync then ping the vips,30.0.0.1 fail,others pass ip netns exec vm1 ping 30.0.0.1 -c 5 ip netns exec vm1 ping 30.0.0.1 -c 5' PING 30.0.0.1 (30.0.0.1) 56(84) bytes of data. --- 30.0.0.1 ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 84ms ip netns exec vm2 ping 30.0.0.2 -c 5 ip netns exec vm3 ping 30.0.0.3 -c 5 ip netns exec vm2 ping 30.0.0.4 -c 5 ip netns exec vm3 ping 30.0.0.1 -c 5 ip netns exec vm1 ping 30.0.0.1 -c 5' PING 30.0.0.1 (30.0.0.1) 56(84) bytes of data. --- 30.0.0.1 ping statistics --- 5 packets transmitted, 0 received, 100% packet loss, time 84ms #add vip 30.0.0.1 to LB group ovn-nbctl set load-balancer $uuid2 vips:"30.0.0.1"="172.16.0.103,172.16.0.102" then ping the vips all pass. #delete the lbg from the ls and lr ovn-nbctl clear logical_switch s1 load_balancer_group ovn-nbctl clear logical_switch s2 load_balancer_group ovn-nbctl clear logical_switch s3 load_balancer_group ovn-nbctl clear logical_router r1 load_balancer_group ovn-nbctl clear logical_router r2 load_balancer_group ovn-nbctl clear logical_router r3 load_balancer_group then ping the vips ,all fail #add the lbg for the ls again ovn-nbctl add logical_switch s1 load_balancer_group $lbg ovn-nbctl add logical_switch s2 load_balancer_group $lbg ovn-nbctl add logical_switch s3 load_balancer_group $lbg then ping the vips,all pass restart northd,then ping the vips,all pass And I creat more than 100 ls and lr in the above script, test pass too!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (ovn bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0049