The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1703162 - [RFE] Support for load balancing health checks in OVN
Summary: [RFE] Support for load balancing health checks in OVN
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.12
Version: FDP 20.A
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Numan Siddique
QA Contact: ying xu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-25 16:48 UTC by Carlos Goncalves
Modified: 2023-09-14 05:27 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1784081 (view as bug list)
Environment:
Last Closed: 2020-01-21 06:20:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:0167 0 None None None 2020-01-21 06:20:23 UTC

Description Carlos Goncalves 2019-04-25 16:48:34 UTC
Health checks are periodic connection attempts to check the healthy status of members. Unhealthy members should be marked as out of service and incoming traffic not forwarded to such members.

Presently, OVN does not support health checks for load balancing members and as so traffic can be forwarded to unhealthy members.

This BZ requests enhancement to OVN to support health checks for members. TCP has probably higher priority over UDP as it's more commonly used in applications.

Comment 3 qding 2019-04-26 00:31:58 UTC
Take

Comment 4 qding 2019-04-26 05:47:02 UTC
I wrongly set Assignee to me and restore to OVN component default.

Comment 8 Numan Siddique 2019-12-05 07:21:55 UTC
This feature is backported to OVN2.12-2.12.0-19 FDN.

Comment 10 ying xu 2019-12-26 08:49:56 UTC
hi,Numan
could you please give me some suggestions about how to test this feature?
thanks very much!

Comment 11 ying xu 2020-01-08 01:59:24 UTC
verified on version below:

[root@dell-per730-57 basic]# rpm -qa|grep ovn
ovn2.11-host-2.11.1-24.el7fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-common-1.0-6.noarch
ovn2.11-central-2.11.1-24.el7fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-basic-1.0-16.noarch
kernel-kernel-networking-openvswitch-ovn-qos-1.0-1.noarch
ovn2.11-2.11.1-24.el7fdp.x86_64


#add load balance on logical router
                uuid=`ovn-nbctl create load_balancer vips:30.0.0.1="172.16.103.11,172.16.103.12"`
                ovn-nbctl set load_balancer $uuid vips:'"30.0.0.1:8000"'='"172.16.103.11:80,172.16.103.12:80"'
                #create load balance check
                uuid3=`ovn-nbctl --id=@hc create Load_Balancer_Health_Check vip="30.0.0.1\:8000" -- add Load_Balancer $uuid health_check @hc`
                ovn-nbctl set Load_Balancer_Health_Check $uuid3 options:interval=5 options:timeout=20 options:success_count=3 options:failure_count=3
                ovn-nbctl set logical_router r1 load_balancer=$uuid
                ovn-nbctl --wait=sb set load_balancer $uuid ip_port_mappings:172.16.103.12=hv0_vm01_vnet1:172.16.103.1
                ovn-sbctl list service_monitor|grep "status.*\[\]" || ((result += 1))
                rlRun "ovn-sbctl list service_monitor"

[root@dell-per730-57 basic]# ovn-sbctl list service_monitor
_uuid               : af85a476-b867-4a64-b8a9-99fba2c2a8d0
external_ids        : {}
ip                  : "172.16.103.12"
logical_port        : "hv0_vm01_vnet1"
options             : {failure_count="3", interval="5", success_count="3", timeout="20"}
port                : 80
protocol            : tcp
src_ip              : "172.16.103.1"
src_mac             : "de:94:9a:e5:bf:c4"
status              : []



                for ((i=0; i<10; i++))
                do
                    vmsh run_cmd $(vm_name $hv 0) 'curl 30.0.0.1:8000 >> log.txt'
                done
                vmsh run_cmd $(vm_name $hv 0) 'cat log.txt | grep vm1' || ((result += 1))
                vmsh run_cmd $(vm_name $hv 0) 'cat log.txt | grep vm2' || ((result += 1))
                echo "result=$result"
                vmsh run_cmd $(vm_name $hv 0) 'cat log.txt'
                echo "result=$result"

                vmsh run_cmd $(vm_name 1 1) 'ip link set down dev eth1'
                sleep 30
                rlRun "ovn-sbctl list service_monitor"
[root@dell-per730-57 basic]# ovn-sbctl list service_monitor
_uuid               : af85a476-b867-4a64-b8a9-99fba2c2a8d0
external_ids        : {}
ip                  : "172.16.103.12"
logical_port        : "hv0_vm01_vnet1"
options             : {failure_count="3", interval="5", success_count="3", timeout="20"}
port                : 80
protocol            : tcp
src_ip              : "172.16.103.1"
src_mac             : "de:94:9a:e5:bf:c4"
status              : offline

                rlRun "ovn-sbctl list service_monitor|grep \"status.*offline\""
                ovn-sbctl list service_monitor|grep "status.*offline" || ((result += 1))
                vmsh run_cmd $(vm_name 1 1) 'ip link set up dev eth1'
                sleep 30
                rlRun "ovn-sbctl list service_monitor"
                rlRun "ovn-sbctl list service_monitor|grep \"status.*online\""
                ovn-sbctl list service_monitor|grep "status.*online" 
[root@dell-per730-57 basic]# ovn-sbctl list service_monitor
_uuid               : af85a476-b867-4a64-b8a9-99fba2c2a8d0
external_ids        : {}
ip                  : "172.16.103.12"
logical_port        : "hv0_vm01_vnet1"
options             : {failure_count="3", interval="5", success_count="3", timeout="20"}
port                : 80
protocol            : tcp
src_ip              : "172.16.103.1"
src_mac             : "de:94:9a:e5:bf:c4"
status              : online

#change the src_mac
[root@dell-per730-57 basic]# ovn-nbctl set NB_Global . options:svc_monitor_mac="fe:a0:65:a2:01:03"
[root@dell-per730-57 basic]# ovn-sbctl list service_monitor
_uuid               : 89eec067-6f39-4c29-8d7b-3633de6d0bda
external_ids        : {}
ip                  : "172.16.103.11"
logical_port        : "hv0_vm00_vnet1"
options             : {failure_count="3", interval="5", success_count="3", timeout="20"}
port                : 80
protocol            : tcp
src_ip              : "172.16.103.1"
src_mac             : "fe:a0:65:a2:01:03"
status              : online

[root@localhost ~]# tcpdump -i eth1 -e -nn -v 
[60764.285786] device eth1 entered promiscuous mode
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
20:58:44.509627 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 40)
    172.16.103.1.44719 > 172.16.103.11.80: Flags [S], cksum 0x37bf (correct), seq 1969041168, win 65160, length 0
20:58:44.509661 00:de:ad:00:00:01 > fe:a0:65:a2:01:03, ethertype IPv4 (0x0800), length 58: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 44)
    172.16.103.11.80 > 172.16.103.1.44719: Flags [S.], cksum 0x264c (incorrect -> 0x4ef7), seq 1130240533, ack 1969041169, win 29200, options [mss 1460], length 0
20:58:44.510345 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 40)
    172.16.103.1.44719 > 172.16.103.11.80: Flags [R.], cksum 0xda36 (correct), seq 2, ack 1, win 65160, length 0

Comment 12 ying xu 2020-01-08 03:00:51 UTC
sorry for the mess.
here is the log for ovn2.12

[root@dell-per730-19 basic]# rpm -qa|grep ovn
ovn2.12-central-2.12.0-19.el7fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-common-1.0-6.noarch
kernel-kernel-networking-openvswitch-ovn-basic-1.0-16.noarch
ovn2.12-2.12.0-19.el7fdp.x86_64
ovn2.12-host-2.12.0-19.el7fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-qos-1.0-1.noarch

:: [ 21:54:48 ] :: [  BEGIN   ] :: Running 'ovn-sbctl list service_monitor'
_uuid               : df3e1fa8-87fa-4b7f-b361-3510a867ea4f
external_ids        : {}
ip                  : "172.16.103.12"
logical_port        : hv0_vm01_vnet1
options             : {failure_count="3", interval="5", success_count="3", timeout="20"}
port                : 80
protocol            : tcp
src_ip              : "172.16.103.1"
src_mac             : "62:d5:ba:75:8b:9f"
status              : offline
:: [ 21:54:48 ] :: [   PASS   ] :: Command 'ovn-sbctl list service_monitor' (Expected 0, got 0)
:: [ 21:54:48 ] :: [  BEGIN   ] :: Running 'ovn-sbctl list service_monitor|grep "status.*offline"'
status              : offline
:: [ 21:54:48 ] :: [   PASS   ] :: Command 'ovn-sbctl list service_monitor|grep "status.*offline"' (Expected 0, got 0)
SYNC_NC: sync_set client test_load_balance_health_check
SYNC_NC: sent "test_load_balance_health_check" to dell-per730-19.rhts.eng.pek2.redhat.com
SYNC_NC: sync_wait client test_load_balance_health_check
SYNC_NC: waiting "dell-per730-19.rhts.eng.pek2.redhat.com"
SYNC_NC: got "test_load_balance_health_check" from dell-per730-19.rhts.eng.pek2.redhat.com
:: [ 21:55:24 ] :: [  BEGIN   ] :: Running 'ovn-sbctl list service_monitor'
_uuid               : df3e1fa8-87fa-4b7f-b361-3510a867ea4f
external_ids        : {}
ip                  : "172.16.103.12"
logical_port        : hv0_vm01_vnet1
options             : {failure_count="3", interval="5", success_count="3", timeout="20"}
port                : 80
protocol            : tcp
src_ip              : "172.16.103.1"
src_mac             : "62:d5:ba:75:8b:9f"
status              : online
:: [ 21:55:24 ] :: [   PASS   ] :: Command 'ovn-sbctl list service_monitor' (Expected 0, got 0)
:: [ 21:55:24 ] :: [  BEGIN   ] :: Running 'ovn-sbctl list service_monitor|grep "status.*online"'
status              : online
:: [ 21:55:24 ] :: [   PASS   ] :: Command 'ovn-sbctl list service_monitor|grep "status.*online"' (Expected 0, got 0)

:: [ 21:58:06 ] :: [  BEGIN   ] :: Running 'ovn-sbctl list service_monitor'
_uuid               : caf48d03-c312-4d54-9316-f75c6afc76a1
external_ids        : {}
ip                  : "172.16.103.11"
logical_port        : hv0_vm00_vnet1
options             : {failure_count="3", interval="5", success_count="3", timeout="20"}
port                : 80
protocol            : tcp
src_ip              : "172.16.103.1"
src_mac             : "62:d5:ba:75:8b:9f"
status              : online
:: [ 21:58:06 ] :: [   PASS   ] :: Command 'ovn-sbctl list service_monitor' (Expected 0, got 0)
:: [ 21:58:06 ] :: [  BEGIN   ] :: Running 'ovn-sbctl list service_monitor|grep "status.*online"'
status              : online
:: [ 21:58:06 ] :: [   PASS   ] :: Command 'ovn-sbctl list service_monitor|grep "status.*online"' (Expected 0, got 0)
:: [ 21:58:06 ] :: [  BEGIN   ] :: Running 'ovn-sbctl list service_monitor|grep "fe:a0:65:a2:01:03"'
src_mac             : "fe:a0:65:a2:01:03"
:: [ 21:58:06 ] :: [   PASS   ] :: Command 'ovn-sbctl list service_monitor|grep "fe:a0:65:a2:01:03"' (Expected 0, got 0)
[root@localhost ~]# tcpdump -r a.pcap -e -nn -v|grep fe:a0:65:a2:01:03
reading from file a.pcap, link-type EN10MB (Ethernet)
21:58:14.388988 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Reply 172.16.103.1 is-at fe:a0:65:a2:01:03, length 28
21:58:15.390830 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Reply 172.16.103.1 is-at fe:a0:65:a2:01:03, length 28
21:58:16.780564 00:de:ad:00:00:01 > fe:a0:65:a2:01:03, ethertype IPv4 (0x0800), length 58: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 44)
21:58:16.781260 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 40)
21:58:21.786355 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 40)
21:58:21.786385 00:de:ad:00:00:01 > fe:a0:65:a2:01:03, ethertype IPv4 (0x0800), length 58: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 44)
21:58:21.786752 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 40)
21:58:26.791913 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 40)
21:58:26.791941 00:de:ad:00:00:01 > fe:a0:65:a2:01:03, ethertype IPv4 (0x0800), length 58: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 44)
21:58:26.792370 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 40)
21:58:31.797498 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 40)
21:58:31.797524 00:de:ad:00:00:01 > fe:a0:65:a2:01:03, ethertype IPv4 (0x0800), length 58: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 44)
21:58:31.797948 fe:a0:65:a2:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 54: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 40)

Comment 14 errata-xmlrpc 2020-01-21 06:20:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0167

Comment 15 Red Hat Bugzilla 2023-09-14 05:27:36 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.