Bug 1952961
| Summary: | [ovn] dnat_snat traffic becomes centralized during VIP failover | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | Jakub Libosvar <jlibosva> | |
| Component: | ovn2.13 | Assignee: | lorenzo bianconi <lorenzo.bianconi> | |
| Status: | CLOSED ERRATA | QA Contact: | Jianlin Shi <jishi> | |
| Severity: | high | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | FDP 21.B | CC: | ctrautma, dalvarez, dcbw, eolivare, jishi, kfida, lorenzo.bianconi, mhofmann, mmichels, ralongi | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | ovn2.13-20.12.0-150.el8fdp | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2035079 (view as bug list) | Environment: | ||
| Last Closed: | 2022-12-15 00:21:16 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2035079, 2083527 | |||
|
Description
Jakub Libosvar
2021-04-23 16:20:06 UTC
Just to emphasise the outcome - the FIP becomes unreachable for some time until switches learn the right port where the mac is. Is it possible that some flows are removed when OVN claims the virtual port and virtual parents are updated - that the traffic becomes centralized by mistake because of the way flows are matched? Tested with following script: systemctl start openvswitch systemctl start ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:1.1.170.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=1.1.170.25 systemctl restart ovn-controller ovs-vsctl add-br br-public ovs-vsctl set open . external-ids:ovn-bridge-mappings=public:br-public ovs-vsctl add-port br-public p1p2 ovn-nbctl ls-add sw0 ovn-nbctl lsp-add sw0 sw0-vir ovn-nbctl lsp-set-addresses sw0-vir "50:54:00:00:00:10 10.0.0.10" ovn-nbctl lsp-set-port-security sw0-vir "50:54:00:00:00:10 10.0.0.10" ovn-nbctl lsp-set-type sw0-vir virtual ovn-nbctl set logical_switch_port sw0-vir options:virtual-ip=10.0.0.10 ovn-nbctl set logical_switch_port sw0-vir options:virtual-parents=sw0-p1,sw0-p2 ovn-nbctl lsp-add sw0 sw0-p1 ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3" ovn-nbctl lsp-add sw0 sw0-p2 ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4" ovn-nbctl lr-add lr0 ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 ovn-nbctl lsp-add sw0 sw0-lr0 ovn-nbctl lsp-set-type sw0-lr0 router ovn-nbctl lsp-set-addresses sw0-lr0 00:00:00:00:ff:01 ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 ovn-nbctl ls-add public ovn-nbctl lrp-add lr0 lr0-public 00:00:20:20:12:13 172.168.0.100/24 ovn-nbctl lsp-add public public-lr0 ovn-nbctl lsp-set-type public-lr0 router ovn-nbctl lsp-set-addresses public-lr0 router ovn-nbctl lsp-set-options public-lr0 router-port=lr0-public ovn-nbctl lsp-add public ln-public ovn-nbctl lsp-set-type ln-public localnet ovn-nbctl lsp-set-addresses ln-public unknown ovn-nbctl lsp-set-options ln-public network_name=public ovn-nbctl --wait=hv lrp-set-gateway-chassis lr0-public hv1 20 ovn-nbctl lr-nat-add lr0 dnat_and_snat 172.168.0.50 10.0.0.10 sw0-vir 10:54:00:00:00:10 ovn-sbctl list port_binding sw0-vir ovn-sbctl lflow-list lr0 | grep lr_in_gw_redirect on ovn2.13-20.12.0-149.el7: [root@wsfd-advnetlab16 bz1952961]# ovn-sbctl lflow-list lr0 | grep lr_in_gw_redirect table=17(lr_in_gw_redirect ), priority=100 , match=(ip4.src == 10.0.0.10 && outport == "lr0-public" && is_chassis_resident("sw0-vir")), action=(eth.src = 10:54:00:00:00:10; reg1 = 172.168.0.50; next;) table=17(lr_in_gw_redirect ), priority=50 , match=(outport == "lr0-public"), action=(outport = "cr-lr0-public"; next;) table=17(lr_in_gw_redirect ), priority=0 , match=(1), action=(next;) on ovn2.13-20.12.0-173.el7: [root@wsfd-advnetlab16 bz1952961]# ovn-sbctl lflow-list lr0 | grep lr_in_gw_redirect table=17(lr_in_gw_redirect ), priority=100 , match=(ip4.src == 10.0.0.10 && outport == "lr0-public" && is_chassis_resident("sw0-vir")), action=(eth.src = 10:54:00:00:00:10; reg1 = 172.168.0.50; next;) table=17(lr_in_gw_redirect ), priority=80 , match=(ip4.src == 10.0.0.10 && outport == "lr0-public"), action=(drop;) <=== one drop flow is added table=17(lr_in_gw_redirect ), priority=50 , match=(outport == "lr0-public"), action=(outport = "cr-lr0-public"; next;) table=17(lr_in_gw_redirect ), priority=0 , match=(1), action=(next;) We can verify that the drop rule is added in the latest ovn version. but we can't reproduce the initial issue described in the Description. jlibosva, could you help to test with ovn2.13-20.12.0-173.el7 located at http://download-node-02.eng.bos.redhat.com/brewroot/packages/ovn2.13/20.12.0/173.el7fdp/? thanks also verified on ovn2.13-20.12.0-173.el8:
+ ovn-sbctl list port_binding sw0-vir
_uuid : 417656bd-8669-411c-8e2f-36a61d431e27
chassis : []
datapath : 6e8294d3-1693-4267-8d71-851ada3eba52
encap : []
external_ids : {}
gateway_chassis : []
ha_chassis_group : []
logical_port : sw0-vir
mac : ["50:54:00:00:00:10 10.0.0.10"]
nat_addresses : []
options : {virtual-ip="10.0.0.10", virtual-parents="sw0-p1,sw0-p2"}
parent_port : []
tag : []
tunnel_key : 1
type : virtual
up : false
virtual_parent : []
+ ovn-sbctl lflow-list lr0
+ grep lr_in_gw_redirect
table=17(lr_in_gw_redirect ), priority=100 , match=(ip4.src == 10.0.0.10 && outport == "lr0-public" && is_chassis_resident("sw0-vir")), action=(eth.src = 10:54:00:00:00:10; reg1 = 172.168.0.50; next;)
table=17(lr_in_gw_redirect ), priority=80 , match=(ip4.src == 10.0.0.10 && outport == "lr0-public"), action=(drop;)
table=17(lr_in_gw_redirect ), priority=50 , match=(outport == "lr0-public"), action=(outport = "cr-lr0-public"; next;)
table=17(lr_in_gw_redirect ), priority=0 , match=(1), action=(next;)
[root@dell-per740-12 bz1952961]# rpm -qa | grep ovn2.13
ovn2.13-20.12.0-173.el8fdp.x86_64
ovn2.13-host-20.12.0-173.el8fdp.x86_64
ovn2.13-central-20.12.0-173.el8fdp.x86_64
also verified on ovn-2021-20.06.0-18.el8:
+ ovn-sbctl list port_binding sw0-vir
_uuid : e0a99b66-2f9f-4bb5-b3af-9d9e0d8ede3a
chassis : []
datapath : 96f6365d-87ed-4167-9f56-7dde99e82d37
encap : []
external_ids : {}
gateway_chassis : []
ha_chassis_group : []
logical_port : sw0-vir
mac : ["50:54:00:00:00:10 10.0.0.10"]
nat_addresses : []
options : {virtual-ip="10.0.0.10", virtual-parents="sw0-p1,sw0-p2"}
parent_port : []
tag : []
tunnel_key : 1
type : virtual
up : false
virtual_parent : []
+ ovn-sbctl lflow-list lr0
+ grep lr_in_gw_redirect
table=17(lr_in_gw_redirect ), priority=100 , match=(ip4.src == 10.0.0.10 && outport == "lr0-public" && is_chassis_resident("sw0-vir")), action=(eth.src = 10:54:00:00:00:10; reg1 = 172.168.0.50; next;)
table=17(lr_in_gw_redirect ), priority=80 , match=(ip4.src == 10.0.0.10 && outport == "lr0-public"), action=(drop;)
table=17(lr_in_gw_redirect ), priority=50 , match=(outport == "lr0-public"), action=(outport = "cr-lr0-public"; next;)
table=17(lr_in_gw_redirect ), priority=0 , match=(1), action=(next;)
[root@dell-per740-12 bz1952961]# rpm -qa | grep -E "openvswitch2.15|ovn-2021"
ovn-2021-21.06.0-18.el8fdp.x86_64
openvswitch2.15-2.15.0-35.el8fdp.x86_64
ovn-2021-central-21.06.0-18.el8fdp.x86_64
ovn-2021-host-21.06.0-18.el8fdp.x86_64
(In reply to Jianlin Shi from comment #8) > > We can verify that the drop rule is added in the latest ovn version. but we > can't reproduce the initial issue described in the Description. > jlibosva, could you help to test with ovn2.13-20.12.0-173.el7 > located at > http://download-node-02.eng.bos.redhat.com/brewroot/packages/ovn2.13/20.12.0/ > 173.el7fdp/? thanks I will clone this BZ to OpenStack and we will verify it. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:9044 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |