Bug 2043543

Summary: Load balancer IP is not reachable from inside the VM
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Yatin Karel <ykarel>
Component: ovn-2021Assignee: lorenzo bianconi <lorenzo.bianconi>
Status: CLOSED CURRENTRELEASE QA Contact: ying xu <yinxu>
Severity: unspecified Docs Contact:
Priority: high    
Version: RHEL 8.0CC: ctrautma, dceara, jiji, jishi, lorenzo.bianconi, ltomasbo, mmichels
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-03-13 07:09:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
OVN and OVS DB's archive none

Description Yatin Karel 2022-01-21 13:48:35 UTC
Created attachment 1852501 [details]
OVN and OVS DB's archive

Description of problem:
In a OpenStack setup with OVN seeing an issue where load balancer(port forwarding to a vm from floating ip) is not accessible from inside the vm, it's accessible from the host.

Details:-
# ovn-sbctl --version
ovn-sbctl 21.06.0
Open vSwitch Library 2.15.4
DB Schema 20.17.0

Load balancer:-
# ovn-nbctl list load_balancer dc6461c2-8d55-4d83-8e71-19a687f47b07
_uuid               : dc6461c2-8d55-4d83-8e71-19a687f47b07
external_ids        : {"neutron:device_owner"=port_forwarding_plugin, "neutron:fip_id"="e712c45b-1620-4088-b845-5ddabbbcd203", "neutron:revision_number"="2", "neutron:router_name"=neutron-e30f310a-29dd-43a2-8bde-dedd103d9db9}
health_check        : []
ip_port_mappings    : {}
name                : pf-floatingip-e712c45b-1620-4088-b845-5ddabbbcd203-tcp
options             : {}
protocol            : tcp
selection_fields    : []
vips                : {"172.24.4.194:2255"="192.0.2.220:22"}

# Logical Router
# ovn-nbctl list logical_router be34095b-e5b3-4dcc-9a80-d9d487f36fd0
_uuid               : be34095b-e5b3-4dcc-9a80-d9d487f36fd0
enabled             : true
external_ids        : {"neutron:availability_zone_hints"="", "neutron:gw_port_id"="b6f80253-c7e4-4779-bc58-fd77198f30d9", "neutron:revision_number"="6", "neutron:router_name"=router1}
load_balancer       : [dc6461c2-8d55-4d83-8e71-19a687f47b07]
name                : neutron-e30f310a-29dd-43a2-8bde-dedd103d9db9
nat                 : [08d867eb-73b1-45c4-b379-31c8a88f90b7, 8c132d30-c038-4198-9a36-9151ecd9499c]
options             : {always_learn_from_arp_request="false", dynamic_neigh_routers="true"}
policies            : []
ports               : [74a5fdec-86b4-4dfe-9dc1-2fc916e63d6f, 9bb9c470-826c-4923-9f56-d6204d3ca09e, 9e03c3ed-c3cc-44f7-8c59-e4acf055d85d, bf48439d-368c-4096-9185-fd229d4ec801]
static_routes       : [a900273a-4a9f-4c06-a1ea-676f340defa5, bc3fdd3e-da2f-470a-9be7-404679896865]

# Logical Switch
# ovn-nbctl list logical_switch 39bdfd3a-9f22-4ea0-a720-8b35d26e7d36
_uuid               : 39bdfd3a-9f22-4ea0-a720-8b35d26e7d36
acls                : []
dns_records         : [db5c2fdd-485e-4b4c-85cb-08908d1593b7]
external_ids        : {"neutron:availability_zone_hints"="", "neutron:mtu"="1442", "neutron:network_name"=net1, "neutron:revision_number"="2"}
forwarding_groups   : []
load_balancer       : []
name                : neutron-a86e2e4b-59f8-444d-9012-9820f13c0353
other_config        : {mcast_flood_unregistered="false", mcast_snoop="false", vlan-passthru="false"}
ports               : [0b7af6a2-49e3-4e38-80ad-72a9fd2c43f4, 694da56e-0538-4f0e-b6bb-b98b0dc565c8, a2e9d602-05f3-4955-99ab-b070c47f5545, f823d3de-1049-48f7-bfa3-3b0605716177]
qos_rules           : []


# VMS
Client: tap9e0f4837-42  - 192.0.2.225
Server: tape187de7e-94  - 192.0.2.220


Trying to access 172.24.4.194:2255 from vm using curl/telnet/ssh etc fails. From tcpdump can see that no reply is received.


Version-Release number of selected component (if applicable):


How reproducible:
Always with the type of setup. Attaching the ovn dbs and ovs db for reference.


Actual results:
Access to 172.24.4.194:2255 from tap9e0f4837-42 or tape187de7e-94 fails.

Expected results:
Access to 172.24.4.194:2255 from tap9e0f4837-42 or tape187de7e-94 should pass.

Additional info:

Workaround:-
# Add the lb to logical_switch also, but this duplicates lb on both router and switch, switch is connected to the router already.
# ovn-nbctl set logical_switch 39bdfd3a-9f22-4ea0-a720-8b35d26e7d36 load_balancer=dc6461c2-8d55-4d83-8e71-19a687f47b07

Comment 1 Mark Michelson 2022-01-21 14:32:33 UTC
On router neutron-e30f310a-29dd-43a2-8bde-dedd103d9db9, what are the configured networks on its ports? Is 172.24.4.194 in any of the router port subnets?

Comment 2 Yatin Karel 2022-01-21 14:47:15 UTC
<< On router neutron-e30f310a-29dd-43a2-8bde-dedd103d9db9, what are the configured networks on its ports? Is 172.24.4.194 in any of the router port subnets?

Yes router have the the corresponding subnet attached, sharing output:- 

# ovn-nbctl list logical_router_port
_uuid               : 9bb9c470-826c-4923-9f56-d6204d3ca09e
enabled             : []
external_ids        : {"neutron:network_name"=neutron-a86e2e4b-59f8-444d-9012-9820f13c0353, "neutron:revision_number"="2", "neutron:router_name"="e30f310a-29dd-43a2-8bde-dedd103d9db9", "neutron:subnet_ids"="a124530e-9ff5-4245-bad6-4f2b8cb9d7e8"}
gateway_chassis     : []
ha_chassis_group    : []
ipv6_prefix         : []
ipv6_ra_configs     : {}
mac                 : "fa:16:3e:37:c6:58"
name                : lrp-5b104bcc-c570-4e0d-ae39-b16e1226e95c
networks            : ["192.0.2.1/24"]
options             : {}
peer                : []

_uuid               : 74a5fdec-86b4-4dfe-9dc1-2fc916e63d6f
enabled             : []
external_ids        : {"neutron:network_name"=neutron-718e9b43-6997-44b2-8750-f239e87bb1c8, "neutron:revision_number"="3", "neutron:router_name"="e30f310a-29dd-43a2-8bde-dedd103d9db9", "neutron:subnet_ids"="9afcbd74-a443-41b3-9c1f-52af7c0e1db3"}
gateway_chassis     : []
ha_chassis_group    : []
ipv6_prefix         : []
ipv6_ra_configs     : {address_mode=slaac, mtu="1442", send_periodic="true"}
mac                 : "fa:16:3e:ad:12:a4"
name                : lrp-487a501e-4f70-4dad-bcef-372c51baba17
networks            : ["fd9e:91f4:dd2f::1/64"]
options             : {}
peer                : []

_uuid               : 9e03c3ed-c3cc-44f7-8c59-e4acf055d85d
enabled             : []
external_ids        : {"neutron:network_name"=neutron-718e9b43-6997-44b2-8750-f239e87bb1c8, "neutron:revision_number"="3", "neutron:router_name"="e30f310a-29dd-43a2-8bde-dedd103d9db9", "neutron:subnet_ids"="e630213e-3aec-4f97-b3ca-ceef7312b23e"}
gateway_chassis     : []
ha_chassis_group    : []
ipv6_prefix         : []
ipv6_ra_configs     : {}
mac                 : "fa:16:3e:72:f0:a0"
name                : lrp-6595ab21-3b99-413b-a393-25befbe4d4af
networks            : ["10.0.0.1/26"]
options             : {}
peer                : []

_uuid               : bf48439d-368c-4096-9185-fd229d4ec801
enabled             : []
external_ids        : {"neutron:network_name"=neutron-29f97601-d55b-49c7-a71b-8335844bdd0f, "neutron:revision_number"="7", "neutron:router_name"="e30f310a-29dd-43a2-8bde-dedd103d9db9", "neutron:subnet_ids"="8227997d-452a-45d8-afcd-4361380dde5a fb434b49-ae92-44a8-b6d8-c1bad41f8da8"}
gateway_chassis     : [8dc6c9c1-c621-4183-ba36-267d89356a8c]
ha_chassis_group    : []
ipv6_prefix         : []
ipv6_ra_configs     : {}
mac                 : "fa:16:3e:9b:d7:39"
name                : lrp-b6f80253-c7e4-4779-bc58-fd77198f30d9
networks            : ["172.24.4.156/24", "2001:db8::1/64"]
options             : {}
peer                : []

Comment 3 Dumitru Ceara 2022-02-04 08:11:59 UTC
(In reply to Yatin Karel from comment #0)

> Workaround:-
> # Add the lb to logical_switch also, but this duplicates lb on both router
> and switch, switch is connected to the router already.
> # ovn-nbctl set logical_switch 39bdfd3a-9f22-4ea0-a720-8b35d26e7d36
> load_balancer=dc6461c2-8d55-4d83-8e71-19a687f47b07

This needs investigation but, if determined that the only solution is to also apply the load balancer to the logical switch, then we probably need to open another RFE BZ to investigate if ovn-northd can do this automatically, to simplify the work of the CMS, and without changing behavior for other CMSs (e.g., ovn-k8s) that apply some load balancers only on gateway routers.

Comment 4 lorenzo bianconi 2022-02-22 17:18:14 UTC
In order to run the "pf-floatingip" lb just on the logical router we need to set the "chassis" option for the router (gw router scenario). Doing so the packet is sent to the hv running the router before performing lr_in_dnat stage and ct operation is properly executed. If we run the scenario with a gw-router-port, it is currently mandatory to configure the lb even on the logical switches otherwise lr_in_dnat will not be executed on the hv running the gw-router-port.

Comment 5 Yatin Karel 2022-03-16 05:58:58 UTC
Pushed https://review.opendev.org/c/openstack/neutron/+/833620 to add lb also to logical_switches connected to logical router. When ovn will support this natively, this can be reverted in neutron.