Bug 2104476

Summary: [OSP 16.2][neutron][ovn] - FIP to FIP communication issues when multiple subnets exist on the floating IP network and floating IP's subnet is different than router's subnet
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Flavio Piccioni <fpiccion>
Component: openvswitch2.15Assignee: Adrián Moreno <amorenoz>
Status: CLOSED INSUFFICIENT_DATA QA Contact: qding
Severity: high Docs Contact:
Priority: urgent    
Version: FDP 21.ECC: chrisw, ctrautma, dceara, egarciar, fleitner, jhsiao, jiji, jlibosva, mmichels, mtomaska, ralongi, scohen, ssigwald, tdoucet
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-14 18:32:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Flavio Piccioni 2022-07-06 11:27:14 UTC
Description of problem:
Setup: RHOSP 16.2 z1 - OVN (DVR)

client needed to extend floating ip network. They added a secondary subnet (IPv4) to the existing one, let's say subnet.a and subnet.b.
- When neutron routers are created, their external interface is assigned an address randomly or by choosing manually from one of the external subnets.
- This works fine for most communication.  General external access via the subnet.a or subnet.b FiPs works without issues.
- However, TCP/UDP FiP <-> FiP communication is broken.

ROUTER: external gw on FiP subnet.a
VM1: FiP on subnet.a
VM2: FiP on subnet.b
both vms are in the same internal network connected to the same router

VM1 can reach via FiP VM2 without any issue
VM2 can't reach correctly VM2, for example running a simple curl it receive a RST packet after a while.


[root@test-vm-2 ~]# time curl "vm1.fip"
curl: (56) Recv failure: Connection reset by peer

real	0m53.694s
user	0m0.052s
sys	0m0.047s

from tcpdump i can see:
- request sent from vm2 to vm1
- reply sent from vm1 to vm2
- vm2 did not receive reply.

###ICMP Works without issues###

I'll provide more details in comments.

Version-Release number of selected component (if applicable):
OSP 16.2 z1 - OVN (DVR)
OSP 16.2 z3 - OVN (DVR) (my lab)

How reproducible:
100%

Steps to Reproduce:
1. create external network with 2 subnets
2. create a router connected to the external network (lets say with A.external.sub as gw)
3. create 2 vm in the same tenant subnet 
4. add each vm 1 fip from every external subnet (vm1: A.external.sub.fip - vm2: B.external.sub.fip)
5. traffic from vm1 (A sub fip) to vm2 (B sub fip) will  work without issues
6. TCP/UDP traffic from vm2 (B sub fip) to vm1 (A sub fip) seems not working properly (icmp works)

Actual results:
tcp/udp traffic between 2 fips from different subnets (same network) connected to the same router not working properly.

Expected results:
Traffic between 2 fips from different subnets (same network) connected to the same router working properly.


Additional info:
seems similar to 1929901 and clones, but as this should be fixed, i opened a new BZ