Bug 2123168

Summary: [OSP17][OVN][IPv6][TLS-E] The connection/ping to a FIP does not work
Product: Red Hat OpenStack Reporter: Marian Krcmarik <mkrcmari>
Component: openstack-neutronAssignee: Jakub Libosvar <jlibosva>
Status: CLOSED DUPLICATE QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: urgent    
Version: 17.0 (Wallaby)CC: chrisw, jlibosva, oblaut, scohen, skaplons, spower
Target Milestone: ---Keywords: AutomationBlocker, Regression, TestOnly, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-11 20:27:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2119194    
Bug Blocks:    

Description Marian Krcmarik 2022-08-31 20:32:09 UTC
Description of problem:
THe FIP address of a newly created VM is not responsive on a DNC/Spine&Leaf environment with overcloud network on IPv6 and TLS-E with OVN as networking backend. It does not work even on a central site/stack so I assume the fact It's multistack DCN env does not really play a significant role. I did not observe the problem on the almost identical IPv4 based job with the only change of overcloud control plane networks being IPv4 and not IPv6.

Miro T. and Kuba L. made some debugging and the reason was identified as:
icmp pings are arriving to the VM responding but its response never made it back to the undercloud(ping requester). We discovered that this is due to the router not having arp entries for the undercloud. One way to force the "arp learn" is to ping the router GW IP from undercloud.

I'll provide some logs and references in next comment, If a live env for debugging is needed I can provide that too.

Version-Release number of selected component (if applicable):
python3-neutronclient-7.3.0-0.20220707060727.4963c7a.el9ost.noarch
python3-neutron-lib-2.10.2-0.20220712120440.6bbae46.el9ost.noarch
python3-neutron-18.4.1-0.20220705190435.5258354.el9ost.noarch
openstack-neutron-common-18.4.1-0.20220705190435.5258354.el9ost.noarch
openstack-neutron-18.4.1-0.20220705190435.5258354.el9ost.noarch
openstack-neutron-ml2-18.4.1-0.20220705190435.5258354.el9ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy OSP17.0 DCN (Spine&Leaf) with OVN network backend, overcloud networks on IPv6 and TLS-E
2. Create a VM on a the central site and associate a FIP with the VM
3. Ping the FIP of the VM

Actual results:
The ping is lost

Expected results:
Successful connection to the FIP as It works on identical setup wth IPv4.

Additional info:
It could be that the problem is in the underlying infrastructure since this particular setup was never tested by anyone imo. The same setup but with OVS as network backend works on 16.2 tho.

Comment 6 spower 2022-09-07 11:16:04 UTC
PM (MariaB) has approved this as an exception and can be shipped for GA but must be fixed for 17.0.1