Switching component to OVN driver for neutron because the root cause is in how it handles the case where a router port is attached to an empty AZ. (It then lands it to all chassis.)
AFAIU the workaround that can be tried in the environment with no code changed is making sure that if a router is assigned to AZ, the AZ has at least one chassis. The issue in the env is triggered by ports with AZ set that has no chassis. If you avoid setting AZ for such ports, then neutron should fall back to assigning chassis that are explicitly marked with ovn-cms-options=enable-chassis-as-gw, which are controller nodes in the cluster. This will stop neutron from landing the ports to compute nodes.
I don't think we need more logs collected at this point, the issue is clear and it's not in OVN / OVS BFD implementation. It's an upstream switch misconfiguration / bond issue of some sort.
The original issue reported here - BFD flapping - was fixed by migrating gw ports back to where they belong - to controller nodes. The remaining issue turned out to be an upstream switch misconfiguration of some sort. I am closing the bug. If there are other issues to follow up on, a new bz should be created.
Created https://bugzilla.redhat.com/show_bug.cgi?id=2195898#c9 to follow up on neutron AZ scheduler behavior as mentioned in comment 6.
Created documentation bz to follow up on recommendation to not locate gw ports on compute nodes: https://bugzilla.redhat.com/show_bug.cgi?id=2209100