Bug 1860395

Summary: RARP packets after live migration are dropped by "br-link" due to a timing issue between packet generation and OpenFlow rule installation.
Product: Red Hat OpenStack Reporter: Andre <afariasa>
Component: openstack-neutronAssignee: Rodolfo Alonso <ralonsoh>
Status: CLOSED ERRATA QA Contact: Alex Katz <akatz>
Severity: high Docs Contact:
Priority: high    
Version: 16.0 (Train)CC: amuller, bjarolim, chrisw, dhill, fbaudin, ffernand, knoha, ltamagno, mivollme, ralonsoh, scohen, skaplons, smooney, svmichel
Target Milestone: z4Keywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-neutron-15.2.1-1.20201114025044.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-26 13:49:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andre 2020-07-24 13:00:15 UTC
Description of problem:
Customer migrated an instance from one compute node to another and the network was not notified about the move of the MAC address.
They captured the traffic on the outgoing bond interface and there were not RARP packets at all. On RHOSP 13 they saw 5x RARP packets after every live migration.
On some further investigation they started capturing traffic on each possible location between the instance and the outgoing interface on the destination compute node:

* tap: 5x RARP packets were seen
* qbr: 5x RARP packets were seen
* qvb: 5x RARP packets were seen
* qvo: 5x RARP packets were seen
* br-int: -
* br-link: No RARP packets were seen

Afterwards, it was clear that "br-link" drops the corresponding RARP packets. Therefore, they repeated the test and started checking the OpenFlow rules of "br-link" on the destination compute node (the compute node was empty). According to the outputs, the RARP packets were correctly generated and sent out but the required OpenFlow rules were not present and therefore, the RARP packets were dropped. More information on how they tracked that on the next comment as private (contains customer sensitive data).


Version-Release number of selected component (if applicable):
openvswitch2.11-2.11.0-50.el8fdp.x86_64
openvswitch-selinux-extra-policy-1.0-22.el8fdp.noarch
rhosp-openvswitch-2.11-0.6.el8ost.noarch

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Logs on supportshell under '/cases/02710209/'

Comment 2 Bernard Cafarelli 2020-07-27 13:50:10 UTC
We may have seen something similar upstream (this RARP packet comes from qemu)

Comment 33 Alex Katz 2021-03-14 12:23:41 UTC
verified on RHOS-16.1-RHEL-8-20210304.n.0 ml2/ovs environment

Comment 63 errata-xmlrpc 2021-05-26 13:49:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.6 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2097