Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1481557

Summary: neutron needs to send a unsolicited needs to Gratuitous ARP to update QFABRIC
Product: Red Hat OpenStack Reporter: Nilesh <nchandek>
Component: openstack-neutronAssignee: Assaf Muller <amuller>
Status: CLOSED ERRATA QA Contact: Toni Freger <tfreger>
Severity: medium Docs Contact:
Priority: medium    
Version: 10.0 (Newton)CC: amuller, chrisw, cpatters, nyechiel, pablo.iranzo, srevivo
Target Milestone: z6Keywords: TestOnly, Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-neutron-9.4.0-2.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-11-15 13:53:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nilesh 2017-08-15 05:25:37 UTC
When live-migrating an instance between two compute nodes neutron only sends an ARP_REPLY packet to the upstream switches to update the mac table. However for Juniper QFRABRIC we need to send an unsolicited ARP_REQUEST.

By only sending an ARP_REPLY the QFABRIC continues to cache the mac-table until the mac-address ages which by default is around 20 mins. 

Looking at the https://github.com/openstack/neutron/blob/stable/newton/neutron/agent/linux/ip_lib.py the newton code is already patched compared to the RHOSP code. 

Can you please release a hotfix for this issue, while mainlining occurs.

Github
def _arping(ns_name, iface_name, address, count, log_exception):
    # Due to a Linux kernel bug*, it's advised to spread gratuitous updates
    # more, injecting an interval between consequent packets that is longer
    # than 1s which is currently hardcoded** in arping. To achieve that, we
    # call arping tool the 'count' number of times, each issuing a single ARP
    # update, and wait between iterations.
    #
    # *  https://patchwork.ozlabs.org/patch/760372/
    # ** https://github.com/iputils/iputils/pull/86
    first = True
    for i in range(count):
        if not first:
            # hopefully enough for kernel to get out of locktime loop
            time.sleep(2)
        first = False

        # some Linux kernels* don't honour REPLYs. Send both gratuitous REQUEST
        # and REPLY packets (REQUESTs are left for backwards compatibility for
        # in case if some network peers, vice versa, honor REPLYs and not
        # REQUESTs)
        #
        # * https://patchwork.ozlabs.org/patch/763016/
        for arg in ('-U', '-A'):
            arping_cmd = ['arping', arg, '-I', iface_name, '-c', 1,
                          # Pass -w to set timeout to ensure exit if interface
                          # removed while running
                          '-w', 1.5, address]
            try:
                ip_wrapper = IPWrapper(namespace=ns_name)
                # Since arping is used to send gratuitous ARP, a response is
                # not expected. In some cases (no response) and with some
                # platforms (>=Ubuntu 14.04), arping exit code can be 1.
                ip_wrapper.netns.execute(arping_cmd, extra_ok_codes=[1])
            except Exception as exc:
                # Since this is spawned in a thread and executed 2 seconds
                # apart, the interface may have been deleted while we were
                # sleeping. Downgrade message to a warning and return early.
                exists = device_exists(iface_name, namespace=ns_name)
                msg = _("Failed sending gratuitous ARP to %(addr)s on "
                        "%(iface)s in namespace %(ns)s: %(err)s")
                logger_method = LOG.exception
                if not (log_exception or exists):
                    logger_method = LOG.warning
                logger_method(msg, {'addr': address,
                                    'iface': iface_name,
                                    'ns': ns_name,
                                    'err': exc})
                if not exists:
                    LOG.warning(_LW("Interface %s might have been deleted "
                                    "concurrently"), iface_name)
                    return


RHOSP
The arping_cmd in RHOSP is:

            arping_cmd = ['arping', '-A', '-I', iface_name, '-c', 1,
                          '-w', 1.5, address]


RPMS:
python-neutron-9.1.1-2.el7ost.noarch

Comment 1 Assaf Muller 2017-08-15 15:05:42 UTC
https://review.openstack.org/#/c/467427/ was merged in upstream stable/newton and is available in the upstream 9.4 release, which will be available in the next OSP 10 release.

Comment 10 Lon Hohberger 2017-09-06 19:58:37 UTC
According to our records, this should be resolved by openstack-neutron-9.4.0-2.el7ost.  This build is available now.

Comment 13 errata-xmlrpc 2017-11-15 13:53:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3234