1626217 – OVN support for deterministic MAC addresses

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1626217 - OVN support for deterministic MAC addresses

Summary: OVN support for deterministic MAC addresses

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	openvswitch
Sub Component:
Version:	7.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	lorenzo bianconi
QA Contact:	haidong li
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1648272
TreeView+	depends on / blocked

Reported:	2018-09-06 18:37 UTC by Dan Winship
Modified:	2019-01-02 17:54 UTC (History)
CC List:	10 users (show)
Fixed In Version:	openvswitch-2.9.0-81.el7fdn
Doc Type:	Enhancement
Doc Text:	This update introduces a deterministic relationship between IP and MAC addresses dynamically allocated by OVN. As a result, the POD is always reachable even if it gets a new IP address from OVN.
Clone Of:
Clones:	1648272 (view as bug list)
Environment:
Last Closed:	2019-01-02 17:54:40 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:0014	0	None	None	None	2019-01-02 17:54:52 UTC

Description Dan Winship 2018-09-06 18:37:48 UTC

In OpenShift we've seen a problem where when pods are being created and destroyed at a high rate, you eventually end up with a scenario where:

  - pod A is talking to pod B, which has, say, IP 10.0.1.5 and
    MAC bb:bb:bb:bb:bb:bb

      - pod A ends up with an entry 10.0.1.2 -> bb:bb:bb:bb:bb:bb in
        its ARP cache

  - pod B exits / is destroyed

  - Around 255 other pods on pod B's node are created/destroyed in a
    short amount of time, and the IP address assignment range wraps
    around back to the beginning again.

  - pod C is created and gets assigned IP 10.0.1.5 and
    MAC cc:cc:cc:cc:cc:cc

  - pod A tries to talk to pod C, finds that it already has an ARP
    cache entry for 10.0.1.5, and so tries to send packets to
    IP 10.0.1.5, MAC bb:bb:bb:bb:bb:bb

      - These packets go nowhere because nobody currently has that MAC

  - pod A's attempt to talk to pod C eventually times out. Things
    start failing

(This is not a problem in a VM-based world because of a combination of (a) VMs come and go less quickly than containers, so other VMs are less likely to still have stale ARP cache mappings when an IP gets reused again; and (b) VMs, like bare metal hosts, tend to have startup scripts that send out gratuitous ARPs when they bring up their network connection, so anyone who did have a stale ARP cache entry would get fixed.)

In OpenShift SDN, our fix for this was to just assign pods deterministic MAC addresses that were based on their IPs; specifically they get 0a:58:ww:xx:yy:zz, where ww:xx:yy:zz is the IP converted to hex. (The code for this comes from CNI and is used by some other plugins as well. I don't know who chose the prefix "0a:58" or why.)


With ovn-kubernetes, we will need to either

  1. also have deterministic IP-to-MAC mappings, OR
  2. send out ARP announcements whenever a pod is created

The latter would be possible, but is less inefficient if lots of pods are being created, especially if they are attached to logical switches that are spread across multiple hosts.


We don't handle IPv6 yet, and I'm not sure what the situation is there; in theory the kernel automatically handles the "announcement" part, so there might not be a problem. Unless the announcements get sent out before OVN is ready to forward them to other ports, which might be the case. Also, even if the announcements do get sent out, and do work, it would still be more efficient to *not* forward them, if they were known to be unnecessary.

Comment 2 Daniel Alvarez Sanchez 2018-10-29 12:28:43 UTC

Perhaps this has to do with old MAC_Binding entries in SB DB. In OpenStack I sent a patch to workaround the issue and delete them:

https://mail.openvswitch.org/pipermail/ovs-discuss/2018-October/047604.html

Comment 3 Dan Winship 2018-10-29 13:34:02 UTC

No, it's the pods that are caching old MAC addresses in this case, not OVN. Though the bug you pointed out might cause additional OVN-level problems on top of the pod-level problems too I guess. (We haven't actually gotten to the point in OVN testing where we've encountered this issue yet.)

Comment 4 Daniel Alvarez Sanchez 2018-10-29 13:38:41 UTC

Yeah got it now, thanks Dan! It perhaps can cause additional OVN problems as you point out. Something to have in mind now is that if we generate the mac address deterministically based on the IP address, then stale MAC_Binding entries have to be removed when updating/upgrading OVS :)

Comment 5 lorenzo bianconi 2018-10-31 10:42:09 UTC

upstream patches (not applied yet):
- https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353327.html
- https://mail.openvswitch.org/pipermail/ovs-dev/2018-October/353328.html

Comment 7 haidong li 2018-12-03 03:44:04 UTC

verified on the latest version:
[root@hp-dl388g8-19 ovn]# uname -a
Linux hp-dl388g8-19.rhts.eng.pek2.redhat.com 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@hp-dl388g8-19 ovn]# rpm -qa | grep openvswitch
openvswitch-2.9.0-81.el7fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-1.0-108.noarch
openvswitch-ovn-common-2.9.0-81.el7fdp.x86_64
openvswitch-ovn-host-2.9.0-81.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-8.el7fdp.noarch
openvswitch-ovn-central-2.9.0-81.el7fdp.x86_64
[root@hp-dl388g8-19 ovn]#  ovn-nbctl ls-add sw6
[root@hp-dl388g8-19 ovn]# ovn-nbctl  set NB_Global . options:mac_prefix="00:11:22:33:44:55"
[root@hp-dl388g8-19 ovn]# ovn-nbctl  set Logical-Switch sw6 other_config:subnet=192.168.100.0/24
[root@hp-dl388g8-19 ovn]# ovn-nbctl lsp-add sw6 p6 -- lsp-set-addresses p6 dynamic
[root@hp-dl388g8-19 ovn]# ovn-nbctl get Logical-Switch-Port p6 dynamic_addresses
"00:11:22:a8:64:03 192.168.100.2"
[root@hp-dl388g8-19 ovn]# ovn-nbctl lsp-add sw6 p7 -- lsp-set-addresses p7 dynamic
[root@hp-dl388g8-19 ovn]#  ovn-nbctl get Logical-Switch-Port p7 dynamic_addresses
"00:11:22:a8:64:04 192.168.100.3"

Comment 9 errata-xmlrpc 2019-01-02 17:54:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0014

Note You need to log in before you can comment on or make changes to this bug.