Bug 1298876 - Problem of enabling promiscuous mode on a virtual switch to use RHEL OSP-director on ESXi
Summary: Problem of enabling promiscuous mode on a virtual switch to use RHEL OSP-dire...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 8.0 (Liberty)
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 10.0 (Newton)
Assignee: Angus Thomas
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks: 1347518
TreeView+ depends on / blocked
 
Reported: 2016-01-15 10:45 UTC by Erwan Gallen
Modified: 2019-12-16 05:17 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-07 10:04:04 UTC
Target Upstream Version:
Embargoed:
ealcaniz: needinfo-


Attachments (Terms of Use)

Description Erwan Gallen 2016-01-15 10:45:35 UTC
Description of problem:

When you want to use RHEL OSP-director services hosted on an ESXi VM, you need to enable the promiscuous mode on the ESXi virtual switch:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1004099

Some anti-spoofing mechanism in nested deployment environment leads on deployment issue since multiple MAC addresses are used by one physical interface (dhcp namespaces, etc). 

When this promiscuous option is enabled the security level is decreased and the network performance is also reduced. 

Expected results:

The users don't want to enable promiscuous mode on ESXi to use RHEL OSP-director.

One solution could be to use an ARP Responder and a MAC address translation mechanism in order to translate the source MAC address. This could be added into Neutron in order to inject flow rules in OVS. There is already an ARP Responder in neutron so maybe this component could be re-used.

Comment 3 Lucas Alvares Gomes 2016-01-15 14:11:32 UTC
Hi Erwan,

Quick questions...

Is it about running the undecloud in a ESXi node?

Also, we have to enable promiscuous mode in order to decrease the security and consequently disable the anti-spoofing mechanism ?

It would be good if we could only disable the anti-spoofing thing if the question above is true, I don't know much about ESXi but it seems to be possible [0] (Search for "Prevent spoofing")

[0] http://faq.sanbarrow.com/index.php?action=artikel&cat=7&id=80&artlang=en

Cheers,
Lucas

Comment 7 Dan Sneddon 2016-02-17 18:07:47 UTC
(In reply to Lucas Alvares Gomes from comment #3)
> Hi Erwan,
> 
> Quick questions...
> 
> Is it about running the undecloud in a ESXi node?
> 
> Also, we have to enable promiscuous mode in order to decrease the security
> and consequently disable the anti-spoofing mechanism ?
> 
> It would be good if we could only disable the anti-spoofing thing if the
> question above is true, I don't know much about ESXi but it seems to be
> possible [0] (Search for "Prevent spoofing")
> 
> [0] http://faq.sanbarrow.com/index.php?action=artikel&cat=7&id=80&artlang=en
> 
> Cheers,
> Lucas

I would disagree that you are making the VM "less safe" by enabling promiscuous mode. There is nothing to stop a bare-metal host from running in promiscuous mode, and in fact it is required in order for Overcloud hosts to function properly.

By default, VMWare actually filters the traffic between the host and the rest of the net, and this turns off that filter.

Specifically, floating IP support requires promiscuous mode. This is well known, just look through these results:

https://www.google.com/search?q=openstack+vmware+promiscuous&ie=utf-8&oe=utf-8

If you really can't enable it for all network interfaces, promiscuous mode will AT MINIMUM need to be enabled for all network interfaces that will be carrying external Neutron routers. That is, the floating IP network(s).

-Dan

Comment 8 Mike Burns 2016-04-07 21:03:37 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 10 Harald Jensås 2016-05-18 09:09:46 UTC
These are the issues customer is having with running promiscuous enabled long term:

a) Enabling promiscuous mode allow the VM to see all traffic to/from other VM's in the same ESXi PortGroup, e.g network segment. (It is not the VM itself they are worried about, it is the fact that this VM can now se other VM's traffic on the same segment. This is not allowed in their environment)

b) Enabling promiscuous mode has performance impact. [1] [2]
Essentially all packets to any host on the vSwitch Port Group is replicated to all VM's in the portgroup, causing a big overhead.

The performance issue can be mitigated by applying ESXi Mac Learning dvFilter [3].


Why OSP need promiscuous mode? ESXi vSwitch does not do MAC learning.

For most ESX use cases, MAC learning is not required as ESX knows exactly which MAC address will be used by a VM. Howver when multiple MAC addresses are used by one physical interface; e.g by namespaces used by external neutron routers etc.


[1] http://anthonyspiteri.net/nested-esxi-reduced-network-throughput-with-promiscuous-mode-portgroups/
[2] http://www.virtuallyghetto.com/2014/08/new-vmware-fling-to-improve-networkcpu-performance-when-using-promiscuous-mode-for-nested-esxi.html
[3] https://labs.vmware.com/flings/esxi-mac-learning-dvfilter

Comment 13 Dan Sneddon 2016-07-06 17:03:26 UTC
(In reply to Harald Jensås from comment #10)
> These are the issues customer is having with running promiscuous enabled
> long term:
> 
> a) Enabling promiscuous mode allow the VM to see all traffic to/from other
> VM's in the same ESXi PortGroup, e.g network segment. (It is not the VM
> itself they are worried about, it is the fact that this VM can now se other
> VM's traffic on the same segment. This is not allowed in their environment)
> 
> b) Enabling promiscuous mode has performance impact. [1] [2]
> Essentially all packets to any host on the vSwitch Port Group is replicated
> to all VM's in the portgroup, causing a big overhead.
> 
> The performance issue can be mitigated by applying ESXi Mac Learning
> dvFilter [3].
> 
> 
> Why OSP need promiscuous mode? ESXi vSwitch does not do MAC learning.
> 
> For most ESX use cases, MAC learning is not required as ESX knows exactly
> which MAC address will be used by a VM. Howver when multiple MAC addresses
> are used by one physical interface; e.g by namespaces used by external
> neutron routers etc.
> 
> 
> [1]
> http://anthonyspiteri.net/nested-esxi-reduced-network-throughput-with-
> promiscuous-mode-portgroups/
> [2]
> http://www.virtuallyghetto.com/2014/08/new-vmware-fling-to-improve-
> networkcpu-performance-when-using-promiscuous-mode-for-nested-esxi.html
> [3] https://labs.vmware.com/flings/esxi-mac-learning-dvfilter

OpenStack requires that the Neutron services be run with promiscuous mode. This is because the virtual routers and DHCP agents use self-generated MAC addresses, and need to be able to receive traffic to those self-generated MACs. VMWare is only aware of the MAC address that the VM is using, not the various MACs that Neutron uses in network namespaces.

In order for traffic to be sent from a virtual MAC address, ESXi requires that MAC address changes and forged retransmits be enabled. In order for traffic to be received on the virtual address, promiscuous mode needs to be enabled. The reason for this is that VMWare will filter out the self-generated MAC addresses from the rest of the network traffic, and will drop those frames.

On the Undercloud, we use the Neutron DHCP agent for PXE booting the nodes. This agent is designed to use a self-generated MAC inside of a namespace. It's unfortunate that VMWare takes a big performance hit when operating in this mode, but I'm not sure there is anything we can do about it (short of re-engineering OpenStack to not use self-generated MAC addresses, which isn't likely to happen any time soon).

There are certain things that can help. They can run fewer VMs on the ESXi host that is housing the Undercloud. They might have better results if they used PCI passthrough to assign an entire network card to the VM which could be used for the Provisioning interface (although we haven't tested that to see if it works). You linked to Fling, a learning switch plugin for ESXi. That might help, but it seems like it is outside our purview to do that testing. Perhaps VMWare can provide support for testing that?

Comment 16 Hugh Brock 2016-07-25 11:55:05 UTC
reassigning this to the correct owner

Comment 17 Edu Alcaniz 2016-08-19 07:19:54 UTC
Morning, could we get an update of this BZ please

Comment 18 Edu Alcaniz 2016-09-07 10:04:04 UTC
Closed by customer, VMware will 

Hi Edu,

From my perspective, vSphere 2016 (aka. V6.5) would provide the mac learning feature in the VDS. Then we could avoid the requirements of the promiscuous mode.

At the moment, vSPhere 2016 is expected RTM end of October and GA mi November.

Best regards,

Jerome

Jerome Asseray – Global Solution Consultant Orange Group

VMware France | Tour Franklin 100-101 Terrasse Boieldieu 92042 Paris La Defense Cedex
Email: jasseray| Mobile: +33 7 87 47 15 64


Note You need to log in before you can comment on or make changes to this bug.