This bug has been migrated to another issue tracking site. It has been closed here and may no longer be being monitored.

If you would like to get updates for this issue, or to participate in it, you may do so at Red Hat Issue Tracker .
Bug 2111556 - [RFE] Layer 2 encryption of control plane using macsec
Summary: [RFE] Layer 2 encryption of control plane using macsec
Keywords:
Status: CLOSED MIGRATED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: os-net-config
Version: 18.0 (Zed)
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: OSP Team
QA Contact: Nobody
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-27 13:23 UTC by Eric Nothen
Modified: 2023-08-22 01:13 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-22 01:12:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-17884 0 None None None 2022-07-27 13:30:18 UTC
Red Hat Issue Tracker OSP-27642 0 None None None 2023-08-22 01:13:14 UTC
Red Hat Issue Tracker   OSPRH-307 0 None None None 2023-08-22 01:12:15 UTC

Description Eric Nothen 2022-07-27 13:23:14 UTC
Description of problem:

We currently support encryption of internal and admin openstack endpoints at application layer, for which an IdM server needs to be deployed and integrated in order to manage the large amount of certificates required (each endpoint on each overcloud node).

Large customers with internal CA and DNS infrastructure already deployed (especially those using dedicated appliances) also feel against deploying IdM just for the purpose of serving overcloud deployment, and perceive it as a single point of failure.

As an alternative to L7 encryption, the RHOSP control plane could be deployed on top of macsec interfaces, which would take care of L2 encryption at the OS, simplifying the deployment and configuration of RHOSP itself. This way, all RHOSP control plane traffic would be encrypted, including not only the services for which we already support certificate-based encryption, but also services that don't, such as Memcached (currently tech-preview), Keystone middleware and Django. Macsec standard is included and supported out-of-the-box in RHEL8 and 9.

This RFE is to support the deployment of macsec interfaces by TripleO as the foundation of the control plane subnets.


How reproducible:

This change can be done today by deploying an overcloud with bridged interfaces, then manually replacing the slaves (regular nics) with macsec interfaces.

Steps to Reproduce:
1. Deploy undercloud, set MTU of provisioning network to 1468 in advance
2. Deploy overcloud using a bridge for the control plane. Set MTU of all internal vlans to 1468.
3. Stop fencing

On director and each overcloud node:

4. Un-enslave nic from control plane bridge (br-ctlplane or else)
5. Set MTU of nic to 1500
6. Create macsec interface
7. Add TX key and 1 RX key for each other overcloud node
8. Enslave macsec interface to bridge

Actual results:

All traffic between director and overcloud nodes, as well as between the overcloud nodes themselves is encrypted:

~~~
[root@overcloud-controller-0 ~]# tcpdump -i enp2s0 -vv | head -20
dropped privs to tcpdump
tcpdump: listening on enp2s0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:04:26.810962 52:54:01:72:b1:47 (oui Unknown) > 52:54:01:f4:17:cc (oui Unknown), ethertype Unknown (0x88e5), length 98: 
	0x0000:  2c00 0020 c800 5254 0172 b147 0001 9468  ,.....RT.r.G...h
	0x0010:  cd54 31de dde0 5aff afae 8f22 256f 070f  .T1...Z...."%o..
	0x0020:  8b1c 109c 80e7 78f7 68b1 b415 f8f2 0aae  ......x.h.......
	0x0030:  3ed5 bea5 0cbc fd84 b50b c800 6b64 4aa9  >...........kdJ.
	0x0040:  c34e e22a 4395 8384 72f1 8e8e f497 e9c2  .N.*C...r.......
	0x0050:  2568 c62c                                %h.,
13:04:26.810980 52:54:01:72:b1:47 (oui Unknown) > 52:54:01:f4:17:cc (oui Unknown), ethertype Unknown (0x88e5), length 98: 
	0x0000:  2c00 0020 c801 5254 0172 b147 0001 9494  ,.....RT.r.G....
	0x0010:  f92d a91e e5c6 6066 18ea b3a4 bc4f b85d  .-....`f.....O.]
	0x0020:  04d2 7759 f6bd b2bf faae 9514 e1ed 2d3b  ..wY..........-;
	0x0030:  29bd c889 204b 0f66 9457 4021 6836 fa43  )....K.f.W@!h6.C
	0x0040:  b0b4 2bce bd87 aa77 71a6 3e84 6c5c 0746  ..+....wq.>.l\.F
	0x0050:  9414 b711                                ....
13:04:26.811051 52:54:01:f4:17:cc (oui Unknown) > 52:54:01:72:b1:47 (oui Unknown), ethertype Unknown (0x88e5), length 98: 
	0x0000:  2c00 0024 c42d 5254 01f4 17cc 0001 e849  ,..$.-RT.......I
	0x0010:  9223 a181 076c 43b4 b667 8ecd 4d7a 28b7  .#...lC..g..Mz(.
	0x0020:  2c50 fa0a 492f ae5a 3c44 9952 cbbf afa3  ,P..I/.Z<D.R....
	0x0030:  870c f491 5caf 0c21 a72c e5c7 70b3 e8ae  ....\..!.,..p...
	0x0040:  c02c 6a7f 967d 0a5c f52c 077d fad4 6f42  .,j..}.\.,.}..oB
tcpdump: Unable to write output: Broken pipe
[root@overcloud-controller-0 ~]# 
~~~

However looking at the macsec interface, all traffic is visible:

~~~
[root@overcloud-controller-0 ~]# tcpdump -i macsec1 -vv | head -20
dropped privs to tcpdump
tcpdump: listening on macsec1, link-type EN10MB (Ethernet), capture size 262144 bytes
13:04:59.790263 IP (tos 0x0, ttl 64, id 23829, offset 0, flags [DF], proto TCP (6), length 100)
    overcloud-controller-1.macsec.lab.51423 > overcloud-controller-0.macsec.lab.25672: Flags [P.], cksum 0x8321 (correct), seq 3584495374:3584495422, ack 2595989109, win 5969, options [nop,nop,TS val 1085936595 ecr 4114347458], length 48
13:04:59.790882 IP (tos 0x0, ttl 64, id 4524, offset 0, flags [DF], proto TCP (6), length 52)
    overcloud-controller-0.macsec.lab.25672 > overcloud-controller-1.macsec.lab.51423: Flags [.], cksum 0x2ceb (correct), seq 1, ack 48, win 1412, options [nop,nop,TS val 4114347479 ecr 1085936595], length 0
13:04:59.809426 IP (tos 0xc0, ttl 255, id 18294, offset 0, flags [none], proto VRRP (112), length 40)
    director.ctlplane.macsec.lab > 224.0.0.18: vrrp director.ctlplane.macsec.lab > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 103, authtype none, intvl 1s, length 20, addrs: 192.168.24.3
13:04:59.809466 IP (tos 0xc0, ttl 255, id 18294, offset 0, flags [none], proto VRRP (112), length 40)
    director.ctlplane.macsec.lab > 224.0.0.18: vrrp director.ctlplane.macsec.lab > 224.0.0.18: VRRPv2, Advertisement, vrid 52, prio 103, authtype none, intvl 1s, length 20, addrs: 192.168.24.2
13:04:59.823187 IP (tos 0x0, ttl 64, id 8776, offset 0, flags [DF], proto TCP (6), length 52)
    overcloud-controller-0.macsec.lab.memcache > overcloud-controller-2.macsec.lab.49976: Flags [.], cksum 0xe16b (correct), seq 1659315160, ack 1405736867, win 222, options [nop,nop,TS val 876043046 ecr 2359961051], length 0
13:04:59.823192 IP (tos 0x0, ttl 64, id 29077, offset 0, flags [DF], proto TCP (6), length 52)
    overcloud-controller-0.macsec.lab.memcache > overcloud-controller-2.macsec.lab.50012: Flags [.], cksum 0x4d78 (correct), seq 3826278804, ack 2854966827, win 222, options [nop,nop,TS val 876043046 ecr 2359961051], length 0
13:04:59.823238 IP (tos 0x0, ttl 64, id 57606, offset 0, flags [DF], proto TCP (6), length 52)
    overcloud-controller-0.macsec.lab.memcache > overcloud-controller-2.macsec.lab.50020: Flags [.], cksum 0xf08d (correct), seq 3094920941, ack 1039330693, win 222, options [nop,nop,TS val 876043046 ecr 2359961051], length 0
13:04:59.823283 IP (tos 0x0, ttl 64, id 51912, offset 0, flags [DF], proto TCP (6), length 52)
    overcloud.ctlplane.localdomain.mysql > overcloud-controller-2.macsec.lab.33959: Flags [.], cksum 0x4fdb (correct), seq 589975137, ack 1970773866, win 746, options [nop,nop,TS val 1996389862 ecr 1738800647], length 0
13:04:59.823498 IP (tos 0x0, ttl 64, id 40508, offset 0, flags [DF], proto TCP (6), length 52)
    overcloud-controller-2.macsec.lab.49976 > overcloud-controller-0.macsec.lab.memcache: Flags [.], cksum 0xd733 (correct), seq 1, ack 1, win 433, options [nop,nop,TS val 2359966107 ecr 875253974], length 0
13:04:59.823517 IP (tos 0x0, ttl 64, id 22546, offset 0, flags [DF], proto TCP (6), length 52)
    overcloud-controller-2.macsec.lab.50012 > overcloud-controller-0.macsec.lab.memcache: Flags [.], cksum 0x43be (correct), seq 1, ack 1, win 316, options [nop,nop,TS val 2359966107 ecr 875253965], length 0
tcpdump: Unable to write output: Broken pipe
[root@overcloud-controller-0 ~]# 
~~~

Basic functionality such as flavor, image, network, subnet, router server create/delete works, and I have no reason to think that other services would fail as basically RHOSP services are unaware of the L2 encryption. Even an overcloud deploy to update configuration completes successfully, as long as network configuration does not need to be changed.


Additional info:

Test configuration based on the following documentation:

- MACsec: Encryption for the wired LAN - Sabrina Dubroca
  https://legacy.netdevconf.info/1.1/proceedings/papers/MACsec-Encryption-for-the-wired-LAN.pdf

- MACsec: a different solution to encrypt network traffic - Sabrina Dubroca
  https://developers.redhat.com/blog/2016/10/14/macsec-a-different-solution-to-encrypt-network-traffic#

- RHEL8 - Securing Networks / Using MACsec to encrypt layer-2 traffic in the same physical network
  https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/securing_networks/assembly_using-macsec-to-encrypt-layer-2-traffic-in-the-same-physical-network_securing-networks

Comment 1 Eric Nothen 2023-05-03 13:07:09 UTC
While I would very much like to see this RFE make progress and eventually become a feature, I doubt the request is compatible with the roadmap of RHOSP, in particular regarding the future deployment restrictions (the process described in this BZ relies in configuration performed at the OS of the overcloud nodes, both controllers and computes).

Can someone on the OSP team decide if we want to CLOSE WONTFIX this BZ or if it is something that can be worked out as described or in some other way so that I can pass the right message to the customer?

Comment 2 Harald Jensås 2023-05-05 09:33:47 UTC
Hi Eric,

In the future nmstate.io (https://nmstate.io/) is more likely to be used to manage networking, os-net-config is adding support to use that via the python library and OCP and future OSP will most likely support using NMState directly. I.e I think the path forward is to ensure nmstate in RHEL and RHCOS supports configuring MACsec.

I would suggest closing this RFE as MIGRATED?, and open a new RFE that is not OSP specific against RHEL nmstate.


Regards
Harald

Comment 3 Eric Nothen 2023-05-05 09:57:18 UTC
(In reply to Harald Jensås from comment #2)
> Hi Eric,
> 
> In the future nmstate.io (https://nmstate.io/) is more likely to be used to
> manage networking, os-net-config is adding support to use that via the
> python library and OCP and future OSP will most likely support using NMState
> directly. I.e I think the path forward is to ensure nmstate in RHEL and
> RHCOS supports configuring MACsec.
> 
> I would suggest closing this RFE as MIGRATED?, and open a new RFE that is
> not OSP specific against RHEL nmstate.
> 
> 
> Regards
> Harald

Harald, thank you for the information. What I don't understand is how nmstate or even os-net-config would be part of OSP starting on 18, when there are no hosts for the OSP control plane and there's just the containers running on top of RHOCP. I would guess OCP networking would have to deal with the network encryption in the same way it would do for any other workload, thus making this RFE incompatible. Is this a valid assumption?

Comment 4 Harald Jensås 2023-05-05 11:17:31 UTC
(In reply to Eric Nothen from comment #3)
> (In reply to Harald Jensås from comment #2)
> > Hi Eric,
> > 
> > In the future nmstate.io (https://nmstate.io/) is more likely to be used to
> > manage networking, os-net-config is adding support to use that via the
> > python library and OCP and future OSP will most likely support using NMState
> > directly. I.e I think the path forward is to ensure nmstate in RHEL and
> > RHCOS supports configuring MACsec.
> > 
> > I would suggest closing this RFE as MIGRATED?, and open a new RFE that is
> > not OSP specific against RHEL nmstate.
> > 
> > 
> > Regards
> > Harald
> 
> Harald, thank you for the information. What I don't understand is how
> nmstate or even os-net-config would be part of OSP starting on 18, when
> there are no hosts for the OSP control plane and there's just the containers
> running on top of RHOCP. I would guess OCP networking would have to deal
> with the network encryption in the same way it would do for any other
> workload, thus making this RFE incompatible. Is this a valid assumption?

OCP uses nmstate[1] to configure node interfaces. The containers running on OCP uses Multus[2] to attache multiple network interfaces. Multus network attachments use a "master" device on each OCP worker node. The master device can be: physical interface, bridge, bond, vlan, or *MACsec* (if nmstate[1] can configure that). The compute nodes (external dataplane nodes) in OSP18 will run on baremetal RHEL hosts.

I imagine, if nmstate supports MACsec
* you can configure MACsec on the worker node interfaces and PODs with Multus could use the MACsec interface as the "master" or a bridge/bond on top of MACsec interfaces, i.e you have encrypted traffic on the OCP worker node interfaces used of OSP isolated traffic.
* you can also configure MACsec on the external dataplane nodes to get encrypted traffic on these nodes.

This might change, but both os-net-config and nmstate are likely to be available for network configuration for the external dataplane nodes. nmstate as the backend is a feature being added to os-net-config - i.e os-net-config supporting MACsec depends on nmstate's support.


[1] https://access.redhat.com/documentation/en-us/openshift_container_platform/4.7/html/networking/kubernetes-nmstate
[2] https://cloud.redhat.com/blog/how-to-use-kubernetes-services-on-secondary-networks-with-multus-cni

Comment 5 Eric Nothen 2023-06-28 10:03:39 UTC
Thanks for the clarification. 

I've now opened BZ#2218137 requesting support of MACSec interfaces in nmstate.

Comment 6 Eric Nothen 2023-06-28 10:05:28 UTC
I think we could close this BZ. If this is ever supported by nmstate, I will then ask for OSP to support using it, but I'm guessing that's not going to be anytime soon.


Note You need to log in before you can comment on or make changes to this bug.